Forums

schedule task for long run didn't run

I have a schedule task longrun.py which run in console but not in schedule task. This is the log for schedule task. I am using python 2.7.

hello from python 2

2017-04-25 01:29:10 -- Completed task, took 3.00 seconds, return code was 0.

I have a similar script which is longrun for a few script together. It able to run but not the above one which is suppose just run one task

Try adding a few more debug print statements to find out exactly what's happening in your script?

When I confirm again, longrun.py in schedule task does not run. It only run in console and able to complete it. Also, not every time the schedule task shows the log. Many times it is blank but I know it is not running as it do not update the file.

I assume if in console it is running, schedule task shall work too.

I search the older post, is it something related cluster allocation? even the running longrun.py takes longer time than run in console. 10 hours vs 4 hours. And the schedule task get kill after 10 hours

That's the most likely explanation yes, if your task takes a very long time it will get killed before the end. Try adding some more debug logging so you can find out how long it's taking, and how far through it gets?

Hi harry,

can you explain more how I shall do? my longrun.py has 4 files that will run one by one. Each a,b,c and d take about 1 hour to complete. Thus, it is 4 hours in total when run longrun.py in console. The problem is schedule task takes much longer time than 4 hours. why it is so?

I do not know what debug logging to use. Can you give me example?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#!/usr/bin/python
print "hello from python 2"
import logging
import socket
import sys

lock_socket = None

def is_lock_free():
    global lock_socket
    lock_socket = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
    try:
        lock_id = "vinasia.lr_asia"   # this should be unique. using your username as a prefix is a convention
        lock_socket.bind('\0' + lock_id)
        logging.debug("Acquired lock %r" % (lock_id,))
        return True
    except socket.error:
        # socket already locked, task must already be running
        logging.info("Failed to acquire lock %r" % (lock_id,))
        return False

if not is_lock_free():
    sys.exit()

import a
import b
import c # this will run first.py
import d

The task servers have more contention than console servers.

Given your setup, I guess you'd have to put the extra logging inside the a/b/c/d modules...

what type of logging that I need to put inside a b c d module? please explain further with example.

2017-04-23 14:12:31 -- Completed task, took 24.00 seconds, return code was 1.

/bin/bash: line 1:   393 Killed                  python /bin/run_scheduled_task.py /home/vinasia/longrun_aus.py

2017-04-25 03:08:12 -- Completed task, took 46566.00 seconds, return code was 137.


hello from python 2

2017-04-25 14:12:07 -- Completed task, took 1.00 seconds, return code was 0.

latest schedule task log. It is either taking very long and unable to complete or not able to start. I think the issue is purely your low resources in schedule task

I'd be surprised if the contention made something not start at all -- where are you seeing that?

By adding logging to the modules -- what Harry meant was that you could do something like this:

from datetime import datetime

print("{} About to run a".format(datetime.now())
import a
print("{} Done running a".format(datetime.now())
print("{} About to run b".format(datetime.now())
import b
print("{} Done running b".format(datetime.now())
print("{} About to run c".format(datetime.now())
import c # this will run first.py
print("{} Done running c".format(datetime.now())
print("{} About to run d".format(datetime.now())
import d
print("{} Done running d".format(datetime.now())

Hi giles, it doesn't show with the above code when it fail

  conn.executemany(self.insert_statement(), data_list)
/bin/bash: line 1: 13077 Killed                  python /bin/run_scheduled_task.py /home/vinasia/longrun_asia.py

2017-04-26 15:28:55 -- Completed task, took 16850.00 seconds, return code was 137.

If you already know it is a cluster problem, why not you guys do something for it at your side. I just want to know if it is working in PA. If it is not, I need to find another alternatives.

Console, working fine consistently.

Schedule task, fail to complete and it is slow even when it completed.

You can log in and test my script if you want to know more.

Hi there,

There are 3 possible reasons a scheduled task will be killed:

  1. it took more than 10 hours to run.

  2. it used more than 3GB of RAM.

  3. you used more than 10x your cpu quota for the day.

Currently there are no notifications for cases 1 and 2, but you will get emails for case 3. We're working on improving that. But looking at our logs, I can see examples of all three of these rules being applied to your scheduled task processes.

eg, memory limit exceeded:

2017-04-26 04:18:19,622 INFO:Killing python /home/vinasia/longrun_asia.py for using 3.00GB
2017-04-26 15:28:55,160 INFO:Killing python /home/vinasia/longrun_asia.py for using 3.00GB
2017-04-27 03:38:08,378 INFO:Killing python /home/vinasia/longrun_asia.py for using 3.00GB

timeouts:

Free User vinasia process 17818 bash /home/vinasia/runall.sh was too old: 2017-04-16 19:08:09.070000
Free User vinasia process 24414 bash /home/vinasia/runall.sh was too old: 2017-04-17 19:08:07.610000
Paying User vinasia process 21305 python /home/vinasia/longrun_aus.py was too old: 2017-04-25 06:40:15.180000

tarpit 10x exceeded:

2017-04-08 19:40:11,128 INFO:Killing process 30823 ['python', '/home/vinasia/SGX/ya_panda_sgx.py'] owned by vinasia
2017-04-07 19:20:10,570 INFO:Killing process 21626 ['python', '/home/vinasia/KLSE/ya_panda_klse.py'] owned by vinasia
2017-04-06 19:20:09,710 INFO:Killing process 758 ['python', '/home/vinasia/KLSE/ya_panda_klse.py'] owned by vinasia

those are the last three events of each kind. As you can see, the most recent ones are happening because you're exceeding the 3GB memory limit. To avoid those, you'll have to see if you can rewrite your scripts to use less RAM...

Hi Harry,

what I need to do to reduce the RAM? I do not know which part of the script can reduce the RAM. How is this RAM calculated? If I separate the longrun_asia.py (run for a,b,c,d) into two script which run a,b and c,d, will it reduce the RAM by half? Please advice.

Hi giles or Harry

please advice on the RAM issue

The RAM used by your script will be whatever space Python needs to store the data structures that you load into memory.

If you split out and run each of your parts individually, you may find that that reduces the memory use, but it may be that each of your parts uses too much memory. If that's the case you'll have to look at your code to determine where you're loading large data structures into memory and work out how to reduce the memory use.

I run using longrun.py, it will get killed. But if I run a,b,c,d in a bash file. It normally run perfectly.

How is bash file and reduce the RAM use while longrun.py increase the RAM use? It is the same script with same sequence run.

Also, it the 3GB RAM limit apply only on schedule task?

Running the script alone in console not encounter the same issue

It applies to all processes on PythonAnywhere.

Hi Glenn,

Is it is so, I suspect it only occurred in longrun.py

I am running the same task, using bash, it is fine.

Why longrun.py take up higher RAM? I need longrun just because using bash, it may be stop running if some instability happen in PA

If it's making assumptions about the working directory (using relative paths etc.), then it may be working on different data in the the console and in the task.

Hi glenn,

I do not get your meaning. Could explain in a more simple way?

When you run something from the command line, the working directory is whatever directory you're in. When it's run as a scheduled task, the workiong directory is your home directory. If your code uses relative paths, then you may be working on entirely different files to the ones that you expect. To confirm it, you can run your script from your home directory and see how it behaves.

As I said, the files run fine just slow in schedule tasks and being kill in schedule task.

All my files is in home directory by the way, so no issue as you mentioned.

3GB issue happen in schedule task using longrun.py

3GB issue did not happen if separated to 2 files

3GB issue not happen if it run in bash file schedule task.

So, 3GB issue is only happen to longrun.py obviously. Longrun.py is taking higher resources? This is the thing you need to investigate.

bash does split up the tasks so that their memory is not exceeded. so that makes logical sense.

So what is the suggestion to make the longrun.py to run without hitting the 3GB limit? I only do minimal printing.

Also, is the limit is lower than in January?

Hi there,

the limit hasn't changed since january. we are will be adding notifications to the memory killer soon.

regarding how to adapt your longrun.py so it uses less memory, I wonder if running each separate step in a subprocess might work? something like this:

from __future__ import print_function
import socket
import datetime
import subprocess
import time
import psutil


GB_MULTIPLIER = 1024 * 1024 * 1024.0

def is_lock_free():
    global lock_socket
    lock_socket = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
# [etc etc, rest of lock code as normal]


def run_python_process(path):
    print(datetime.now(), 'running', path)
    process = subprocess.Popen(['/usr/bin/python2.7', path])  # run python code in a sub-process
    # ever 30s or so, check the current memory usage for the sub-process
    while process.poll() is None:  # means process is still running
        time.sleep(30)
        usage = psutil.Process(process.pid).get_ext_memory_info().rss / GB_MULTIPLIER
        print('process a currently using', usage, 'GB')

    # when we exit the loop, the process has completed
    print(datetime.now(), path, 'done, return code was', process.returncode)

# now run a,b,c,d in sub-processes:
run_python_process('/home/myusername/a.py')
run_python_process('/home/myusername/b.py')
run_python_process('/home/myusername/c.py')
run_python_process('/home/myusername/d.py')

print('all done!')