Forums

implement longrun task script

I have a long schedule task using bash and in the bash run python script. It stopped after 2 hours plus without error. I think it is time for me to implement long run task script check. https://help.pythonanywhere.com/pages/LongRunningTasks

bash /home/vinus/abc.sh && bash /home/vinus/cde.sh

abc.sh contains

python /home/vinus/first.py && python /home/vinus/second.py

cde.sh contains

python /home/vinus/third.py && python /home/vinus/fourth.py

Where should I put the code in long running task? A separate python script? Let said LR.py so abc.sh will be

 python /home/vinus/LR.py && python /home/vinus/first.py && python /home/vinus/LR.py && python /home/vinus/second.py

Also, how should I put the lock_id = "my-username.my-task-name"
lock_id = "vinus.abc.sh" ?

hmm you could maybe write a single python script doing the lock stuff which then calls first/second/third/fourth? (breaking if any of them fail)

do you mean lock_id = "vinus.first.py", lock_id = "vinus.second.py", lock_id = "vinus.third.py", lock_id = "vinus.fourth.py" all replace it in the def is_lock_free():?

I think conrad was suggesting just using a Python script to call the other python scripts?

eg, *launcher.py", could have code like

from first.py import first_main_function
from second.py import second_function
from third.py import function_3 # etc

# lock code goes here
if not is_lock_free():
    sys.exit()

first_main_function()
second_function()
function_3()  # etc

Alternatively you can use subprocess.check_call inside your python launcher script?

The end result is that you have just one lock, which is what you probably want...

Sorry, can I know if I still need this script if I use launcher.py?

import logging
import socket
import sys
from my_module import my_long_running_process

lock_socket = None  # we want to keep the socket open until the very end of
                    # our script so we use a global variable to avoid going
                    # out of scope and being garbage-collected

def is_lock_free():
    global lock_socket
    lock_socket = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
    try:
        lock_id = "my-username.my-task-name"   # this should be unique. using your username as a prefix is a convention
        lock_socket.bind('\0' + lock_id)
        logging.debug("Acquired lock %r" % (lock_id,))
        return True
    except socket.error:
        # socket already locked, task must already be running
        logging.info("Failed to acquire lock %r" % (lock_id,))
        return False

if not is_lock_free():
    sys.exit()

my_long_running_process()

Also, the first main function in the launcher.py, what is it actually? In my script I did not use any define functions, and it is very long script.

usually the main function in the launcher is what you would use to launch/start all your other python scripts. (we did not realise that you were using the first script as the equivalent of the lock/launcher.

that is a bit different in your case since you did not define functions in your scripts. there are two ways around this.

  1. just put your scripts into a function and do what harry suggested
  2. don't import your scripts until the main function (but this isn't super recommended)

I have limited knowledge of socket libraries. let said I only have first.py that run in schedule task. so the longrun.py script, will be run prior the first.py?

python /home/vinus/longrun.py && python /home/vinus/first.py

longrun.py is the script https://help.pythonanywhere.com/pages/LongRunningTasks

Hi there, I would modify longrun.py to look something like this:

import logging
import socket
import sys

lock_socket = None

def is_lock_free():
    global lock_socket
    lock_socket = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
    try:
        lock_id = "my-username.my-task-name"   # this should be unique. using your username as a prefix is a convention
        lock_socket.bind('\0' + lock_id)
        logging.debug("Acquired lock %r" % (lock_id,))
        return True
    except socket.error:
        # socket already locked, task must already be running
        logging.info("Failed to acquire lock %r" % (lock_id,))
        return False

if not is_lock_free():
    sys.exit()

import first  # this will run first.py
import second # this will run second.py
import third # this will run third.py
# etc

then your scheduled task entry can just be "longrun.py".

Hi Harry,

If I have to completed first.py then only start run second.py, can I still using the import like you did? what is the example of lock_id shall I put? "vinus.first" something like this?

You can use any string as your lock id as long as it's unique for you.

glenn, if seond.py run after first.py is completed, the way to import oth at the same time is correct?As shown by harry

Yes

One question, if the script for longrun.py stop while running second.py, it will restart to run from first.py or second.py?

in my longrun.sh and longrun.py import first.py

chmod +x /home/vinus/longrun.py
python /home/vinus/longrun.py

the schedule task log

hello from python 2

2016-12-01 06:54:42 -- Completed task, took 35.00 seconds, return code was 0.

manually run in bash was runnning as normal but in schedule task, it does not work? Why?

Hi vinus, to answer the first question, it would re-start from first.py, whenever the script is next scheduled to run.

I'm not sure I understand your second question?

curious with this also

Hi Harry, longrun.py is what you showed me. So I schedule task using longrun.sh which contains longrun.py when it run in schedule task, it did not run and shows this log

hello from python 2

2016-12-01 06:54:42 -- Completed task, took 35.00 seconds, return code was 0.

But if I run is a bash console manually, it can run as it will run first.py I don't understand why it does not run in schedule task

Why not just schedule longrun.py directly, without the .sh file around it?

I did use to run longrun.py in schedule tasks. However, the log(from view log in schedule tasks) is empty. Ans since it will run quite some time, I am not sure if it is running. Prior this, it will shows some data while running but not now.

schedule task longrun.py has this log

/usr/bin/env python2.7: no such Python interpreter

2016-12-01 15:56:15 -- Completed task, took 236.00 seconds, return code was 0.

try on changing to /usr/bin/python still error

/usr/bin/python python2.7: no such Python interpreter

2016-12-01 23:59:41 -- Completed task, took 213.00 seconds, return code was 0.

it's not /usr/bin/python python2.7

It's just /usr/bin/python

Or if you want to speicfy, /usr/bin/python2.7

some how it still does not run schedule task. But when I view log, it is blank.

    #!/usr/bin/python
    print "hello from python 2"
    import logging
    import socket
    import sys

    lock_socket = None

    def is_lock_free():
        global lock_socket
        lock_socket = socket.socket(socket.AF_UNIX, socket.SOCK_DGRAM)
        try:
            lock_id = "vinus.lr_asia"   # this should be unique. using your username as a prefix is a convention
            lock_socket.bind('\0' + lock_id)
            logging.debug("Acquired lock %r" % (lock_id,))
            return True
        except socket.error:
            # socket already locked, task must already be running
            logging.info("Failed to acquire lock %r" % (lock_id,))
            return False

    if not is_lock_free():
        sys.exit()

    import first.py  # this will run first.p

y

Did you just not wait long enough for the task to start and run?

what I get finally is this

hello from python 2

2016-12-02 07:05:55 -- Completed task, took 406.00 seconds, return code was 0.

There is no result and it is not suppose to complete in a short time.

Have you checked that importing "first" actually runs the code you want it to? Also, I'd be very surprised if import first.py was correct. Import imports modules, not filenames.

I tested it a console before. It did run the code I want. My mistake while copy from actual code and rename it to 'first'. It is only import first not import first.py So, I am surprise why it run in console but not schedule tasks.

enter image description here

when it run perfectly in console

No idea. Try putting prints into your code to see what's happening when you run it as a scheduled task.

hi glenn, it finally works after I also edit the first.py to /usr/bin/python

but will longrun.py slow down the run? normally 3 hours job now taking 9 hours plus with longrun.py. From log does not see any rerun

hmm that's quite weird. were you running the 3 jobs asynchronously before? (vs synchronously now?)

nope. This script only run one tasks. BTW, is the schedule tasks log save only the last running log and overwrite the previous one if it is rerun? I suspect there is a rerun but the log is overwrite

I try on another script with with the above picture that run 4 jobs, it is not running with schedule tasks. I have change all the scripts to /usr/bin/python

Are you using the same lock for the second script as the first script?

no, it is different lock. this longrun.py really run much longer than it used to. also, if there are error while running and abort, it will not rerun. Example

requests.exceptions.ConnectionError: ('Connection aborted.', error(110, 'Connection timed out'))

that sounds like the site that you are scraping is responding slowly and timing out

run lonrun.py in console, with imprt first,second, theird and fourth. However, the console run till second and teh console close. When I recheck, the console is killed but completed teh second.py. I have tried a few times, all run until second.py. Is the run follow sequence, first,second, third and the last is fourth is import in such a way?

So you think that your script is consistently getting killed after the second one? Do you know if you are doing anything memory intensive etc? We kill processes that suddenly take up multiple gigabytes of ram.

enter image description here

you see, the console closed while it is running and no error

I do not think I used a few gigabytes of ram. I closely monitor the run. It took 3.5 hours before the console was killed.

do you mind us taking a look at your code?

how can I send the code to you privately?

if you just let us know where the file is, we can go take a look. (we can get access to your code, but always ask for permission first)

/home/vinus look for longrun_asia.py all the relevant files are there

Any findings? also schedule task takes much longer time than manually run on console

2016-12-05 03:08:22 -- Completed task, took 45353.00 seconds, return code was 137.

that might be because you are in the tarpit

that 137 exit code means that the task was killed (in this case because it was taking over 12hrs to run).

do you know why your code is taking so long? (ie. is it doing intensive computations? or is it waiting for a response? or is it doing lots of file io?)

I am not sure why it takes so long now when it run in schedule tasks. Even in tarpit, the longrun.py run quite fast, but it was killed half way. What is your findings after look at my code?

I see that you loop over a bunch of try/excepts where you sleep for 100s if there is an exception. If you also get it to print out something, you can see if you are getting a lot of those exceptions.

Have you calculated how long it will take if all your attempts to scrape wait for 100s?

The same script, if without longrun.py , will run about 1 hour 15 minutes to 1 hour 30 minutes in average.I have been trying that script since one month ago, run fine for two weeks, and has more problems with longrun.py. It rarely run into 100s, I can said, may be one time or two, else, I can not completed the task in 1 hour plus right? The issues is, why there are more uncompleted task lately? It was said schedule task rarely being killed. But lately, there are very common. The main questions, why a same script that run smooth two weeks ago, will be easily being killed recently? I have no idea but I really need to know why? I can eliminate wait for 100s, but the outcome will be the same I believe, because it rarely happen. Furthermore, the screenshot I post previously, the task was killed inside 3 hours 30 minutes. At that time, 4 script it has completed two and running the third one. So, it is impossible all the loop going to try up 100s. I would said, the most it will be one or two in every task or may be none. You can try to run and prove my words

enter image description here

enter image description here

run in bash manually without longrun.py is working finr. So, it is down to longrun or schedule tasks

If the try/excepts that conrad mentioned are hiding the exceptions, I would suggest putting in some code the show the exceptions so you can work out what your code is doing.

if it run thru time sleep 100s, it will print "100" this script is run in schedule task without longrun.py none of the stocks go into that loop, but still spend 4 hours, 3 times more than in console. I believe something not right with scheduled task nowsadays. Easily killed off

    NA
###############  
PU4D
###############
abc
old namePU4D
-3.23
0
-8.2
###############  
N6GD
###############
abc
old nameN6GD
eps1g
def
0
0
NA
###############  
K3GD
###############
abc
old nameK3GD
def
0.1747
0.0155
46.62
###############  
K3TD
###############
abc
old nameK3TD
def
2.5
0.6109
6.57
4:06:26.330975

We've added some extra capacity to the cluster. Will you let us know how it goes on the next run?

also- maybe try printing out the datetime for each of those lines so you can pinpoint which part of your code is taking the longest?

Today, it is clear.Completed for the first time in one week plus with a normal time frame as 3/4 weeks ago. However, this is without longrun.py and also I kill off all others schedule tasks to run this two tasks. I need to test on the consistency. What do you mean by adding some capacity to the cluster. How it affects me these 2 weeks but not before that?

We just added more machines for everyone's tasks- so there would have been less resource contention. It may have just been more usage over the past couple weeks, so your task started getting slower, and then it got better after we added more machines.