Forums

Scheduled Task Stops after 1400 Line Entries

I have a daily scheduled task (an online scraper) that can take as much as 6 hours to complete. When I manually run it, it works fine and goes to full completion, writing around 40k lines of text entries to a text file.

But when I have it setup in the scheduler, it ran correctly 3 different days, but wrote exactly (and only) 2401 text entries and nothing more.

Anyone have any idea whats going on?

PS error in title, it should be 2400 -- but cannot edit that

If it's spewing out a lot of output (i.e. lots of prints), it can fill the buffer for output. We have a ticket to fix this, but at the moment, the solution is not to output as much. Just writing to the file should be fine.

In general it's a good idea to have scripts create output files directly, rather than relying on output redirection (as much as the Unix authors might disagree with me!), particularly for long-running ones. It can also be helpful to flush after each record if you're worried about crashes etc. but typically this isn't necessary (and could cause your script to run more slowly).

Don't forget to append the newline, however:

with open("~/output.txt", "w") as fd:
    for line in my_output_generator():
        fd.write(line + "\n");

Whatever you do, don't be tempted to fd.write("\n".join(lines)) as I see quite a bit - it buffers the whole lot in memory, which really isn't a good idea if you expect a reasonable amount of output.