Always-On Tasks Hang when Log Files exceed 10 chunks?


I've noticed a pattern - it seems like when I leave an always-on task running with verbose logging, it eventually hangs. The tasks page will still show it running, but the script is supposed to be giving outputs regularly and it isn't. When I clicked the log button to investigate, the log won't load. Going into the logging folder, I see that there are actually 10 logfiles chunks for this always-on tasks, split up.

This script has mysteriously stopped working twice before, and looking in this logging folder, I see that there were two other sets of logs that exceeded 10 chunks... and the timing lines up with around when the always-on task stopped working.

Is this some sort of a bug? Ideally I'd utilize less verbose logging now that I've got my script more debugged, and perhaps add a line in my code that clears old logs on some interval, but additionally it would be great to not have it crash at 10 logfile chunks.

There's also a good chance it's user error. ;-) Let me know if you've seen this before.

Logs rotate daily. Oldest ones are deleted. What code do you use to produce logs?

Sorry for the delay. It just happened again yesterday. The logs are being produced by a third party / open source library, I think it just is using a "print" command. I have some of my own logging set up using I think the library redirected the outputs of all logs to display on the console, which seems to be what is captured in the always on task logs.

You said that the oldest logs are deleted - I've not experienced this to be the case. Every time I click the log for an always-on task, it shows me every log since the always on task was initiated. These only reset when I delete the task and make a new one. Here's an example of what I see in the folder where all the logs are:

Logs w 10 files

The deleted log will be the log.9.gz one -- we move the .log one to .log.1, then compress the old .log.1 and move it to .log.2.gz, and so on upwards.

Is it possible that the logging is continuing to the .log.1 file? Depending on how files are opened and closed, I can imagine that under some circumstances it might do that.

Another possibility is that the third-party logging you're using is just using print statements without the flush=True kwarg -- if that happens then it would still be running and logging, but the log messages would only be written to the disk when Python's internal buffers filled up, which happens (IIRC) after about 1KiB has been written.

Giles, are you saying that anything exceeding Log.9 gets deleted, or that the system should continue to make and compress log files continually, i.e. log.10, log11, etc?

If the latter, I can find no examples, and in fact the always on task seems to hang once it hits log.9.

It doesn't seem like logging is continuing on Log.1 - I see for example, all entries from July 3, then Log 2 has July 2, Log 3 has July 1....

I'll go and examine the library and see what I find for the method it uses for logging.

Just as a hygiene thing I'm going to try to suppress the logging back to the minimum, which will also help.

I looked into the source code. I see this related to logging:

import logging
logger = logging.getLogger(__name__)

and generally when it logs stuff it uses:

Having trouble locating whatever it is referring to as __name__.

My experience with the logging library you set it at some point with logging.basicConfig(). I have mine set like this:

logging.basicConfig(format='%(asctime)s %(message)s',filename='verbose.log',level='INFO')

I want it to print into that verbose.log file but wherever that open-source library is, it seems to redirect it to the console and it prints in the always-on logs instead of my verbose.log file.

There should never be a log file with a filename larger than .log.9.gz -- if you check out your screenshot earlier, you'll see that that is where it stops. Essentially we have a system that runs once a day, and does the renaming/compressing/deleting process that I described above.

Hey - sorry for the delay over here. Seems like the logs are acting the way you are describing.

That makes me think that perhaps there's an issue with my process - after a certain number of days, it hangs and doesn't restart. (which correlates with but isn't necessarily causally related to having ten log files)

Is there anything I should be aware of for long-running tasks on pythonanywhere? Pitfalls you've seen before? I'm sure I can find some general guidance on a forum somewhere, but anything particular to your platform?

My process consists of two scripts: a "Listener" that logs incoming messages from telegram and registers them in a PostgresSQL DB, and a "worker" that works through that DB, modifying and reposting these signals. Both scripts I intend to run perpetually as "always-on" tasks.

Back again. Did some research on some forums (best one was here) and read that I should check into Memory Usage.

Looking on the pythonanywhere forum here, I'm seeing there isn't currently a good way to track this data. Is it possible for one of your staff to see my historic memory consumption? Particularly, does it seem to grow steadily over time and then reset whenever I restart my always-on tasks?

You can use psutil to get the memory used by your process: The important metric in that is RSS - if your RSS use goes above 3G,, the process will be killed.

Another possibility that could be causing issues is that your "Listener" may be losing its connection to telegram, so it doesn't get any messages and just sits waiting. That could appear as a task that is not running. Have you got code in your task that ensures the connection is working?