Forums

Site errors starting at Apr 3 11:45 AM UTC

Looks like are having some type of issue? My site http://ifatc.org is returning a lot of errors. Getting timeouts, load balancer errors and 500s returned to client. PA site access log shows lots of 499s, 502s and 504s mixed with 200s. Started at about Apr 3 11:45 AM UTC from both outside monitors and errors in the access log. Tried reloading site - no help. No changes on my end.

Seems to have resolved around 12:50 PM UTC. I did reload my web server for good measure, although I wasn't seeing any application level issues so not sure if that had anything to do with the resolution.

From a deeper dig, it looks like your site was overloaded for that time. Essentially, you were getting too many hits for the number of workers you have. Each worker can handle one request at a time. When there are no workers to handle a request, it get put into a queue and you will see the response times increase. After a while, the queue gets full and you start to see errors. I believe that what you saw was higher traffic (or perhaps just enough of a slow down from the things you connect to) over that time and not enough workers to service the requests. In general, your web app looks like it has enough workers for the general load, but needs a few more to handle what happened over that hour.

Thanks for investigating more deeply. I checked the traffic level prior to the event and it very much in the normal range and far below peaks. Looking at the access log there no spike of traffic - responses suddenly just get 100x or more slower.

I increased the web workers from four to six for grins but I'm not at all convinced that is the issue.

I wish we had server stats, etc. to review so we could check for a memory leak, etc. We are both basically left to speculate.

In any event thanks again for confirming you didn't have any issues you are aware of.

Maybe send us an email to support@pythonanywhere.com, and we can provide more detailed data for you.

Thanks. Done.