Forums

Performance issues??

Are you guys experiencing performance issues of any type? I have a couple of apps that have never experienced perf issues that are suddenly very slow. Incl my test site and another completely separate code which suggests it isn't load.

Your admin pages are suddenly really slow too. Seeing 11 sec on my "webapps" admin page. Seeing perf issues on my site from external monitors too (not just me).

19 seconds to refresh your webapps admin page just now.

Hoping to see communication soon!

Me too with a Telegram bot

Recent performance issues. Times are UTC -6.

Currently the site is just timing out a good percentage of the time.

enter image description here

20 seconds to load "webapps" admin page.

enter image description here

Pythonanywhere confirmed over twitter they are having issues.

https://twitter.com/pythonanywhere/status/1111664664587296773

enter image description here

Pythonanywhere just replied on that same twitter thread that they are recovering. Seeing the same for my app.

Thanks for the posts! We had some performance problems on one of our servers. It should have recovered by now. We are investigating regarding root causes, and how to monitor and be alerted sooner etc.

We did, thanks! Looking forward to the root cause analysis, etc.

For the future it would have been helpful if you had tweeted out you were having issues proactively. Or put something in the forums, the blog, etc.

I did some of my own troubleshooting initially and then checked all those locations.

I know sometimes it is hard to determine if that is worth doing and to take the time to do it, but more visibility sooner would have been very helpful and would have reduced the duplicate messages.

Will do, we are definitely looking at ways to be more proactive/be alerted earlier ourselves :p

Are you guys going to publish some type of summary of what happens, lessons learned and next steps? Hoping that you do.

We certainly plan to. A brief summary for now:

  • Your files are stored on file servers, which make the stuff they store available to the systems where your code actually runs (and to our own web servers for the "Files" page inside our own site) via NFS.
  • Those file servers constantly sync the data they store over to backup servers, which mirror the data and give us an easy way to take daily snapshots for disaster recovery purposes.
  • Mid-afternoon on Friday, we noticed that access to file server 3 was taking an excessive amount of time. This meant that people whose storage was on that server saw a sudden slowdown. Additionally, because people can see their files when they're logged in to our own website, our own site slowed down -- each time someone whose storage was on file 3 viewed the "Files" page, the worker process that was handling that request would take a long time to return, which meant that it was tied up and unable to handle other requests.
  • Our alerting system kicked in (essentially because of the slowdown on our own site) and we started investigating.
  • We discovered that the file server was running slowly because the synchronization process that mirrors data to its associated backup server was taking up excessive resources, so we turned off synchronization temporarily.
  • At that point things started recovering; the file server was still quite slow for a while as it worked its way through the backlog of requests, but after ten minutes or so it was back to normal.
  • We monitored for an hour or so, then switched synchronization back on and watched closely.
  • After about 20 minutes the backup server was back in sync, and the synchronization process was running normally.

We're still not sure what caused this behaviour in the synchronization process, but we'll post something if and when we find something out.

You're quite right, we should have tweeted something about this as soon as we realised there was an issue; in general we put status updates like that on Twitter rather than here in the forums because when there's a system issue, the forums might not be available (and likewise the blog -- it's just a regular PythonAnywhere website).

This is great, thank you very much for the sharing the details!

Guys, did you had again the same problems last night?

2019-04-09 00:10:28,534: The request's session was deleted before the request completed. The user may have logged out in a concurrent request, for example. 2019-04-10 21:59:14,757: OSError: write error

The same as it was at March 29.

I think those "OSError: write error" are primarily about client disconnects. That said I did get alerts last night for a perf problem and when I went to look at the log on pythonanywhere.com the initial page load (from pythonanywhere.com) seemed really slow. Then latency improved. Not clear if there was a PA problem or something with my site and the one slow PA page load was coincidence.

I am also experiencing these issues. Sometimes it takes minutes to load my website or the admin pages (dashboard, console, etc)

@Aleksey we did have some unrelated issues at around the timestamp of that first error message, 2019-04-09 00:10:28,534 -- a database outage.

@cmckulka -- were the problems you saw over the night 9 April to 10 April UTC? Or the night 10 April to 11 April?

@dull -- when were these problems?

My alert fired on 4/10 at 23:20 UTC time. That monitor only checks every five minutes (unless it is in a failed state) so I don't know when between 23:15 and 23:20 it started. It only lasted about a minute.

That is when I noticed what appeared be a slow response on the phythonanywhere dashboard (I checked immediately). The monitor cleared about a minute later, I verified it manually and the performance of the PA dashboard page returned to normal.

Thanks! That's interesting, you're the third person to report a slowdown over that period, but there's nothing in our logs at around that time. We'll keep an eye out.

The problems I had were happening at the time I posted

This is very weird. If anyone else is seeing random slowdowns, please do let us know (via email to support@pythonanywhere.com if the site is too slow to post here!)

Right now the site seems to be really slow for me too. More than 30sec to load some pages.

We had some issues with our database and it took a long time for it to recover. We are back up now.