Forums

Recurrent 500 Server Errors when loading Web pages

Hello everyone,

I discovered this site a few hours ago and have subscribed to a hacker account in order to run my python script without letting my computer on all the day.

But when checking my script error logs; I have noticed that a non-negligible number of attempts to connect to sites (in my case, making MediaWiki API calls) failed with 500 status. Also, can someone explain me this? Is this related to bandwidth limit? If the case, what is the upper limit of bandwidth use allowed by PythonAnywhere servers?

Best regards,

I don't know if it is related, but scripts I ran today with no problems could not consistently reach outside (whitelisted) URLs last night. And now again tonight they are having problems except they don't reach the sites at all. But they ran just fine earlier and I haven't changed the code. I ran them locally and they are fine too right now and so it isn't the external servers' issue. One of the sites involved is NCBI's Entrez service that BioPython uses.

This night i found myself in the middle of a 502 Bad Gateway while i was coding on my project. Right now none of my scripts/webapps is off the hook, but seeing the 30 minute downtimes that happen quite frequently makes me wonder if they can improve on uptime, and how these frequent downtimes impact on visitor loyality and googlebot.

First things first -- we're really sorry about the outage last night. Between 22:09 and 22:42 UTC last night there was a DNS outage affecting some regions of Amazon's EC2 service, including the region where PythonAnywhere is based. By the time we were all logged on and investigating, Amazon had sorted it out. We'll look into whether there are any ways we can improve our infrastructure to see if there is any way we can become more resilient to that kind of thing. Apologies also for not posting an update about that one sooner.

Some of the other outages we've had recently are due to certain scaling issues in our own systems -- that is, they're not Amazon's fault! -- and we're working on some changes (including a complete upgrade of our underlying Linux systems from Debian Squeeze to Ubuntu Raring Ringtail). That's going through testing now, and hopefully we'll have a more stable platform when it's done.

Improving our uptime is really high priority for us right now; that's why there haven't been any features added over the last month or so, we're focusing 100% on making the system more reliable with its current feature set.

Right, moving on to the earlier questions in this thread:

@radotranonkala -- there's no hard bandwidth limit right now. If I understand correctly, you're running a script on PythonAnywhere that's making requests to some other server somewhere else, and it's those requests that are getting 500 errors. Is that right? If so, the problem looks like it must be at the other end -- a 500 error is a server error.

@wayne461 -- access to external internet sites for free users goes via a proxy server (paying customers have direct net access) and the proxy server associated with your account is being hammered by another user who is attempting to log in to an online role playing game several hundred times a second. I'm tracking down the rogue process and will stop it ASAP.

OK, the proxy problem should be fixed now -- let me know if you have any further problems.

Thanks, Giles! I'm back in business with that proxy issue fixed. At first I thought I should have posted my original concern (https://www.pythonanywhere.com/forums/topic/685/#id_post_4974) with the account that matches my Twitter handle, but that would have made it harder for you to sort probably since it was casued by something on the server where that account was running. I guess I should have added the PA handle I favor for more 'daily' work tasks to my tweet. I'll try to remember next time. But thanks for getting back at me.

Thank you Giles and all other users for their answers. I am happy to know that my error log hasn't been flooded by server errors while I'm sleeping :-)

It is still intrigating as server errors almost never occur when I launch the script at home... Never mind, the most important is that my scripts are going fine.

@radotranonkala: I would try and log the entire response from one of those 500s. It might tell you something. It's possible that Wikipedia is rate limiting connections by IP address? That might explain the behaviour. But I wouldn't expect them to return a 500 for that.

Oh, and @fomightez -- no problem! We had a bit of a miscommunication here in the office about who should be handling questions on the forums earlier, otherwise we'd have got back to you sooner.

@giles : Wikipedia and other Wikimedia projects have some kind of rate limiting, but only when you try to edit pages; and only when you are a newbie. When launching the script at home, server errors did occur but very, very rarely though. So I was astonished to get so many server errors here.

The framework I am using automatically limits the pace at which I'm interacting with the MediaWiki API, so I assume it may be a bug of the framework which has incorrectly set the sleeping time between two API calls...

Now everything seems to be fine (the error rate in scripts is back to "normal"). A big thanks to staff members :-)

Interesting, so they're using 500 errors to report that you're being rate limited? Odd, but I'm sure they have their reasons :-) Possibly the problem is that other people here are also accessing Wikipedia, and they're using the IP address as the key to rate limit on instead of the account. Which would also be weird, but...

I'd expect a 4xx error from a rate-limited service, either 420 if you're Twitter or 429 if you're anybody else. Still, it's entirely possible that Wikimedia haven't read their own page on the subject.

With regards to rate-limiting by IP address that doesn't surprise me. Many people seem to make the assumption that a client IP address maps fairly closely to a unique user, or at least a small group of users. I can see why people think this, but it completely disregards situations such as carrier-grade NAT which blow that assumption clean out of the water. That said, unless you have an authenticated API then the IP address is about the only reliable token you've got. The correct approach, in my opinion, is to require authentication for all API requests.

Once more or less everyone's using IPv6 all the way out to the client (probably in about 100 years, at the current rate of adoption) then the "one user one address" assumption becomes somewhat more valid.