Flask web app suddenly showing "502 Bad Gateway" : Forums : PythonAnywhere

Flask web app suddenly showing "502 Bad Gateway"

Problem cropped up suddenly on the web server. App runs fine on my local dev server. Please help!

deleted-user-73269 | 2 posts | April 2, 2013, 8:58 p.m. | permalink

Having the same problem, seems to be an issue on PythonAnywhere's side and not just with Flask apps

deleted-user-69883 | 2 posts | April 2, 2013, 9:04 p.m. | permalink

We're investigating.

giles | 12063 posts | PythonAnywhere staff | April 3, 2013, 12:53 a.m. | permalink

Looks like one of our web servers got overloaded, which took out web apps that were hosted there. We've switched everything over to a backup, and I've double-checked that all web apps belonging to people who've posted on this thread are running.

Apologies for the outage, we'll look into the underlying cause and why we didn't get an automated alert tomorrow.

giles | 12063 posts | PythonAnywhere staff | April 3, 2013, 1:51 a.m. | permalink

Tomorrow? It was already tomorrow when you wrote that...☺

Of course I know what you meant. Be sure to get some sleep now...☺

a2j | 684 posts | April 3, 2013, 3:46 a.m. | permalink

The problem with monitoring systems is that it's very hard to notice when they stop working. Quis custodiet ipsos custodes?

deleted-user-39880 | 669 posts | April 3, 2013, 8:42 a.m. | permalink

Indeed... But it really looks like there was a case here that our systems didn't pick up - most web apps were fine, it was just new ones and those that hadn't been hit for a while.

giles | 12063 posts | PythonAnywhere staff | April 3, 2013, 8:57 a.m. | permalink

Ahh, nasty. So presumably the monitoring uses a well-known sample service which is kept nice and active by the fact that the monitoring system polls it frequently, and hence it didn't get "idled out" and suffer the issue. Ouch!

deleted-user-39880 | 669 posts | April 3, 2013, 9:01 a.m. | permalink

Precisely. And (thinking about it) the regular polls from the monitoring service (we use Pingdom) would have kept it awake anyway...

giles | 12063 posts | PythonAnywhere staff | April 3, 2013, 1:49 p.m. | permalink

What about having every service you want to ensure is functioning perform regular log entries. Then if you can't monitor the expected log entry you alert. Or more than likely I'm missing something...☺

a2j | 684 posts | April 3, 2013, 5 p.m. | permalink

Log monitoring is one approach, but generally not favoured due to the risk that the logging is working but the something else is broken. Generally you want your monitoring to be as close as possible to a user-facing interface so you can catch the widest set of potential issues. In complex systems, it's notoriously hard to predict all the possible failure modes.

Also, web apps are essentially reactive - they don't have any sort of regular timer to use for logging. Daemon-like services could do it

So I assume the PA devs have a dummy user account or something that has a web app with a known response that they can check for. But the fact that the monitoring system kept making requests prevented this app from being "swapped out" and it seems this issue only affected swapped out services. This demonstrates the difficulty in predicting failure modes.

In an ideal world I suppose one would monitor real users' apps, but there are many reasons why this is impractical. Still, as long as new monitoring checks are added as new failure modes come to light, the system will become increasingly reliable. Monitoring is something that inevitably has to evolve over time to some extent.

deleted-user-39880 | 669 posts | April 4, 2013, 9:27 a.m. | permalink

What about having every service you want to ensure is functioning perform regular log entries. Then if you can't monitor the expected log entry you alert. Or more than likely I'm missing something...☺

We could definitely add some stuff like that, but I don't think it would have picked up any of our non-alerting outages so far.

So I assume the PA devs have a dummy user account or something that has a web app with a known response that they can check for.

We actually use our own blog -- it's a Django app running in a PythonAnywhere hosting account like any other, but we have complete control over what it returns so it's a good proxy for user web apps as a whole -- modulo the whole "never getting swapped out" thing.

giles | 12063 posts | PythonAnywhere staff | April 4, 2013, 11:25 a.m. | permalink