Forums

502 Bad Gateway

My webapp is showing "502 Bad Gateway". There doesn't appear to be anything wrong with it, and it used to work fine. I guess it's something to do with PA. Please fix asap.

Checking now...

Still investigating. Other sites are OK (apart from an apparently unrelated problem on one other) but there's definitely something odd happening with yours. One possibility is that it's using more memory than it was previously -- has it changed recently?

Alternatively, have you changed your code so that you're using threads? The error in your domain.server.log files looks like it could be associated with that.

OK, it looks like you updated your web app at 7:30 UTC this morning. Perhaps you changed it then but didn't see the error until this morning's update to PythonAnywhere forced it to restart?

It's working now. I just re-uploaded my python files. The weird part is that it didn't appear as if there were any changes to the files. Maybe something got corrupted somehow. Thanks for the quick response.

Glad it's working now!

That's really odd, though. If you take a look at your server log files, you'll see that the web workers are failing to start because of this error:

Fatal Python error: Couldn't create autoTLSkey mapping

That sounds like your app is trying to create thread local storage, which would imply it's trying to run in some kind of multi-threaded mode. And that doesn't work on PythonAnywhere right now.

Definitely odd.

Guys I am getting this error "502 Bad Gateway". It was perfectly fine an hour ago. There were no code/db changes.

Its up again. Do we know why this is happening? Is pythonanywhere a reliable host?

We were alerted, saw what was happening and then fixed it. Outage was caused by the file storage glitching on a webserver so only affected those users who were actually using that machine at the time.

Just to add -- things like that happen pretty rarely (less than once a fortnight) and normally only cause a 5-minute outage for a small number of users. We're working on getting that even further down.

I'm getting the 502 Bad Gateway right now too. I have seen this problem intermittently over the past few weeks - it doesn't appear to last long but it can be frustrating to a user who hits it.

I have two domains/webapps pointing to the same code; it's only happening on seacemas.dynamo.pe, it's not happening right now on www.seacemas.dynamo.pe. Hope this helps narrow it down. Please look into it. thanks

Investigating.

Well, my investigations have restarted it so it's running again now. But I know that's not really a solution if it keeps crashing!

Is there a possibility that it's trying to grab lots of RAM? The logging output is unclear, but it does look like it might be being killed and not automatically restarted because the system thinks it's out of control and allocating memory without limit.

Hi giles, if you're referring to my app, I doubt that is the culprit. Both www.seacemas.dynamo.pe and seacemas.dynamo.pe point to the same code and database -- the first was up and active while the other repeatedly gave 502s. I also restarted both webapps (via the Web tab), to no avail. The 502s went away about 15-20 minutes after I wrote the last update up there ^^, which appears to match your restart.

I hope this info helps in your investigations, and that you can find the solution soon. Unfortunately one of my users was unable to do work during the downtime.

That's doubly strange; the restart that I triggered as part of the investigations was basically the same as you would have got by hitting the "Reload" button. So I don't know why mine worked but yours didn't.

One thing that might be relevant is that your apps are running on different web servers in our cluster. This is perfectly normal -- it's part of how our load-balancing system works -- but might point to there having been some kind of issue with the web server where the non-www version was running. I'm looking into that now.

...nothing particularly interesting in the logs, I think -- the server did ask for a DHCP update on its IP address at 18:19:39 UTC, but the new address was the same as the old (as it should be) and that's about an hour before you reported the problem.

But just to double-check -- any idea how long things had been broken before you were able to post your first message yesterday evening?

@giles, sorry for the late response. No idea, but if I was pressed to guess, I would say that it was not more than half an hour, because it was during business hours here. A user alerted me to the problem; I'm guessing he didn't wait for too long before reporting it. Hope that helps.

We are facing the same problem ! Please help us. URL Link: http://frontsquare.pythonanywhere.com/

-FrontSquare Team

Ah! I've just discovered that if you hit a web app while it's reloading (i.e. while the spinner next to the reload button is still going), you end up with a web app that only gives 502 errors. If you reload it again and take care not to hit the site while it's reloading, it will start working again. It looks like we have some sort of a race going on between the reload code and the code that deals with the first hit on a web app. I'll poke around a bit and see what I can see.

Interesting. That sounds like a reasonable hypothesis for my case. It sounds like something that should be fixed, because you can't always control whether other users hit your site while you're refreshing it.

Absolutely. I'm trying to put together an automated repro (and hopefully a fix) today.

We are facing this 502 problem again in the middle of our testing ! Please help us. URL Link: http://frontsquare.pythonanywhere.com/

-FrontSquare Team

Everything seems OK now? We're looking into a fix for the "hits while reloading" issue...

There was an Amazon AWS issue last night, I think that was the cause of this.

Hi we still face 502 Bad Gateway Error, this is slowing our testing process... Also, this is standing as a major concern for us to consider pythonanywhere.com as a full fledged platform for our organization.

Kindly provide a permanent solution.

-FrontSquare Team

It looks fine now... Are you making sure to wait until the spinner has stopped when you press "reload web app" before you try to visit the site?

ok, it was our fault. sorry. thanks for your help

No problem! Happy coding :-)

Hi @giles and @harry,

I just got bitten by the "502 Bad Gateway" monster again. Only this time it happened at 1:45am local time, and I'm certain noone was hitting my site at the time. I always reload via the button in the web interface. (Is there another way?)

After extensive testing, I've noticed the following: when I time the reload (by watching the spinner), the reload lasts for either ca. 6 seconds, or it takes "a lot more" (around 20 seconds). With the shorter timespan, my app always reloads correctly; with the longer, I always get the 502.

Not scientific, I know, and highly dependent on system load. But I'm hoping this info might be useful for your debugging, because it bears out 100% in my testing. At least, it should help other users who are reloading and reloading "until it works".

Thanks, that's interesting. The 20-second wait sounds like it might be a timeout on our side; we do have several in the "reload" flow.

Just to be sure I understand -- this 502 was also triggered when you hit the reload button, right? I'm wondering if we could put in a workaround where we always check the site for 502s after a reload, and just reload again if it fails. It wouldn't pick up every problem, but it might at least help us work around this one.

Not sure I understood you 100%, but if you are asking whether the 502 happens to me when I do the reload, the answer is no. After the 20-second delay, the spinner stops spinning and I'm good. However, if I open my app in another window, I get the 502. (Sorry it wasn't clear.)

Sorry for the slow reply! Thanks, that's exactly what I was asking. Interesting. What's particularly odd is that it's not happening at busy times of day. I assume there's nothing in the log files, as otherwise you'd have told us. Hmm.

This same issue (502 Bad Gateway) occurred for me today. Had not made any alterations to my app.

Reloaded my app via the dashboard and all was well.

But kinda annoying nevertheless.

Same for me, I had 502 bad gateway this morning during about 1 hour, and once again this afternoon...

I've been struggling with 502 errors today too. Granted, I am fairly new to Python and working with Flask, so I am leaning towards there being an issue with my code and/or my WSGI file configuration (tho I haven't found anything helpful in the logs), but now I really don't know anymore :S

We've had a bad day with file server outages and we'll be taking a good look at how we can prevent the issue from happening again.

i am facing the same issue right now..... pls help

my webapp is working now... i found an error in the souce code, but it worked before....anyway , it works now.

My webapp has been coming up with an "Unhandled exception" error (and no other details). After reloading the app, it's now giving a "502 Bad Gateway". From the looks of the error log, the first instance of the error recorded associated with the "Unhandled exception" error started just before noon yesterday:

2013-09-29 11:43:35,005 :Traceback (most recent call last): 2013-09-29 11:43:35,007 :ImportError: cannot import name current

Which site? Two of your sites are showing a pretty clear import error in the error logs. I can confirm I see the 502 on the third one though, I'll take a look...

Aha. and the third one has now reloaded, and it's showing an import error too. "No module named shopify". Let me know if you need help debugging that, it's probably an issue with your sys.path...

Why 502 sometimes happen instead of 500s is perplexing though. I wonder if nginx is somehow deciding that sites with broken code, ie sites which return 500 pages, are somehow broken, and it then just decides to classify them as bad gateways... It shouldn't...

Very odd - no changes were made to the site in any way (in fact, it hasn't been touched in over two weeks). And for some reason now it's gone back to "Unhandled exception".

The site I'm looking at is "geoffham" ("estore" is currently not being used, and the other should just be displaying a static page, but is giving the same error). The traceback in the error log is still giving "ImportError: cannot import name current". I'm not sure what "Current" is - it's not an import I'm using myself, so I assume it's part of Web2py. In a perhaps related odd glitch, I can't access the "errors" directory (and only that directory) in the "myboxbuyer" application of that Web2py site - it gives a 504 Gateway Error.

Investigating...

Would you mind if I took a look at your files?

Please - go right ahead.

A bit more information form the server log about this problem importing "current":

Traceback (most recent call last): False
    File "/bin/user_wsgi_wrapper.py", line 58, in __call__ False
       app_iterator = self.app(environ, start_response) True
    File "/bin/user_wsgi_wrapper.py", line 70, in import_error_application False
       raise e True
ImportError: cannot import name current False

So it appears that the problem originates in /bin/user_wsgi_wrapper.py - not sure if that's been changed globally in the past few days or not.

/bin/user_wsgi_wrapper.py hasn't changed -- it's just mangling the stack trace slightly.

It looks like the problem is that gluon.globals doesn't contain something called current. To see this, if you start a bash shell, then cd to /home/geoffham/web2py, and run Python, then try to run the import line from your WSGI file, you'll see the same error but with a better stacktrace:

>>> from gluon.main import wsgibase as application 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "gluon/__init__.py", line 15, in <module>
    from globals import current
ImportError: cannot import name current

I see that /home/geoffham/web2py/gluon/globals.pyc was updated at Sep 29 06:18 UTC. Perhaps removing it then reloading the web app would help?

That's fixed it. Very odd - is routine recompilation of the .pyc a normal feature, or should I be concerned about a potential site invasion attempt?

A good question! I think it's time for me to reach out to Massimo, who created web2py, to see if he has any input. I'll drop him an email and post anything he comes up with that sounds like it might help back here.

I also happen to have this same problem, and there is no .run method call in my code, please guys the people am working with are already waiting.....how can I fix this?

I see you posted on another forum thread; I'll answer you there.