Forums

Webapp is running fine until app tries to load tensorflow model and create a session. Error code generated is 502: backend but the error log is empty. The server log shows that workers died!

https://ibb.co/g3VeaR

[edit by admin: formatting]

It looks like Tensorflow crashed. Which version are you using? How did you install it?

I am using tensorflow 1.4. I installed it using "pip2.7 install upgrade --user tensorflow" command on bash console.

The model is trained. I tried running the model on bash console with a custom input, it worked fine and was giving the result. I recently limiit the number of inter and intra processes to 1 in tf.configproto variables and tried to create a session. The error log still says that resources are not available and worker dies.

Hmm, interesting. I just checked the server where your code was running, and it looks like there were some processes that had been left running, probably detached from earlier instances of your website (eg. before your most recent reloads).

A couple of thoughts:

  • Is your model using lots of memory?
  • Are you opening/closing your Tensorflow sessions carefully, for example using with tf.Session() as session: around the code to run stuff?

My model is only 15Mb on disk. So not that big of a deal I guess. If it was memory issue then it would not even work in bash console right! Here is my tensorflow code - https://ibb.co/gsuiKm

[edit by admin: formatting]

Hmm. I'm not a Tensorflow expert, but it looks like you might not be closing your tf.Session. Try replacing

sess = tf.Session(config=session_conf)
with sess.as_default:
    # ....

with:

with tf.Session(config=session_conf) as sess:
    with sess.as_default:
        # rest of your code here, indented one extra step

That should make sure that everything is properly shut down when your function exits.

Hey, tried it just now. Still no progress. :(

out of curiosity- does your code work from a bash console instead of a webapp?

--- nvm- realized that you already tried running it from a bash console and it worked.

how long did it take to run in the bash console?

It took around 5-10 seconds to run from bash console.

oh just to double check- you don't do anything like post back to your webapp etc / access anything external in your code right?

Can you please elaborate? Like access anything external? I access the model stored in some other file. Thats it! I didn't really get you right?

By "anything external", Conrad meant something like hitting an external API via requests or urllib or something like that -- or, indeed, using those libraries to hit your own site from within the view. Are you doing anything like that?

No I am not. My app is not hitting any external API.

Did you take a look at your server log? there seems to be a recurring error:

2017-12-22 12:07:18 terminate called after throwing an instance of ' 2017-12-22 12:07:18 std::system_error 2017-12-22 12:07:18 ' 2017-12-22 12:07:18 what():
2017-12-22 12:07:18 Resource temporarily unavailable

I know some other PythonAnywhere users are running tensorflow happily though... I'm wondering if it's because you are a free user and you are limited in the the # of threads you can start from a webapp.

Maybe try upgrading and reloading your webapp etc and running it? If it doesn't work just downgrade again (eg within the hour and you won't be charged, or just let us know and we will refund your payment).

okay..I am upgrading my account. Please tell me the number of workers I need to run it without errors? I am signing up for hacker account and it has only 2 workers. Open to recommendations.

Hmm I would say just try the hacker plan and see. Alternatively maybe there is a particular tensor flow configuration that you may be able to change (eg to not start any threads etc)

Just wondering if you ever found a solution to this problem, I am running into exactly that same error when running tensorflow from the flask app. Similarly it has no problem running from the console.

Just to clarify, your code runs fine from a PythonAnywhere console, but when run as a webapp, does not produce any error messages in the PythonAnywhere logs?

The problem was when calling a method from another class from my main flask_app.py. If I run the method directly in the class it works perfectly, but when I trigger that method call from the main app it causes the system error that is above

can you give us a quick code snippet of what works vs what doesn't work?

Whenever I run a file I have called test with this code

"from ChatbotFramework import ChatbotFramework

chatbot = ChatbotFramework() testString = chatbot.handleIncoming("What rooms are available?") print(testString)"

I can get the response from the tensorflow model, but in my flask_app.py file whenever I run

"chatbot = ChatbotFramework() testString = chatbot.handleIncoming("What rooms are available?") print(testString)"

which is imported in the same way I get the system error.

Whenever I run a file I have called test with this code

from ChatbotFramework import ChatbotFramework

chatbot = ChatbotFramework()
testString = chatbot.handleIncoming("What rooms are available?")
print(testString)

I can get the response from the tensorflow model, but in my flask_app.py file whenever I run

chatbot = ChatbotFramework()
testString = chatbot.handleIncoming("What rooms are available?")
print(testString)"

which is imported in the same way I get the system error.

[edited by admin: formatting]

is chatbotframework a flask app? does it do something similar to app.run() in normal flask?

It isn't a flask app, just a python class that has the model.predict call for tensorflow

And just to make sure I understand -- the specific error you're seeing is the "Resource temporarily unavailable" one...?

2017-12-22 12:07:18 terminate called after throwing an instance of ' 2017-12-22 12:07:18 std::system_error 2017-12-22 12:07:18 ' 2017-12-22 12:07:18 what(): 2017-12-22 12:07:18 Resource temporarily unavailable

Yeah that is exactly the one

Hmm. Do you know if the ChatbotFramework is spinning off threads of its own?

That is something that I am worried about, I don't believe that it is though but I could just not be understanding it properly. I can't see anywhere that it would be doing that though

Where are you seeing the error? Is it in the error or server log on the "Web" tab?

Sorry about being so slow giving you the information! The error is in the server logs

Hmm, very odd. There seem to be a bunch of different things that can make Tensorflow crash with that error message. Perhaps a good start would be to find out which line of Python code is triggering it. Could you put print statements before each line in your code where you use ChatbotFramework (including the import) so that we can find out? The output of the prints will go to the server log too.