Forums

Web app refusing ot start

with the following error:

Traceback (most recent call last): False File "/bin/user_wsgi_wrapper.py", line 58, in call False app_iterator = self.app(environ, start_response) True File "/bin/user_wsgi_wrapper.py", line 70, in import_error_application False raise e True OSError: [Errno 2] No such file or directory False

Was working fine earlier (and for the last seven days, or so)...

And the webserver response is 'Unhandled Exception'

Yeah same here, I'm seeing 'Unhandled Exception'.

Error log says: 2013-06-26 00:55:33,010 :OSError: [Errno 2] No such file or directory 2013-06-26 00:56:33,106 :Traceback (most recent call last): 2013-06-26 00:56:33,107 :OSError: [Errno 2] No such file or directory 2013-06-26 00:57:33,047 :Traceback (most recent call last): 2013-06-26 00:57:33,048 :OSError: [Errno 2] No such file or directory 2013-06-26 00:58:33,001 :Traceback (most recent call last): 2013-06-26 00:58:33,001 :OSError: [Errno 2] No such file or directory 2013-06-26 00:59:34,437 :Traceback (most recent call last): 2013-06-26 00:59:34,437 :OSError: [Errno 2] No such file or directory

I now have the same problem. Writes "Unhandled Exception", although this afternoon and yesterday worked steadily.

Error log:

2013-06-26 00:15:20,829 :Traceback (most recent call last):
2013-06-26 00:15:20,830 :OSError: [Errno 2] No such file or directory

Server log:

Traceback (most recent call last): False
File "/bin/user_wsgi_wrapper.py", line 58, in call False
app_iterator = self.app(environ, start_response) True
File "/bin/user_wsgi_wrapper.py", line 70, in import_error_application False
raise e True
OSError: [Errno 2] No such file or directory False

Sounds like the same error as this post, but that doesn't mean that this error is for the same reason. Try reloading your web apps - if that doesn't work, I think perhaps the PA devs will have to have a look at this one.

Reloading does nothing. Creating a new webapp works... until it's reloaded, then it falls back into the same error.

Posted in tech support another 7 hours ago, did not answer. Probably a serious problem. We can only wait.

I think the problem is that you hit middle of the night in the UK where PythonAnywhere is based... I should expect you'll get a response pretty soon.

Sorry for the silence, this happened overnight and our alerting system failed. We're investigating.

Sounds like the problem would only be detected by reloading a webapp - once this issue has been resolved, I guess that might be an extra check for the alerting system - periodically reload a sample web app and check it comes up OK. Should be pretty easy to test by deliberately inserting a syntax error into the code... (^_^)

EDIT: Oh, and one tip from personal experience - if you haven't already, I would suggest adding a check which only runs once a week, say on Friday lunch time, and is hard-coded to always fail. That's a good way to test your alerting system is still working, rather like a weekly fire alarm test.

I remember on a personal server years ago I had a homebrew alerting system (mostly for fun) and at one point I noticed something was broken and wondered why I hadn't been emailed about it. After a little investigation, I realised the MTA had become broken by an upgrade, so I fixed it. Then I suddenly received a whole slew of pending emails - some about the problem, some saying that it couldn't mail me about the problem because it was failing to send email, some saying that it couldn't mail me about the problem sending email because it was failing to send email, etc.

That taught me two lessons - one is the fire alarm test strategy, as I mentioned above; the other is that you never have a monitoring check watching a log file which contains errors from the monitoring infrastructure itself. Pretty embarrassing as even the Romans had figured that out.

Sage advice, especially about the fire alarm test!

Unfortunately this problem isn't specifically due to reloading -- we've tried reloading a few test apps and they're fine. We're trying to tie down anything the affected apps have in common.

My site started to work again, thank you :)

Sorry for the outage! It should be resolved now and a post mortem will follow when we've figured everything out.

So, it appears that the underlying problem was an outage on one of our file servers, which caused a knock-on effect on every server that used it.

File storage is currently a single point of failure for a large part of PythonAnywhere, and addressing that is now top of our priority list.

Sounds like you might be having a lot of distributed filesystem fun! This was something I kept meaning to spend some time looking into at a previous job, but never really had the time. I did get as far as working out that the most promising candidates seemed to be:


HDFS
A Java-based distributed filesystem for Hadoop. TCP/IP and RPC-based, implements its own redundancy across machines (no need for RAID) but I don't think it's fully POSIX-compliant and is probably heavily optimised for larger files (many MB).

Ceph
Free software storage platform with libraries available for most major languages including Python. I believe it can also offer an S3-like interface. Offers seamless replication and designed to require low administrative overhead. A POSIX-compliant filesystem can be layered on top of the underlying object store, included in the Linux kernel as of 2.6.34 - I think a FUSE-based solution also exists. Looks like a clever system, but it looks like it might require a high number of servers - not sure if metadata and storage functions can be colocated, for example.

GlusterFS
Now owned by RedHat, formed of storage servers and clients which communicate with a custom TCP/IP protocol. Data can be accessed with a library, or there's a FUSE-based filesystem interface too. Not too sure on the technical details, but this system looks one of the simplest - I think it relies on shared configuration as opposed to a centralised metadata cluster.

Lustre
Frankly I don't know a lot about this, but it seems to be quite popular for supercomputing so its performance must be reasonable. Looks like it uses separate storage and metadata nodes, as many of these systems do. Looks like support uses third party kernel modules, which might be a pain.


I guess as well as these solutions, you could also consider whether a standard network filesystem (e.g. NFS) along with some low-latency synchronisation (e.g. BitTorrent Sync) might be good enough for practical purposes and less effort to maintain. I seem to remember that autofs can handle multiple hosts, presumably failing over to other hosts if the connection to the first one fails. This isn't something I've ever had to try, however - the only time I've worked with failing over to other NFS hosts, we were dealing with them at the raw NFS protocol level.

Hey, great suggestions! Right now we're reliant on NFS, which is definitely creaking a bit -- so yes, the DFS route is definitely where we're planning to go.

POSIX-compliance is important to us, so we're not looking at HDFS, but Harry spent some time benchmarking Gluster last week -- more to come there, and perhaps we'll blog what we found. Ceph is definitely also on the list, as is Lustre now that you've mentioned it. And we've been considering OpenAFS too -- have you has any experience with that?

No, I haven't played around with any form of AFS. Do you know if it handles redundancy and automatic failover? From what I've just skimmed it still may require a single writable master, although it can have read-only replicas for performance. Presumably one could manually failover to one of the replicas becoming the new master, but I always prefer peer-based architectures where possible as they tend to failover automatically (or rather, the concept of failover no longer applies).

I have the same problem with my website. when is this gonna be resolved?

Just a note to say that @joblinks's problem was fixed over email, and wasn't related to this problem - it was a virtualenv that had been broken by the recent Ubuntu upgrade. The error message at the start of this thread is a generic one you'll get if PythonAnywhere is having problems finding part of your web app, and can be caused by any number of things. Right now we believe that the problem that triggered this thread is fixed.

It seems I have run into the same problem for one of my bottle webapps. It was stable before lunch and now is serving an unhandled exception. My other webapps with the same bottle configuration are not effected even on reload. Let me know if I can provide any more information.

From server log:

Traceback (most recent call last): False File "/bin/user_wsgi_wrapper.py", line 58, in call False app_iterator = self.app(environ, start_response) True File "/bin/user_wsgi_wrapper.py", line 70, in import_error_application False raise e True OSError: [Errno 2] No such file or directory False

I don't know if anything changed on PA end, but after the 10th reload or so over the last hour, everything is back up and running. So thanks if you fixed it. You guys are great!

Hello, we just had a brief outage on one of the web servers. Some users were affected. It should all be fine now. If you are still experiencing problems, after a reload, then email support and we will sort it out.