Forums

HTTP responses cutting out halfway through, and also SQLite /var/tmp problem

Hello! Two problems started sometime in the last 12 hours.

1. Truncated HTTP responses

HTTP responses from my site are being truncated (e.g. it only sends some of the data it should). It happens after about 30k to 128k bytes have been served -- the response just ends abruptly. Pages smaller than that work fine.

I can still download my files through the admin dashboard; this only affects HTTP.

You can test this by downloading this random file from my site. It should be 3,929,709 bytes, but you'll only get about 28,490 bytes.

curl http://piecharts.pythonanywhere.com/static/GitX-dev.dmg > test.dmg
curl: (18) transfer closed with 3901219 bytes remaining to read

Error log:

2015-01-31 11:25:13,018 :IOError: write error
2015-01-31 11:25:13,018 :Traceback (most recent call last):
2015-01-31 11:25:13,019 :  File "/bin/user_wsgi_wrapper.py", line 128, in __call__
2015-01-31 11:25:13,019 :    start_response('500 Internal Server Error', [('Content-type', 'text/html')], sys.exc_info())
2015-01-31 11:25:13,019 :IOError: headers already sent

Server log:

2015-01-31 11:12:41 Sat Jan 31 11:12:41 2015 - SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) on request /list (ip 10.190.154.117) !!!
2015-01-31 11:12:41 Sat Jan 31 11:12:41 2015 - uwsgi_response_write_body_do(): Broken pipe [core/writer.c line 322] during GET /list (10.190.154.117)

2. SQLite can't write to /var/tmp?

I'm also getting an SQLite error: "sqlite3.OperationalError: database or disk is full". I still have lots of space left in my account. Apparently this happens when /tmp or /var/tmp is full or not writable: https://recursiveramblings.wordpress.com/2014/01/23/working-around-sqlite3-operationalerror-database-or-disk-is-full/

However, /tmp and /var/tmp appear to be empty and I can write into them from the command line, so... ???

Maybe something is killing my app process partway through each HTTP request and SQLite responds with that error while getting killed.

The workaround in that article solves the problem for me (by telling SQLite to use memory instead of /tmp) but I still have the problem with the HTTP requests ending early.

I should mention I didn't rebuild my virtualenv back in October during the migration to Trusty. I'll do that if you like, but I'm trying to solve one problem at a time. :)

Everything worked fine until today.

I have the same problem. Logs full of broken pipes and 500-s mentioned above. In my case a website can't properly send 4 images (around 100-150kb) size. At least one of them gets cut somewhere in the middle.

Same/similar issues here, at least with the http responses. The manifestation I observe is that png images are being truncated to 28.4 kb or so when accessed from http, but I can download the (complete) images just fine from the admin interface. Some of those images were generated months ago.

Also (perhaps related), the web2py editor failed to save changes to any edited file today--it spat out a "communication error".

Same log errors as piecharts.

Hi there -- thanks for reporting this, and sorry for the slow response. We're investigating, it looks like it is a problem with one of our web servers.

We're still investigating what the problem is, but we've rebooted the server in question and it seems to have fixed the problem. I've double-checked some of the sites belonging to people on this thread and they appear to be OK.

Let us know if anything's still not working -- we'll continue trying to work out what went wrong and how we can sort out our alerting system so that we can detect and fix problems like this faster in the future.

We think we've found the problem -- a partition on one of the web servers had filled up. We're not sure why our monitoring (which, of course, checks disk space) didn't alert us. We're clearing down some unnecessary logfiles and will bounce the server again in a few minutes.

It's back now. The problem should be fully fixed now. We'll be investigating in depth why our monitoring didn't pick this up and make appropriate fixes to it. Our update to the system on Tuesday is already planned to increase the size of the logging partitions on all of our web servers, so either way this problem should not reoccur.

Confirming that it works for me. Thank you!

Excellent, thanks for confirming that! And apologies again for the slow fix.