Forums

Any way to localise abnormal performance issue.

I have a light use web2py app (mySQL db) up an running for a couple of months with no issues (performance or otherwise) until today when I got a call form the user saying it was very slow. It took me an unusual amount of time (~30 seconds) to log in and page navigation/refresh is taking similar amount of time.

Is there an issue affecting performance?

In the time it has taken me to write this - the situation appears to have rectified itself and pages are bursting onto the screen! - Still would be good to know if and where there was a problem.

Many thanks
P

[EDIT] On submitting this - I went back to the app to find performance/response times are back on the floor! So this appears somewhat erratic.

Some ping stats from Dublin, (look ok to me)...

--- www.pythonanywhere.com ping statistics ---  
59 packets transmitted, 59 received, 0% packet loss, time 58039ms  
rtt min/avg/max/mdev = 117.097/255.317/327.917/57.861 ms

Just got this while waiting for a page refresh...

Something went wrong :-(
.../snip/...
Debugging tips
.../snip/...

   Error code: 504-loadbalancer

And this from the server log - I havn't seen HARAKIRI before...

2017-09-08 13:02:14 ERROR:root:IOError: write error  
2017-09-08 13:02:14 ERROR:root:Error running WSGI application  
2017-09-08 13:02:14 ERROR:root:GeneratorExit  
2017-09-08 13:02:14 RuntimeError    
2017-09-08 13:02:14 :   
2017-09-08 13:02:14 generator ignored GeneratorExit  
2017-09-08 13:02:14   
2017-09-08 13:02:47 Fri Sep  8 13:02:36 2017 - *** HARAKIRI ON WORKER 3 (pid: 16241, try: 1) ***  
2017-09-08 13:02:52 Fri Sep  8 13:02:36 2017 - HARAKIRI !!! worker 3 status !!!  
2017-09-08 13:02:52 Fri Sep  8 13:02:36 2017 - HARAKIRI [core 0] 10.0.0.209 - GET /my_app/default/list_persons since 1504875455  
2017-09-08 13:02:52 Fri Sep  8 13:02:36 2017 - HARAKIRI !!! end of worker 3 status !!!  
2017-09-08 13:02:52 DAMN ! worker 3 (pid: 16241) died, killed by signal 9 :( trying respawn ...  
2017-09-08 13:02:52 Respawned uWSGI worker 3 (new pid: 18771)  
2017-09-08 13:02:52 spawned 2 offload threads for uWSGI worker 3  
2017-09-08 13:03:40 Fri Sep  8 13:03:39 2017 - *** HARAKIRI ON WORKER 2 (pid: 16235, try: 1) ***  
2017-09-08 13:03:40 Fri Sep  8 13:03:39 2017 - HARAKIRI !!! worker 2 status !!!  
2017-09-08 13:03:40 Fri Sep  8 13:03:39 2017 - HARAKIRI [core 0] 10.0.0.209 - GET /my_app/default/user/logout since 1504875516  
2017-09-08 13:03:40 Fri Sep  8 13:03:39 2017 - HARAKIRI !!! end of worker 2 status !!!  
2017-09-08 13:03:40 DAMN ! worker 2 (pid: 16235) died, killed by signal 9 :( trying respawn ...  
2017-09-08 13:03:40 Respawned uWSGI worker 2 (new pid: 18818)  
2017-09-08 13:03:40 spawned 2 offload threads for uWSGI worker 2  
2017-09-08 13:04:29 announcing my loyalty to the Emperor...

Looks like you guys are on this already...

I'm guessing there was a server reboot - everything went offline for a few minutes,
it's only been back a couple of minutes but FYI normal service appears to have be restored.

Harakiri is what happens to a web app when it takes too long to respond to a request - the system assumes there's something wrong with it and restarts it.

In your case, it could be related to the downtime, but if you see it in future, it may be an issue with your web app.

Thanks Glen!

Can/will it be confirmed that there was a general issue and a reboot, so I'm not chasing wild geese?

Yes, we can definitely confirm that there was system-wide issue. Sorry about that!

There were occasional brief slowdowns -- maybe a few minutes or so long each time -- all morning (UTC timezone) culminating in a very large slowdown at around 13:00 UTC. We then applied a fix, which required rebooting a lot of servers.

Thanks Giles!