pypi connection refused by proxy

Hi together,

I tried to install mongoengine from pypi and run into this: Is there something wrong, or is it me?

15:34 ~ $ pip install --user mongoengine
Downloading/unpacking mongoengine Cannot fetch index base URL Could not find any downloads that satisfy the requirement mongoengine 15:34 ~ $ wget
--2013-03-17 15:35:13-- Resolving proxy.server... Connecting to proxy.server||:3128... failed: Connection refused.

At first glance it looks like either the proxy server is down, or for some reason it's resolved to the wrong IP address. Until the PA staff get a chance to look at this I would keep trying periodically in case it's a transient issue.

You're right, the proxy server was out -- looks like someone was hammering it, it ran out of disk space, and we didn't get alerted.

I've bounced it and it's working now. Tomorrow we'll try to find out why we didn't get alerted.

Yeah it works. Thanks for the response

Thanks for confirming!

@PAStaff: What do you think of a public facing representation of PA's infrastructure status?

It sounds to me from another thread like there is a monitoring system in place but it didn't detect this issue as quickly as would be expected - so presumably making it public-facing wouldn't have helped because it still wouldn't have updated there either. (^_^)

I've worked on fairly large-scale monitoring systems (based on Nagios as it happens, though I wouldn't necessarily recommend it myself) and they're tricky beasts. Since they only get properly tested when something goes wrong, it's really hard to iron out all the kinks. During other outages (e.g. upgrades) it's a good time to take a minute just to check that the monitoring has started failing all the tests you'd expect as things get taken offline - it's about the only time you can do that live without introducing more disruption to service.

@a2j -- that's an excellent idea. Right now we rely on Pingdom for the bulk of our alerts; they send us emails and have an Android app that starts beeping when our we have infrastructure problems. But we just bought a Raspberry Pi and a 32" TV to mount on the wall and show an ongoing system status page so that we have something to look at too (and also because we didn't have a Raspberry Pi in the office, and we wanted one). So when that status page is sorted, we'll look into doing a public-facing version.

@Cartroo -- that's also an excellent idea :-) When we deploy at the moment we always have the moment when everyone's phones start beeping and we think "great, the alerting's working" -- but checking that we got all the alerts we wanted isn't part of our formal deploy checklist. I'll update it so that it is.

Oh yes, big screens with green lights always instil a nice warm fuzzy feeling. Plus it means you can turn the lights off and pretend you're in a Doctor Who episode.

Easiest approach might be to make it a web site with a simple front page containing just traffic lights and which click through to more info. So, your big screen just shows the front page but if it goes red anybody in the office can get the page in their browser and then click through to find out exactly what's wrong.

Definitely recommend keeping it simple - forget lots of explanatory text, just the colours are good. Nothing more annoying than squinting at a distant screen to try and see whether things are OK. I would also strongly consider a way to show historical results (not necessarily on the front page) so you can tell how long something's been failing and whether it's failing consistently or flapping. Worth noting that existing monitoring systems such as Nagios will do a lot of this for you, but in an irritatingly mediocre fashion. You might like to read this post for some alternative views.

If you find yourself with failover scenarios, where you have a backup system which cuts in while you perform upgrades to a master, then it's useful to also have a failover monitoring system. Once you've done the upgrades you put the failover system monitoring the upgraded primary and check that everything is green just before dropping it back into service. Not sure how applicable to PA that is, though - it's not as if you can run instances of everyone's web apps on a backup system before putting it live. It's worth thinking about if you ever have master/slave scenarios where failover is part of the upgrade process.

Heh, that's exactly how we were imagining it -- just a grid with one square per server (or other similar unit), coloured based on the server's status. Perhaps with some minimal information in the square and more info on a clickthrough. The contents of each square could be served by the server itself, and then there could be some extra JS code in the grid to say "if you can't get contents for this, colour it red and play air-raid-siren.mp3".

Excellent points re: failover monitoring.

Ah, the old tried and tested traffic lights - green for all OK, amber for some tests failing, red for we couldn't even get the results.

The question is, how do you detect that the reason you can't hear air-raid-siren.mp3 is that the speakers are broken?

Maybe we could have backup speakers in case the first set don't work? But then we'd need status monitoring on them too...

Just put a microphone in front of the speakers and direct it out of the speakers again - any time the high-pitched whine stops, you know you've got a failure somewhere.

I found your audio notification solution. Some may think it's overkill, but I say, how else will those of us on this side of the Atlantic hear it? You can also listen to a sample of the solution...☺

@Cartroo -- excellent idea. I always felt our office was lacking a constant high-pitched whine.

@a2j -- that's perfect! Might make it a bit hard to fix the problem if the windows are all exploding around us due to the warning noise, though.

I'm just impressed that your offices are big enough to house such a monster!! If by some strange event you did deploy one, don't forget to build in good exhaust ventilation for that V8 engine...☺

I think it might just fit, if we moved all of the furniture and people out...

In that case it's a good thing it wasn't a serious discussion to begin with...☺