[OpenStack-Infra] Ask.o.o down

Tom Fifield tom at openstack.org
Tue Feb 14 07:50:22 UTC 2017


On 10/02/17 22:39, Jeremy Stanley wrote:
> On 2017-02-10 16:08:51 +0800 (+0800), Tom Fifield wrote:
> [...]
>> Down again, this time with "Network is unreachable".
> [...]
>
> I'm not finding any obvious errors on the server nor relevant
> maintenance notices/trouble tickets from the service provider to
> explain this. I do see conspicuous gaps in network traffic volume
> and system load from ~06:45 to ~08:10 UTC according to cacti:
>
>     http://cacti.openstack.org/?tree_id=1&leaf_id=156
>
> Skipping back through previous days I find some similar gaps
> starting anywhere from 06:30 to 07:00 and ending between 07:00 and
> 08:00 but they don't seem to occur every day and I'm not having much
> luck finding a pattern. It _is_ conspicuously close to when
> /etc/cron.daily scripts get fired from the crontab so might coincide
> with log rotation/service restarts? The graphs don't show these gaps
> correlating with any spikes in CPU, memory or disk activity so it
> doesn't seem to be resource starvation (at least not for any common
> resources we're tracking).
>

Indeed. It's down again today during the same timeslot.

Another idea for the cron-based theory:

https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py

loops through the list of Ask OpenStack users via the API on a cron 
running on www.openstack.org. Not sure when that cron runs, but if it's 
similar, this could potentially be a high-load generator.




Regards,


Tom



More information about the OpenStack-Infra mailing list