[OpenStack-Infra] Ask.o.o down

Gene Kuo gene at openstack.org
Tue Mar 7 07:46:38 UTC 2017


Hi All,

I found that ask.o.o is down again. 
I’m able to ping the server but connection through port 80 is refused.
Can someone in the infra team check the server. The same problem happened yesterday at about 7 am UTC.

Regards,

Gene Kuo

-----Original Message-----
From: "Tom Fifield" <tom at openstack.org>
Sent: Tuesday, February 21, 2017 2:15am
To: openstack-infra at lists.openstack.org
Subject: Re: [OpenStack-Infra] Ask.o.o down



On 廿十七年二月廿一日 暮 03:11, Tom Fifield wrote:
>
>
> On 廿十七年二月十四日 暮 04:19, Joshua Hesketh wrote:
>>
>>
>> On Tue, Feb 14, 2017 at 7:15 PM, Tom Fifield <tom at openstack.org
>> <mailto:tom at openstack.org>> wrote:
>>
>>     On 14/02/17 16:11, Joshua Hesketh wrote:
>>
>>         Hey Tom,
>>
>>         Where is that script being fired from (a quick grep doesn't find
>>         it), or
>>         is it a tool people are using?
>>
>>         If it's a tool we'd need to make sure whoever is using it gets
>> a new
>>         version to rule it out.
>>
>>
>>     Indeed.
>>
>>
>>     It's fired from a PHP service on www.openstack.org
>>     <http://www.openstack.org> itself, which writes to the Member
>> database:
>>
>>
>> https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/code/services/ActiveModeratorService.php
>>
>>
>> <https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/code/services/ActiveModeratorService.php>
>>
>>
>>
>>
>> Right. I wonder if somebody could check the logs to see if the process
>> times out. Sadly looking at that code it looks like any output messages
>> from the script will be discarded.
>>
>
> ... and my patch was deployed, but the site is down today. So, looks
> like it wasn't that.

Though, is it staying down for less time? It came back up just now - 
normally it'd be down for another 45mins.

Interesting traffic spikes at:
http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=2549&rra_id=all

seem to correlate with the outage. Perhaps we can set up some tcpdumps?

>>
>>
>>     The next step is to update the copy of the script it references:
>>
>>
>> https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/lib/uc-recognition/tools/get_active_moderator.py
>>
>>
>> <https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/lib/uc-recognition/tools/get_active_moderator.py>
>>
>>
>>     I am not sure if this is in place using git submodules or manually,
>>     but will figure it out and get that updated.
>>
>>
>>
>>
>>          - Josh
>>
>>         On Tue, Feb 14, 2017 at 7:07 PM, Tom Fifield <tom at openstack.org
>>         <mailto:tom at openstack.org>
>>         <mailto:tom at openstack.org <mailto:tom at openstack.org>>> wrote:
>>
>>             On 14/02/17 16:06, Joshua Hesketh wrote:
>>
>>                 Hey,
>>
>>                 I've brought the service back up, but have no new clues
>>         as to why.
>>
>>
>>             Cheers.
>>
>>             Going to try: https://review.openstack.org/#/c/433478/
>>         <https://review.openstack.org/#/c/433478/>
>>             <https://review.openstack.org/#/c/433478/
>>         <https://review.openstack.org/#/c/433478/>>
>>             to see if this script is culprit.
>>
>>
>>                 - Josh
>>
>>                 On Tue, Feb 14, 2017 at 6:50 PM, Tom Fifield
>>         <tom at openstack.org <mailto:tom at openstack.org>
>>                 <mailto:tom at openstack.org <mailto:tom at openstack.org>>
>>                 <mailto:tom at openstack.org <mailto:tom at openstack.org>
>>         <mailto:tom at openstack.org <mailto:tom at openstack.org>>>> wrote:
>>
>>                     On 10/02/17 22:39, Jeremy Stanley wrote:
>>
>>                         On 2017-02-10 16:08:51 +0800 (+0800), Tom
>>         Fifield wrote:
>>                         [...]
>>
>>                             Down again, this time with "Network is
>>         unreachable".
>>
>>                         [...]
>>
>>                         I'm not finding any obvious errors on the
>> server nor
>>                 relevant
>>                         maintenance notices/trouble tickets from the
>> service
>>                 provider to
>>                         explain this. I do see conspicuous gaps in
>> network
>>                 traffic volume
>>                         and system load from ~06:45 to ~08:10 UTC
>>         according to
>>                 cacti:
>>
>>
>>         http://cacti.openstack.org/?tree_id=1&leaf_id=156
>>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156>
>>                 <http://cacti.openstack.org/?tree_id=1&leaf_id=156
>>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156>>
>>
>>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156
>>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156>
>>                 <http://cacti.openstack.org/?tree_id=1&leaf_id=156
>>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156>>>
>>
>>                         Skipping back through previous days I find some
>>         similar gaps
>>                         starting anywhere from 06:30 to 07:00 and ending
>>         between
>>                 07:00 and
>>                         08:00 but they don't seem to occur every day and
>>         I'm not
>>                 having much
>>                         luck finding a pattern. It _is_ conspicuously
>>         close to when
>>                         /etc/cron.daily scripts get fired from the
>>         crontab so
>>                 might coincide
>>                         with log rotation/service restarts? The graphs
>> don't
>>                 show these gaps
>>                         correlating with any spikes in CPU, memory or
>> disk
>>                 activity so it
>>                         doesn't seem to be resource starvation (at least
>>         not for
>>                 any common
>>                         resources we're tracking).
>>
>>
>>                     Indeed. It's down again today during the same
>> timeslot.
>>
>>                     Another idea for the cron-based theory:
>>
>>
>>
>>
>> https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>
>>
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>>
>>
>>
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>
>>
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>>>
>>
>>
>>                     loops through the list of Ask OpenStack users via
>>         the API on
>>                 a cron
>>                     running on www.openstack.org
>>         <http://www.openstack.org> <http://www.openstack.org>
>>                 <http://www.openstack.org>. Not sure
>>                     when that cron runs, but if it's similar, this could
>>                 potentially be
>>                     a high-load generator.
>>
>>
>>
>>
>>                     Regards,
>>
>>
>>                     Tom
>>
>>
>>                     _______________________________________________
>>                     OpenStack-Infra mailing list
>>                     OpenStack-Infra at lists.openstack.org
>>         <mailto:OpenStack-Infra at lists.openstack.org>
>>                 <mailto:OpenStack-Infra at lists.openstack.org
>>         <mailto:OpenStack-Infra at lists.openstack.org>>
>>                     <mailto:OpenStack-Infra at lists.openstack.org
>>         <mailto:OpenStack-Infra at lists.openstack.org>
>>                 <mailto:OpenStack-Infra at lists.openstack.org
>>         <mailto:OpenStack-Infra at lists.openstack.org>>>
>>
>>
>>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>
>>
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>>
>>
>>
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>
>>
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>>>
>>
>>
>>
>>
>>
>>
>
> _______________________________________________
> OpenStack-Infra mailing list
> OpenStack-Infra at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra

_______________________________________________
OpenStack-Infra mailing list
OpenStack-Infra at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra




More information about the OpenStack-Infra mailing list