[OpenStack-Infra] Ask.o.o down

Tom Fifield tom at openstack.org
Tue Feb 21 07:15:18 UTC 2017



On 廿十七年二月廿一日 暮 03:11, Tom Fifield wrote:
>
>
> On 廿十七年二月十四日 暮 04:19, Joshua Hesketh wrote:
>>
>>
>> On Tue, Feb 14, 2017 at 7:15 PM, Tom Fifield <tom at openstack.org
>> <mailto:tom at openstack.org>> wrote:
>>
>>     On 14/02/17 16:11, Joshua Hesketh wrote:
>>
>>         Hey Tom,
>>
>>         Where is that script being fired from (a quick grep doesn't find
>>         it), or
>>         is it a tool people are using?
>>
>>         If it's a tool we'd need to make sure whoever is using it gets
>> a new
>>         version to rule it out.
>>
>>
>>     Indeed.
>>
>>
>>     It's fired from a PHP service on www.openstack.org
>>     <http://www.openstack.org> itself, which writes to the Member
>> database:
>>
>>
>> https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/code/services/ActiveModeratorService.php
>>
>>
>> <https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/code/services/ActiveModeratorService.php>
>>
>>
>>
>>
>> Right. I wonder if somebody could check the logs to see if the process
>> times out. Sadly looking at that code it looks like any output messages
>> from the script will be discarded.
>>
>
> ... and my patch was deployed, but the site is down today. So, looks
> like it wasn't that.

Though, is it staying down for less time? It came back up just now - 
normally it'd be down for another 45mins.

Interesting traffic spikes at:
http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=2549&rra_id=all

seem to correlate with the outage. Perhaps we can set up some tcpdumps?

>>
>>
>>     The next step is to update the copy of the script it references:
>>
>>
>> https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/lib/uc-recognition/tools/get_active_moderator.py
>>
>>
>> <https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/lib/uc-recognition/tools/get_active_moderator.py>
>>
>>
>>     I am not sure if this is in place using git submodules or manually,
>>     but will figure it out and get that updated.
>>
>>
>>
>>
>>          - Josh
>>
>>         On Tue, Feb 14, 2017 at 7:07 PM, Tom Fifield <tom at openstack.org
>>         <mailto:tom at openstack.org>
>>         <mailto:tom at openstack.org <mailto:tom at openstack.org>>> wrote:
>>
>>             On 14/02/17 16:06, Joshua Hesketh wrote:
>>
>>                 Hey,
>>
>>                 I've brought the service back up, but have no new clues
>>         as to why.
>>
>>
>>             Cheers.
>>
>>             Going to try: https://review.openstack.org/#/c/433478/
>>         <https://review.openstack.org/#/c/433478/>
>>             <https://review.openstack.org/#/c/433478/
>>         <https://review.openstack.org/#/c/433478/>>
>>             to see if this script is culprit.
>>
>>
>>                 - Josh
>>
>>                 On Tue, Feb 14, 2017 at 6:50 PM, Tom Fifield
>>         <tom at openstack.org <mailto:tom at openstack.org>
>>                 <mailto:tom at openstack.org <mailto:tom at openstack.org>>
>>                 <mailto:tom at openstack.org <mailto:tom at openstack.org>
>>         <mailto:tom at openstack.org <mailto:tom at openstack.org>>>> wrote:
>>
>>                     On 10/02/17 22:39, Jeremy Stanley wrote:
>>
>>                         On 2017-02-10 16:08:51 +0800 (+0800), Tom
>>         Fifield wrote:
>>                         [...]
>>
>>                             Down again, this time with "Network is
>>         unreachable".
>>
>>                         [...]
>>
>>                         I'm not finding any obvious errors on the
>> server nor
>>                 relevant
>>                         maintenance notices/trouble tickets from the
>> service
>>                 provider to
>>                         explain this. I do see conspicuous gaps in
>> network
>>                 traffic volume
>>                         and system load from ~06:45 to ~08:10 UTC
>>         according to
>>                 cacti:
>>
>>
>>         http://cacti.openstack.org/?tree_id=1&leaf_id=156
>>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156>
>>                 <http://cacti.openstack.org/?tree_id=1&leaf_id=156
>>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156>>
>>
>>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156
>>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156>
>>                 <http://cacti.openstack.org/?tree_id=1&leaf_id=156
>>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156>>>
>>
>>                         Skipping back through previous days I find some
>>         similar gaps
>>                         starting anywhere from 06:30 to 07:00 and ending
>>         between
>>                 07:00 and
>>                         08:00 but they don't seem to occur every day and
>>         I'm not
>>                 having much
>>                         luck finding a pattern. It _is_ conspicuously
>>         close to when
>>                         /etc/cron.daily scripts get fired from the
>>         crontab so
>>                 might coincide
>>                         with log rotation/service restarts? The graphs
>> don't
>>                 show these gaps
>>                         correlating with any spikes in CPU, memory or
>> disk
>>                 activity so it
>>                         doesn't seem to be resource starvation (at least
>>         not for
>>                 any common
>>                         resources we're tracking).
>>
>>
>>                     Indeed. It's down again today during the same
>> timeslot.
>>
>>                     Another idea for the cron-based theory:
>>
>>
>>
>>
>> https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>
>>
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>>
>>
>>
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>
>>
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>>>
>>
>>
>>                     loops through the list of Ask OpenStack users via
>>         the API on
>>                 a cron
>>                     running on www.openstack.org
>>         <http://www.openstack.org> <http://www.openstack.org>
>>                 <http://www.openstack.org>. Not sure
>>                     when that cron runs, but if it's similar, this could
>>                 potentially be
>>                     a high-load generator.
>>
>>
>>
>>
>>                     Regards,
>>
>>
>>                     Tom
>>
>>
>>                     _______________________________________________
>>                     OpenStack-Infra mailing list
>>                     OpenStack-Infra at lists.openstack.org
>>         <mailto:OpenStack-Infra at lists.openstack.org>
>>                 <mailto:OpenStack-Infra at lists.openstack.org
>>         <mailto:OpenStack-Infra at lists.openstack.org>>
>>                     <mailto:OpenStack-Infra at lists.openstack.org
>>         <mailto:OpenStack-Infra at lists.openstack.org>
>>                 <mailto:OpenStack-Infra at lists.openstack.org
>>         <mailto:OpenStack-Infra at lists.openstack.org>>>
>>
>>
>>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>
>>
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>>
>>
>>
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>
>>
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>>>
>>
>>
>>
>>
>>
>>
>
> _______________________________________________
> OpenStack-Infra mailing list
> OpenStack-Infra at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra



More information about the OpenStack-Infra mailing list