[OpenStack-Infra] Ask.o.o down
Tom Fifield
tom at openstack.org
Tue Feb 21 07:15:18 UTC 2017
On 廿十七年二月廿一日 暮 03:11, Tom Fifield wrote:
>
>
> On 廿十七年二月十四日 暮 04:19, Joshua Hesketh wrote:
>>
>>
>> On Tue, Feb 14, 2017 at 7:15 PM, Tom Fifield <tom at openstack.org
>> <mailto:tom at openstack.org>> wrote:
>>
>> On 14/02/17 16:11, Joshua Hesketh wrote:
>>
>> Hey Tom,
>>
>> Where is that script being fired from (a quick grep doesn't find
>> it), or
>> is it a tool people are using?
>>
>> If it's a tool we'd need to make sure whoever is using it gets
>> a new
>> version to rule it out.
>>
>>
>> Indeed.
>>
>>
>> It's fired from a PHP service on www.openstack.org
>> <http://www.openstack.org> itself, which writes to the Member
>> database:
>>
>>
>> https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/code/services/ActiveModeratorService.php
>>
>>
>> <https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/code/services/ActiveModeratorService.php>
>>
>>
>>
>>
>> Right. I wonder if somebody could check the logs to see if the process
>> times out. Sadly looking at that code it looks like any output messages
>> from the script will be discarded.
>>
>
> ... and my patch was deployed, but the site is down today. So, looks
> like it wasn't that.
Though, is it staying down for less time? It came back up just now -
normally it'd be down for another 45mins.
Interesting traffic spikes at:
http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=2549&rra_id=all
seem to correlate with the outage. Perhaps we can set up some tcpdumps?
>>
>>
>> The next step is to update the copy of the script it references:
>>
>>
>> https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/lib/uc-recognition/tools/get_active_moderator.py
>>
>>
>> <https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/lib/uc-recognition/tools/get_active_moderator.py>
>>
>>
>> I am not sure if this is in place using git submodules or manually,
>> but will figure it out and get that updated.
>>
>>
>>
>>
>> - Josh
>>
>> On Tue, Feb 14, 2017 at 7:07 PM, Tom Fifield <tom at openstack.org
>> <mailto:tom at openstack.org>
>> <mailto:tom at openstack.org <mailto:tom at openstack.org>>> wrote:
>>
>> On 14/02/17 16:06, Joshua Hesketh wrote:
>>
>> Hey,
>>
>> I've brought the service back up, but have no new clues
>> as to why.
>>
>>
>> Cheers.
>>
>> Going to try: https://review.openstack.org/#/c/433478/
>> <https://review.openstack.org/#/c/433478/>
>> <https://review.openstack.org/#/c/433478/
>> <https://review.openstack.org/#/c/433478/>>
>> to see if this script is culprit.
>>
>>
>> - Josh
>>
>> On Tue, Feb 14, 2017 at 6:50 PM, Tom Fifield
>> <tom at openstack.org <mailto:tom at openstack.org>
>> <mailto:tom at openstack.org <mailto:tom at openstack.org>>
>> <mailto:tom at openstack.org <mailto:tom at openstack.org>
>> <mailto:tom at openstack.org <mailto:tom at openstack.org>>>> wrote:
>>
>> On 10/02/17 22:39, Jeremy Stanley wrote:
>>
>> On 2017-02-10 16:08:51 +0800 (+0800), Tom
>> Fifield wrote:
>> [...]
>>
>> Down again, this time with "Network is
>> unreachable".
>>
>> [...]
>>
>> I'm not finding any obvious errors on the
>> server nor
>> relevant
>> maintenance notices/trouble tickets from the
>> service
>> provider to
>> explain this. I do see conspicuous gaps in
>> network
>> traffic volume
>> and system load from ~06:45 to ~08:10 UTC
>> according to
>> cacti:
>>
>>
>> http://cacti.openstack.org/?tree_id=1&leaf_id=156
>> <http://cacti.openstack.org/?tree_id=1&leaf_id=156>
>> <http://cacti.openstack.org/?tree_id=1&leaf_id=156
>> <http://cacti.openstack.org/?tree_id=1&leaf_id=156>>
>>
>> <http://cacti.openstack.org/?tree_id=1&leaf_id=156
>> <http://cacti.openstack.org/?tree_id=1&leaf_id=156>
>> <http://cacti.openstack.org/?tree_id=1&leaf_id=156
>> <http://cacti.openstack.org/?tree_id=1&leaf_id=156>>>
>>
>> Skipping back through previous days I find some
>> similar gaps
>> starting anywhere from 06:30 to 07:00 and ending
>> between
>> 07:00 and
>> 08:00 but they don't seem to occur every day and
>> I'm not
>> having much
>> luck finding a pattern. It _is_ conspicuously
>> close to when
>> /etc/cron.daily scripts get fired from the
>> crontab so
>> might coincide
>> with log rotation/service restarts? The graphs
>> don't
>> show these gaps
>> correlating with any spikes in CPU, memory or
>> disk
>> activity so it
>> doesn't seem to be resource starvation (at least
>> not for
>> any common
>> resources we're tracking).
>>
>>
>> Indeed. It's down again today during the same
>> timeslot.
>>
>> Another idea for the cron-based theory:
>>
>>
>>
>>
>> https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>
>>
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>>
>>
>>
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>
>>
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>>
>>
>> <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>>>
>>
>>
>> loops through the list of Ask OpenStack users via
>> the API on
>> a cron
>> running on www.openstack.org
>> <http://www.openstack.org> <http://www.openstack.org>
>> <http://www.openstack.org>. Not sure
>> when that cron runs, but if it's similar, this could
>> potentially be
>> a high-load generator.
>>
>>
>>
>>
>> Regards,
>>
>>
>> Tom
>>
>>
>> _______________________________________________
>> OpenStack-Infra mailing list
>> OpenStack-Infra at lists.openstack.org
>> <mailto:OpenStack-Infra at lists.openstack.org>
>> <mailto:OpenStack-Infra at lists.openstack.org
>> <mailto:OpenStack-Infra at lists.openstack.org>>
>> <mailto:OpenStack-Infra at lists.openstack.org
>> <mailto:OpenStack-Infra at lists.openstack.org>
>> <mailto:OpenStack-Infra at lists.openstack.org
>> <mailto:OpenStack-Infra at lists.openstack.org>>>
>>
>>
>>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>
>>
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>>
>>
>>
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>
>>
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>>
>> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>>>
>>
>>
>>
>>
>>
>>
>
> _______________________________________________
> OpenStack-Infra mailing list
> OpenStack-Infra at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
More information about the OpenStack-Infra
mailing list