[OpenStack-Infra] Ask.o.o down

Tom Fifield tom at openstack.org
Tue Feb 21 07:11:48 UTC 2017



On 廿十七年二月十四日 暮 04:19, Joshua Hesketh wrote:
>
>
> On Tue, Feb 14, 2017 at 7:15 PM, Tom Fifield <tom at openstack.org
> <mailto:tom at openstack.org>> wrote:
>
>     On 14/02/17 16:11, Joshua Hesketh wrote:
>
>         Hey Tom,
>
>         Where is that script being fired from (a quick grep doesn't find
>         it), or
>         is it a tool people are using?
>
>         If it's a tool we'd need to make sure whoever is using it gets a new
>         version to rule it out.
>
>
>     Indeed.
>
>
>     It's fired from a PHP service on www.openstack.org
>     <http://www.openstack.org> itself, which writes to the Member database:
>
>     https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/code/services/ActiveModeratorService.php
>     <https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/code/services/ActiveModeratorService.php>
>
>
>
> Right. I wonder if somebody could check the logs to see if the process
> times out. Sadly looking at that code it looks like any output messages
> from the script will be discarded.
>

... and my patch was deployed, but the site is down today. So, looks 
like it wasn't that.

>
>
>     The next step is to update the copy of the script it references:
>
>     https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/lib/uc-recognition/tools/get_active_moderator.py
>     <https://github.com/OpenStackweb/openstack-org/blob/master/auc-metrics/lib/uc-recognition/tools/get_active_moderator.py>
>
>     I am not sure if this is in place using git submodules or manually,
>     but will figure it out and get that updated.
>
>
>
>
>          - Josh
>
>         On Tue, Feb 14, 2017 at 7:07 PM, Tom Fifield <tom at openstack.org
>         <mailto:tom at openstack.org>
>         <mailto:tom at openstack.org <mailto:tom at openstack.org>>> wrote:
>
>             On 14/02/17 16:06, Joshua Hesketh wrote:
>
>                 Hey,
>
>                 I've brought the service back up, but have no new clues
>         as to why.
>
>
>             Cheers.
>
>             Going to try: https://review.openstack.org/#/c/433478/
>         <https://review.openstack.org/#/c/433478/>
>             <https://review.openstack.org/#/c/433478/
>         <https://review.openstack.org/#/c/433478/>>
>             to see if this script is culprit.
>
>
>                 - Josh
>
>                 On Tue, Feb 14, 2017 at 6:50 PM, Tom Fifield
>         <tom at openstack.org <mailto:tom at openstack.org>
>                 <mailto:tom at openstack.org <mailto:tom at openstack.org>>
>                 <mailto:tom at openstack.org <mailto:tom at openstack.org>
>         <mailto:tom at openstack.org <mailto:tom at openstack.org>>>> wrote:
>
>                     On 10/02/17 22:39, Jeremy Stanley wrote:
>
>                         On 2017-02-10 16:08:51 +0800 (+0800), Tom
>         Fifield wrote:
>                         [...]
>
>                             Down again, this time with "Network is
>         unreachable".
>
>                         [...]
>
>                         I'm not finding any obvious errors on the server nor
>                 relevant
>                         maintenance notices/trouble tickets from the service
>                 provider to
>                         explain this. I do see conspicuous gaps in network
>                 traffic volume
>                         and system load from ~06:45 to ~08:10 UTC
>         according to
>                 cacti:
>
>
>         http://cacti.openstack.org/?tree_id=1&leaf_id=156
>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156>
>                 <http://cacti.openstack.org/?tree_id=1&leaf_id=156
>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156>>
>
>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156
>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156>
>                 <http://cacti.openstack.org/?tree_id=1&leaf_id=156
>         <http://cacti.openstack.org/?tree_id=1&leaf_id=156>>>
>
>                         Skipping back through previous days I find some
>         similar gaps
>                         starting anywhere from 06:30 to 07:00 and ending
>         between
>                 07:00 and
>                         08:00 but they don't seem to occur every day and
>         I'm not
>                 having much
>                         luck finding a pattern. It _is_ conspicuously
>         close to when
>                         /etc/cron.daily scripts get fired from the
>         crontab so
>                 might coincide
>                         with log rotation/service restarts? The graphs don't
>                 show these gaps
>                         correlating with any spikes in CPU, memory or disk
>                 activity so it
>                         doesn't seem to be resource starvation (at least
>         not for
>                 any common
>                         resources we're tracking).
>
>
>                     Indeed. It's down again today during the same timeslot.
>
>                     Another idea for the cron-based theory:
>
>
>
>         https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>         <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>
>
>         <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>         <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>>
>
>
>         <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>         <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>
>
>         <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py
>         <https://github.com/openstack/uc-recognition/blob/master/tools/get_active_moderator.py>>>
>
>                     loops through the list of Ask OpenStack users via
>         the API on
>                 a cron
>                     running on www.openstack.org
>         <http://www.openstack.org> <http://www.openstack.org>
>                 <http://www.openstack.org>. Not sure
>                     when that cron runs, but if it's similar, this could
>                 potentially be
>                     a high-load generator.
>
>
>
>
>                     Regards,
>
>
>                     Tom
>
>
>                     _______________________________________________
>                     OpenStack-Infra mailing list
>                     OpenStack-Infra at lists.openstack.org
>         <mailto:OpenStack-Infra at lists.openstack.org>
>                 <mailto:OpenStack-Infra at lists.openstack.org
>         <mailto:OpenStack-Infra at lists.openstack.org>>
>                     <mailto:OpenStack-Infra at lists.openstack.org
>         <mailto:OpenStack-Infra at lists.openstack.org>
>                 <mailto:OpenStack-Infra at lists.openstack.org
>         <mailto:OpenStack-Infra at lists.openstack.org>>>
>
>
>         http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>         <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>
>
>         <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>         <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>>
>
>
>         <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>         <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>
>
>         <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra
>         <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra>>>
>
>
>
>
>
>



More information about the OpenStack-Infra mailing list