[nova] New gate bug 1844929, timed out waiting for response from cell during scheduling

Mark Goddard mark at stackhpc.com
Sun Sep 22 16:55:03 UTC 2019


On Sun, 22 Sep 2019, 16:39 Matt Riedemann, <mriedemos at gmail.com> wrote:

> I noticed this while looking at a grenade failure on an unrelated patch:
>
> https://bugs.launchpad.net/nova/+bug/1844929
>
> The details are in the bug but it looks like this showed up around Sept
> 17 and hits mostly on FortNebula nodes but also OVH nodes. It's
> restricted to grenade jobs and while I don't see anything obvious in the
> rabbitmq logs (the only errors are about uwsgi [api] heartbeat issues),
> it's possible that these are slower infra nodes and we're just not
> waiting for something properly during the grenade upgrade. We also don't
> seem to have the mysql logs published during the grenade jobs which we
> need to fix (and recently did fix for devstack jobs [1] but grenade jobs
> are still using devstack-gate so log collection happens there).
>
> I didn't see any changes in nova, grenade or devstack since Sept 16 that
> look like they would be related to this so I'm guessing right now it's
> just a combination of performance on certain infra nodes (slower?) and
> something in grenade/nova not restarting properly or not waiting long
> enough for the upgrade to complete.
>

Julia recently fixed an issue in ironic caused by a low MTU on fortnebula.
May or may not be related.

[1]
>
> https://github.com/openstack/devstack/commit/f92c346131db2c89b930b1a23f8489419a2217dc
>
> --
>
> Thanks,
>
> Matt
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190922/c59435ae/attachment.html>


More information about the openstack-discuss mailing list