[nova][gate] Thoughts on working around bug 1853453?
Matt Riedemann
mriedemos at gmail.com
Thu Nov 21 17:08:08 UTC 2019
I've been noticing these shelve/unshelve guest ssh fail due to dhcp
lease issues quite a bit recently and wrote a bug and e-r query for it
this morning:
http://status.openstack.org/elastic-recheck/#1853453
The problem seems to stem from when these shelve tests run on multinode
jobs and we shelve on one host and unshelve on another.
I have a patch up to nova to force config drive in the nova-next job
where this hits the most:
https://review.opendev.org/#/c/695431
But that's just kind of a stab in the dark to take the metadata API out
of the picture for cloud-init.
If that doesn't help, and we don't know what is causing this or have
ideas to debug it, we might need to consider making a change to
shelve/unshelve testing in tempest such that we try to unshelve on
original host. Now I realize that is unfortunate since the whole point
of shelve offloading and unshelving is that you can land on another host
and things are good, but if these tests continue to be a high failure
rate in multinode jobs we probably need to consider workarounds if no
one is going to dig into the failures and figure out what is going wrong.
Thoughts?
--
Thanks,
Matt
More information about the openstack-discuss
mailing list