I've been noticing these shelve/unshelve guest ssh fail due to dhcp lease issues quite a bit recently and wrote a bug and e-r query for it this morning: http://status.openstack.org/elastic-recheck/#1853453 The problem seems to stem from when these shelve tests run on multinode jobs and we shelve on one host and unshelve on another. I have a patch up to nova to force config drive in the nova-next job where this hits the most: https://review.opendev.org/#/c/695431 But that's just kind of a stab in the dark to take the metadata API out of the picture for cloud-init. If that doesn't help, and we don't know what is causing this or have ideas to debug it, we might need to consider making a change to shelve/unshelve testing in tempest such that we try to unshelve on original host. Now I realize that is unfortunate since the whole point of shelve offloading and unshelving is that you can land on another host and things are good, but if these tests continue to be a high failure rate in multinode jobs we probably need to consider workarounds if no one is going to dig into the failures and figure out what is going wrong. Thoughts? -- Thanks, Matt