[nova][ops] Trying to get per-instance live migration timeout action spec unstuck

Dan Smith dms at danplanet.com
Thu Jan 3 22:37:03 UTC 2019


>> Or even that libvirt/kvm
>> migration changes in such a way that these no longer make sense even for
>> it. We already have an example this in-tree today, where the
>> recently-added libvirt post-copy mode makes the 'abort' option invalid.
>
> I'm not following you here. As far as I understand, post-copy in the
> libvirt driver is triggered on the force complete action and only if
> (1) it's available and (2) nova is configured to allow it, otherwise
> the force complete action for the libvirt driver pauses the VM. The
> abort operation aborts the job in libvirt [1] which I believe triggers
> a rollback [2].
>
> [1]
> https://github.com/openstack/nova/blob/8ef3d253a086e4f8575f5221d4515cda421abea2/nova/virt/libvirt/driver.py#L7388
> [2]
> https://github.com/openstack/nova/blob/8ef3d253a086e4f8575f5221d4515cda421abea2/nova/virt/libvirt/driver.py#L7454

Because in nova[0] we currently only switch to post-copy after we decide
we're not making progress right? If we later allow a configuration where
post-copy is the default from the start (as I believe is the actual
current recommendation from the virt people[1]), and someone triggers a
migration with a short timeout and abort action, we'll not be able to
actually do the abort. I'm guessing we'd just need to refuse a request
where abort is specified with any timeout if post-copy will be used from
the beginning. Since the API user can't know how the virt driver is
configured, we just have to refuse to do the migration and hope they'll
understand :)

0: Sorry, I shouldn't have said "in tree" because I meant "in the
   libvirt world"
1: look for "in summary" here: https://www.berrange.com/posts/2016/05/12/analysis-of-techniques-for-ensuring-migration-completion-with-kvm/

--Dan



More information about the openstack-discuss mailing list