On Thu, 3 Jan 2019 18:02:16 -0600, Matt Riedemann <mriedemos@gmail.com> wrote:
On 1/3/2019 5:45 PM, Dan Smith wrote:
You can't abort a post-copy migration once it has started. If we were to add an "always do post-copy" mode to Nova, per the recommendation from the post I linked, then we would start a migration in post-copy mode, which would make it un-cancel-able. That means not only could you not cancel it, but we would have to refuse to start the migration if the user requested an abort action via this new proposed API with any timeout value.
Anyway, my point here is just that libvirt already (but not nova/libvirt yet) has a live migration mode where we would not be able to honor a request of "abort after N seconds". If config specified that, we could warn or fail on startup, but via the API all we'd be able to do is refuse to start the migration. I'm just trying to highlight that baking "force/abort after N seconds" into our API is not only just libvirt-specific at the moment, but even libvirt-pre-copy specific.
OK, sorry, I'm following you now. I didn't make the connection that you were talking about something we could do in the future (in nova) to initiate the live migration in post-copy mode. Yeah I agree in that case if the user said abort we'd just have to reject it and say you can't do that based on how the source host is configured.
This seems like a reasonable way to handle the future case of a live migration initiated in post-copy mode. Overall, I'm in support of the idea of adding finer-grained control over live migrations, being that we have multiple operators who've expressed the usefulness they'd get from it and it seems like a relatively simple change. It also sounds like we have answers for the concerns about bad UX by checking pre-live-migration whether the driver supports the new parameters and fail fast in that case. And in the future if we have live migrations able to be initiated in post-copy mode, fail fast with instance action info similarly. -melanie