[nova] Live migration of a RAM intensive instance failed

Michel Jouvin michel.jouvin at ijclab.in2p3.fr
Mon Sep 18 10:13:04 UTC 2023


Hi,

post-copy looks to meas a very attractive approach for these 
heavy-loaded VMs but I didn't understand that there is an inherent risk 
of data-loss (except if there is an implementation bug)... Are you sure?

Michel

Le 18/09/2023 à 11:29, Rafa a écrit :
> As I can read and understand from the source compute logs,
> the memory is copied over successfully and there is no migration timeout.
> But after the instance is paused there is something wrong happening.
> I first thought it could be the short migration downtime (default=500ms),
> that's why I increased the "live_migration_downtime" to higher values
> (max was 300000ms; just for testing:) ) and nothing changed.
>
> And the error message doesn't say much either.
>
> I don't really want to use post-copy as it can lead to data loss.
>
> Auto Converge doesn't seem to help either.
>
>
>
> Am Fr., 15. Sept. 2023 um 13:56 Uhr schrieb <smooney at redhat.com>:
>> On Thu, 2023-09-14 at 19:07 +0200, Rafa wrote:
>>>> Hi, could you share logs from the target compute node as well?
>>> yes, here: https://paste.openstack.org/show/blpJE8krA1N6PVaLTVF0/
>>>
>> if the vm is under heavy memory load then its advisable ot use post-copy live migration.
>>
>> in general live migration is not intened to be used with a vm under load
>> as there is no gurenettee that it will ever complete. post-copy live migration
>> can signifcantly increae the probablity that a vm under load will live migrate
>> in a reasonabel amount of time.
>>
>> https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.live_migration_permit_post_copy
>>
>> auto converge can also help but tis less important
>>
>> https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.live_migration_permit_auto_converge
>>



More information about the openstack-discuss mailing list