<div dir="ltr"><div>Hi, this is the volume attached on netapp nfs about the vm I am migrating:</div><div>qemu-img  info volume-002ff8af-9067-4f84-a01c-d147cdd1f70dqimage: volume-002ff8af-9067-4f84-a01c-d147cdd1f70d<br>file format: raw<br>virtual size: 40G (42949672960 bytes)<br>disk size: 21G</div><div><br></div><div>As you can see it is raw and it does not ha base image.</div><div>Ignazio<br></div><div><br></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Il giorno ven 5 ago 2022 alle ore 10:49 Gorka Eguileor <<a href="mailto:geguileo@redhat.com">geguileo@redhat.com</a>> ha scritto:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 05/08, Ignazio Cassano wrote:<br>

> Hello, firstly let me to thank you for reply and sorry if I come back to<br>

> ask why when I do the first migration from A to B it takes 20 minutes and<br>

> then, when I migrate from B to A it takes few seconds.<br>

> I wonder if after the first migration memory is reorganized.<br>

> In the first live migration it lost time to get memory pages ?<br>

> Ignazio<br>

><br>

<br>

Hi,<br>

<br>

I work on Cinder, so my knowledge on live migrations is mostly limited<br>

to the attach/detach flow of the volumes.<br>

<br>

I thought that maybe if you were using ephemeral nova volumes<br>

(non-cinder) maybe the volume had not yet been deleted from the old<br>

node, or maybe it was using a qcow2 base file for multiple instances on<br>

the source (each using a different chain on top of it) and this qcow2<br>

was not originally present in the destination (hence the time to copy<br>

it), so when we do a migration back since there are other instances that<br>

were also using it on the destination (original location) only de<br>

difference needs to be copied.<br>

<br>

But these are just brainstorming ideas, since I don't really know how<br>

Nova handles all this.<br>

<br>

I would recommend setting Nova log to debug mode in both source and<br>

destination nodes and look at where the time difference really is, in<br>

case it's not where you think it is.<br>

<br>

Cheers,<br>

Gorka.<br>

<br>

<br>

> Il giorno ven 5 ago 2022 alle ore 10:17 Gorka Eguileor <<a href="mailto:geguileo@redhat.com" target="_blank">geguileo@redhat.com</a>><br>

> ha scritto:<br>

><br>

> > On 04/08, Ignazio Cassano wrote:<br>

> > > HI,<br>

> > > I am using cinder volumes.<br>

> > > Ignazio<br>

> > ><br>

> ><br>

> > Hi,<br>

> ><br>

> > In that case there is no volume data being copied for the instance<br>

> > migration, and volume attach on the destination should not account for<br>

> > more than 30 seconds of those 20 minutes, so not much improvement<br>

> > possible there.<br>

> ><br>

> > Cheers,<br>

> > Gorka.<br>

> ><br>

> > > Il giorno gio 4 ago 2022 alle ore 16:56 Gorka Eguileor <<br>

> > <a href="mailto:geguileo@redhat.com" target="_blank">geguileo@redhat.com</a>><br>

> > > ha scritto:<br>

> > ><br>

> > > > On 03/08, Ignazio Cassano wrote:<br>

> > > > > Hello All,<br>

> > > > > I am looking for a solution to speed up live migration.<br>

> > > > > Instances where ram is used heavily like java application servers,<br>

> > live<br>

> > > > > migration take a long time (more than 20 minutes for 8GB ram<br>

> > instance)<br>

> > > > and<br>

> > > > > converge mode is already set to True in nova.conf.<br>

> > > ><br>

> > > > Hi,<br>

> > > ><br>

> > > > Probably doesn't affect your case, but I assume you are using ephemeral<br>

> > > > nova boot volumes.<br>

> > > ><br>

> > > > Have you tried using only Cinder volumes on the VM?<br>

> > > ><br>

> > > > Cheers,<br>

> > > > Gorka.<br>

> > > ><br>

> > > ><br>

> > > > > I also tried with post_copy but it does not change.<br>

> > > > > After the first live migration (very solow) if I try to migrate<br>

> > again it<br>

> > > > is<br>

> > > > > very fast.<br>

> > > > > I presume the first migration is slow because memory fragmentation<br>

> > when<br>

> > > > an<br>

> > > > > instance is running on the same compute node for a long time.<br>

> > > > > I am looking for a solution considering the on my computing node I<br>

> > can<br>

> > > > have<br>

> > > > > a little ram overcommit. Any case I am increasing the number of<br>

> > compute<br>

> > > > > nodes to reduce it.<br>

> > > > > Thanks<br>

> > > > > Ignazio<br>

> > > ><br>

> > > ><br>

> ><br>

> ><br>

<br>

</blockquote></div>