[operations][nova] Mi grate to shared /var/lib/nova/instances
Hello, AFAIK live migration requires to have /var/lib/nova/instances shared between the compute nodes (eg. through a NFS export)- If this has not been configured initially and each compute node has its own folder, is it possible to merge them somehow without disruptive effects? Regards Francesco Di Nucci
On Mon, 2024-06-17 at 14:02 +0200, Francesco Di Nucci wrote:
Hello,
AFAIK live migration requires to have /var/lib/nova/instances shared between the compute nodes (eg. through a NFS export)
that is not the case and has never been required to my knowladge. perhaps in the very early days i.e. pre 2013 when i started working on openstack but we have supported live migration with local storage without nfs since at least the hevana release and likely before that. nova instructs libvirt to export the vm storage via the nbd (network block device) server so that it can be transferred to the destination hosts.
- If this has not been configured initially and each compute node has its own folder, is it possible to merge them somehow without disruptive effects?
not safely with running workloads without a lot of work the only way to move form local non shared stoarge to local shared storage is to copy the disk which is generally unsafe to do manually. the only way i can think of to do this with running workloads is via live migration in effect you need an empty compute node that is on nfs, then you need to live migrate to live migrate all insatnce form an non nfs host to the nfs host. once the non-nfs host is drained of workloads you can mount the nfs share on /var/lib/nova/instance and use it as the next "empty" host. then you have to repeat that proceture for every host until all your isntances are on nfs. that will likely reduce the perfroamce of your guest and may cause addtional operation issues as nfs is know to have locking/consitency issues espically v3. we recommend a minium nfs version of 4.2 while /var/lib/nova/instance on nfs is technially supproted upstream our best practices recommendation is to avoid that configuration and either use local stoage, boot form cinder volume or ceph if you want a shared network storage solution.
Regards
Francesco Di Nucci
Thank you, I was reading the Nova docs that only cite NFS for live migration (https://docs.openstack.org/nova/latest/admin/configuring-migrations.html#sha...), do you know if there is documentation about live migration with Ceph? We're planning on deploying it, in case switching from local storage to Ceph (instead of NFS) would be the more sensible approach Regards Francesco Di Nucci On 17/06/24 15:54, smooney@redhat.com wrote:
Hello,
AFAIK live migration requires to have /var/lib/nova/instances shared between the compute nodes (eg. through a NFS export)
On Mon, 2024-06-17 at 14:02 +0200, Francesco Di Nucci wrote: that is not the case and has never been required to my knowladge.
perhaps in the very early days i.e. pre 2013 when i started working on openstack but we have supported live migration with local storage without nfs since at least the hevana release and likely before that.
nova instructs libvirt to export the vm storage via the nbd (network block device) server so that it can be transferred to the destination hosts.
- If this has not been configured initially and each compute node has its own folder, is it possible to merge them somehow without disruptive effects? not safely with running workloads without a lot of work
the only way to move form local non shared stoarge to local shared storage is to copy the disk which is generally unsafe to do manually.
the only way i can think of to do this with running workloads is via live migration in effect you need an empty compute node that is on nfs, then you need to live migrate to live migrate all insatnce form an non nfs host to the nfs host. once the non-nfs host is drained of workloads you can mount the nfs share on /var/lib/nova/instance and use it as the next "empty" host. then you have to repeat that proceture for every host until all your isntances are on nfs.
that will likely reduce the perfroamce of your guest and may cause addtional operation issues as nfs is know to have locking/consitency issues espically v3. we recommend a minium nfs version of 4.2
while /var/lib/nova/instance on nfs is technially supproted upstream our best practices recommendation is to avoid that configuration and either use local stoage, boot form cinder volume or ceph if you want a shared network storage solution.
Regards
Francesco Di Nucci
On Tue, 2024-06-18 at 08:33 +0200, Francesco Di Nucci wrote:
Thank you,
I was reading the Nova docs that only cite NFS for live migration (https://docs.openstack.org/nova/latest/admin/configuring-migrations.html#sha...), do you know if there is documentation about live migration with Ceph? it more or less the same the only thing requried to make live mgiratio work with libvirt for local non shared storage or ceph is covered in the general section
https://docs.openstack.org/nova/latest/admin/configuring-migrations.html#gen... basiclaly you need do create and distirbute a passwordless ssh key or ssh certificate which is in the authorized keys file on each host adn you need to either disabel know host validation or do a ssh key scan and populate a known hosts file with all your compute nodes and distribute that to all hosts. we requires this ssh access regradless of the storage backend used.
We're planning on deploying it, in case switching from local storage to Ceph (instead of NFS) would be the more sensible approach
that is even more unsupproted. nova in general does not allow switch stroge backend for existing instnaces. offically it is unsupported. unoffcilly it can work but is untested in limited cases. for non shared to shared but still using qcow/raw the live migration method i descibed will work that will not work for cpeh and move operations between diffent storage backend are not offically supproted. the only one that kind of work is shelve and unshelve. that should work from local sotage to ceph backed storage. but again that is not technially supproted although it might work since it would be creating a ceph voluem form a glance image. going the other way ceph to local vai shelve proably wont work as the local stroage compute proably wotn be able to pull the shelve image form the ceph cluster and flaten it. when you deploy openstack choosing a storage backend for a set of comptues is ment to be an upfront descison you can mix an match if you partition your cloud appropriately but for runing workload once they are created the backedn they use is effectivly fixed without lots of hacks. i have heard of operators using qemu-img and the rbd/ceph command line clent to manually create the ceph volumes for existing vms and then change the nova.conf and hard reboot the instances to have them boot form ceph. my recollection was they were able to succesfully pull that off just by creating the voluem with the correct uuid and copying the data. while it actully did work in that specific case we dont have any documentation for that as it really does require a lot of operational knowlage of nova and ceph to do correctly. i.e. distirbuting functional ceph config and keyrings to the relevent hosts. configuring networkign for ceph if not already done, stopign the guests to safely do the data transfer ectra.
Regards
Francesco Di Nucci
On 17/06/24 15:54, smooney@redhat.com wrote:
Hello,
AFAIK live migration requires to have /var/lib/nova/instances shared between the compute nodes (eg. through a NFS export)
On Mon, 2024-06-17 at 14:02 +0200, Francesco Di Nucci wrote: that is not the case and has never been required to my knowladge.
perhaps in the very early days i.e. pre 2013 when i started working on openstack but we have supported live migration with local storage without nfs since at least the hevana release and likely before that.
nova instructs libvirt to export the vm storage via the nbd (network block device) server so that it can be transferred to the destination hosts.
- If this has not been configured initially and each compute node has its own folder, is it possible to merge them somehow without disruptive effects? not safely with running workloads without a lot of work
the only way to move form local non shared stoarge to local shared storage is to copy the disk which is generally unsafe to do manually.
the only way i can think of to do this with running workloads is via live migration in effect you need an empty compute node that is on nfs, then you need to live migrate to live migrate all insatnce form an non nfs host to the nfs host. once the non-nfs host is drained of workloads you can mount the nfs share on /var/lib/nova/instance and use it as the next "empty" host. then you have to repeat that proceture for every host until all your isntances are on nfs.
that will likely reduce the perfroamce of your guest and may cause addtional operation issues as nfs is know to have locking/consitency issues espically v3. we recommend a minium nfs version of 4.2
while /var/lib/nova/instance on nfs is technially supproted upstream our best practices recommendation is to avoid that configuration and either use local stoage, boot form cinder volume or ceph if you want a shared network storage solution.
Regards
Francesco Di Nucci
Thank you very much, I'll try live migration without shared storage first. Regarding the change of storage backend, unfortunately the decision it's not up to me... thank you for the warning/information Best regards Francesco Di Nucci On 18/06/24 13:49, smooney@redhat.com wrote:
On Tue, 2024-06-18 at 08:33 +0200, Francesco Di Nucci wrote:
Thank you,
I was reading the Nova docs that only cite NFS for live migration (https://docs.openstack.org/nova/latest/admin/configuring-migrations.html#sha...), do you know if there is documentation about live migration with Ceph? it more or less the same the only thing requried to make live mgiratio work with libvirt for local non shared storage or ceph is covered in the general section
https://docs.openstack.org/nova/latest/admin/configuring-migrations.html#gen...
basiclaly you need do create and distirbute a passwordless ssh key or ssh certificate which is in the authorized keys file on each host adn you need to either disabel know host validation or do a ssh key scan and populate a known hosts file with all your compute nodes and distribute that to all hosts.
We're planning on deploying it, in case switching from local storage to Ceph (instead of NFS) would be the more sensible approach
we requires this ssh access regradless of the storage backend used. that is even more unsupproted.
nova in general does not allow switch stroge backend for existing instnaces. offically it is unsupported. unoffcilly it can work but is untested in limited cases.
for non shared to shared but still using qcow/raw the live migration method i descibed will work
that will not work for cpeh and move operations between diffent storage backend are not offically supproted. the only one that kind of work is shelve and unshelve.
that should work from local sotage to ceph backed storage. but again that is not technially supproted although it might work since it would be creating a ceph voluem form a glance image.
going the other way ceph to local vai shelve proably wont work as the local stroage compute proably wotn be able to pull the shelve image form the ceph cluster and flaten it.
when you deploy openstack choosing a storage backend for a set of comptues is ment to be an upfront descison you can mix an match if you partition your cloud appropriately but for runing workload once they are created the backedn they use is effectivly fixed without lots of hacks.
i have heard of operators using qemu-img and the rbd/ceph command line clent to manually create the ceph volumes for existing vms and then change the nova.conf and hard reboot the instances to have them boot form ceph.
my recollection was they were able to succesfully pull that off just by creating the voluem with the correct uuid and copying the data. while it actully did work in that specific case we dont have any documentation for that as it really does require a lot of operational knowlage of nova and ceph to do correctly.
i.e. distirbuting functional ceph config and keyrings to the relevent hosts. configuring networkign for ceph if not already done, stopign the guests to safely do the data transfer ectra.
Regards
Francesco Di Nucci
On 17/06/24 15:54, smooney@redhat.com wrote:
Hello,
AFAIK live migration requires to have /var/lib/nova/instances shared between the compute nodes (eg. through a NFS export)
On Mon, 2024-06-17 at 14:02 +0200, Francesco Di Nucci wrote: that is not the case and has never been required to my knowladge.
perhaps in the very early days i.e. pre 2013 when i started working on openstack but we have supported live migration with local storage without nfs since at least the hevana release and likely before that.
nova instructs libvirt to export the vm storage via the nbd (network block device) server so that it can be transferred to the destination hosts.
- If this has not been configured initially and each compute node has its own folder, is it possible to merge them somehow without disruptive effects? not safely with running workloads without a lot of work
the only way to move form local non shared stoarge to local shared storage is to copy the disk which is generally unsafe to do manually.
the only way i can think of to do this with running workloads is via live migration in effect you need an empty compute node that is on nfs, then you need to live migrate to live migrate all insatnce form an non nfs host to the nfs host. once the non-nfs host is drained of workloads you can mount the nfs share on /var/lib/nova/instance and use it as the next "empty" host. then you have to repeat that proceture for every host until all your isntances are on nfs.
that will likely reduce the perfroamce of your guest and may cause addtional operation issues as nfs is know to have locking/consitency issues espically v3. we recommend a minium nfs version of 4.2
while /var/lib/nova/instance on nfs is technially supproted upstream our best practices recommendation is to avoid that configuration and either use local stoage, boot form cinder volume or ceph if you want a shared network storage solution.
Regards
Francesco Di Nucci
participants (2)
-
Francesco Di Nucci
-
smooney@redhat.com