[cinder][nova] Migrating servers' root block devices from a cinder backend to another

newer
[keystone] Keystone Team Update -...

Jean-Philippe Méthot

29 Jan 2020 29 Jan '20

12:50 p.m.

Hi, We have a several hundred VMs which were built on cinder block devices as root drives which use a SAN backend. Now we want to change their backend from the SAN to Ceph. We can shutdown the VMs but we will not destroy them. I am aware that there is a cinder migrate volume command to change a volume’s backend, but it requires that the volume be completely detached. Forcing a detached state on that volume does let the volume migration take place, but the volume’s path in Nova block_device_mapping doesn’t update, for obvious reasons. So, I am considering forcing the volumes to a detached status in Cinder and then manually updating the nova db block_device_mapping entry for each volume so that the VM can boot back up afterwards. However, before I start toying with the database and accidentally break stuff, has anyone else ever done something similar? Got any tips or hints on how best to proceed? Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc.

Attachments:

attachment.html (text/html — 3.6 KB)

Show replies by date

Tobias Urdin

30 Jan 30 Jan

12:34 a.m.

We did this something similar recently, we booted all instances from Cinder volume (with "Delete on terminate" set) in an old platform. So we added our new Ceph storage to the old platform, removed instances (updated delete_on_terminate to 0 in Nova DB). Then we issued a retype so cinder-volume performed a `dd` of the volume from the old to the new storage. We then synced network/subnet/sg and started instances with same fixed IP and moved floating IPs to the new platform. Since you only have to swap storage you should experiment with powering off the instances and try doing a migrate of the volume but I suspect you need to either remove the instance or do some really nasty database operations. I would suggest always going through the API and recreate the instance from the migrated volume instead of changing in the DB. We had to update delete_on_terminate in DB but that was pretty trivial (and I even think there is a spec that is not implemented yet that will allow that from API). On 1/29/20 9:54 PM, Jean-Philippe Méthot wrote:

...

Hi,

We have a several hundred VMs which were built on cinder block devices as root drives which use a SAN backend. Now we want to change their backend from the SAN to Ceph. We can shutdown the VMs but we will not destroy them. I am aware that there is a cinder migrate volume command to change a volume’s backend, but it requires that the volume be completely detached. Forcing a detached state on that volume does let the volume migration take place, but the volume’s path in Nova block_device_mapping doesn’t update, for obvious reasons.

So, I am considering forcing the volumes to a detached status in Cinder and then manually updating the nova db block_device_mapping entry for each volume so that the VM can boot back up afterwards. However, before I start toying with the database and accidentally break stuff, has anyone else ever done something similar? Got any tips or hints on how best to proceed?

Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc.

Tony Pearce

12:43 a.m.

I want to do something similar soon and don't want to touch the db (I experimented with cloning the "controller" and it did not achieve any desired outcome). Is there a way to export an instance from Openstack in terms of something like a script that could re-create it on another openstack as a like-for-like? I guess this is assuming that the instance is linux-based and has cloud-init enabled. *Tony Pearce* | *Senior Network Engineer / Infrastructure Lead**Cinglevue International <https://www.cinglevue.com>* Email: tony.pearce@cinglevue.com Web: http://www.cinglevue.com *Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. On Thu, 30 Jan 2020 at 16:39, Tobias Urdin <tobias.urdin@binero.se> wrote:

...

We did this something similar recently, we booted all instances from Cinder volume (with "Delete on terminate" set) in an old platform.

So we added our new Ceph storage to the old platform, removed instances (updated delete_on_terminate to 0 in Nova DB). Then we issued a retype so cinder-volume performed a `dd` of the volume from the old to the new storage.

We then synced network/subnet/sg and started instances with same fixed IP and moved floating IPs to the new platform.

Since you only have to swap storage you should experiment with powering off the instances and try doing a migrate of the volume but I suspect you need to either remove the instance or do some really nasty database operations.

I would suggest always going through the API and recreate the instance from the migrated volume instead of changing in the DB. We had to update delete_on_terminate in DB but that was pretty trivial (and I even think there is a spec that is not implemented yet that will allow that from API).

On 1/29/20 9:54 PM, Jean-Philippe Méthot wrote:

Hi,

We have a several hundred VMs which were built on cinder block devices as root drives which use a SAN backend. Now we want to change their backend from the SAN to Ceph. We can shutdown the VMs but we will not destroy them. I am aware that there is a cinder migrate volume command to change a volume’s backend, but it requires that the volume be completely detached. Forcing a detached state on that volume does let the volume migration take place, but the volume’s path in Nova block_device_mapping doesn’t update, for obvious reasons.

So, I am considering forcing the volumes to a detached status in Cinder and then manually updating the nova db block_device_mapping entry for each volume so that the VM can boot back up afterwards. However, before I start toying with the database and accidentally break stuff, has anyone else ever done something similar? Got any tips or hints on how best to proceed?

Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc.

Tobias Urdin

1:49 a.m.

Another approach would be to export the data to Glance and download then you can upload it somewhere. There is no ready thing that I know about. We used the openstacksdk to simply recreate the steps we did on CLI. Create all the neccesary resources on the other side, create new instances from the migrated volume and set a fixedIP on the neutron port to get same IP address. On 1/30/20 9:43 AM, Tony Pearce wrote:

...

I want to do something similar soon and don't want to touch the db (I experimented with cloning the "controller" and it did not achieve any desired outcome).

Is there a way to export an instance from Openstack in terms of something like a script that could re-create it on another openstack as a like-for-like? I guess this is assuming that the instance is linux-based and has cloud-init enabled.

*Tony Pearce*| *Senior Network Engineer / Infrastructure Lead **Cinglevue International <https://www.cinglevue.com>*

Email: tony.pearce@cinglevue.com <mailto:tony.pearce@cinglevue.com> Web: http://www.cinglevue.com <http://www.cinglevue.com/>**

*Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia.

Direct: +61 8 6202 0036 | Main: +61 8 6202 0024

Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender.

On Thu, 30 Jan 2020 at 16:39, Tobias Urdin <tobias.urdin@binero.se <mailto:tobias.urdin@binero.se>> wrote:

We did this something similar recently, we booted all instances from Cinder volume (with "Delete on terminate" set) in an old platform.

So we added our new Ceph storage to the old platform, removed instances (updated delete_on_terminate to 0 in Nova DB). Then we issued a retype so cinder-volume performed a `dd` of the volume from the old to the new storage.

We then synced network/subnet/sg and started instances with same fixed IP and moved floating IPs to the new platform.

Since you only have to swap storage you should experiment with powering off the instances and try doing a migrate of the volume but I suspect you need to either remove the instance or do some really nasty database operations.

I would suggest always going through the API and recreate the instance from the migrated volume instead of changing in the DB. We had to update delete_on_terminate in DB but that was pretty trivial (and I even think there is a spec that is not implemented yet that will allow that from API).

On 1/29/20 9:54 PM, Jean-Philippe Méthot wrote:

...
Hi,

We have a several hundred VMs which were built on cinder block devices as root drives which use a SAN backend. Now we want to change their backend from the SAN to Ceph. We can shutdown the VMs but we will not destroy them. I am aware that there is a cinder migrate volume command to change a volume’s backend, but it requires that the volume be completely detached. Forcing a detached state on that volume does let the volume migration take place, but the volume’s path in Nova block_device_mapping doesn’t update, for obvious reasons.

So, I am considering forcing the volumes to a detached status in Cinder and then manually updating the nova db block_device_mapping entry for each volume so that the VM can boot back up afterwards. However, before I start toying with the database and accidentally break stuff, has anyone else ever done something similar? Got any tips or hints on how best to proceed?

Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc.

Tony Pearce

1:58 a.m.

Thanks Tobias - thats my last resort. I'd still need to upload that image to the new openstack and then build an instance from the image. I'd also need to use metadata to make sure the instance was built with the same components (IP address etc). *Tony Pearce* | *Senior Network Engineer / Infrastructure Lead**Cinglevue International <https://www.cinglevue.com>* Email: tony.pearce@cinglevue.com Web: http://www.cinglevue.com *Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. On Thu, 30 Jan 2020 at 17:55, Tobias Urdin <tobias.urdin@binero.se> wrote:

...

Another approach would be to export the data to Glance and download then you can upload it somewhere.

There is no ready thing that I know about. We used the openstacksdk to simply recreate the steps we did on CLI. Create all the neccesary resources on the other side, create new instances from the migrated volume and set a fixedIP on the neutron port to get same IP address.

On 1/30/20 9:43 AM, Tony Pearce wrote:

I want to do something similar soon and don't want to touch the db (I experimented with cloning the "controller" and it did not achieve any desired outcome).

Is there a way to export an instance from Openstack in terms of something like a script that could re-create it on another openstack as a like-for-like? I guess this is assuming that the instance is linux-based and has cloud-init enabled.

*Tony Pearce* | *Senior Network Engineer / Infrastructure Lead **Cinglevue International <https://www.cinglevue.com>*

Email: tony.pearce@cinglevue.com Web: http://www.cinglevue.com

*Australia* 1 Walsh Loop, Joondalup, WA 6027 Australia.

Direct: +61 8 6202 0036 | Main: +61 8 6202 0024

Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender.

On Thu, 30 Jan 2020 at 16:39, Tobias Urdin <tobias.urdin@binero.se> wrote:

...
We did this something similar recently, we booted all instances from Cinder volume (with "Delete on terminate" set) in an old platform.

So we added our new Ceph storage to the old platform, removed instances (updated delete_on_terminate to 0 in Nova DB). Then we issued a retype so cinder-volume performed a `dd` of the volume from the old to the new storage.

We then synced network/subnet/sg and started instances with same fixed IP and moved floating IPs to the new platform.

Since you only have to swap storage you should experiment with powering off the instances and try doing a migrate of the volume but I suspect you need to either remove the instance or do some really nasty database operations.

I would suggest always going through the API and recreate the instance from the migrated volume instead of changing in the DB. We had to update delete_on_terminate in DB but that was pretty trivial (and I even think there is a spec that is not implemented yet that will allow that from API).

On 1/29/20 9:54 PM, Jean-Philippe Méthot wrote:

Hi,

We have a several hundred VMs which were built on cinder block devices as root drives which use a SAN backend. Now we want to change their backend from the SAN to Ceph. We can shutdown the VMs but we will not destroy them. I am aware that there is a cinder migrate volume command to change a volume’s backend, but it requires that the volume be completely detached. Forcing a detached state on that volume does let the volume migration take place, but the volume’s path in Nova block_device_mapping doesn’t update, for obvious reasons.

So, I am considering forcing the volumes to a detached status in Cinder and then manually updating the nova db block_device_mapping entry for each volume so that the VM can boot back up afterwards. However, before I start toying with the database and accidentally break stuff, has anyone else ever done something similar? Got any tips or hints on how best to proceed?

Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc.

Albert Braden

10:47 a.m.

Hi Tony, Have you considered doing a “Tech Refresh” process? The companies I’ve worked at consider VMs ephemeral. When we need to replace a cluster, we build the new one, notify the users to create new VMs there, and then delete the old ones and take down the old cluster. We give them tools like Forklift to help migrate, but we make it their responsibility to create the new VM and move their data over. From: Tony Pearce <tony.pearce@cinglevue.com> Sent: Thursday, January 30, 2020 1:58 AM To: Tobias Urdin <tobias.urdin@binero.se> Cc: OpenStack Discuss ML <openstack-discuss@lists.openstack.org> Subject: Re: [cinder][nova] Migrating servers' root block devices from a cinder backend to another Thanks Tobias - thats my last resort. I'd still need to upload that image to the new openstack and then build an instance from the image. I'd also need to use metadata to make sure the instance was built with the same components (IP address etc). Tony Pearce | Senior Network Engineer / Infrastructure Lead Cinglevue International<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cinglevue.com&d=DwMFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=Qxh_9fhTpN7ruruy289npYYL1g0-iVX5iO4Xuh_6ocE&s=F6UeIji0x4xzB-YcUsU7AaRo2GrNajr7M8MB3Z8lwD0&e=> Email: tony.pearce@cinglevue.com<mailto:tony.pearce@cinglevue.com> Web: http://www.cinglevue.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.cinglevue.com_&d=DwMFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=Qxh_9fhTpN7ruruy289npYYL1g0-iVX5iO4Xuh_6ocE&s=XnaFL5VBksmWdjbQbVuSSHQHuZO_WFyGSjOo6zmTfrA&e=> Australia 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. On Thu, 30 Jan 2020 at 17:55, Tobias Urdin <tobias.urdin@binero.se<mailto:tobias.urdin@binero.se>> wrote: Another approach would be to export the data to Glance and download then you can upload it somewhere. There is no ready thing that I know about. We used the openstacksdk to simply recreate the steps we did on CLI. Create all the neccesary resources on the other side, create new instances from the migrated volume and set a fixedIP on the neutron port to get same IP address. On 1/30/20 9:43 AM, Tony Pearce wrote: I want to do something similar soon and don't want to touch the db (I experimented with cloning the "controller" and it did not achieve any desired outcome). Is there a way to export an instance from Openstack in terms of something like a script that could re-create it on another openstack as a like-for-like? I guess this is assuming that the instance is linux-based and has cloud-init enabled. Tony Pearce | Senior Network Engineer / Infrastructure Lead Cinglevue International<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cinglevue.com&d=DwMFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=Qxh_9fhTpN7ruruy289npYYL1g0-iVX5iO4Xuh_6ocE&s=F6UeIji0x4xzB-YcUsU7AaRo2GrNajr7M8MB3Z8lwD0&e=> Email: tony.pearce@cinglevue.com<mailto:tony.pearce@cinglevue.com> Web: http://www.cinglevue.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.cinglevue.com_&d=DwMFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=Qxh_9fhTpN7ruruy289npYYL1g0-iVX5iO4Xuh_6ocE&s=XnaFL5VBksmWdjbQbVuSSHQHuZO_WFyGSjOo6zmTfrA&e=> Australia 1 Walsh Loop, Joondalup, WA 6027 Australia. Direct: +61 8 6202 0036 | Main: +61 8 6202 0024 Note: This email and all attachments are the sole property of Cinglevue International Pty Ltd. (or any of its subsidiary entities), and the information contained herein must be considered confidential, unless specified otherwise. If you are not the intended recipient, you must not use or forward the information contained in these documents. If you have received this message in error, please delete the email and notify the sender. On Thu, 30 Jan 2020 at 16:39, Tobias Urdin <tobias.urdin@binero.se<mailto:tobias.urdin@binero.se>> wrote: We did this something similar recently, we booted all instances from Cinder volume (with "Delete on terminate" set) in an old platform. So we added our new Ceph storage to the old platform, removed instances (updated delete_on_terminate to 0 in Nova DB). Then we issued a retype so cinder-volume performed a `dd` of the volume from the old to the new storage. We then synced network/subnet/sg and started instances with same fixed IP and moved floating IPs to the new platform. Since you only have to swap storage you should experiment with powering off the instances and try doing a migrate of the volume but I suspect you need to either remove the instance or do some really nasty database operations. I would suggest always going through the API and recreate the instance from the migrated volume instead of changing in the DB. We had to update delete_on_terminate in DB but that was pretty trivial (and I even think there is a spec that is not implemented yet that will allow that from API). On 1/29/20 9:54 PM, Jean-Philippe Méthot wrote: Hi, We have a several hundred VMs which were built on cinder block devices as root drives which use a SAN backend. Now we want to change their backend from the SAN to Ceph. We can shutdown the VMs but we will not destroy them. I am aware that there is a cinder migrate volume command to change a volume’s backend, but it requires that the volume be completely detached. Forcing a detached state on that volume does let the volume migration take place, but the volume’s path in Nova block_device_mapping doesn’t update, for obvious reasons. So, I am considering forcing the volumes to a detached status in Cinder and then manually updating the nova db block_device_mapping entry for each volume so that the VM can boot back up afterwards. However, before I start toying with the database and accidentally break stuff, has anyone else ever done something similar? Got any tips or hints on how best to proceed? Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc.

Eric Fried

6:44 a.m.

...

We had to update delete_on_terminate in DB but that was pretty trivial (and I even think there is a spec that is not implemented yet that will allow that from API).

Not sure if this is still helpful, but that spec is here: https://review.opendev.org/#/c/580336/ efried .

Lee Yarwood

12:57 a.m.

On 29-01-20 15:50:13, Jean-Philippe Méthot wrote:

...

Hi,

We have a several hundred VMs which were built on cinder block devices as root drives which use a SAN backend. Now we want to change their backend from the SAN to Ceph. We can shutdown the VMs but we will not destroy them. I am aware that there is a cinder migrate volume command to change a volume’s backend, but it requires that the volume be completely detached. Forcing a detached state on that volume does let the volume migration take place, but the volume’s path in Nova block_device_mapping doesn’t update, for obvious reasons.

So, I am considering forcing the volumes to a detached status in Cinder and then manually updating the nova db block_device_mapping entry for each volume so that the VM can boot back up afterwards. However, before I start toying with the database and accidentally break stuff, has anyone else ever done something similar? Got any tips or hints on how best to proceed?

Assuming you're using the Libvirt driver, a hackaround here is to cold migrate the instance to another host, this refreshes the connection_info from c-api and should allow the instance to boot correctly. FWIW https://review.opendev.org/#/c/696834/ will hopefully support live attached volume migrations to and from Ceph volumes thanks to recently -blockdev changes in Libvirt and QEMU. I also want to look into offline attached volume migration in the V cycle, IIRC swap_volume fails when the instance isn't running but in that case it's essentially a noop for n-cpu and assuming c-vol rebases the data on the backend it should succeed. Anyway, hope this helps! -- Lee Yarwood A5D1 9385 88CB 7E5F BE64 6618 BCA6 6E33 F672 2D76

Jean-Philippe Méthot

6:12 a.m.

...

Assuming you're using the Libvirt driver, a hackaround here is to cold migrate the instance to another host, this refreshes the connection_info from c-api and should allow the instance to boot correctly.

Indeed I hadn’t thought of this. If migrations updates the block_device_mapping from the cinder DB entry, I will not need to make any database modifications. Thank you for your help, it’s strongly appreciated. Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc.

Jean-Philippe Méthot

7:23 a.m.

To follow up on my last email, I have tested the following hackaround in my testing environment:

...

...
Assuming you're using the Libvirt driver, a hackaround here is to cold migrate the instance to another host, this refreshes the connection_info from c-api and should allow the instance to boot correctly.

I can confirm that on OpenStack Queens, this doesn’t work. Openstack can’t find the attachment ID as it was marked as deleted in the cinder DB. I’m guessing it put itself as deleted when I did cinder reset-state --attach-status detached b70b254f-58cd-4940-b976-6f4dc0209a8c. Even though I did cinder reset-state --attach-status attached b70b254f-58cd-4940-b976-6f4dc0209a8c, the original attachment ID did not undelete itself. It also didn’t update itself to the new path. As a result, nova seems to be looking for the attachment ID when trying to migrate and errors out when it can’t find it. That said, I do remember using this workaround in the past on older versions of Openstack, so this definitely used to work. I guess a change in cinder probably broke this. Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc.

Brin Zhang(张百林)

9:23 p.m.

New subject: 答复: [lists.openstack.org代发]Re: [cinder][nova] Migrating servers' root block devices from a cinder backend to another

...

To follow up on my last email, I have tested the following hackaround in my testing environment:

...

I can confirm that on OpenStack Queens, this doesn’t work. Openstack can’t find the attachment ID as it was marked as deleted in the cinder DB. I’m guessing it put itself as deleted when I did cinder reset-state --attach-status detached b70b254f-58cd-4940-b976-6f4dc0209a8c.

The “delete_on_termination” was recored in instance’s BDM table https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/models.py#L..., and it was deal with by nova delete the server (clean up bdms, and call cinderclinet to delete the target volume). In Cinder, delete_on_termination cannot be recored, so in Cinder DB you cannot find this field. Clean up bdms: https://github.com/openstack/nova/blob/master/nova/compute/api.py#L2334-L237...

...

Even though I did cinder reset-state --attach-status attached b70b254f-58cd-4940-b976-6f4dc0209a8c, the original attachment ID did not undelete itself. It also didn’t update itself to the new path. As a result, nova seems to be looking for the attachment ID when trying to migrate and errors out when it can’t find it.

...

That said, I do remember using this workaround in the past on older versions of Openstack, so this definitely used to work. I guess a change in cinder probably broke this.

...

Jean-Philippe Méthot Openstack system administrator Administrateur système Openstack PlanetHoster inc.

brinzhang

2136

Age (days ago)

2138

Last active (days ago)

List overview

Download

10 comments

7 participants

participants (7)

Albert Braden
Brin Zhang(张百林)
Eric Fried
Jean-Philippe Méthot
Lee Yarwood
Tobias Urdin
Tony Pearce