[openstack-dev] [cinder] making volume available without stopping VM

Volodymyr Litovka doka.ua at gmx.com
Tue Jun 26 09:42:06 UTC 2018


Hi Sean,

thanks for the responce, my questions and comments below.

On 6/25/18 9:42 PM, Sean McGinnis wrote:

> Not sure if it's an option for you, but in the Pike release support was added
> to be able to extend attached volumes. There are several caveats with this
> feature though. I believe it only works with libvirt, and if I remember right,
> only newer versions of libvirt. You need to have notifications working for Nova
> to pick up that Cinder has extended the volume.
Pike release notes states the following: "It is now possible to signal 
and perform an online volume size change as of the 2.51 microversion 
using the volume-extended external event. Nova will perform the volume 
extension so the host can detect its new size. It will also resize the 
device in QEMU so instance can detect the new disk size without 
rebooting. Currently only the *libvirt compute driver with iSCSI and FC 
volumes supports the online volume size change*." And yes, it doesn't 
work for me since I'm using CEPH as backend.

Queens release notes says nothing on changes. Feature matrix 
(https://docs.openstack.org/nova/queens/user/support-matrix.html) says 
it's supported on libvirt/x86 without any other further details. Does 
anybody know whether this feature implemented in Queens for other 
backends except iSCSI and FC?

Mentioned earlier spec are talking about how to make result of resize to 
be visible to VM immediately upon resize, without restarting VM, while I 
don't asking for this. My question is how to resize volume and make it 
available after restart, see below

>> In fact, I'm ok with delayed resize (upon power-cycle), and it's not an
>> issue for me that VM don't detect changes immediately. What I want to
>> understand is that changes to Cinder (and, thus, underlying changes to CEPH)
>> are safe for VM while it's in active state.
> No, this is not considered safe. You are forcing the volume state to be
> availabile when it is in fact not.

In very general case, I agree with you. For example, I can imagine that 
allocation of new blocks can fail if volume is declared as available, 
but, in particular case of CEPH:

- in short:
# status of volume in Cinder means nothing to CEPH

- in details:
# while Cinder do provisioning and maintenance
# kvm/libvirt work directly with CEPH (after got this endpoint from 
<-Nova<-Cinder)
# and I see no changes in CEPH's status of volume while it is available 
in Cinder:

* in-use:
$ rbd info volumes/volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb
rbd image 'volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb':
     size 20480 MB in 5120 objects
     order 22 (4096 kB objects)
     block_name_prefix: rbd_data.2414a7572c9f46
     format: 2
     features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
     flags:
     create_timestamp: Mon Jun 25 10:47:03 2018
     parent: 
volumes/volume-42edf442-1dbb-4b6e-8593-1fbfbc821a1a at volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb.clone_snap
     overlap: 3072 MB

* available:
$ rbd info volumes/volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb
rbd image 'volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb':
     size 20480 MB in 5120 objects
     order 22 (4096 kB objects)
     block_name_prefix: rbd_data.2414a7572c9f46
     format: 2
     features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
     flags:
     create_timestamp: Mon Jun 25 10:47:03 2018
     parent: 
volumes/volume-42edf442-1dbb-4b6e-8593-1fbfbc821a1a at volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb.clone_snap
     overlap: 3072 MB

# and, during copying data, CEPH successfully allocates additional 
blocks to the volume:

* before copying (volume is already available in Cinder)
$ rbd du volumes/volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb
NAME                                        PROVISIONED USED
volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb      20480M *2256M*

* after copying (while volume is available in Cinder)
$ rbd du volumes/volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb
NAME                                        PROVISIONED USED
volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb      20480M *2560M*

# which preserved after back to in-use:
$ rbd du volumes/volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb
NAME                                        PROVISIONED USED
volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb      20480M *2560M*
$ rbd info volumes/volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb
rbd image 'volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb':
     size 20480 MB in 5120 objects
     order 22 (4096 kB objects)
     block_name_prefix: rbd_data.2414a7572c9f46
     format: 2
     features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
     flags:
     create_timestamp: Mon Jun 25 10:47:03 2018
     parent: 
volumes/volume-42edf442-1dbb-4b6e-8593-1fbfbc821a1a at volume-5474ca4f-40ad-4151-9916-d9b4e9de14eb.clone_snap
     overlap: 3072 MB

Actually, the only problem with safety I see is possible administrative 
race - since volume is available, cloud administrator or any kind of 
automation can break dependencies. If this is fully controlled 
environment (nobody else can modify it in any way or reattach it to 
other instance or make anything else with the volume), which other kinds 
of problems can appear in this case?

Thank you.

> You can get some details from the cinder spec:
>
> https://specs.openstack.org/openstack/cinder-specs/specs/pike/extend-attached-volume.html
>
> And the corresponding Nova spec:
>
> http://specs.openstack.org/openstack/nova-specs/specs/pike/implemented/nova-support-attached-volume-extend.html
>
> You may also want to read through the mailing list thread if you want to get in
> to some of the nitty gritty details behind why certain design choices were
> made:
>
> http://lists.openstack.org/pipermail/openstack-dev/2017-April/115292.html

-- 
Volodymyr Litovka
   "Vision without Execution is Hallucination." -- Thomas Edison

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20180626/ebe47113/attachment.html>


More information about the OpenStack-dev mailing list