[openstack-dev] [cinder] LVM snapshot performance issue -- why isn't thin provisioning the default?

Eric Harney eharney at redhat.com
Tue Sep 15 17:38:42 UTC 2015


On 09/15/2015 01:00 PM, Chris Friesen wrote:
> I'm currently trying to work around an issue where activating LVM
> snapshots created through cinder takes potentially a long time. 
> (Linearly related to the amount of data that differs between the
> original volume and the snapshot.)  On one system I tested it took about
> one minute per 25GB of data, so the worst-case boot delay can become
> significant.
> 
> According to Zdenek Kabelac on the LVM mailing list, LVM snapshots were
> not intended to be kept around indefinitely, they were supposed to be
> used only until the backup was taken and then deleted.  He recommends
> using thin provisioning for long-lived snapshots due to differences in
> how the metadata is maintained.  (He also says he's heard reports of
> volume activation taking half an hour, which is clearly crazy when
> instances are waiting to access their volumes.)
> 
> Given the above, is there any reason why we couldn't make thin
> provisioning the default?
> 


My intention is to move toward thin-provisioned LVM as the default -- it
is definitely better suited to our use of LVM.  Previously this was less
easy, since some older Ubuntu platforms didn't support it, but in
Liberty we added the ability to specify lvm_type = "auto" [1] to use
thin if it is supported on the platform.

The other issue preventing using thin by default is that we default the
max oversubscription ratio to 20.  IMO that isn't a safe thing to do for
the reference implementation, since it means that people who deploy
Cinder LVM on smaller storage configurations can easily fill up their
volume group and have things grind to halt.  I think we want something
closer to the semantics of thick LVM for the default case.

We haven't thought through a reasonable migration strategy for how to
handle that.  I'm not sure we can change the default oversubscription
ratio without breaking deployments using other drivers.  (Maybe I'm
wrong about this?)

If we sort out that issue, I don't see any reason we can't switch over
in Mitaka.

[1] https://review.openstack.org/#/c/104653/



More information about the OpenStack-dev mailing list