[Openstack-operators] Cinder 10.0.4 (latest Ocata) broken for ceph/rbd
arne.wiebalck at cern.ch
Thu Aug 3 20:24:43 UTC 2017
For a sufficiently large number of volumes, the thin provisioning stats gathering could break things already
before the referenced patch:
It seems, however, that the attempt to gather at least the correct data (used instead of allocated) lowers that
threshold even further.
In order to allow our c-vol to start (and as we don’t use over-provisioning), we’ve for now commented out the
usage stats gathering.
> On 03 Aug 2017, at 20:47, Mike Lowe <jomlowe at iu.edu> wrote:
> I did the minor point release update from 10.0.2 to 10.0.4 and found my cinder volume services would go out to lunch during startup. They would do their initial heartbeat then get marked as dead never sending another heartbeat. The process was running and there were constant logs about ceph connections but what was missing was the follow up to "Initializing RPC dependent components of volume driver RBDDriver (1.2.0)”. It never finished the rpc init "Driver post RPC initialization completed successfully.” Digging in a little bit with my limited knowledge of the python librbd it seems that this commit landed in 10.0.4 https://github.com/openstack/cinder/commit/e72dead5ce085a6ba66f7aad2ff58061842f43d2 Instead of looping over the volume size for every volume it looped over all the volumes calling diff_iterate from offset 0 to the end. Near as I can tell this actually calls whatever you pass in as iterate_cb for every used extent of the volume. So a handful of empty volumes no problem, but in production by my count I would have to call iterate_cb 12.6M times just to add up the bytes used from each extent. I’ve filed a bug https://bugs.launchpad.net/cinder/+bug/1708507 and downgrading to 10.0.2 seems to be an ok workaround.
> TLDR; if you have ceph don’t upgrade past 10.0.2, for the time being
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-operators