[openstack-dev] [cinder][nova] proper syncing of cinder volume state
Duncan Thomas
duncan.thomas at gmail.com
Mon Dec 1 13:30:25 UTC 2014
John:
States that the driver can/should do some cleanup work during the
transition:
attaching -> available or error
detaching -> available or error
error -> available or error
deleting -> deleted or error_deleting
Also in possibly wanted in future but much harder:
backing_up -> available or error (need to make sure the backup service
copes)
restoring -> error (need to make sure the backup service copes)
I haven't looked at the entire state space yet, these are the obvious ones
off the top of my head
On 1 December 2014 at 06:30, John Griffith <john.griffith8 at gmail.com> wrote:
> On Fri, Nov 28, 2014 at 11:25 AM, D'Angelo, Scott <scott.dangelo at hp.com>
> wrote:
> > A Cinder blueprint has been submitted to allow the python-cinderclient to
> > involve the back end storage driver in resetting the state of a cinder
> > volume:
> >
> > https://blueprints.launchpad.net/cinder/+spec/reset-state-with-driver
> >
> > and the spec:
> >
> > https://review.openstack.org/#/c/134366
> >
> >
> >
> > This blueprint contains various use cases for a volume that may be
> listed in
> > the Cinder DataBase in state detaching|attaching|creating|deleting.
> >
> > The Proposed solution involves augmenting the python-cinderclient command
> > ‘reset-state’, but other options are listed, including those that
> >
> > involve Nova, since the state of a volume in the Nova XML found in
> > /etc/libvirt/qemu/<instance_id>.xml may also be out-of-sync with the
> >
> > Cinder DB or storage back end.
> >
> >
> >
> > A related proposal for adding a new non-admin API for changing volume
> status
> > from ‘attaching’ to ‘error’ has also been proposed:
> >
> > https://review.openstack.org/#/c/137503/
> >
> >
> >
> > Some questions have arisen:
> >
> > 1) Should ‘reset-state’ command be changed at all, since it was
> originally
> > just to modify the Cinder DB?
> >
> > 2) Should ‘reset-state’ be fixed to prevent the naïve admin from changing
> > the CinderDB to be out-of-sync with the back end storage?
> >
> > 3) Should ‘reset-state’ be kept the same, but augmented with new options?
> >
> > 4) Should a new command be implemented, with possibly a new admin API to
> > properly sync state?
> >
> > 5) Should Nova be involved? If so, should this be done as a separate
> body of
> > work?
> >
> >
> >
> > This has proven to be a complex issue and there seems to be a good bit of
> > interest. Please provide feedback, comments, and suggestions.
> >
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
>
> Hey Scott,
>
> Thanks for posting this to the ML, I stated my opinion on the spec,
> but for completeness:
> My feeling is that reset-state has morphed into something entirely
> different than originally intended. That's actually great, nothing
> wrong there at all. I strongly disagree with the statements that
> "setting the status in the DB only is almost always the wrong thing to
> do". The whole point was to allow the state to be changed in the DB
> so the item could in most cases be deleted. There was never an intent
> (that I'm aware of) to make this some sort of uber resync and heal API
> call.
>
> All of that history aside, I think it would be great to add some
> driver interaction here. I am however very unclear on what that would
> actually include. For example, would you let a Volume's state be
> changed from "Error-Attaching" to "In-Use" and just run through the
> process of retyring an attach? To me that seems like a bad idea. I'm
> much happier with the current state of changing the state form "Error"
> to "Available" (and NOTHING else) so that an operation can be retried,
> or the resource can be deleted. If you start allowing any state
> transition (which sadly we've started to do) you're almost never going
> to get things correct. This also covers almost every situation even
> though it means you have to explicitly retry operations or steps (I
> don't think that's a bad thing) and make the code significantly more
> robust IMO (we have some issues lately with things being robust).
>
> My proposal would be to go back to limiting the things you can do with
> reset-state (basicly make it so you can only release the resource back
> to available) and add the driver interaction to clean up any mess if
> possible. This could be a simple driver call added like
> "make_volume_available" whereby the driver just ensures that there are
> no attachments and.... well; honestly nothing else comes to mind as
> being something the driver cares about here. The final option then
> being to add some more power to force-delete.
>
> Is there anything other than attach that matters from a driver? If
> people are talking error-recovery that to me is a whole different
> topic and frankly I think we need to spend more time preventing errors
> as opposed to trying to recover from them via new API calls.
>
> Curious to see if any other folks have input here?
>
> John
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
--
Duncan Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141201/4fec98ec/attachment.html>
More information about the OpenStack-dev
mailing list