Open Stack

Wed Apr 12 17:30:31 UTC 2017

Thanks for starting this discussion. There is a lot to cover/answer.

On Tue, Apr 11, 2017 at 6:35 PM, Matt Riedemann <mriedemos at gmail.com> wrote:
>
> This is not discoverable at the moment, for the end user or cinder, so I'm
> trying to figure out what the failure mode looks like.
>
> This all starts on the cinder side to extend the size of the attached
> volume. Cinder is going to have to see if Nova is new enough to handle this
> (via the available API versions) before accepting the request and resizing
> the volume. Then Cinder sends the event to Nova. This is where it gets
> interesting.
>
> On the Nova side, if all of the computes aren't new enough, we could just
> fail the request outright with a 409. What does Cinder do then? Rollback the
> volume resize?

This means an extend volume operation would need to check for Nova
support first.
This also means adding a new API call to fetch and discover such
capabilities per instance (from associated compute node).
If we want to catch errors in volume size extension in Nova, we will
need to find an other way, external events are async.

> But let's say the computes are new enough, but the instance is on a compute
> that does not support the operation. Then what? Do we register an instance
> fault and put the instance into ERROR state? Then the admin would need to
> intervene.
>
> Are there other ideas? Until we have capabilities (info) exposed out of the
> API we're stuck with questions like this.
>

Like TommyLike mentioned in a review, AWS introduced Live Volume
Modifications available on some instance types.
On instance types with limited support, you need to stop/start the
instance or detach/attach the volume.
On instances started before a certain date, you need to stop/start the
instance or detach/attach the volume at least once.
In all cases, the end user needs to extend the partition/filesystem in
the instance.

They have the luxury to fully control the environment and synchronize
the compute service with the volume service.
Even (speculatively) having bidirectional
orchestration/synchronization/communications or what.

I have that same luxury since I only support one volume backend and
virt driver combination.
But I now start to grasp the extend of what adding such feature
requires, especially when it implies cross-services support...

We have a matrix of compute drivers and volume backends to support
with some combinations which might never support online volume
extension.
There is the desire for OpenStack to be interoperable between clouds
so there is a strong incentive to make it work for all combinations.

I will still take the liberty to ask:

Would it be in the realm of possibilities for a deployer to have to
explicitly enable this feature?
A deployer would be able to enable such feature once all
services/components it choose to deployed fully support online volume
extension.

I know it won't address cases where a mixed of volume backends and
virt drivers are deployed.
So we would still need capabilities discoverability. This includes
volume type capabilities discoverability which I'm not sure exists
today.

Lets not start about how Horizon will discover such capabilities per
instance/volume. That's an other can of worms. =)

--
Mathieu

Open Stack

[openstack-dev] [nova][cinder] Can all non-Ironic compute drivers handle volume size extension?

OpenStack

Community

Documentation

Branding & Legal