[Openstack-operators] [nova][cinder] Is there interest in an admin-api to refresh volume connection info?
Matt Riedemann
mriedemos at gmail.com
Fri Jun 9 04:15:31 UTC 2017
On 6/8/2017 1:39 PM, melanie witt wrote:
> On Thu, 8 Jun 2017 08:58:20 -0500, Matt Riedemann wrote:
>> Nova stores the output of the Cinder os-initialize_connection info API
>> in the Nova block_device_mappings table, and uses that later for
>> making volume connections.
>>
>> This data can get out of whack or need to be refreshed, like if your
>> ceph server IP changes, or you need to recycle some secret uuid for
>> your ceph cluster.
>>
>> I think the only ways to do this on the nova side today are via volume
>> detach/re-attach, reboot, migrations, etc - all of which, except live
>> migration, are disruptive to the running guest.
>
> I believe the only way to work around this currently is by doing a 'nova
> shelve' followed by a 'nova unshelve'. That will end up querying the
> connection_info from Cinder and update the block device mapping record
> for the instance. Maybe detach/re-attach would work too but I can't
> remember trying it.
Shelve has it's own fun set of problems like the fact it doesn't
terminate the connection to the volume backend on shelve. Maybe that's
not a problem for Ceph, I don't know. You do end up on another host
though potentially, and it's a full delete and spawn of the guest on
that other host. Definitely disruptive.
>
>> I've kicked around the idea of adding some sort of admin API interface
>> for refreshing the BDM.connection_info on-demand if needed by an
>> operator. Does anyone see value in this? Are operators doing stuff
>> like this already, but maybe via direct DB updates?
>>
>> We could have something in the compute API which calls down to the
>> compute for an instance and has it refresh the connection_info from
>> Cinder and updates the BDM table in the nova DB. It could be an admin
>> action API, or part of the os-server-external-events API, like what we
>> have for the 'network-changed' event sent from Neutron which nova uses
>> to refresh the network info cache.
>>
>> Other ideas or feedback here?
>
> We've discussed this a few times before and we were thinking it might be
> best to handle this transparently and just do a connection_info refresh
> + record update inline with the request flows that will end up reading
> connection_info from the block device mapping records. That way,
> operators won't have to intervene when connection_info changes.
The thing that sucks about this is if we're going to be refreshing
something that maybe rarely changes for every volume-related operation
on the instance. That seems like a lot of overhead to me (nova/cinder
API interactions, Cinder interactions to the volume backend,
nova-compute round trips to conductor and the DB to update the BDM
table, etc).
>
> At least in the case of Ceph, as long as a guest is running, it will
> continue to work fine if the monitor IPs or secrets change because it
> will continue to use its existing connection to the Ceph cluster. Things
> go wrong when an instance action such as resize, stop/start, or reboot
> is done because when the instance is taken offline and being brought
> back up, the stale connection_info is read from the block_device_mapping
> table and injected into the instance, and so it loses contact with the
> cluster. If we query Cinder and update the block_device_mapping record
> at the beginning of those actions, the instance will get the new
> connection_info.
>
> -melanie
>
>
--
Thanks,
Matt
More information about the OpenStack-operators
mailing list