[openstack-dev] [Openstack-operators] [nova][cinder] Is there interest in an admin-api to refresh volume connection info?
melanie witt
melwittt at gmail.com
Thu Jun 8 18:39:18 UTC 2017
On Thu, 8 Jun 2017 08:58:20 -0500, Matt Riedemann wrote:
> Nova stores the output of the Cinder os-initialize_connection info API
> in the Nova block_device_mappings table, and uses that later for making
> volume connections.
>
> This data can get out of whack or need to be refreshed, like if your
> ceph server IP changes, or you need to recycle some secret uuid for your
> ceph cluster.
>
> I think the only ways to do this on the nova side today are via volume
> detach/re-attach, reboot, migrations, etc - all of which, except live
> migration, are disruptive to the running guest.
I believe the only way to work around this currently is by doing a 'nova
shelve' followed by a 'nova unshelve'. That will end up querying the
connection_info from Cinder and update the block device mapping record
for the instance. Maybe detach/re-attach would work too but I can't
remember trying it.
> I've kicked around the idea of adding some sort of admin API interface
> for refreshing the BDM.connection_info on-demand if needed by an
> operator. Does anyone see value in this? Are operators doing stuff like
> this already, but maybe via direct DB updates?
>
> We could have something in the compute API which calls down to the
> compute for an instance and has it refresh the connection_info from
> Cinder and updates the BDM table in the nova DB. It could be an admin
> action API, or part of the os-server-external-events API, like what we
> have for the 'network-changed' event sent from Neutron which nova uses
> to refresh the network info cache.
>
> Other ideas or feedback here?
We've discussed this a few times before and we were thinking it might be
best to handle this transparently and just do a connection_info refresh
+ record update inline with the request flows that will end up reading
connection_info from the block device mapping records. That way,
operators won't have to intervene when connection_info changes.
At least in the case of Ceph, as long as a guest is running, it will
continue to work fine if the monitor IPs or secrets change because it
will continue to use its existing connection to the Ceph cluster. Things
go wrong when an instance action such as resize, stop/start, or reboot
is done because when the instance is taken offline and being brought
back up, the stale connection_info is read from the block_device_mapping
table and injected into the instance, and so it loses contact with the
cluster. If we query Cinder and update the block_device_mapping record
at the beginning of those actions, the instance will get the new
connection_info.
-melanie
More information about the OpenStack-dev
mailing list