[Openstack-operators] [nova][cinder] Is there interest in an admin-api to refresh volume connection info?

Matt Riedemann mriedemos at gmail.com
Fri Jun 9 04:15:31 UTC 2017


On 6/8/2017 1:39 PM, melanie witt wrote:
> On Thu, 8 Jun 2017 08:58:20 -0500, Matt Riedemann wrote:
>> Nova stores the output of the Cinder os-initialize_connection info API 
>> in the Nova block_device_mappings table, and uses that later for 
>> making volume connections.
>>
>> This data can get out of whack or need to be refreshed, like if your 
>> ceph server IP changes, or you need to recycle some secret uuid for 
>> your ceph cluster.
>>
>> I think the only ways to do this on the nova side today are via volume 
>> detach/re-attach, reboot, migrations, etc - all of which, except live 
>> migration, are disruptive to the running guest.
> 
> I believe the only way to work around this currently is by doing a 'nova 
> shelve' followed by a 'nova unshelve'. That will end up querying the 
> connection_info from Cinder and update the block device mapping record 
> for the instance. Maybe detach/re-attach would work too but I can't 
> remember trying it.

Shelve has it's own fun set of problems like the fact it doesn't 
terminate the connection to the volume backend on shelve. Maybe that's 
not a problem for Ceph, I don't know. You do end up on another host 
though potentially, and it's a full delete and spawn of the guest on 
that other host. Definitely disruptive.

> 
>> I've kicked around the idea of adding some sort of admin API interface 
>> for refreshing the BDM.connection_info on-demand if needed by an 
>> operator. Does anyone see value in this? Are operators doing stuff 
>> like this already, but maybe via direct DB updates?
>>
>> We could have something in the compute API which calls down to the 
>> compute for an instance and has it refresh the connection_info from 
>> Cinder and updates the BDM table in the nova DB. It could be an admin 
>> action API, or part of the os-server-external-events API, like what we 
>> have for the 'network-changed' event sent from Neutron which nova uses 
>> to refresh the network info cache.
>>
>> Other ideas or feedback here?
> 
> We've discussed this a few times before and we were thinking it might be 
> best to handle this transparently and just do a connection_info refresh 
> + record update inline with the request flows that will end up reading 
> connection_info from the block device mapping records. That way, 
> operators won't have to intervene when connection_info changes.

The thing that sucks about this is if we're going to be refreshing 
something that maybe rarely changes for every volume-related operation 
on the instance. That seems like a lot of overhead to me (nova/cinder 
API interactions, Cinder interactions to the volume backend, 
nova-compute round trips to conductor and the DB to update the BDM 
table, etc).

> 
> At least in the case of Ceph, as long as a guest is running, it will 
> continue to work fine if the monitor IPs or secrets change because it 
> will continue to use its existing connection to the Ceph cluster. Things 
> go wrong when an instance action such as resize, stop/start, or reboot 
> is done because when the instance is taken offline and being brought 
> back up, the stale connection_info is read from the block_device_mapping 
> table and injected into the instance, and so it loses contact with the 
> cluster. If we query Cinder and update the block_device_mapping record 
> at the beginning of those actions, the instance will get the new 
> connection_info.
> 
> -melanie
> 
> 


-- 

Thanks,

Matt



More information about the OpenStack-operators mailing list