[openstack-dev] [nova][neutron][SR-IOV] Hardware changes and shifting PCI addresses

Daniel P. Berrange berrange at redhat.com
Tue Sep 15 08:29:04 UTC 2015


On Mon, Sep 14, 2015 at 09:34:31PM -0400, Jay Pipes wrote:
> On 09/10/2015 05:23 PM, Brent Eagles wrote:
> >Hi,
> >
> >I was recently informed of a situation that came up when an engineer
> >added an SR-IOV nic to a compute node that was hosting some guests that
> >had VFs attached. Unfortunately, adding the card shuffled the PCI
> >addresses causing some degree of havoc. Basically, the PCI addresses
> >associated with the previously allocated VFs were no longer valid.
> >
> >I tend to consider this a non-issue. The expectation that hosts have
> >relatively static hardware configuration (and kernel/driver configs for
> >that matter) is the price you pay for having pets with direct hardware
> >access. That being said, this did come as a surprise to some of those
> >involved and I don't think we have any messaging around this or advice
> >on how to deal with situations like this.
> >
> >So what should we do? I can't quite see altering OpenStack to deal with
> >this situation (or even how that could work). Has anyone done any
> >research into this problem, even if it is how to recover or extricate
> >a guest that is no longer valid? It seems that at the very least we
> >could use some stern warnings in the docs.
> 
> Hi Brent,
> 
> Interesting issue. We have code in the PCI tracker that ostensibly handles
> this problem:
> 
> https://github.com/openstack/nova/blob/master/nova/pci/manager.py#L145-L164
> 
> But the note from yjiang5 is telling:
> 
> # Pci properties may change while assigned because of
> # hotplug or config changes. Although normally this should
> # not happen.
> # As the devices have been assigned to a instance, we defer
> # the change till the instance is destroyed. We will
> # not sync the new properties with database before that.
> # TODO(yjiang5): Not sure if this is a right policy, but
> # at least it avoids some confusion and, if
> # we can add more action like killing the instance
> # by force in future.
> 
> Basically, if the PCI device tracker notices that an instance is assigned a
> PCI device with an address that no longer exists in the PCI device addresses
> returned from libvirt, it will (eventually, in the _free_instance() method)
> remove the PCI device assignment from the Instance object, but it will make
> no attempt to assign a new PCI device that meets the original PCI device
> specification in the launch request.
> 
> Should we handle this case and attempt a "hot re-assignment of a PCI
> device"? Perhaps. Is it high priority? Not really, IMHO.

Hotplugging new PCI devices to a running host should not have any impact
on existing PCI device addresses - it'll merely add new adddresses for
new devices - existing devices are unchanged. So Everything should "just
work" in that case. IIUC, Brent's Q was around turning off the host and
cold-plugging/unplugging hardware, which /is/ liable to arbitrarily
re-arrange existing PCI device addresses.

> If you'd like to file a bug against Nova, that would be cool, though.

I think it is explicitly out of scope for Nova to deal with this
scenario.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



More information about the OpenStack-dev mailing list