[openstack-dev] [nova] [neutron] PCI pass-through network support
Irena Berezovsky
irenab at mellanox.com
Thu Jan 16 09:43:53 UTC 2014
Ian,
Thank you for putting in writing the ongoing discussed specification.
I have added few comments on the Google doc [1].
As for live migration support, this can be done also without libvirt network usage.
Not very elegant, but working: rename the interface of the PCI device to some logical name, let's say based on neutron port UUID and put it into the interface XML, i.e.:
If PCI device network interface name is eth8 and neutron port UUID is 02bc4aec-b4f4-436f-b651-024 then rename it to something like: eth02bc4aec-b4'. The interface XML will look like this:
...
<interface type='direct'>
<mac address='fa:16:3e:46:d3:e8'/>
<source dev='eth02bc4aec-b4' mode='passthrough'/>
<target dev='macvtap0'/>
<model type='virtio'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
...
[1] https://docs.google.com/document/d/1vadqmurlnlvZ5bv3BlUbFeXRS_wh-dsgi5plSjimWjU/edit?pli=1#heading=h.308b0wqn1zde
BR,
Irena
From: Ian Wells [mailto:ijw.ubuntu at cack.org.uk]
Sent: Thursday, January 16, 2014 2:34 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] [neutron] PCI pass-through network support
To clarify a couple of Robert's points, since we had a conversation earlier:
On 15 January 2014 23:47, Robert Li (baoli) <baoli at cisco.com<mailto:baoli at cisco.com>> wrote:
--- do we agree that BDF address (or device id, whatever you call it), and node id shouldn't be used as attributes in defining a PCI flavor?
Note that the current spec doesn't actually exclude it as an option. It's just an unwise thing to do. In theory, you could elect to define your flavors using the BDF attribute but determining 'the card in this slot is equivalent to all the other cards in the same slot in other machines' is probably not the best idea... We could lock it out as an option or we could just assume that administrators wouldn't be daft enough to try.
* the compute node needs to know the PCI flavor. [...]
- to support live migration, we need to use it to create network xml
I didn't understand this at first and it took me a while to get what Robert meant here.
This is based on Robert's current code for macvtap based live migration. The issue is that if you wish to migrate a VM and it's tied to a physical interface, you can't guarantee that the same physical interface is going to be used on the target machine, but at the same time you can't change the libvirt.xml as it comes over with the migrating machine. The answer is to define a network and refer out to it from libvirt.xml. In Robert's current code he's using the group name of the PCI devices to create a network containing the list of equivalent devices (those in the group) that can be macvtapped. Thus when the host migrates it will find another, equivalent, interface. This falls over in the use case under consideration where a device can be mapped using more than one flavor, so we have to discard the use case or rethink the implementation.
There's a more complex solution - I think - where we create a temporary network for each macvtap interface a machine's going to use, with a name based on the instance UUID and port number, and containing the device to map. Before starting the migration we would create a replacement network containing only the new device on the target host; migration would find the network from the name in the libvirt.xml, and the content of that network would behave identically. We'd be creating libvirt networks on the fly and a lot more of them, and we'd need decent cleanup code too ('when freeing a PCI device, delete any network it's a member of'), so it all becomes a lot more hairy.
--
Ian.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140116/2eb62105/attachment.html>
More information about the OpenStack-dev
mailing list