[openstack-dev] [nova] [neutron] PCI pass-through network support

John Garbutt john at johngarbutt.com
Thu Dec 19 14:15:26 UTC 2013


Response inline...

On 19 December 2013 13:05, Irena Berezovsky <irenab at mellanox.com> wrote:
> Hi John,
> I totally agree that we should define the use cases both for administration and tenant that powers the VM.
> Since we are trying to support PCI pass-through network, let's focus on the related use cases.
> Please see my comments inline.

Cool.

> Regards,
> Irena
> -----Original Message-----
> From: John Garbutt [mailto:john at johngarbutt.com]
> Sent: Thursday, December 19, 2013 1:42 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [nova] [neutron] PCI pass-through network support
>
> Apologies for being late onto this thread, and not making the meeting the other day.
> Also apologies this is almost totally a top post.
>
> On 17 December 2013 15:09, Ian Wells <ijw.ubuntu at cack.org.uk> wrote:
>> Firstly, I disagree that
>> https://wiki.openstack.org/wiki/PCI_passthrough_SRIOV_support is an
>> accurate reflection of the current state.  It's a very unilateral
>> view, largely because the rest of us had been focussing on the google
>> document that we've been using for weeks.
>
> I haven't seen the google doc. I got involved through the blueprint review of this:
> https://blueprints.launchpad.net/nova/+spec/pci-extra-info
>
> I assume its this one?
> https://docs.google.com/document/d/1EMwDg9J8zOxzvTnQJ9HwZdiotaVstFWKIuKrPse6JOs
>
> On a quick read, my main concern is separating out the "user" more:
> * administration (defines pci-flavor, defines which hosts can provide it, defines server flavor...)
> * person who boots server (picks server flavor, defines neutron ports)
>
> Note, I don't see the person who boots the server ever seeing the pci-flavor, only understanding the server flavor.
> [IrenaB] I am not sure that elaborating PCI device request into server flavor is the right approach for the PCI pass-through network case. vNIC by its nature is something dynamic that can be plugged or unplugged after VM boot. server flavor is  quite static.

I was really just meaning the server flavor specify the type of NIC to attach.

The existing port specs, etc, define how many nics, and you can hot
plug as normal, just the VIF plugger code is told by the server flavor
if it is able to PCI passthrough, and which devices it can pick from.
The idea being combined with the neturon network-id you know what to
plug.

The more I talk about this approach the more I hate it :(

> We might also want a "nic-flavor" that tells neutron information it requires, but lets get to that later...
> [IrenaB] nic flavor is definitely something that we need in order to choose if  high performance (PCI pass-through) or virtio (i.e. OVS) nic will be created.

Well, I think its the right way go. Rather than overloading the server
flavor with hints about which PCI devices you could use.

>> Secondly, I totally disagree with this approach.  This assumes that
>> description of the (cloud-internal, hardware) details of each compute
>> node is best done with data stored centrally and driven by an API.  I
>> don't agree with either of these points.
>
> Possibly, but I would like to first agree on the use cases and data model we want.
>
> Nova has generally gone for APIs over config in recent times.
> Mostly so you can do run-time configuration of the system.
> But lets just see what makes sense when we have the use cases agreed.
>
>>> On 2013年12月16日 22:27, Robert Li (baoli) wrote:
>>> I'd like to give you guy a summary of current state, let's discuss it
>>> then.
>>> https://wiki.openstack.org/wiki/PCI_passthrough_SRIOV_support
>>>
>>>
>>> 1)  fade out alias ( i think this ok for all)
>>> 2)  white list became pic-flavor ( i think this ok for all)
>>> 3)  address simply regular expression support: only * and a number
>>> range is support [hex-hex]. ( i think this ok?)
>>> 4)  aggregate : now it's clear enough, and won't impact SRIOV.  ( i
>>> think this irrelevant to SRIOV now)
>
> So... this means we have:
>
> PCI-flavor:
> * i.e. standardGPU, standardGPUnew, fastGPU, hdFlash1TB etc
>
> Host mapping:
> * decide which hosts you allow a particular flavor to be used
> * note, the scheduler still needs to find out if any devices are "free"
>
> flavor (of the server):
> * usual RAM, CPU, Storage
> * use extra specs to add PCI devices
> * example:
> ** add one PCI device, choice of standardGPU or standardGPUnew
> ** also add: one hdFlash1TB
>
> Now, the other bit is SRIOV... At a high level:
>
> Neutron:
> * user wants to connect to a particular neutron network
> * user wants a super-fast SRIOV connection
>
> Administration:
> * needs to map PCI device to what neutron network the connect to
>
> The big question is:
> * is this a specific SRIOV only (provider) network
> * OR... are other non-SRIOV connections also made to that same network
>
> I feel we have to go for that latter. Imagine a network on VLAN 42, you might want some SRIOV into that network, and some OVS connecting into the same network. The user might have VMs connected using both methods, so wants the same IP address ranges and same network id spanning both.
> [IrenaB] Agree. SRIOV connection is the choice for certain VM on certain network. The same VM can be connected to other network via virtio nic as well as other VMs can be connected to the same network via virtio nics.

Cool, agreed.

> If we go for that latter new either need:
> * some kind of nic-flavor
> ** boot ... -nic nic-id:"public-id:,nic-flavor:"10GBpassthrough"
> ** but neutron could store nic-flavor, and pass it through to VIF driver, and user says port-id
> * OR add NIC config into the server flavor
> ** extra spec to say, tell VIF driver it could use on of this list of PCI devices: (list pci-flavors)
> * OR do both
>
> I vote for nic-flavor only, because it matches the volume-type we have with cinder.
> [IrenaB] Agree on nic-flavor.

Cool, agreed.

> [IrenaB] How the nova scheduler will base its host allocation decision on nic-flavor?

Yeah, thats the next bit, I skipped over that on purpose...

Long term, the cross service scheduler should deal with this.
But thats not the answer for Icehouse, and not really an answer.

In the short term, I imagined something like:
* ensure instance's network_info includes the nic-flavor (we get this
from neutron or api request)
* we need to give pci-flavors some "extra specs" just like server flavors
* in the pci-flavor extra specs we have nic-flavor and neutron-network-uuid
* if two pci-flavors match, assume you can freely pick from either available
* when we populate nic-flavors into network_info we add the nova
specific pci-flavors
* the scheduler can then get passed the network_info and use that to
pick a host with available slots for the given pci-flavor

> [IrenaB] Can you please elaborate more on cinder volume-type?

See previous mail from me in response to Ian.

Basically its very similar to nic-flavor I feel, in a good way.

Its just Cinder stores the volume type for each volume, and I think
Neutron should store the nic-flavor for each port.

> However, it does suggest that Nova should leave all the SRIOV work to the VIF driver.
> So the VIF driver, as activate by neutron, will understand which PCI devices to passthrough.
>
> Similar to the plan with brick, we could have an oslo lib that helps you attach SRIOV devices that could be used by the neturon VIF drivers and the nova PCI passthrough code.
> [IrenaB] Can you please add more details on this?

Basically, if neutron needs to share code with Nova's PCI passthrough
code, we could have a common libary to share code. Cinder is looking
towards doing this with Brick:
https://github.com/openstack/cinder/tree/master/cinder/brick
If we don't need to do that, then thats cool too.

John



More information about the OpenStack-dev mailing list