[openstack-dev] [nova] [neutron] PCI pass-through network support

John Garbutt john at johngarbutt.com
Thu Dec 19 12:56:59 UTC 2013

On 19 December 2013 12:54, John Garbutt <john at johngarbutt.com> wrote:
> On 19 December 2013 12:21, Ian Wells <ijw.ubuntu at cack.org.uk> wrote:
>> John:
>>> At a high level:
>>> Neutron:
>>> * user wants to connect to a particular neutron network
>>> * user wants a super-fast SRIOV connection
>>> Administration:
>>> * needs to map PCI device to what neutron network the connect to
>>> The big question is:
>>> * is this a specific SRIOV only (provider) network
>>> * OR... are other non-SRIOV connections also made to that same network
>>> I feel we have to go for that latter. Imagine a network on VLAN 42,
>>> you might want some SRIOV into that network, and some OVS connecting
>>> into the same network. The user might have VMs connected using both
>>> methods, so wants the same IP address ranges and same network id
>>> spanning both.
>>> If we go for that latter new either need:
>>> * some kind of nic-flavor
>>> ** boot ... -nic nic-id:"public-id:,nic-flavor:"10GBpassthrough"
>>> ** but neutron could store nic-flavor, and pass it through to VIF
>>> driver, and user says port-id
>>> * OR add NIC config into the server flavor
>>> ** extra spec to say, tell VIF driver it could use on of this list of
>>> PCI devices: (list pci-flavors)
>>> * OR do both
>>> I vote for nic-flavor only, because it matches the volume-type we have
>>> with cinder.
>> I think the issue there is that Nova is managing the supply of PCI devices
>> (which is limited and limited on a per-machine basis).  Indisputably you
>> need to select the NIC you want to use as a passthrough rather than a vnic
>> device, so there's something in the --nic argument, but you have to answer
>> two questions:
>> - how many devices do you need (which is now not a flavor property but in
>> the --nic list, which seems to me an odd place to be defining billable
>> resources)
>> - what happens when someone does nova interface-attach?
> Agreed.

Apologies, I misread what you put, maybe we don't agree...

I am just trying not to make a passthrough NIC and odd special case.

In my mind, it should just be a regular neturon port connection that
happens to be implemented using PCI passthrough.

I agree we need to sort out the scheduling of that, because its a
finite resource.

> The --nic list specifies how many NICs.
> I was suggesting adding a nic-flavor on each --nic spec to say if its
> PCI passthrough vs virtual NIC.
>> Cinder's an indirect parallel because the resources it's adding to the
>> hypervisor are virtual and unlimited, I think, or am I missing something
>> here?
> I was more referring more to the different "volume-types" i.e. "fast
> volume" or "normal volume".
> And how that is similar to "virtual" vs "fast PCI passthough" vs "slow
> PCI passthrough"
> Local volumes probably have the same issues as PCI passthrough with
> "finite resources".
> But I am not sure we have a good solution for that yet.
> Mostly, it seems right that Cinder and Neutron own the configuration
> about the volume and network resources.
> The VIF driver and volume drivers seem to have a similar sort of
> relationship with Cinder and Neutron vs Nova.
> Then the issues boils down to visibility into that data so we can
> schedule efficiently, which is no easy problem.
>>> However, it does suggest that Nova should leave all the SRIOV work to
>>> the VIF driver.
>>> So the VIF driver, as activate by neutron, will understand which PCI
>>> devices to passthrough.
>>> Similar to the plan with brick, we could have an oslo lib that helps
>>> you attach SRIOV devices that could be used by the neturon VIF drivers
>>> and the nova PCI passthrough code.
>> I'm not clear that this is necessary.
>> At the moment with vNICs, you pass through devices by having a co-operation
>> between Neutron (which configures a way of attaching them to put them on a
>> certain network) and the hypervisor specific code (which creates them in the
>> instance and attaches them as instructed by Neutron).  Why would we not
>> follow the same pattern with passthrough devices?  In this instance, neutron
>> would tell nova that when it's plugging this device it should be a
>> passthrough device, and pass any additional parameters like the VF encap,
>> and Nova would do as instructed, then Neutron would reconfigure whatever
>> parts of the network need to be reconfigured in concert with the
>> hypervisor's settings to make the NIC a part of the specified network.
> I agree, in general terms.
> Firstly, do you agree the neutron network-id can be used for
> passthrough and non-passthrough VIF connections? i.e. a neturon
> network-id does not imply PCI-passthrough.
> Secondly, we need to agree on the information flow around defining the
> "flavor" of the NIC. i.e. virtual or passthroughFast or
> passthroughNormal.
> My gut feeling is that neutron port description should somehow define
> this via a nic-flavor that maps to a group of pci-flavors.
> But from a billing point of view, I like the idea of the server flavor
> saying to the VIF plug code, by the way, for this server, please
> support all the nics using devices in pciflavor:fastNic should that be
> possible for the users given port configuration. But this is leaking
> neutron/networking information into Nova, which seems bad.
> John

