[openstack-dev] [nova] [neutron] PCI pass-through network support

Irena Berezovsky irenab at mellanox.com
Mon Jan 13 08:50:41 UTC 2014


Hi,
After having a lot of discussions both on IRC and mailing list, I would like to suggest to define basic use cases for PCI pass-through network support with agreed list of limitations and assumptions  and implement it.  By doing this Proof of Concept we will be able to deliver basic PCI pass-through network support in Icehouse timeframe and understand better how to provide complete solution starting from  tenant /admin API enhancement, enhancing nova-neutron communication and eventually provide neutron plugin  supporting the PCI pass-through networking.
We can try to split tasks between currently involved participants and bring up the basic case. Then we can enhance the implementation.
Having more knowledge and experience with neutron parts, I would like  to start working on neutron mechanism driver support.  I have already started to arrange the following blueprint doc based on everyone's ideas:
https://docs.google.com/document/d/1RfxfXBNB0mD_kH9SamwqPI8ZM-jg797ky_Fze7SakRc/edit#<https://docs.google.com/document/d/1RfxfXBNB0mD_kH9SamwqPI8ZM-jg797ky_Fze7SakRc/edit>

For the basic PCI pass-through networking case we can assume the following:

1.       Single provider network (PN1)

2.       White list of available SRIOV PCI devices for allocation as NIC for neutron networks on provider network  (PN1) is defined on each compute node

3.       Support directly assigned SRIOV PCI pass-through device as vNIC. (This will limit the number of tests)

4.       More ....


If my suggestion seems reasonable to you, let's try to reach an agreement and split the work during our Monday IRC meeting.

BR,
Irena

From: Jiang, Yunhong [mailto:yunhong.jiang at intel.com]
Sent: Saturday, January 11, 2014 8:36 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] [neutron] PCI pass-through network support

Comments with prefix [yjiang5_2] , including the double confirm.

I think we (you and me) is mostly on the same page, would you please give a summary, and then we can have community , including Irena/Robert, to check it. We need Cores to sponsor it. We should check with John to see if this is different with his mentor picture, and we may need a neutron core (I assume Cisco has a bunch of Neutron cores :) )to sponsor it?

And, will anyone from Cisco can help on the implementation? After this long discussion, we are in half bottom of I release and I'm not sure if Yongli and I alone can finish them in I release.

Thanks
--jyh

From: Ian Wells [mailto:ijw.ubuntu at cack.org.uk]
Sent: Friday, January 10, 2014 6:34 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] [neutron] PCI pass-through network support


>
> OK - so if this is good then I think the question is how we could change the 'pci_whitelist' parameter we have - which, as you say, should either *only* do whitelisting or be renamed - to allow us to add information.  Yongli has something along those lines but it's not flexible and it distinguishes poorly between which bits are extra information and which bits are matching expressions (and it's still called pci_whitelist) - but even with those criticisms it's very close to what we're talking about.  When we have that I think a lot of the rest of the arguments should simply resolve themselves.
>
>
>
> [yjiang5_1] The reason that not easy to find a flexible/distinguishable change to pci_whitelist is because it combined two things. So a stupid/naive solution in my head is, change it to VERY generic name, 'pci_devices_information',
>
> and change schema as an array of {'devices_property'=regex exp, 'group_name' = 'g1'} dictionary, and the device_property expression can be 'address ==xxx, vendor_id == xxx' (i.e. similar with current white list),  and we can squeeze more into the "pci_devices_information" in future, like 'network_information' = xxx or "Neutron specific information" you required in previous mail.


We're getting to the stage that an expression parser would be useful, annoyingly, but if we are going to try and squeeze it into JSON can I suggest:

{ match = { class = "Acme inc. discombobulator" }, info = { group = "we like teh groups", volume = "11" } }

[yjiang5_2] Double confirm that 'match' is whitelist, and info is 'extra info', right?  Can the key be more meaningful, for example, s/match/pci_device_property,  s/info/pci_device_info, or s/match/pci_devices/  etc.
Also assume the class should be the class code in the configuration space, and be digital, am I right? Otherwise, it's not easy to get the 'Acme inc. discombobulator' information.


>
> All keys other than 'device_property' becomes extra information, i.e. software defined property. These extra information will be carried with the PCI devices,. Some implementation details, A)we can limit the acceptable keys, like we only support 'group_name', 'network_id', or we can accept any keys other than reserved (vendor_id, device_id etc) one.


Not sure we have a good list of reserved keys at the moment, and with two dicts it isn't really necessary, I guess.  I would say that we have one match parser which looks something like this:

# does this PCI device match the expression given?
def match(expression, pci_details, extra_specs):
   for (k, v) in expression:
        if k.starts_with('e.'):
           mv = extra_specs.get(k[2:])
        else:
           mv = pci_details.get(k[2:])
        if not match(m, mv):
            return False
    return True

Usable in this matching (where 'e.' just won't work) and also for flavor assignment (where e. will indeed match the extra values).
[yjiang5_2] I think if we use same function to check or use two functions for match/flavor will be implementation detail and can be discussed in next step. Of course, we should always avoid code duplication.


> B) if a device match 'device_property' in several entries, raise exception, or use the first one.

Use the first one, I think.  It's easier, and potentially more useful.
[yjiang5] good.


> [yjiang5_1] Another thing need discussed is, as you pointed out, "we would need to add a config param on the control host to decide which flags to group on when doing the stats".  I agree with the design, but some details need decided.

This is a patch that can come at any point after we do the above stuff (which we need for Neutron), clearly.

> Where should it defined. If we a) define it in both control node and compute node, then it should be static defined (just change pool_keys in "/opt/stack/nova/nova/pci/pci_stats.py" to a configuration parameter) . Or b) define only in control node, then I assume the control node should be the scheduler node, and the scheduler manager need save such information, present a API to fetch such information and the compute node need fetch it on every update_available_resource() periodic task. I'd prefer to take a) option in first step. Your idea?

I think it has to be (a), which is a shame.

[yjiang5] We can extend it to (b) in future.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140113/ff1ef89d/attachment.html>


More information about the OpenStack-dev mailing list