[openstack-dev] [nova] [neutron] PCI pass-through network support

yongli he yongli.he at intel.com
Thu Dec 19 00:52:46 UTC 2013


On 2013?12?17? 23:09, Ian Wells wrote:
> Reiterating from the IRC mneeting, largely, so apologies.
>
> Firstly, I disagree that 
> https://wiki.openstack.org/wiki/PCI_passthrough_SRIOV_support is an 
> accurate reflection of the current state.  It's a very
do you really find anything is not what you want except the API part?
group device /fade out alias/ simplify the reguler expression for 
address filter/ mark the device allocated the sriov?

Yongli he
> unilateral view, largely because the rest of us had been focussing on 
> the google document that we've been using for weeks.
>
> Secondly, I totally disagree with this approach.  This assumes that 
> description of the (cloud-internal, hardware) details of each compute 
> node is best done with data stored centrally and driven by an API.  I 
> don't agree with either of these points.
>
> Firstly, the best place to describe what's available on a compute node 
> is in the configuration on the compute node. For instance, I describe 
> which interfaces do what in Neutron on the compute node.  This is 
> because when you're provisioning nodes, that's the moment you know how 
> you've attached it to the network and what hardware you've put in it 
> and what you intend the hardware to be for - or conversely your 
> deployment puppet or chef or whatever knows it, and Razor or MAAS has 
> enumerated it, but the activities are equivalent.  Storing it 
> centrally distances the compute node from its descriptive information 
> for no good purpose that I can see and adds the complexity of having 
> to go make remote requests just to start up.
>
> Secondly, even if you did store this centrally, it's not clear to me 
> that an API is very useful.  As far as I can see, the need for an API 
> is really the need to manage PCI device flavors.  If you want that to 
> be API-managed, then the rest of a (rather complex) API cascades from 
> that one choice.  Most of the things that API lets you change 
> (expressions describing PCI devices) are the sort of thing that you 
> set once and only revisit when you start - for instance - deploying 
> new hosts in a different way.
>
> I at the parallel in Neutron provider networks.  They're config 
> driven, largely on the compute hosts.  Agents know what ports on their 
> machine (the hardware tie) are associated with provider networks, by 
> provider network name.  The controller takes 'neutron net-create ... 
> --provider:network 'name'' and uses that to tie a virtual network to 
> the provider network definition on each host.  What we absolutely 
> don't do is have a complex admin API that lets us say 'in host 
> aggregate 4, provider network x (which I made earlier) is connected to 
> eth6'.
>
> -- 
> Ian.
>
>
>
> On 17 December 2013 03:12, yongli he <yongli.he at intel.com 
> <mailto:yongli.he at intel.com>> wrote:
>
>     On 2013?12?16? 22:27, Robert Li (baoli) wrote:
>>     Hi Yongli,
>>
>>     The IRC meeting we have for PCI-Passthrough is the forum for
>>     discussion on SR-IOV support in openstack. I think the goal is to
>>     come up with a plan on both the nova and neutron side in support
>>     of the SR-IOV, and the current focus is on the nova side. Since
>>     you've done a lot of work on it already, would you like to lead
>>     tomorrow's discussion at UTC 1400?
>
>     Robert , you lead the meeting very well i enjoy you setup every
>     for us, keep going on it -:)
>
>     I'd like to give you guy a summary of current state, let's discuss
>     it then.
>     https://wiki.openstack.org/wiki/PCI_passthrough_SRIOV_support
>
>
>     1)  fade out alias ( i think this ok for all)
>     2)  white list became pic-flavor ( i think this ok for all)
>     3)  address simply regular expression support: only * and a number
>     range is support [hex-hex]. ( i think this ok?)
>     4)  aggregate : now it's clear enough, and won't impact SRIOV.  (
>     i think this irrelevant to SRIOV now)
>
>
>     5)  SRIOV use case, if you suggest a use case, please given a full
>     example like this: [discuss: compare to other solution]
>
>       * create a pci flavor for the SRIOV
>
>        nova pci-flavor-create  name 'vlan-SRIOV'  description "xxxxx"
>        nova pci-flavor-update UUID  set    'description'='xxxx'   'address'= '0000:01:*.7'
>
>
>
>               Admin config SRIOV
>
>       * create pci-flavor :
>
>         {"name": "privateNIC", "neutron-network-uuid": "uuid-1", ...}
>         {"name": "publicNIC", "neutron-network-uuid": "uuid-2", ...}
>         {"name": "smallGPU", "neutron-network-uuid": "", ...}
>
>       * set aggregate meta according the flavors existed in the hosts
>
>     flavor extra-specs, for a VM that gets two small GPUs and VIFs
>     attached from the above SRIOV NICs:
>
>         nova aggregate-set-metadata pci-aware-group set 'pci-flavor'='smallGPU,oldGPU, privateNIC,privateNIC'
>
>       * create instance flavor for sriov
>
>          nova flavor-key 100 set  'pci-flavor='1:privateNIC;  1: publicNIC;  2:smallGPU,oldGPU'
>
>       * User just specifies a quantum port as normal:
>
>         nova boot --flavor "sriov-plus-two-gpu" --image img --nic net-id=uuid-2 --nic net-id=uuid-1 vm-name
>
>
>
>     Yongli
>
>
>>
>>     Thanks,
>>     Robert
>>
>>     On 12/11/13 8:09 PM, "He, Yongli" <yongli.he at intel.com
>>     <mailto:yongli.he at intel.com>> wrote:
>>
>>         Hi, all
>>
>>         Please continue to foucs on the blueprint, it change after
>>         reviewing. And  for this point:
>>
>>
>>         >5. flavor style for sriov: i just list the flavor style in
>>         the design but for the style
>>         >              --nic
>>         >                   --pci-flavor PowerfullNIC:1
>>          >  still possible to work, so what's the real impact to
>>         sriov from the flavor design?
>>
>>         >As you can see from the log, Irena has some strong opinions on
>>         this, and I tend to agree with her. The problem we need to
>>         solve is this: we need a means to associate a nic (or port)
>>         with a PCI device that is allocated out of a PCI >flavor or a
>>         PCI group. We think that we presented a complete solution in
>>         our google doc.
>>
>>         It's not so clear, could you please list the key point here.
>>         Btw, the blue print I sent Monday had changed for this,
>>         please check.
>>
>>         Yongli he
>>
>>         *From:*Robert Li (baoli) [mailto:baoli at cisco.com]
>>         *Sent:* Wednesday, December 11, 2013 10:18 PM
>>         *To:* He, Yongli; Sandhya Dasu (sadasu); OpenStack
>>         Development Mailing List (not for usage questions); Jiang,
>>         Yunhong; Irena Berezovsky; prashant.upadhyaya at aricent.com
>>         <mailto:prashant.upadhyaya at aricent.com>;
>>         chris.friesen at windriver.com
>>         <mailto:chris.friesen at windriver.com>; Itzik Brown;
>>         john at johngarbutt.com <mailto:john at johngarbutt.com>
>>         *Subject:* Re: [openstack-dev] [nova] [neutron] PCI
>>         pass-through network support
>>
>>         Hi Yongli,
>>
>>         Thank you very much for sharing the Wiki with us on Monday so
>>         that we have a better understanding on your ideas and
>>         thoughts. Please see embedded comments.
>>
>>         --Robert
>>
>>         On 12/10/13 8:35 PM, "yongli he" <yongli.he at intel.com
>>         <mailto:yongli.he at intel.com>> wrote:
>>
>>             On 2013?12?10?22:41, Sandhya Dasu (sadasu) wrote:
>>
>>                 Hi,
>>
>>                    I am trying to resurrect this email thread since
>>                 discussions have split between several threads and is
>>                 becoming hard to keep track.
>>
>>                 An update:
>>
>>                 New PCI Passthrough meeting time: Tuesdays UTC 1400.
>>
>>                 New PCI flavor proposal from Nova:
>>
>>                 https://wiki.openstack.org/wiki/PCI_configration_Database_and_API#Take_advantage_of_host_aggregate_.28T.B.D.29
>>
>>             Hi, all
>>               sorry for miss the meeting, i was seeking John at that
>>             time. from the log i saw some concern about new design, 
>>             i list them there and try to clarify it per my opinion:
>>
>>             1. configuration going to deprecated:   this might impact
>>             SRIOV.  if possible, please list what kind of impact make
>>             to you.
>>
>>         Regarding the nova API pci-flavor-update, we had a
>>         face-to-face discussion over use of a nova API to
>>         provision/define/configure PCI passthrough list during the
>>         ice-house summit. I kind of like the idea initially. As you
>>         can see from the meeting log, however, I later thought that
>>         in a distributed system, using a centralized API to define
>>         resources per compute node, which could come and go any time,
>>         doesn't seem to provide any significant benefit. This is the
>>         reason that I didn't mention it in our google doc
>>         https://docs.google.com/document/d/1EMwDg9J8zOxzvTnQJ9HwZdiotaVstFWKIuKrPse6JOs/edit#
>>
>>         If you agree that pci-flavor and pci-group is kind of the
>>         same thing, then we agree with you that the pci-flavor-create
>>         API is needed. Since pci-flavor or pci-group is global, then
>>         such an API can be used for resource registration/validation
>>         on nova server. In addition, it can be used to facilitate the
>>         display of PCI devices per node, per group, or in the entire
>>         cloud, etc.
>>
>>
>>
>>             2. <baoli>So the API seems to be combining the whitelist
>>             + pci-group
>>                 yeah, it's actually almost same thing, 'flavor'
>>             'pci-group' or 'group'. the real different is this flavor
>>             going to deprecated the alias, and combine tight to
>>             aggregate or flavor.
>>
>>         Well, with pci-group, we recommended to deprecate the PCI
>>         alias because we think it is redundant.
>>
>>         We think that specification of PCI requirement in the
>>         flavor's extra spec is still needed as it's a generic means
>>         to allocate PCI devices. In addition, it can be used as
>>         properties in the host aggregate as well.
>>
>>
>>
>>             3. feature:
>>                this design is not to say the feature is not work, but
>>             changed. if auto discovery feature is possible, we got
>>             'feature' form the device, then use the feature to define
>>             the pci-flavor.  it's also possible create default
>>             pci-flavor for this. so the feature concept will be
>>             impact, my feeling, we should given a separated bp for
>>             feature, and not in this round change, so here we only
>>             thing is keep the feature is possible.
>>
>>         I think that it's ok to have separate BPs. But we think that
>>         auto discovery is an essential part of the design, and
>>         therefore it should be implemented with more helping hands.
>>
>>
>>
>>             4. address regular expression: i'm fine with the
>>             wild-match style.
>>
>>         Sounds good. One side node is that I noticed that the driver
>>         for intel 82576 cards has a strange slot assignment scheme.
>>         So the final definition of it may need to accommodate that as
>>         well.
>>
>>
>>
>>             5. flavor style for sriov: i just list the flavor style
>>             in the design but for the style
>>                           --nic
>>                                --pci-flavor PowerfullNIC:1
>>                still possible to work, so what's the real impact to
>>             sriov from the flavor design?
>>
>>         As you can see from the log, Irena has some strong opinions
>>         on this, and I tend to agree with her. The problem we need to
>>         solve is this: we need a means to associate a nic (or port)
>>         with a PCI device that is allocated out of a PCI flavor or a
>>         PCI group. We think that we presented a complete solution in
>>         our google doc.
>>
>>         At this point, I really believe that we should combine our
>>         efforts and ideas. As far as how many BPs are needed, it
>>         should be a trivial matter after we have agreed on a complete
>>         solution.
>>
>>
>>
>>             Yongli He
>>
>>
>>
>>             Thanks,
>>
>>             Sandhya
>>
>>             *From: *Sandhya Dasu <sadasu at cisco.com
>>             <mailto:sadasu at cisco.com>>
>>             *Reply-To: *"OpenStack Development Mailing List (not for
>>             usage questions)" <openstack-dev at lists.openstack.org
>>             <mailto:openstack-dev at lists.openstack.org>>
>>             *Date: *Thursday, November 7, 2013 9:44 PM
>>             *To: *"OpenStack Development Mailing List (not for usage
>>             questions)" <openstack-dev at lists.openstack.org
>>             <mailto:openstack-dev at lists.openstack.org>>, "Jiang,
>>             Yunhong" <yunhong.jiang at intel.com
>>             <mailto:yunhong.jiang at intel.com>>, "Robert Li (baoli)"
>>             <baoli at cisco.com <mailto:baoli at cisco.com>>, Irena
>>             Berezovsky <irenab at mellanox.com
>>             <mailto:irenab at mellanox.com>>,
>>             "prashant.upadhyaya at aricent.com
>>             <mailto:prashant.upadhyaya at aricent.com>"
>>             <prashant.upadhyaya at aricent.com
>>             <mailto:prashant.upadhyaya at aricent.com>>,
>>             "chris.friesen at windriver.com
>>             <mailto:chris.friesen at windriver.com>"
>>             <chris.friesen at windriver.com
>>             <mailto:chris.friesen at windriver.com>>, "He, Yongli"
>>             <yongli.he at intel.com <mailto:yongli.he at intel.com>>, Itzik
>>             Brown <ItzikB at mellanox.com <mailto:ItzikB at mellanox.com>>
>>             *Subject: *Re: [openstack-dev] [nova] [neutron] PCI
>>             pass-through network support
>>
>>             Hi,
>>
>>                  The discussions during the summit were very
>>             productive. Now, we are ready to setup our IRC meeting.
>>
>>             Here are some slots that look like they might work for us.
>>
>>             1. Wed 2 -- 3 pm UTC.
>>
>>             2. Thursday 12 -- 1 pm UTC.
>>
>>             3. Thursday 7 -- 8pm UTC.
>>
>>             Please vote.
>>
>>             Thanks,
>>
>>             Sandhya
>>
>>             *From: *Sandhya Dasu <sadasu at cisco.com
>>             <mailto:sadasu at cisco.com>>
>>             *Reply-To: *"OpenStack Development Mailing List (not for
>>             usage questions)" <openstack-dev at lists.openstack.org
>>             <mailto:openstack-dev at lists.openstack.org>>
>>             *Date: *Tuesday, November 5, 2013 12:03 PM
>>             *To: *"OpenStack Development Mailing List (not for usage
>>             questions)" <openstack-dev at lists.openstack.org
>>             <mailto:openstack-dev at lists.openstack.org>>, "Jiang,
>>             Yunhong" <yunhong.jiang at intel.com
>>             <mailto:yunhong.jiang at intel.com>>, "Robert Li (baoli)"
>>             <baoli at cisco.com <mailto:baoli at cisco.com>>, Irena
>>             Berezovsky <irenab at mellanox.com
>>             <mailto:irenab at mellanox.com>>,
>>             "prashant.upadhyaya at aricent.com
>>             <mailto:prashant.upadhyaya at aricent.com>"
>>             <prashant.upadhyaya at aricent.com
>>             <mailto:prashant.upadhyaya at aricent.com>>,
>>             "chris.friesen at windriver.com
>>             <mailto:chris.friesen at windriver.com>"
>>             <chris.friesen at windriver.com
>>             <mailto:chris.friesen at windriver.com>>, "He, Yongli"
>>             <yongli.he at intel.com <mailto:yongli.he at intel.com>>, Itzik
>>             Brown <ItzikB at mellanox.com <mailto:ItzikB at mellanox.com>>
>>             *Subject: *Re: [openstack-dev] [nova] [neutron] PCI
>>             pass-through network support
>>
>>             Just to clarify, the discussion is planned for 10 AM
>>             Wednesday morning at the developer's lounge.
>>
>>             Thanks,
>>
>>             Sandhya
>>
>>             *From: *Sandhya Dasu <sadasu at cisco.com
>>             <mailto:sadasu at cisco.com>>
>>             *Reply-To: *"OpenStack Development Mailing List (not for
>>             usage questions)" <openstack-dev at lists.openstack.org
>>             <mailto:openstack-dev at lists.openstack.org>>
>>             *Date: *Tuesday, November 5, 2013 11:38 AM
>>             *To: *"OpenStack Development Mailing List (not for usage
>>             questions)" <openstack-dev at lists.openstack.org
>>             <mailto:openstack-dev at lists.openstack.org>>, "Jiang,
>>             Yunhong" <yunhong.jiang at intel.com
>>             <mailto:yunhong.jiang at intel.com>>, "Robert Li (baoli)"
>>             <baoli at cisco.com <mailto:baoli at cisco.com>>, Irena
>>             Berezovsky <irenab at mellanox.com
>>             <mailto:irenab at mellanox.com>>,
>>             "prashant.upadhyaya at aricent.com
>>             <mailto:prashant.upadhyaya at aricent.com>"
>>             <prashant.upadhyaya at aricent.com
>>             <mailto:prashant.upadhyaya at aricent.com>>,
>>             "chris.friesen at windriver.com
>>             <mailto:chris.friesen at windriver.com>"
>>             <chris.friesen at windriver.com
>>             <mailto:chris.friesen at windriver.com>>, "He, Yongli"
>>             <yongli.he at intel.com <mailto:yongli.he at intel.com>>, Itzik
>>             Brown <ItzikB at mellanox.com <mailto:ItzikB at mellanox.com>>
>>             *Subject: *Re: [openstack-dev] [nova] [neutron] PCI
>>             pass-through network support
>>
>>             *Hi,*
>>
>>             *    We are planning to have a discussion at the
>>             developer's lounge tomorrow morning at 10:00 am. Please
>>             feel free to drop by if you are interested.*
>>
>>             *Thanks,*
>>
>>             *Sandhya*
>>
>>             *From: *<Jiang>, Yunhong <yunhong.jiang at intel.com
>>             <mailto:yunhong.jiang at intel.com>>
>>
>>             *Date: *Thursday, October 31, 2013 6:21 PM
>>             *To: *"Robert Li (baoli)" <baoli at cisco.com
>>             <mailto:baoli at cisco.com>>, Irena Berezovsky
>>             <irenab at mellanox.com <mailto:irenab at mellanox.com>>,
>>             "prashant.upadhyaya at aricent.com
>>             <mailto:prashant.upadhyaya at aricent.com>"
>>             <prashant.upadhyaya at aricent.com
>>             <mailto:prashant.upadhyaya at aricent.com>>,
>>             "chris.friesen at windriver.com
>>             <mailto:chris.friesen at windriver.com>"
>>             <chris.friesen at windriver.com
>>             <mailto:chris.friesen at windriver.com>>, "He, Yongli"
>>             <yongli.he at intel.com <mailto:yongli.he at intel.com>>, Itzik
>>             Brown <ItzikB at mellanox.com <mailto:ItzikB at mellanox.com>>
>>             *Cc: *OpenStack Development Mailing List
>>             <openstack-dev at lists.openstack.org
>>             <mailto:openstack-dev at lists.openstack.org>>, "Brian Bowen
>>             (brbowen)" <brbowen at cisco.com
>>             <mailto:brbowen at cisco.com>>, "Kyle Mestery (kmestery)"
>>             <kmestery at cisco.com <mailto:kmestery at cisco.com>>, Sandhya
>>             Dasu <sadasu at cisco.com <mailto:sadasu at cisco.com>>
>>             *Subject: *RE: [openstack-dev] [nova] [neutron] PCI
>>             pass-through network support
>>
>>             Robert, I think your change request for pci alias should
>>             be covered by the extra infor enhancement.
>>             https://blueprints.launchpad.net/nova/+spec/pci-extra-info  and
>>             Yongli is working on it.
>>
>>             I'm not sure how the port profile is passed to the
>>             connected switch, is it a Cisco VMEFX specific method or
>>             libvirt method? Sorry I'm not well on network side.
>>
>>             --jyh
>>
>>             *From:*Robert Li (baoli) [mailto:baoli at cisco.com]
>>             *Sent:* Wednesday, October 30, 2013 10:13 AM
>>             *To:* Irena Berezovsky; Jiang, Yunhong;
>>             prashant.upadhyaya at aricent.com
>>             <mailto:prashant.upadhyaya at aricent.com>;
>>             chris.friesen at windriver.com
>>             <mailto:chris.friesen at windriver.com>; He, Yongli; Itzik Brown
>>             *Cc:* OpenStack Development Mailing List; Brian Bowen
>>             (brbowen); Kyle Mestery (kmestery); Sandhya Dasu (sadasu)
>>             *Subject:* Re: [openstack-dev] [nova] [neutron] PCI
>>             pass-through network support
>>
>>             Hi,
>>
>>             Regarding physical network mapping,  This is what I thought.
>>
>>             consider the following scenarios:
>>
>>                1. a compute node with SRIOV only interfaces attached
>>             to a physical network. the node is connected to one
>>             upstream switch
>>
>>                2. a compute node with both SRIOV interfaces and
>>             non-SRIOV interfaces attached to a physical network. the
>>             node is connected to one upstream switch
>>
>>                3. in addition to case 1 &2, a compute node may have
>>             multiple vNICs that are connected to different upstream
>>             switches.
>>
>>             CASE 1:
>>
>>              -- the mapping from a virtual network (in terms of
>>             neutron) to a physical network is actually done by
>>             binding a port profile to a neutron port. With cisco's
>>             VM-FEX, a port profile is associated with one or multiple
>>             vlans. Once the neutron port is bound with this
>>             port-profile in the upstream switch, it's effectively
>>             plugged into the physical network.
>>
>>              -- since the compute node is connected to one upstream
>>             switch, the existing nova PCI alias will be sufficient.
>>             For example, one can boot a Nova instance that is
>>             attached to a SRIOV port with the following command:
>>
>>                       nova boot ---flavor m1.large ---image
>>             <image-id> --nic
>>             net-id=<net>,pci-alias=<alias>,sriov=<direct|macvtap>,port-profile=<profile>
>>
>>                 the net-id will be useful in terms of allocating IP
>>             address, enable dhcp, etc that is associated with the
>>             network.
>>
>>             -- the pci-alias specified in the nova boot command is
>>             used to create a PCI request for scheduling purpose. a
>>             PCI device is bound to a neutron port during the instance
>>             build time in the case of nova boot. Before invoking the
>>             neutron API to create a port, an allocated PCI device out
>>             of a PCI alias will be located from the PCI device list
>>             object. This device info among other information will be
>>             sent to neutron to create the port.
>>
>>             CASE 2:
>>
>>             -- Assume that OVS is used for the non-SRIOV interfaces.
>>             An example of configuration with ovs plugin would look like:
>>
>>                         bridge_mappings = physnet1:br-vmfex
>>
>>                         network_vlan_ranges = physnet1:15:17
>>
>>                         tenant_network_type = vlan
>>
>>                 When a neutron network is created, a vlan is either
>>             allocated or specified in the neutron net-create command.
>>             Attaching a physical interface to the bridge (in the
>>             above example br-vmfex) is an administrative task.
>>
>>             -- to create a Nova instance with non-SRIOV port:
>>
>>                        nova boot ---flavor m1.large ---image
>>             <image-id> --nic net-id=<net>
>>
>>             -- to create a Nova instance with SRIOV port:
>>
>>                        nova boot ---flavor m1.large ---image
>>             <image-id> --nic
>>             net-id=<net>,pci-alias=<alias>,sriov=<direct|macvtap>,port-profile=<profile>
>>
>>                 it's essentially the same as in the first case. But
>>             since the net-id is already associated with a vlan, the
>>             vlan associated with the port-profile must be identical
>>             to that vlan. This has to be enforced by neutron.
>>
>>                 again, since the node is connected to one upstream
>>             switch, the existing nova PCI alias should be sufficient.
>>
>>             CASE 3:
>>
>>             -- A compute node might be connected to multiple upstream
>>             switches, with each being a separate network. This means
>>             SRIOV PFs/VFs are already implicitly associated with
>>             physical networks. In the none-SRIOV case, a physical
>>             interface is associated with a physical network by
>>             plugging it into that network, and attaching this
>>             interface to the ovs bridge that represents this physical
>>             network on the compute node. In the SRIOV case, we need a
>>             way to group the SRIOV VFs that belong to the same
>>             physical networks. The existing nova PCI alias is to
>>             facilitate PCI device allocation by associating
>>             <product_id, vendor_id> with an alias name. This will no
>>             longer be sufficient. But it can be enhanced to achieve
>>             our goal. For example, the PCI device domain, bus (if
>>             their mapping to vNIC is fixed across boot) may be added
>>             into the alias, and the alias name should be
>>             corresponding to a list of tuples.
>>
>>             Another consideration is that a VF or PF might be used on
>>             the host for other purposes. For example, it's possible
>>             for a neutron DHCP server to be bound with a VF.
>>             Therefore, there needs a method to exclude some VFs from
>>             a group.  One way is to associate an exclude list with an
>>             alias.
>>
>>             The enhanced PCI alias can be used to support features
>>             other than neutron as well. Essentially, a PCI alias can
>>             be defined as a group of PCI devices associated with a
>>             feature. I'd think that this should be addressed with a
>>             separate blueprint.
>>
>>             Thanks,
>>
>>             Robert
>>
>>             On 10/30/13 12:59 AM, "Irena Berezovsky"
>>             <irenab at mellanox.com <mailto:irenab at mellanox.com>> wrote:
>>
>>                 Hi,
>>
>>                 Please see my answers inline
>>
>>                 *From:*Jiang, Yunhong [mailto:yunhong.jiang at intel.com]
>>                 *Sent:* Tuesday, October 29, 2013 10:17 PM
>>                 *To:* Irena Berezovsky; Robert Li (baoli);
>>                 prashant.upadhyaya at aricent.com
>>                 <mailto:prashant.upadhyaya at aricent.com>;
>>                 chris.friesen at windriver.com
>>                 <mailto:chris.friesen at windriver.com>; He, Yongli;
>>                 Itzik Brown
>>                 *Cc:* OpenStack Development Mailing List; Brian Bowen
>>                 (brbowen); Kyle Mestery (kmestery); Sandhya Dasu (sadasu)
>>                 *Subject:* RE: [openstack-dev] [nova] [neutron] PCI
>>                 pass-through network support
>>
>>                 Your explanation of the virtual network and physical
>>                 network is quite clear and should work well. We need
>>                 change nova code to achieve it, including get the
>>                 physical network for the virtual network, passing the
>>                 physical network requirement to the filter properties
>>                 etc.
>>
>>                 */[IrenaB] /* The physical network is already
>>                 available to nova at networking/nova/api at as
>>                 virtual network attribute, it then passed to the VIF
>>                 driver. We will push soon the fix
>>                 to:https://bugs.launchpad.net/nova/+bug/1239606;
>>                 which will provide general support for getting this
>>                 information.
>>
>>                 For your port method, so you mean we are sure to
>>                 passing network id to 'nova boot' and nova will
>>                 create the port during VM boot, am I right?  Also,
>>                 how can nova knows that it need allocate the PCI
>>                 device for the port? I'd suppose that in SR-IOV NIC
>>                 environment, user don't need specify the PCI
>>                 requirement. Instead, the PCI requirement should come
>>                 from the network configuration and image property. Or
>>                 you think user still need passing flavor with pci
>>                 request?
>>
>>                 */[IrenaB] There are two way to apply port method.
>>                 One is to pass network id on nova boot and use
>>                 default type as chosen in the neutron config file for
>>                 vnic type. Other way is to define port with required
>>                 vnic type and other properties if applicable, and run
>>                 'nova boot' with port id argument. Going forward with
>>                 nova support for PCI devices awareness, we do need a
>>                 way impact scheduler choice to land VM on suitable
>>                 Host with available PC device that has the required
>>                 connectivity./*
>>
>>                 --jyh
>>
>>                 *From:*Irena Berezovsky [mailto:irenab at mellanox.com]
>>                 *Sent:* Tuesday, October 29, 2013 3:17 AM
>>                 *To:* Jiang, Yunhong; Robert Li (baoli);
>>                 prashant.upadhyaya at aricent.com
>>                 <mailto:prashant.upadhyaya at aricent.com>;
>>                 chris.friesen at windriver.com
>>                 <mailto:chris.friesen at windriver.com>; He, Yongli;
>>                 Itzik Brown
>>                 *Cc:* OpenStack Development Mailing List; Brian Bowen
>>                 (brbowen); Kyle Mestery (kmestery); Sandhya Dasu (sadasu)
>>                 *Subject:* RE: [openstack-dev] [nova] [neutron] PCI
>>                 pass-through network support
>>
>>                 Hi Jiang, Robert,
>>
>>                 IRC meeting option works for me.
>>
>>                 If I understand your question below, you are looking
>>                 for a way to tie up between requested virtual
>>                 network(s) and requested PCI device(s). The way we
>>                 did it in our solution  is to map a
>>                 provider:physical_network to an interface that
>>                 represents the Physical Function. Every virtual
>>                 network is bound to the provider:physical_network, so
>>                 the PCI device should be allocated based on this
>>                 mapping.  We can  map a PCI alias to the
>>                 provider:physical_network.
>>
>>                 Another topic to discuss is where the mapping between
>>                 neutron port and PCI device should be managed. One
>>                 way to solve it, is to propagate the allocated PCI
>>                 device details to neutron on port creation.
>>
>>                 In case  there is no qbg/qbh support, VF networking
>>                 configuration should be applied locally on the Host.
>>
>>                 The question is when and how to apply networking
>>                 configuration on the PCI device?
>>
>>                 We see the following options:
>>
>>                 ?it can be done on port creation.
>>
>>                 ?It can be done when nova VIF driver is called for
>>                 vNIC plugging. This will require to have all
>>                 networking configuration available to the VIF driver
>>                 or send request to the neutron server to obtain it.
>>
>>                 ?It can be done by  having a dedicated L2 neutron
>>                 agent on each Host that scans for allocated PCI
>>                 devices  and then retrieves networking configuration
>>                 from the server and configures the device. The agent
>>                 will be also responsible for managing update requests
>>                 coming from the neutron server.
>>
>>                 For macvtap vNIC type assignment, the networking
>>                 configuration can be applied by a dedicated L2
>>                 neutron agent.
>>
>>                 BR,
>>
>>                 Irena
>>
>>                 *From:*Jiang, Yunhong [mailto:yunhong.jiang at intel.com]
>>                 *Sent:* Tuesday, October 29, 2013 9:04 AM
>>
>>
>>                 *To:* Robert Li (baoli); Irena Berezovsky;
>>                 prashant.upadhyaya at aricent.com
>>                 <mailto:prashant.upadhyaya at aricent.com>;
>>                 chris.friesen at windriver.com
>>                 <mailto:chris.friesen at windriver.com>; He, Yongli;
>>                 Itzik Brown
>>                 *Cc:* OpenStack Development Mailing List; Brian Bowen
>>                 (brbowen); Kyle Mestery (kmestery); Sandhya Dasu (sadasu)
>>                 *Subject:* RE: [openstack-dev] [nova] [neutron] PCI
>>                 pass-through network support
>>
>>                 Robert, is it possible to have a IRC meeting? I'd
>>                 prefer to IRC meeting because it's more openstack
>>                 style and also can keep the minutes clearly.
>>
>>                 To your flow, can you give more detailed example. For
>>                 example, I can consider user specify the instance
>>                 with --nic option specify a network id, and then how
>>                 nova device the requirement to the PCI device? I
>>                 assume the network id should define the switches that
>>                 the device can connect to , but how is that
>>                 information translated to the PCI property
>>                 requirement? Will this translation happen before the
>>                 nova scheduler make host decision?
>>
>>                 Thanks
>>
>>                 --jyh
>>
>>                 *From:*Robert Li (baoli) [mailto:baoli at cisco.com]
>>                 *Sent:* Monday, October 28, 2013 12:22 PM
>>                 *To:* Irena Berezovsky;
>>                 prashant.upadhyaya at aricent.com
>>                 <mailto:prashant.upadhyaya at aricent.com>; Jiang,
>>                 Yunhong; chris.friesen at windriver.com
>>                 <mailto:chris.friesen at windriver.com>; He, Yongli;
>>                 Itzik Brown
>>                 *Cc:* OpenStack Development Mailing List; Brian Bowen
>>                 (brbowen); Kyle Mestery (kmestery); Sandhya Dasu (sadasu)
>>                 *Subject:* Re: [openstack-dev] [nova] [neutron] PCI
>>                 pass-through network support
>>
>>                 Hi Irena,
>>
>>                 Thank you very much for your comments. See inline.
>>
>>                 --Robert
>>
>>                 On 10/27/13 3:48 AM, "Irena Berezovsky"
>>                 <irenab at mellanox.com <mailto:irenab at mellanox.com>> wrote:
>>
>>                     Hi Robert,
>>
>>                     Thank you very much for sharing the information
>>                     regarding your efforts. Can you please share your
>>                     idea of the end to end flow? How do you suggest
>>                      to bind Nova and Neutron?
>>
>>                 The end to end flow is actually encompassed in the
>>                 blueprints in a nutshell. I will reiterate it in
>>                 below. The binding between Nova and Neutron occurs
>>                 with the neutron v2 API that nova invokes in order to
>>                 provision the neutron services. The vif driver is
>>                 responsible for plugging in an instance onto the
>>                 networking setup that neutron has created on the host.
>>
>>                 Normally, one will invoke "nova boot" api with the
>>                 ---nic options to specify the nic with which the
>>                 instance will be connected to the network. It
>>                 currently allows net-id, fixed ip and/or port-id to
>>                 be specified for the option. However, it doesn't
>>                 allow one to specify special networking requirements
>>                 for the instance. Thanks to the nova pci-passthrough
>>                 work, one can specify PCI passthrough device(s) in
>>                 the nova flavor. But it doesn't provide means to tie
>>                 up these PCI devices in the case of ethernet adpators
>>                 with networking services. Therefore the idea is
>>                 actually simple as indicated by the blueprint titles,
>>                 to provide means to tie up SRIOV devices with neutron
>>                 services. A work flow would roughly look like this
>>                 for 'nova boot':
>>
>>                       -- Specifies networking requirements in the
>>                 ---nic option. Specifically for SRIOV, allow the
>>                 following to be specified in addition to the existing
>>                 required information:
>>
>>                                . PCI alias
>>
>>                                . direct pci-passthrough/macvtap
>>
>>                                . port profileid that is compliant
>>                 with 802.1Qbh
>>
>>                         The above information is optional. In the
>>                 absence of them, the existing behavior remains.
>>
>>                      -- if special networking requirements exist,
>>                 Nova api creates PCI requests in the nova instance
>>                 type for scheduling purpose
>>
>>                      -- Nova scheduler schedules the instance based
>>                 on the requested flavor plus the PCI requests that
>>                 are created for networking.
>>
>>                      -- Nova compute invokes neutron services with
>>                 PCI passthrough information if any
>>
>>                      --  Neutron performs its normal operations based
>>                 on the request, such as allocating a port, assigning
>>                 ip addresses, etc. Specific to SRIOV, it should
>>                 validate the information such as profileid, and
>>                 stores them in its db. It's also possible to
>>                 associate a port profileid with a neutron network so
>>                 that port profileid becomes optional in the ---nic
>>                 option. Neutron returns  nova the port information,
>>                 especially for PCI passthrough related information in
>>                 the port binding object. Currently, the port binding
>>                 object contains the following information:
>>
>>                           binding:vif_type
>>
>>                           binding:host_id
>>
>>                           binding:profile
>>
>>                           binding:capabilities
>>
>>                     -- nova constructs the domain xml and plug in the
>>                 instance by calling the vif driver. The vif driver
>>                 can build up the interface xml based on the port
>>                 binding information.
>>
>>                     The blueprints you registered make sense. On Nova
>>                     side, there is a need to bind between requested
>>                     virtual network and PCI device/interface to be
>>                     allocated as vNIC.
>>
>>                     On the Neutron side, there is a need to  support
>>                     networking configuration of the vNIC. Neutron
>>                     should be able to identify the PCI device/macvtap
>>                     interface in order to apply configuration. I
>>                     think it makes sense to provide neutron
>>                     integration via dedicated Modular Layer 2
>>                     Mechanism Driver to allow PCI pass-through vNIC
>>                     support along with other networking technologies.
>>
>>                 I haven't sorted through this yet. A neutron port
>>                 could be associated with a PCI device or not, which
>>                 is a common feature, IMHO. However, a ML2 driver may
>>                 be needed specific to a particular SRIOV technology.
>>
>>                     During the Havana Release, we introduced Mellanox
>>                     Neutron plugin that enables networking via SRIOV
>>                     pass-through devices or macvtap interfaces.
>>
>>                     We want to integrate our solution with PCI
>>                     pass-through Nova support.  I will be glad to
>>                     share more details if you are interested.
>>
>>                 Good to know that you already have a SRIOV
>>                 implementation. I found out some information online
>>                 about the mlnx plugin, but need more time to get to
>>                 know it better. And certainly I'm interested in
>>                 knowing its details.
>>
>>                     The PCI pass-through networking support is
>>                     planned to be discussed during the summit:
>>                     http://summit.openstack.org/cfp/details/129. I
>>                     think it's worth to drill down into more detailed
>>                     proposal and present it during the summit,
>>                     especially since it impacts both nova and neutron
>>                     projects.
>>
>>                 I agree. Maybe we can steal some time in that discussion.
>>
>>                     Would you be interested in collaboration on this
>>                     effort? Would you be interested to exchange more
>>                     emails or set an IRC/WebEx meeting during this
>>                     week before the summit?
>>
>>                 Sure. If folks want to discuss it before the summit,
>>                 we can schedule a webex later this week. Or
>>                 otherwise, we can continue the discussion with email.
>>
>>                     Regards,
>>
>>                     Irena
>>
>>                     *From:*Robert Li (baoli) [mailto:baoli at cisco.com]
>>                     *Sent:* Friday, October 25, 2013 11:16 PM
>>                     *To:* prashant.upadhyaya at aricent.com
>>                     <mailto:prashant.upadhyaya at aricent.com>; Irena
>>                     Berezovsky; yunhong.jiang at intel.com
>>                     <mailto:yunhong.jiang at intel.com>;
>>                     chris.friesen at windriver.com
>>                     <mailto:chris.friesen at windriver.com>;
>>                     yongli.he at intel.com <mailto:yongli.he at intel.com>
>>                     *Cc:* OpenStack Development Mailing List; Brian
>>                     Bowen (brbowen); Kyle Mestery (kmestery); Sandhya
>>                     Dasu (sadasu)
>>                     *Subject:* Re: [openstack-dev] [nova] [neutron]
>>                     PCI pass-through network support
>>
>>                     Hi Irena,
>>
>>                     This is Robert Li from Cisco Systems. Recently, I
>>                     was tasked to investigate such support for
>>                     Cisco's systems that support VM-FEX, which is a
>>                     SRIOV technology supporting 802-1Qbh. I was able
>>                     to bring up nova instances with SRIOV interfaces,
>>                     and establish networking in between the instances
>>                     that employes the SRIOV interfaces. Certainly,
>>                     this was accomplished with hacking and some
>>                     manual intervention. Based on this experience and
>>                     my study with the two existing nova
>>                     pci-passthrough blueprints that have been
>>                     implemented and committed into Havana
>>                     (https://blueprints.launchpad.net/nova/+spec/pci-passthrough-base and
>>                     https://blueprints.launchpad.net/nova/+spec/pci-passthrough-libvirt),
>>                      I registered a couple of blueprints (one on Nova
>>                     side, the other on the Neutron side):
>>
>>                     https://blueprints.launchpad.net/nova/+spec/pci-passthrough-sriov
>>
>>                     https://blueprints.launchpad.net/neutron/+spec/pci-passthrough-sriov
>>
>>                     in order to address SRIOV support in openstack.
>>
>>                     Please take a look at them and see if they make
>>                     sense, and let me know any comments and
>>                     questions. We can also discuss this in the
>>                     summit, I suppose.
>>
>>                     I noticed that there is another thread on this
>>                     topic, so copy those folks  from that thread as well.
>>
>>                     thanks,
>>
>>                     Robert
>>
>>                     On 10/16/13 4:32 PM, "Irena Berezovsky"
>>                     <irenab at mellanox.com
>>                     <mailto:irenab at mellanox.com>> wrote:
>>
>>                         Hi,
>>
>>                         As one of the next steps for PCI pass-through
>>                         I would like to discuss is the support for
>>                         PCI pass-through vNIC.
>>
>>                         While nova takes care of PCI pass-through
>>                         device resources  management and VIF
>>                         settings, neutron should manage their
>>                         networking configuration.
>>
>>                         I would like to register asummit proposal to
>>                         discuss the support for PCI pass-through
>>                         networking.
>>
>>                         I am not sure what would be the right topic
>>                         to discuss the PCI pass-through networking,
>>                         since it involve both nova and neutron.
>>
>>                         There is already a session registered by
>>                         Yongli on nova topic to discuss the PCI
>>                         pass-through next steps.
>>
>>                         I think PCI pass-through networking is quite
>>                         a big topic and it worth to have a separate
>>                         discussion.
>>
>>                         Is there any other people who are interested
>>                         to discuss it and share their thoughts and
>>                         experience?
>>
>>                         Regards,
>>
>>                         Irena
>>
>
>
>     _______________________________________________
>     OpenStack-dev mailing list
>     OpenStack-dev at lists.openstack.org
>     <mailto:OpenStack-dev at lists.openstack.org>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131219/c40a3574/attachment-0001.html>


More information about the OpenStack-dev mailing list