[openstack-dev] [nova] [neutron] PCI pass-through network support
yongli he
yongli.he at intel.com
Thu Dec 19 00:52:46 UTC 2013
On 2013?12?17? 23:09, Ian Wells wrote:
> Reiterating from the IRC mneeting, largely, so apologies.
>
> Firstly, I disagree that
> https://wiki.openstack.org/wiki/PCI_passthrough_SRIOV_support is an
> accurate reflection of the current state. It's a very
do you really find anything is not what you want except the API part?
group device /fade out alias/ simplify the reguler expression for
address filter/ mark the device allocated the sriov?
Yongli he
> unilateral view, largely because the rest of us had been focussing on
> the google document that we've been using for weeks.
>
> Secondly, I totally disagree with this approach. This assumes that
> description of the (cloud-internal, hardware) details of each compute
> node is best done with data stored centrally and driven by an API. I
> don't agree with either of these points.
>
> Firstly, the best place to describe what's available on a compute node
> is in the configuration on the compute node. For instance, I describe
> which interfaces do what in Neutron on the compute node. This is
> because when you're provisioning nodes, that's the moment you know how
> you've attached it to the network and what hardware you've put in it
> and what you intend the hardware to be for - or conversely your
> deployment puppet or chef or whatever knows it, and Razor or MAAS has
> enumerated it, but the activities are equivalent. Storing it
> centrally distances the compute node from its descriptive information
> for no good purpose that I can see and adds the complexity of having
> to go make remote requests just to start up.
>
> Secondly, even if you did store this centrally, it's not clear to me
> that an API is very useful. As far as I can see, the need for an API
> is really the need to manage PCI device flavors. If you want that to
> be API-managed, then the rest of a (rather complex) API cascades from
> that one choice. Most of the things that API lets you change
> (expressions describing PCI devices) are the sort of thing that you
> set once and only revisit when you start - for instance - deploying
> new hosts in a different way.
>
> I at the parallel in Neutron provider networks. They're config
> driven, largely on the compute hosts. Agents know what ports on their
> machine (the hardware tie) are associated with provider networks, by
> provider network name. The controller takes 'neutron net-create ...
> --provider:network 'name'' and uses that to tie a virtual network to
> the provider network definition on each host. What we absolutely
> don't do is have a complex admin API that lets us say 'in host
> aggregate 4, provider network x (which I made earlier) is connected to
> eth6'.
>
> --
> Ian.
>
>
>
> On 17 December 2013 03:12, yongli he <yongli.he at intel.com
> <mailto:yongli.he at intel.com>> wrote:
>
> On 2013?12?16? 22:27, Robert Li (baoli) wrote:
>> Hi Yongli,
>>
>> The IRC meeting we have for PCI-Passthrough is the forum for
>> discussion on SR-IOV support in openstack. I think the goal is to
>> come up with a plan on both the nova and neutron side in support
>> of the SR-IOV, and the current focus is on the nova side. Since
>> you've done a lot of work on it already, would you like to lead
>> tomorrow's discussion at UTC 1400?
>
> Robert , you lead the meeting very well i enjoy you setup every
> for us, keep going on it -:)
>
> I'd like to give you guy a summary of current state, let's discuss
> it then.
> https://wiki.openstack.org/wiki/PCI_passthrough_SRIOV_support
>
>
> 1) fade out alias ( i think this ok for all)
> 2) white list became pic-flavor ( i think this ok for all)
> 3) address simply regular expression support: only * and a number
> range is support [hex-hex]. ( i think this ok?)
> 4) aggregate : now it's clear enough, and won't impact SRIOV. (
> i think this irrelevant to SRIOV now)
>
>
> 5) SRIOV use case, if you suggest a use case, please given a full
> example like this: [discuss: compare to other solution]
>
> * create a pci flavor for the SRIOV
>
> nova pci-flavor-create name 'vlan-SRIOV' description "xxxxx"
> nova pci-flavor-update UUID set 'description'='xxxx' 'address'= '0000:01:*.7'
>
>
>
> Admin config SRIOV
>
> * create pci-flavor :
>
> {"name": "privateNIC", "neutron-network-uuid": "uuid-1", ...}
> {"name": "publicNIC", "neutron-network-uuid": "uuid-2", ...}
> {"name": "smallGPU", "neutron-network-uuid": "", ...}
>
> * set aggregate meta according the flavors existed in the hosts
>
> flavor extra-specs, for a VM that gets two small GPUs and VIFs
> attached from the above SRIOV NICs:
>
> nova aggregate-set-metadata pci-aware-group set 'pci-flavor'='smallGPU,oldGPU, privateNIC,privateNIC'
>
> * create instance flavor for sriov
>
> nova flavor-key 100 set 'pci-flavor='1:privateNIC; 1: publicNIC; 2:smallGPU,oldGPU'
>
> * User just specifies a quantum port as normal:
>
> nova boot --flavor "sriov-plus-two-gpu" --image img --nic net-id=uuid-2 --nic net-id=uuid-1 vm-name
>
>
>
> Yongli
>
>
>>
>> Thanks,
>> Robert
>>
>> On 12/11/13 8:09 PM, "He, Yongli" <yongli.he at intel.com
>> <mailto:yongli.he at intel.com>> wrote:
>>
>> Hi, all
>>
>> Please continue to foucs on the blueprint, it change after
>> reviewing. And for this point:
>>
>>
>> >5. flavor style for sriov: i just list the flavor style in
>> the design but for the style
>> > --nic
>> > --pci-flavor PowerfullNIC:1
>> > still possible to work, so what's the real impact to
>> sriov from the flavor design?
>>
>> >As you can see from the log, Irena has some strong opinions on
>> this, and I tend to agree with her. The problem we need to
>> solve is this: we need a means to associate a nic (or port)
>> with a PCI device that is allocated out of a PCI >flavor or a
>> PCI group. We think that we presented a complete solution in
>> our google doc.
>>
>> It's not so clear, could you please list the key point here.
>> Btw, the blue print I sent Monday had changed for this,
>> please check.
>>
>> Yongli he
>>
>> *From:*Robert Li (baoli) [mailto:baoli at cisco.com]
>> *Sent:* Wednesday, December 11, 2013 10:18 PM
>> *To:* He, Yongli; Sandhya Dasu (sadasu); OpenStack
>> Development Mailing List (not for usage questions); Jiang,
>> Yunhong; Irena Berezovsky; prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>;
>> chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>; Itzik Brown;
>> john at johngarbutt.com <mailto:john at johngarbutt.com>
>> *Subject:* Re: [openstack-dev] [nova] [neutron] PCI
>> pass-through network support
>>
>> Hi Yongli,
>>
>> Thank you very much for sharing the Wiki with us on Monday so
>> that we have a better understanding on your ideas and
>> thoughts. Please see embedded comments.
>>
>> --Robert
>>
>> On 12/10/13 8:35 PM, "yongli he" <yongli.he at intel.com
>> <mailto:yongli.he at intel.com>> wrote:
>>
>> On 2013?12?10?22:41, Sandhya Dasu (sadasu) wrote:
>>
>> Hi,
>>
>> I am trying to resurrect this email thread since
>> discussions have split between several threads and is
>> becoming hard to keep track.
>>
>> An update:
>>
>> New PCI Passthrough meeting time: Tuesdays UTC 1400.
>>
>> New PCI flavor proposal from Nova:
>>
>> https://wiki.openstack.org/wiki/PCI_configration_Database_and_API#Take_advantage_of_host_aggregate_.28T.B.D.29
>>
>> Hi, all
>> sorry for miss the meeting, i was seeking John at that
>> time. from the log i saw some concern about new design,
>> i list them there and try to clarify it per my opinion:
>>
>> 1. configuration going to deprecated: this might impact
>> SRIOV. if possible, please list what kind of impact make
>> to you.
>>
>> Regarding the nova API pci-flavor-update, we had a
>> face-to-face discussion over use of a nova API to
>> provision/define/configure PCI passthrough list during the
>> ice-house summit. I kind of like the idea initially. As you
>> can see from the meeting log, however, I later thought that
>> in a distributed system, using a centralized API to define
>> resources per compute node, which could come and go any time,
>> doesn't seem to provide any significant benefit. This is the
>> reason that I didn't mention it in our google doc
>> https://docs.google.com/document/d/1EMwDg9J8zOxzvTnQJ9HwZdiotaVstFWKIuKrPse6JOs/edit#
>>
>> If you agree that pci-flavor and pci-group is kind of the
>> same thing, then we agree with you that the pci-flavor-create
>> API is needed. Since pci-flavor or pci-group is global, then
>> such an API can be used for resource registration/validation
>> on nova server. In addition, it can be used to facilitate the
>> display of PCI devices per node, per group, or in the entire
>> cloud, etc.
>>
>>
>>
>> 2. <baoli>So the API seems to be combining the whitelist
>> + pci-group
>> yeah, it's actually almost same thing, 'flavor'
>> 'pci-group' or 'group'. the real different is this flavor
>> going to deprecated the alias, and combine tight to
>> aggregate or flavor.
>>
>> Well, with pci-group, we recommended to deprecate the PCI
>> alias because we think it is redundant.
>>
>> We think that specification of PCI requirement in the
>> flavor's extra spec is still needed as it's a generic means
>> to allocate PCI devices. In addition, it can be used as
>> properties in the host aggregate as well.
>>
>>
>>
>> 3. feature:
>> this design is not to say the feature is not work, but
>> changed. if auto discovery feature is possible, we got
>> 'feature' form the device, then use the feature to define
>> the pci-flavor. it's also possible create default
>> pci-flavor for this. so the feature concept will be
>> impact, my feeling, we should given a separated bp for
>> feature, and not in this round change, so here we only
>> thing is keep the feature is possible.
>>
>> I think that it's ok to have separate BPs. But we think that
>> auto discovery is an essential part of the design, and
>> therefore it should be implemented with more helping hands.
>>
>>
>>
>> 4. address regular expression: i'm fine with the
>> wild-match style.
>>
>> Sounds good. One side node is that I noticed that the driver
>> for intel 82576 cards has a strange slot assignment scheme.
>> So the final definition of it may need to accommodate that as
>> well.
>>
>>
>>
>> 5. flavor style for sriov: i just list the flavor style
>> in the design but for the style
>> --nic
>> --pci-flavor PowerfullNIC:1
>> still possible to work, so what's the real impact to
>> sriov from the flavor design?
>>
>> As you can see from the log, Irena has some strong opinions
>> on this, and I tend to agree with her. The problem we need to
>> solve is this: we need a means to associate a nic (or port)
>> with a PCI device that is allocated out of a PCI flavor or a
>> PCI group. We think that we presented a complete solution in
>> our google doc.
>>
>> At this point, I really believe that we should combine our
>> efforts and ideas. As far as how many BPs are needed, it
>> should be a trivial matter after we have agreed on a complete
>> solution.
>>
>>
>>
>> Yongli He
>>
>>
>>
>> Thanks,
>>
>> Sandhya
>>
>> *From: *Sandhya Dasu <sadasu at cisco.com
>> <mailto:sadasu at cisco.com>>
>> *Reply-To: *"OpenStack Development Mailing List (not for
>> usage questions)" <openstack-dev at lists.openstack.org
>> <mailto:openstack-dev at lists.openstack.org>>
>> *Date: *Thursday, November 7, 2013 9:44 PM
>> *To: *"OpenStack Development Mailing List (not for usage
>> questions)" <openstack-dev at lists.openstack.org
>> <mailto:openstack-dev at lists.openstack.org>>, "Jiang,
>> Yunhong" <yunhong.jiang at intel.com
>> <mailto:yunhong.jiang at intel.com>>, "Robert Li (baoli)"
>> <baoli at cisco.com <mailto:baoli at cisco.com>>, Irena
>> Berezovsky <irenab at mellanox.com
>> <mailto:irenab at mellanox.com>>,
>> "prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>"
>> <prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>>,
>> "chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>"
>> <chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>>, "He, Yongli"
>> <yongli.he at intel.com <mailto:yongli.he at intel.com>>, Itzik
>> Brown <ItzikB at mellanox.com <mailto:ItzikB at mellanox.com>>
>> *Subject: *Re: [openstack-dev] [nova] [neutron] PCI
>> pass-through network support
>>
>> Hi,
>>
>> The discussions during the summit were very
>> productive. Now, we are ready to setup our IRC meeting.
>>
>> Here are some slots that look like they might work for us.
>>
>> 1. Wed 2 -- 3 pm UTC.
>>
>> 2. Thursday 12 -- 1 pm UTC.
>>
>> 3. Thursday 7 -- 8pm UTC.
>>
>> Please vote.
>>
>> Thanks,
>>
>> Sandhya
>>
>> *From: *Sandhya Dasu <sadasu at cisco.com
>> <mailto:sadasu at cisco.com>>
>> *Reply-To: *"OpenStack Development Mailing List (not for
>> usage questions)" <openstack-dev at lists.openstack.org
>> <mailto:openstack-dev at lists.openstack.org>>
>> *Date: *Tuesday, November 5, 2013 12:03 PM
>> *To: *"OpenStack Development Mailing List (not for usage
>> questions)" <openstack-dev at lists.openstack.org
>> <mailto:openstack-dev at lists.openstack.org>>, "Jiang,
>> Yunhong" <yunhong.jiang at intel.com
>> <mailto:yunhong.jiang at intel.com>>, "Robert Li (baoli)"
>> <baoli at cisco.com <mailto:baoli at cisco.com>>, Irena
>> Berezovsky <irenab at mellanox.com
>> <mailto:irenab at mellanox.com>>,
>> "prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>"
>> <prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>>,
>> "chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>"
>> <chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>>, "He, Yongli"
>> <yongli.he at intel.com <mailto:yongli.he at intel.com>>, Itzik
>> Brown <ItzikB at mellanox.com <mailto:ItzikB at mellanox.com>>
>> *Subject: *Re: [openstack-dev] [nova] [neutron] PCI
>> pass-through network support
>>
>> Just to clarify, the discussion is planned for 10 AM
>> Wednesday morning at the developer's lounge.
>>
>> Thanks,
>>
>> Sandhya
>>
>> *From: *Sandhya Dasu <sadasu at cisco.com
>> <mailto:sadasu at cisco.com>>
>> *Reply-To: *"OpenStack Development Mailing List (not for
>> usage questions)" <openstack-dev at lists.openstack.org
>> <mailto:openstack-dev at lists.openstack.org>>
>> *Date: *Tuesday, November 5, 2013 11:38 AM
>> *To: *"OpenStack Development Mailing List (not for usage
>> questions)" <openstack-dev at lists.openstack.org
>> <mailto:openstack-dev at lists.openstack.org>>, "Jiang,
>> Yunhong" <yunhong.jiang at intel.com
>> <mailto:yunhong.jiang at intel.com>>, "Robert Li (baoli)"
>> <baoli at cisco.com <mailto:baoli at cisco.com>>, Irena
>> Berezovsky <irenab at mellanox.com
>> <mailto:irenab at mellanox.com>>,
>> "prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>"
>> <prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>>,
>> "chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>"
>> <chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>>, "He, Yongli"
>> <yongli.he at intel.com <mailto:yongli.he at intel.com>>, Itzik
>> Brown <ItzikB at mellanox.com <mailto:ItzikB at mellanox.com>>
>> *Subject: *Re: [openstack-dev] [nova] [neutron] PCI
>> pass-through network support
>>
>> *Hi,*
>>
>> * We are planning to have a discussion at the
>> developer's lounge tomorrow morning at 10:00 am. Please
>> feel free to drop by if you are interested.*
>>
>> *Thanks,*
>>
>> *Sandhya*
>>
>> *From: *<Jiang>, Yunhong <yunhong.jiang at intel.com
>> <mailto:yunhong.jiang at intel.com>>
>>
>> *Date: *Thursday, October 31, 2013 6:21 PM
>> *To: *"Robert Li (baoli)" <baoli at cisco.com
>> <mailto:baoli at cisco.com>>, Irena Berezovsky
>> <irenab at mellanox.com <mailto:irenab at mellanox.com>>,
>> "prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>"
>> <prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>>,
>> "chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>"
>> <chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>>, "He, Yongli"
>> <yongli.he at intel.com <mailto:yongli.he at intel.com>>, Itzik
>> Brown <ItzikB at mellanox.com <mailto:ItzikB at mellanox.com>>
>> *Cc: *OpenStack Development Mailing List
>> <openstack-dev at lists.openstack.org
>> <mailto:openstack-dev at lists.openstack.org>>, "Brian Bowen
>> (brbowen)" <brbowen at cisco.com
>> <mailto:brbowen at cisco.com>>, "Kyle Mestery (kmestery)"
>> <kmestery at cisco.com <mailto:kmestery at cisco.com>>, Sandhya
>> Dasu <sadasu at cisco.com <mailto:sadasu at cisco.com>>
>> *Subject: *RE: [openstack-dev] [nova] [neutron] PCI
>> pass-through network support
>>
>> Robert, I think your change request for pci alias should
>> be covered by the extra infor enhancement.
>> https://blueprints.launchpad.net/nova/+spec/pci-extra-info and
>> Yongli is working on it.
>>
>> I'm not sure how the port profile is passed to the
>> connected switch, is it a Cisco VMEFX specific method or
>> libvirt method? Sorry I'm not well on network side.
>>
>> --jyh
>>
>> *From:*Robert Li (baoli) [mailto:baoli at cisco.com]
>> *Sent:* Wednesday, October 30, 2013 10:13 AM
>> *To:* Irena Berezovsky; Jiang, Yunhong;
>> prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>;
>> chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>; He, Yongli; Itzik Brown
>> *Cc:* OpenStack Development Mailing List; Brian Bowen
>> (brbowen); Kyle Mestery (kmestery); Sandhya Dasu (sadasu)
>> *Subject:* Re: [openstack-dev] [nova] [neutron] PCI
>> pass-through network support
>>
>> Hi,
>>
>> Regarding physical network mapping, This is what I thought.
>>
>> consider the following scenarios:
>>
>> 1. a compute node with SRIOV only interfaces attached
>> to a physical network. the node is connected to one
>> upstream switch
>>
>> 2. a compute node with both SRIOV interfaces and
>> non-SRIOV interfaces attached to a physical network. the
>> node is connected to one upstream switch
>>
>> 3. in addition to case 1 &2, a compute node may have
>> multiple vNICs that are connected to different upstream
>> switches.
>>
>> CASE 1:
>>
>> -- the mapping from a virtual network (in terms of
>> neutron) to a physical network is actually done by
>> binding a port profile to a neutron port. With cisco's
>> VM-FEX, a port profile is associated with one or multiple
>> vlans. Once the neutron port is bound with this
>> port-profile in the upstream switch, it's effectively
>> plugged into the physical network.
>>
>> -- since the compute node is connected to one upstream
>> switch, the existing nova PCI alias will be sufficient.
>> For example, one can boot a Nova instance that is
>> attached to a SRIOV port with the following command:
>>
>> nova boot ---flavor m1.large ---image
>> <image-id> --nic
>> net-id=<net>,pci-alias=<alias>,sriov=<direct|macvtap>,port-profile=<profile>
>>
>> the net-id will be useful in terms of allocating IP
>> address, enable dhcp, etc that is associated with the
>> network.
>>
>> -- the pci-alias specified in the nova boot command is
>> used to create a PCI request for scheduling purpose. a
>> PCI device is bound to a neutron port during the instance
>> build time in the case of nova boot. Before invoking the
>> neutron API to create a port, an allocated PCI device out
>> of a PCI alias will be located from the PCI device list
>> object. This device info among other information will be
>> sent to neutron to create the port.
>>
>> CASE 2:
>>
>> -- Assume that OVS is used for the non-SRIOV interfaces.
>> An example of configuration with ovs plugin would look like:
>>
>> bridge_mappings = physnet1:br-vmfex
>>
>> network_vlan_ranges = physnet1:15:17
>>
>> tenant_network_type = vlan
>>
>> When a neutron network is created, a vlan is either
>> allocated or specified in the neutron net-create command.
>> Attaching a physical interface to the bridge (in the
>> above example br-vmfex) is an administrative task.
>>
>> -- to create a Nova instance with non-SRIOV port:
>>
>> nova boot ---flavor m1.large ---image
>> <image-id> --nic net-id=<net>
>>
>> -- to create a Nova instance with SRIOV port:
>>
>> nova boot ---flavor m1.large ---image
>> <image-id> --nic
>> net-id=<net>,pci-alias=<alias>,sriov=<direct|macvtap>,port-profile=<profile>
>>
>> it's essentially the same as in the first case. But
>> since the net-id is already associated with a vlan, the
>> vlan associated with the port-profile must be identical
>> to that vlan. This has to be enforced by neutron.
>>
>> again, since the node is connected to one upstream
>> switch, the existing nova PCI alias should be sufficient.
>>
>> CASE 3:
>>
>> -- A compute node might be connected to multiple upstream
>> switches, with each being a separate network. This means
>> SRIOV PFs/VFs are already implicitly associated with
>> physical networks. In the none-SRIOV case, a physical
>> interface is associated with a physical network by
>> plugging it into that network, and attaching this
>> interface to the ovs bridge that represents this physical
>> network on the compute node. In the SRIOV case, we need a
>> way to group the SRIOV VFs that belong to the same
>> physical networks. The existing nova PCI alias is to
>> facilitate PCI device allocation by associating
>> <product_id, vendor_id> with an alias name. This will no
>> longer be sufficient. But it can be enhanced to achieve
>> our goal. For example, the PCI device domain, bus (if
>> their mapping to vNIC is fixed across boot) may be added
>> into the alias, and the alias name should be
>> corresponding to a list of tuples.
>>
>> Another consideration is that a VF or PF might be used on
>> the host for other purposes. For example, it's possible
>> for a neutron DHCP server to be bound with a VF.
>> Therefore, there needs a method to exclude some VFs from
>> a group. One way is to associate an exclude list with an
>> alias.
>>
>> The enhanced PCI alias can be used to support features
>> other than neutron as well. Essentially, a PCI alias can
>> be defined as a group of PCI devices associated with a
>> feature. I'd think that this should be addressed with a
>> separate blueprint.
>>
>> Thanks,
>>
>> Robert
>>
>> On 10/30/13 12:59 AM, "Irena Berezovsky"
>> <irenab at mellanox.com <mailto:irenab at mellanox.com>> wrote:
>>
>> Hi,
>>
>> Please see my answers inline
>>
>> *From:*Jiang, Yunhong [mailto:yunhong.jiang at intel.com]
>> *Sent:* Tuesday, October 29, 2013 10:17 PM
>> *To:* Irena Berezovsky; Robert Li (baoli);
>> prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>;
>> chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>; He, Yongli;
>> Itzik Brown
>> *Cc:* OpenStack Development Mailing List; Brian Bowen
>> (brbowen); Kyle Mestery (kmestery); Sandhya Dasu (sadasu)
>> *Subject:* RE: [openstack-dev] [nova] [neutron] PCI
>> pass-through network support
>>
>> Your explanation of the virtual network and physical
>> network is quite clear and should work well. We need
>> change nova code to achieve it, including get the
>> physical network for the virtual network, passing the
>> physical network requirement to the filter properties
>> etc.
>>
>> */[IrenaB] /* The physical network is already
>> available to nova at networking/nova/api at as
>> virtual network attribute, it then passed to the VIF
>> driver. We will push soon the fix
>> to:https://bugs.launchpad.net/nova/+bug/1239606;
>> which will provide general support for getting this
>> information.
>>
>> For your port method, so you mean we are sure to
>> passing network id to 'nova boot' and nova will
>> create the port during VM boot, am I right? Also,
>> how can nova knows that it need allocate the PCI
>> device for the port? I'd suppose that in SR-IOV NIC
>> environment, user don't need specify the PCI
>> requirement. Instead, the PCI requirement should come
>> from the network configuration and image property. Or
>> you think user still need passing flavor with pci
>> request?
>>
>> */[IrenaB] There are two way to apply port method.
>> One is to pass network id on nova boot and use
>> default type as chosen in the neutron config file for
>> vnic type. Other way is to define port with required
>> vnic type and other properties if applicable, and run
>> 'nova boot' with port id argument. Going forward with
>> nova support for PCI devices awareness, we do need a
>> way impact scheduler choice to land VM on suitable
>> Host with available PC device that has the required
>> connectivity./*
>>
>> --jyh
>>
>> *From:*Irena Berezovsky [mailto:irenab at mellanox.com]
>> *Sent:* Tuesday, October 29, 2013 3:17 AM
>> *To:* Jiang, Yunhong; Robert Li (baoli);
>> prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>;
>> chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>; He, Yongli;
>> Itzik Brown
>> *Cc:* OpenStack Development Mailing List; Brian Bowen
>> (brbowen); Kyle Mestery (kmestery); Sandhya Dasu (sadasu)
>> *Subject:* RE: [openstack-dev] [nova] [neutron] PCI
>> pass-through network support
>>
>> Hi Jiang, Robert,
>>
>> IRC meeting option works for me.
>>
>> If I understand your question below, you are looking
>> for a way to tie up between requested virtual
>> network(s) and requested PCI device(s). The way we
>> did it in our solution is to map a
>> provider:physical_network to an interface that
>> represents the Physical Function. Every virtual
>> network is bound to the provider:physical_network, so
>> the PCI device should be allocated based on this
>> mapping. We can map a PCI alias to the
>> provider:physical_network.
>>
>> Another topic to discuss is where the mapping between
>> neutron port and PCI device should be managed. One
>> way to solve it, is to propagate the allocated PCI
>> device details to neutron on port creation.
>>
>> In case there is no qbg/qbh support, VF networking
>> configuration should be applied locally on the Host.
>>
>> The question is when and how to apply networking
>> configuration on the PCI device?
>>
>> We see the following options:
>>
>> ?it can be done on port creation.
>>
>> ?It can be done when nova VIF driver is called for
>> vNIC plugging. This will require to have all
>> networking configuration available to the VIF driver
>> or send request to the neutron server to obtain it.
>>
>> ?It can be done by having a dedicated L2 neutron
>> agent on each Host that scans for allocated PCI
>> devices and then retrieves networking configuration
>> from the server and configures the device. The agent
>> will be also responsible for managing update requests
>> coming from the neutron server.
>>
>> For macvtap vNIC type assignment, the networking
>> configuration can be applied by a dedicated L2
>> neutron agent.
>>
>> BR,
>>
>> Irena
>>
>> *From:*Jiang, Yunhong [mailto:yunhong.jiang at intel.com]
>> *Sent:* Tuesday, October 29, 2013 9:04 AM
>>
>>
>> *To:* Robert Li (baoli); Irena Berezovsky;
>> prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>;
>> chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>; He, Yongli;
>> Itzik Brown
>> *Cc:* OpenStack Development Mailing List; Brian Bowen
>> (brbowen); Kyle Mestery (kmestery); Sandhya Dasu (sadasu)
>> *Subject:* RE: [openstack-dev] [nova] [neutron] PCI
>> pass-through network support
>>
>> Robert, is it possible to have a IRC meeting? I'd
>> prefer to IRC meeting because it's more openstack
>> style and also can keep the minutes clearly.
>>
>> To your flow, can you give more detailed example. For
>> example, I can consider user specify the instance
>> with --nic option specify a network id, and then how
>> nova device the requirement to the PCI device? I
>> assume the network id should define the switches that
>> the device can connect to , but how is that
>> information translated to the PCI property
>> requirement? Will this translation happen before the
>> nova scheduler make host decision?
>>
>> Thanks
>>
>> --jyh
>>
>> *From:*Robert Li (baoli) [mailto:baoli at cisco.com]
>> *Sent:* Monday, October 28, 2013 12:22 PM
>> *To:* Irena Berezovsky;
>> prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>; Jiang,
>> Yunhong; chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>; He, Yongli;
>> Itzik Brown
>> *Cc:* OpenStack Development Mailing List; Brian Bowen
>> (brbowen); Kyle Mestery (kmestery); Sandhya Dasu (sadasu)
>> *Subject:* Re: [openstack-dev] [nova] [neutron] PCI
>> pass-through network support
>>
>> Hi Irena,
>>
>> Thank you very much for your comments. See inline.
>>
>> --Robert
>>
>> On 10/27/13 3:48 AM, "Irena Berezovsky"
>> <irenab at mellanox.com <mailto:irenab at mellanox.com>> wrote:
>>
>> Hi Robert,
>>
>> Thank you very much for sharing the information
>> regarding your efforts. Can you please share your
>> idea of the end to end flow? How do you suggest
>> to bind Nova and Neutron?
>>
>> The end to end flow is actually encompassed in the
>> blueprints in a nutshell. I will reiterate it in
>> below. The binding between Nova and Neutron occurs
>> with the neutron v2 API that nova invokes in order to
>> provision the neutron services. The vif driver is
>> responsible for plugging in an instance onto the
>> networking setup that neutron has created on the host.
>>
>> Normally, one will invoke "nova boot" api with the
>> ---nic options to specify the nic with which the
>> instance will be connected to the network. It
>> currently allows net-id, fixed ip and/or port-id to
>> be specified for the option. However, it doesn't
>> allow one to specify special networking requirements
>> for the instance. Thanks to the nova pci-passthrough
>> work, one can specify PCI passthrough device(s) in
>> the nova flavor. But it doesn't provide means to tie
>> up these PCI devices in the case of ethernet adpators
>> with networking services. Therefore the idea is
>> actually simple as indicated by the blueprint titles,
>> to provide means to tie up SRIOV devices with neutron
>> services. A work flow would roughly look like this
>> for 'nova boot':
>>
>> -- Specifies networking requirements in the
>> ---nic option. Specifically for SRIOV, allow the
>> following to be specified in addition to the existing
>> required information:
>>
>> . PCI alias
>>
>> . direct pci-passthrough/macvtap
>>
>> . port profileid that is compliant
>> with 802.1Qbh
>>
>> The above information is optional. In the
>> absence of them, the existing behavior remains.
>>
>> -- if special networking requirements exist,
>> Nova api creates PCI requests in the nova instance
>> type for scheduling purpose
>>
>> -- Nova scheduler schedules the instance based
>> on the requested flavor plus the PCI requests that
>> are created for networking.
>>
>> -- Nova compute invokes neutron services with
>> PCI passthrough information if any
>>
>> -- Neutron performs its normal operations based
>> on the request, such as allocating a port, assigning
>> ip addresses, etc. Specific to SRIOV, it should
>> validate the information such as profileid, and
>> stores them in its db. It's also possible to
>> associate a port profileid with a neutron network so
>> that port profileid becomes optional in the ---nic
>> option. Neutron returns nova the port information,
>> especially for PCI passthrough related information in
>> the port binding object. Currently, the port binding
>> object contains the following information:
>>
>> binding:vif_type
>>
>> binding:host_id
>>
>> binding:profile
>>
>> binding:capabilities
>>
>> -- nova constructs the domain xml and plug in the
>> instance by calling the vif driver. The vif driver
>> can build up the interface xml based on the port
>> binding information.
>>
>> The blueprints you registered make sense. On Nova
>> side, there is a need to bind between requested
>> virtual network and PCI device/interface to be
>> allocated as vNIC.
>>
>> On the Neutron side, there is a need to support
>> networking configuration of the vNIC. Neutron
>> should be able to identify the PCI device/macvtap
>> interface in order to apply configuration. I
>> think it makes sense to provide neutron
>> integration via dedicated Modular Layer 2
>> Mechanism Driver to allow PCI pass-through vNIC
>> support along with other networking technologies.
>>
>> I haven't sorted through this yet. A neutron port
>> could be associated with a PCI device or not, which
>> is a common feature, IMHO. However, a ML2 driver may
>> be needed specific to a particular SRIOV technology.
>>
>> During the Havana Release, we introduced Mellanox
>> Neutron plugin that enables networking via SRIOV
>> pass-through devices or macvtap interfaces.
>>
>> We want to integrate our solution with PCI
>> pass-through Nova support. I will be glad to
>> share more details if you are interested.
>>
>> Good to know that you already have a SRIOV
>> implementation. I found out some information online
>> about the mlnx plugin, but need more time to get to
>> know it better. And certainly I'm interested in
>> knowing its details.
>>
>> The PCI pass-through networking support is
>> planned to be discussed during the summit:
>> http://summit.openstack.org/cfp/details/129. I
>> think it's worth to drill down into more detailed
>> proposal and present it during the summit,
>> especially since it impacts both nova and neutron
>> projects.
>>
>> I agree. Maybe we can steal some time in that discussion.
>>
>> Would you be interested in collaboration on this
>> effort? Would you be interested to exchange more
>> emails or set an IRC/WebEx meeting during this
>> week before the summit?
>>
>> Sure. If folks want to discuss it before the summit,
>> we can schedule a webex later this week. Or
>> otherwise, we can continue the discussion with email.
>>
>> Regards,
>>
>> Irena
>>
>> *From:*Robert Li (baoli) [mailto:baoli at cisco.com]
>> *Sent:* Friday, October 25, 2013 11:16 PM
>> *To:* prashant.upadhyaya at aricent.com
>> <mailto:prashant.upadhyaya at aricent.com>; Irena
>> Berezovsky; yunhong.jiang at intel.com
>> <mailto:yunhong.jiang at intel.com>;
>> chris.friesen at windriver.com
>> <mailto:chris.friesen at windriver.com>;
>> yongli.he at intel.com <mailto:yongli.he at intel.com>
>> *Cc:* OpenStack Development Mailing List; Brian
>> Bowen (brbowen); Kyle Mestery (kmestery); Sandhya
>> Dasu (sadasu)
>> *Subject:* Re: [openstack-dev] [nova] [neutron]
>> PCI pass-through network support
>>
>> Hi Irena,
>>
>> This is Robert Li from Cisco Systems. Recently, I
>> was tasked to investigate such support for
>> Cisco's systems that support VM-FEX, which is a
>> SRIOV technology supporting 802-1Qbh. I was able
>> to bring up nova instances with SRIOV interfaces,
>> and establish networking in between the instances
>> that employes the SRIOV interfaces. Certainly,
>> this was accomplished with hacking and some
>> manual intervention. Based on this experience and
>> my study with the two existing nova
>> pci-passthrough blueprints that have been
>> implemented and committed into Havana
>> (https://blueprints.launchpad.net/nova/+spec/pci-passthrough-base and
>> https://blueprints.launchpad.net/nova/+spec/pci-passthrough-libvirt),
>> I registered a couple of blueprints (one on Nova
>> side, the other on the Neutron side):
>>
>> https://blueprints.launchpad.net/nova/+spec/pci-passthrough-sriov
>>
>> https://blueprints.launchpad.net/neutron/+spec/pci-passthrough-sriov
>>
>> in order to address SRIOV support in openstack.
>>
>> Please take a look at them and see if they make
>> sense, and let me know any comments and
>> questions. We can also discuss this in the
>> summit, I suppose.
>>
>> I noticed that there is another thread on this
>> topic, so copy those folks from that thread as well.
>>
>> thanks,
>>
>> Robert
>>
>> On 10/16/13 4:32 PM, "Irena Berezovsky"
>> <irenab at mellanox.com
>> <mailto:irenab at mellanox.com>> wrote:
>>
>> Hi,
>>
>> As one of the next steps for PCI pass-through
>> I would like to discuss is the support for
>> PCI pass-through vNIC.
>>
>> While nova takes care of PCI pass-through
>> device resources management and VIF
>> settings, neutron should manage their
>> networking configuration.
>>
>> I would like to register asummit proposal to
>> discuss the support for PCI pass-through
>> networking.
>>
>> I am not sure what would be the right topic
>> to discuss the PCI pass-through networking,
>> since it involve both nova and neutron.
>>
>> There is already a session registered by
>> Yongli on nova topic to discuss the PCI
>> pass-through next steps.
>>
>> I think PCI pass-through networking is quite
>> a big topic and it worth to have a separate
>> discussion.
>>
>> Is there any other people who are interested
>> to discuss it and share their thoughts and
>> experience?
>>
>> Regards,
>>
>> Irena
>>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> <mailto:OpenStack-dev at lists.openstack.org>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131219/c40a3574/attachment-0001.html>
More information about the OpenStack-dev
mailing list