[openstack-dev] [nova] [neutron] PCI pass-through network support

Jiang, Yunhong yunhong.jiang at intel.com
Tue Oct 29 20:31:00 UTC 2013


Henry,why do you think the "service VM" need the entire PF instead of a VF? I think the SR-IOV NIC should provide QoS and performance isolation.

As to assign entire PCI device to a guest, that should be ok since usually PF and VF has different device ID, the tricky thing is, at least for some PCI devices, you can't configure that some NIC will have SR-IOV enabled while others not.

Thanks
--jyh

> -----Original Message-----
> From: Henry Gessau [mailto:gessau at cisco.com]
> Sent: Tuesday, October 29, 2013 8:10 AM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [nova] [neutron] PCI pass-through network
> support
> 
> Lots of great info and discussion going on here.
> 
> One additional thing I would like to mention is regarding PF and VF usage.
> 
> Normally VFs will be assigned to instances, and the PF will either not be
> used at all, or maybe some agent in the host of the compute node might
> have
> access to the PF for something (management?).
> 
> There is a neutron design track around the development of "service VMs".
> These are dedicated instances that run neutron services like routers,
> firewalls, etc. It is plausible that a service VM would like to use PCI
> passthrough and get the entire PF. This would allow it to have complete
> control over a physical link, which I think will be wanted in some cases.
> 
> --
> Henry
> 
> On Tue, Oct 29, at 10:23 am, Irena Berezovsky <irenab at mellanox.com>
> wrote:
> 
> > Hi,
> >
> > I would like to share some details regarding the support provided by
> > Mellanox plugin. It enables networking via SRIOV pass-through devices
> or
> > macvtap interfaces.  It plugin is available here:
> >
> https://github.com/openstack/neutron/tree/master/neutron/plugins/mln
> x.
> >
> > To support either PCI pass-through device and macvtap interface type of
> > vNICs, we set neutron port profile:vnic_type according to the required
> VIF
> > type and then use the created port to 'nova boot' the VM.
> >
> > To  overcome the missing scheduler awareness for PCI devices which
> was not
> > part of the Havana release yet, we
> >
> > have an additional service (embedded switch Daemon) that runs on each
> > compute node.
> >
> > This service manages the SRIOV resources allocation,  answers vNICs
> > discovery queries and applies VLAN/MAC configuration using standard
> Linux
> > APIs (code is here:
> https://github.com/mellanox-openstack/mellanox-eswitchd
> > ).  The embedded switch Daemon serves as a glue layer between VIF
> Driver and
> > Neutron Agent.
> >
> > In the Icehouse Release when SRIOV resources allocation is already part
> of
> > the Nova, we plan to eliminate the need in embedded switch daemon
> service.
> > So what is left to figure out is how to tie up between neutron port and
> PCI
> > device and invoke networking configuration.
> >
> >
> >
> > In our case what we have is actually the Hardware VEB that is not
> programmed
> > via either 802.1Qbg or 802.1Qbh, but configured locally by Neutron
> Agent. We
> > also support both Ethernet and InfiniBand physical network L2
> technology.
> > This means that we apply different configuration commands  to set
> > configuration on VF.
> >
> >
> >
> > I guess what we have to figure out is how to support the generic case for
> > the PCI device networking support, for HW VEB, 802.1Qbg and
> 802.1Qbh cases.
> >
> >
> >
> > BR,
> >
> > Irena
> >
> >
> >
> > *From:*Robert Li (baoli) [mailto:baoli at cisco.com]
> > *Sent:* Tuesday, October 29, 2013 3:31 PM
> > *To:* Jiang, Yunhong; Irena Berezovsky;
> prashant.upadhyaya at aricent.com;
> > chris.friesen at windriver.com; He, Yongli; Itzik Brown
> > *Cc:* OpenStack Development Mailing List; Brian Bowen (brbowen);
> Kyle
> > Mestery (kmestery); Sandhya Dasu (sadasu)
> > *Subject:* Re: [openstack-dev] [nova] [neutron] PCI pass-through
> network support
> >
> >
> >
> > Hi Yunhong,
> >
> >
> >
> > I haven't looked at Mellanox in much detail. I think that we'll get more
> > details from Irena down the road. Regarding your question, I can only
> answer
> > based on my experience with Cisco's VM-FEX. In a nutshell:
> >
> >      -- a vNIC is connected to an external switch. Once the host is
> booted
> > up, all the PFs and VFs provisioned on the vNIC will be created, as well as
> > all the corresponding ethernet interfaces .
> >
> >      -- As far as Neutron is concerned, a neutron port can be
> associated
> > with a VF. One way to do so is to specify this requirement in the -nic
> > option, providing information such as:
> >
> >                . PCI alias (this is the same alias as defined in your nova
> > blueprints)
> >
> >                . direct pci-passthrough/macvtap
> >
> >                . port profileid that is compliant with 802.1Qbh
> >
> >      -- similar to how you translate the nova flavor with PCI
> requirements
> > to PCI requests for scheduling purpose, Nova API (the nova api
> component)
> > can translate the above to PCI requests for scheduling purpose. I can
> give
> > more detail later on this.
> >
> >
> >
> > Regarding your last question, since the vNIC is already connected with
> the
> > external switch, the vNIC driver will be responsible for communicating
> the
> > port profile to the external switch. As you have already known, libvirt
> > provides several ways to specify a VM to be booted up with SRIOV. For
> > example, in the following interface definition:
> >
> >
> >
> >   *<interface type='hostdev' managed='yes'>*
> >
> > *      <source>*
> >
> > *        <address type='pci' domain='0' bus='0x09' slot='0x0'
> function='0x01'/>*
> >
> > *      </source>*
> >
> > *      <mac address='01:23:45:67:89:ab' />*
> >
> > *      <virtualport type='802.1Qbh'>*
> >
> > *        <parameters profileid='my-port-profile' />*
> >
> > *      </virtualport>*
> >
> > *    </interface>*
> >
> >
> >
> > The SRIOV VF (bus 0x09, VF 0x01) will be allocated, and the port profile
> 'my-port-profile' will be used to provision this VF. Libvirt will be
> responsible for invoking the vNIC driver to configure this VF with the port
> profile my-port-porfile. The driver will talk to the external switch using the
> 802.1qbh standards to complete the VF's configuration and binding with
> the VM.
> >
> >
> >
> > Now that nova PCI passthrough is responsible for
> discovering/scheduling/allocating a VF, the rest of the puzzle is to associate
> this PCI device with the feature that's going to use it, and the feature will
> be responsible for configuring it. You can also see from the above example,
> in one implementation of SRIOV, the feature (in this case neutron) may not
> need to do much in terms of working with the external switch, the work is
> actually done by libvirt behind the scene.
> >
> >
> >
> > Now the questions are:
> >
> >         -- how the port profile gets defined/managed
> >
> >         -- how the port profile gets associated with a neutron network
> >
> > The first question will be specific to the particular product, and
> therefore a particular neutron plugin has to mange that.
> >
> > There may be several approaches to address the second question. For
> example, in the simplest case, a port profile can be associated with a
> neutron network. This has some significant drawbacks. Since the port
> profile defines features for all the ports that use it, the one port profile to
> one neutron network mapping would mean all the ports on the network
> will have exactly the same features (for example, QoS characteristics). To
> make it flexible, the binding of a port profile to a port may be done at the
> port creation time.
> >
> >
> >
> > Let me know if the above answered your question.
> >
> >
> >
> > thanks,
> >
> > Robert
> >
> >
> >
> > On 10/29/13 3:03 AM, "Jiang, Yunhong" <yunhong.jiang at intel.com
> > <mailto:yunhong.jiang at intel.com>> wrote:
> >
> >
> >
> >     Robert, is it possible to have a IRC meeting? I'd prefer to IRC
> meeting
> >     because it's more openstack style and also can keep the minutes
> clearly.
> >
> >
> >
> >     To your flow, can you give more detailed example. For example, I
> can
> >     consider user specify the instance with -nic option specify a
> network
> >     id, and then how nova device the requirement to the PCI device? I
> assume
> >     the network id should define the switches that the device can
> connect to
> >     , but how is that information translated to the PCI property
> >     requirement? Will this translation happen before the nova
> scheduler make
> >     host decision?
> >
> >
> >
> >     Thanks
> >
> >     --jyh
> >
> >
> >
> >     *From:*Robert Li (baoli) [mailto:baoli at cisco.com]
> >     *Sent:* Monday, October 28, 2013 12:22 PM
> >     *To:* Irena Berezovsky; prashant.upadhyaya at aricent.com
> >     <mailto:prashant.upadhyaya at aricent.com>; Jiang, Yunhong;
> >     chris.friesen at windriver.com
> <mailto:chris.friesen at windriver.com>; He,
> >     Yongli; Itzik Brown
> >     *Cc:* OpenStack Development Mailing List; Brian Bowen
> (brbowen); Kyle
> >     Mestery (kmestery); Sandhya Dasu (sadasu)
> >     *Subject:* Re: [openstack-dev] [nova] [neutron] PCI pass-through
> network
> >     support
> >
> >
> >
> >     Hi Irena,
> >
> >
> >
> >     Thank you very much for your comments. See inline.
> >
> >
> >
> >     --Robert
> >
> >
> >
> >     On 10/27/13 3:48 AM, "Irena Berezovsky" <irenab at mellanox.com
> >     <mailto:irenab at mellanox.com>> wrote:
> >
> >
> >
> >         Hi Robert,
> >
> >         Thank you very much for sharing the information regarding
> your
> >         efforts. Can you please share your idea of the end to end flow?
> How
> >         do you suggest  to bind Nova and Neutron?
> >
> >
> >
> >     The end to end flow is actually encompassed in the blueprints in a
> >     nutshell. I will reiterate it in below. The binding between Nova and
> >     Neutron occurs with the neutron v2 API that nova invokes in order
> to
> >     provision the neutron services. The vif driver is responsible for
> >     plugging in an instance onto the networking setup that neutron has
> >     created on the host.
> >
> >
> >
> >     Normally, one will invoke "nova boot" api with the -nic options to
> >     specify the nic with which the instance will be connected to the
> >     network. It currently allows net-id, fixed ip and/or port-id to be
> >     specified for the option. However, it doesn't allow one to specify
> >     special networking requirements for the instance. Thanks to the
> nova
> >     pci-passthrough work, one can specify PCI passthrough device(s) in
> the
> >     nova flavor. But it doesn't provide means to tie up these PCI devices
> in
> >     the case of ethernet adpators with networking services. Therefore
> the
> >     idea is actually simple as indicated by the blueprint titles, to provide
> >     means to tie up SRIOV devices with neutron services. A work flow
> would
> >     roughly look like this for 'nova boot':
> >
> >
> >
> >           -- Specifies networking requirements in the -nic option.
> >     Specifically for SRIOV, allow the following to be specified in addition
> >     to the existing required information:
> >
> >                    . PCI alias
> >
> >                    . direct pci-passthrough/macvtap
> >
> >                    . port profileid that is compliant with 802.1Qbh
> >
> >
> >
> >             The above information is optional. In the absence of them,
> the
> >     existing behavior remains.
> >
> >
> >
> >          -- if special networking requirements exist, Nova api creates
> PCI
> >     requests in the nova instance type for scheduling purpose
> >
> >
> >
> >          -- Nova scheduler schedules the instance based on the
> requested
> >     flavor plus the PCI requests that are created for networking.
> >
> >
> >
> >          -- Nova compute invokes neutron services with PCI
> passthrough
> >     information if any
> >
> >
> >
> >          --  Neutron performs its normal operations based on the
> request,
> >     such as allocating a port, assigning ip addresses, etc. Specific to
> >     SRIOV, it should validate the information such as profileid, and
> stores
> >     them in its db. It's also possible to associate a port profileid with a
> >     neutron network so that port profileid becomes optional in the
> -nic
> >     option. Neutron returns  nova the port information, especially for
> PCI
> >     passthrough related information in the port binding object.
> Currently,
> >     the port binding object contains the following information:
> >
> >               binding:vif_type
> >
> >               binding:host_id
> >
> >               binding:profile
> >
> >               binding:capabilities
> >
> >
> >
> >         -- nova constructs the domain xml and plug in the instance by
> >     calling the vif driver. The vif driver can build up the interface xml
> >     based on the port binding information.
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >         The blueprints you registered make sense. On Nova side, there
> is a
> >         need to bind between requested virtual network and PCI
> >         device/interface to be allocated as vNIC.
> >
> >         On the Neutron side, there is a need to  support networking
> >         configuration of the vNIC. Neutron should be able to identify
> the
> >         PCI device/macvtap interface in order to apply configuration. I
> >         think it makes sense to provide neutron integration via
> dedicated
> >         Modular Layer 2 Mechanism Driver to allow PCI pass-through
> vNIC
> >         support along with other networking technologies.
> >
> >
> >
> >     I haven't sorted through this yet. A neutron port could be
> associated
> >     with a PCI device or not, which is a common feature, IMHO.
> However, a
> >     ML2 driver may be needed specific to a particular SRIOV
> technology.
> >
> >
> >
> >
> >
> >         During the Havana Release, we introduced Mellanox Neutron
> plugin
> >         that enables networking via SRIOV pass-through devices or
> macvtap
> >         interfaces.
> >
> >         We want to integrate our solution with PCI pass-through Nova
> >         support.  I will be glad to share more details if you are
> interested.
> >
> >
> >
> >
> >
> >     Good to know that you already have a SRIOV implementation. I
> found out
> >     some information online about the mlnx plugin, but need more
> time to get
> >     to know it better. And certainly I'm interested in knowing its details.
> >
> >
> >
> >         The PCI pass-through networking support is planned to be
> discussed
> >         during the summit:
> http://summit.openstack.org/cfp/details/129. I
> >         think it's worth to drill down into more detailed proposal and
> >         present it during the summit, especially since it impacts both
> nova
> >         and neutron projects.
> >
> >
> >
> >     I agree. Maybe we can steal some time in that discussion.
> >
> >
> >
> >         Would you be interested in collaboration on this effort? Would
> you
> >         be interested to exchange more emails or set an IRC/WebEx
> meeting
> >         during this week before the summit?
> >
> >
> >
> >     Sure. If folks want to discuss it before the summit, we can schedule
> a
> >     webex later this week. Or otherwise, we can continue the
> discussion with
> >     email.
> >
> >
> >
> >
> >
> >
> >
> >         Regards,
> >
> >         Irena
> >
> >
> >
> >         *From:*Robert Li (baoli) [mailto:baoli at cisco.com]
> >         *Sent:* Friday, October 25, 2013 11:16 PM
> >         *To:* prashant.upadhyaya at aricent.com
> >         <mailto:prashant.upadhyaya at aricent.com>; Irena Berezovsky;
> >         yunhong.jiang at intel.com <mailto:yunhong.jiang at intel.com>;
> >         chris.friesen at windriver.com
> <mailto:chris.friesen at windriver.com>;
> >         yongli.he at intel.com <mailto:yongli.he at intel.com>
> >         *Cc:* OpenStack Development Mailing List; Brian Bowen
> (brbowen);
> >         Kyle Mestery (kmestery); Sandhya Dasu (sadasu)
> >         *Subject:* Re: [openstack-dev] [nova] [neutron] PCI
> pass-through
> >         network support
> >
> >
> >
> >         Hi Irena,
> >
> >
> >
> >         This is Robert Li from Cisco Systems. Recently, I was tasked to
> >         investigate such support for Cisco's systems that support
> VM-FEX,
> >         which is a SRIOV technology supporting 802-1Qbh. I was able to
> bring
> >         up nova instances with SRIOV interfaces, and establish
> networking in
> >         between the instances that employes the SRIOV interfaces.
> Certainly,
> >         this was accomplished with hacking and some manual
> intervention.
> >         Based on this experience and my study with the two existing
> nova
> >         pci-passthrough blueprints that have been implemented and
> committed
> >         into Havana
> >
> (https://blueprints.launchpad.net/nova/+spec/pci-passthrough-base and
> >
> https://blueprints.launchpad.net/nova/+spec/pci-passthrough-libvirt),  I
> >         registered a couple of blueprints (one on Nova side, the other
> on
> >         the Neutron side):
> >
> >
> >
> >
> https://blueprints.launchpad.net/nova/+spec/pci-passthrough-sriov
> >
> >
> https://blueprints.launchpad.net/neutron/+spec/pci-passthrough-sriov
> >
> >
> >
> >         in order to address SRIOV support in openstack.
> >
> >
> >
> >         Please take a look at them and see if they make sense, and let
> me
> >         know any comments and questions. We can also discuss this in
> the
> >         summit, I suppose.
> >
> >
> >
> >         I noticed that there is another thread on this topic, so copy
> those
> >         folks  from that thread as well.
> >
> >
> >
> >         thanks,
> >
> >         Robert
> >
> >
> >
> >         On 10/16/13 4:32 PM, "Irena Berezovsky"
> <irenab at mellanox.com
> >         <mailto:irenab at mellanox.com>> wrote:
> >
> >
> >
> >             Hi,
> >
> >             As one of the next steps for PCI pass-through I would like
> to
> >             discuss is the support for PCI pass-through vNIC.
> >
> >             While nova takes care of PCI pass-through device
> resources
> >              management and VIF settings, neutron should manage
> their
> >             networking configuration.
> >
> >             I would like to register asummit proposal to discuss the
> support
> >             for PCI pass-through networking.
> >
> >             I am not sure what would be the right topic to discuss the
> PCI
> >             pass-through networking, since it involve both nova and
> neutron.
> >
> >             There is already a session registered by Yongli on nova
> topic to
> >             discuss the PCI pass-through next steps.
> >
> >             I think PCI pass-through networking is quite a big topic and
> it
> >             worth to have a separate discussion.
> >
> >             Is there any other people who are interested to discuss it
> and
> >             share their thoughts and experience?
> >
> >
> >
> >             Regards,
> >
> >             Irena
> >
> >
> >
> >
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list