[openstack-dev] [nova] [neutron] PCI pass-through network support

Jiang, Yunhong yunhong.jiang at intel.com
Wed Oct 30 04:14:40 UTC 2013



> -----Original Message-----
> From: Isaku Yamahata [mailto:isaku.yamahata at gmail.com]
> Sent: Tuesday, October 29, 2013 8:24 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Cc: isaku.yamahata at gmail.com; Itzik Brown
> Subject: Re: [openstack-dev] [nova] [neutron] PCI pass-through network
> support
> 
> Hi Yunhong.
> 
> On Tue, Oct 29, 2013 at 08:22:40PM +0000,
> "Jiang, Yunhong" <yunhong.jiang at intel.com> wrote:
> 
> > > * describe resource external to nova that is attached to VM in the API
> > > (block device mapping and/or vif references)
> > > * ideally the nova scheduler needs to be aware of the local capacity,
> > > and how that relates to the above information (relates to the cross
> > > service scheduling issues)
> >
> > I think this possibly a bit different. For volume, it's sure managed by
> Cinder, but for PCI devices, currently
> > It ;s managed by nova. So we possibly need nova to translate the
> information (possibly before nova scheduler).
> >
> > > * state of the device should be stored by Neutron/Cinder
> > > (attached/detached, capacity, IP, etc), but still exposed to the
> > > "scheduler"
> >
> > I'm not sure if we can keep the state of the device in Neutron. Currently
> nova manage all PCI devices.
> 
> Yes, with the current implementation, nova manages PCI devices and it
> works.
> That's great. It will remain so in Icehouse cycle (maybe also J?).
> 
> But how about long term direction?
> Neutron should know/manage such network related resources on
> compute nodes?

So you mean the PCI device management will be spited between Nova and Neutron? For example, non-NIC device owned by nova and NIC device owned by neutron?

There have been so many discussion of the scheduler enhancement, like https://etherpad.openstack.org/p/grizzly-split-out-scheduling , so possibly that's the right direction? Let's wait for the summit discussion.

> The implementation in Nova will be moved into Neutron like what Cinder
> did?
> any opinions/thoughts?
> It seems that not so many Neutron developers are interested in PCI
> passthrough at the moment, though.
> 
> There are use cases for this, I think.
> For example, some compute nodes use OVS plugin, another nodes LB
> plugin.
> (Right now it may not possible easily, but it will be with ML2 plugin and
> mechanism driver). User wants their VMs to run on nodes with OVS plugin
> for
> some reason(e.g. performance difference).
> Such usage would be handled similarly.
> 
> Thanks,
> ---
> Isaku Yamahata
> 
> 
> >
> > Thanks
> > --jyh
> >
> >
> > > * connection params get given to Nova from Neutron/Cinder
> > > * nova still has the vif driver or volume driver to make the final
> connection
> > > * the disk should be formatted/expanded, and network info injected in
> > > the same way as before (cloud-init, config drive, DHCP, etc)
> > >
> > > John
> > >
> > > On 29 October 2013 10:17, Irena Berezovsky
> <irenab at mellanox.com>
> > > wrote:
> > > > Hi Jiang, Robert,
> > > >
> > > > IRC meeting option works for me.
> > > >
> > > > If I understand your question below, you are looking for a way to tie
> up
> > > > between requested virtual network(s) and requested PCI device(s).
> The
> > > way we
> > > > did it in our solution  is to map a provider:physical_network to an
> > > > interface that represents the Physical Function. Every virtual
> network is
> > > > bound to the provider:physical_network, so the PCI device should
> be
> > > > allocated based on this mapping.  We can  map a PCI alias to the
> > > > provider:physical_network.
> > > >
> > > >
> > > >
> > > > Another topic to discuss is where the mapping between neutron
> port
> > > and PCI
> > > > device should be managed. One way to solve it, is to propagate the
> > > allocated
> > > > PCI device details to neutron on port creation.
> > > >
> > > > In case  there is no qbg/qbh support, VF networking configuration
> > > should be
> > > > applied locally on the Host.
> > > >
> > > > The question is when and how to apply networking configuration on
> the
> > > PCI
> > > > device?
> > > >
> > > > We see the following options:
> > > >
> > > > *         it can be done on port creation.
> > > >
> > > > *         It can be done when nova VIF driver is called for vNIC
> > > plugging.
> > > > This will require to  have all networking configuration available to
> the
> > > VIF
> > > > driver or send request to the neutron server to obtain it.
> > > >
> > > > *         It can be done by  having a dedicated L2 neutron agent
> on
> > > each
> > > > Host that scans for allocated PCI devices  and then retrieves
> networking
> > > > configuration from the server and configures the device. The agent
> will
> > > be
> > > > also responsible for managing update requests coming from the
> neutron
> > > > server.
> > > >
> > > >
> > > >
> > > > For macvtap vNIC type assignment, the networking configuration can
> be
> > > > applied by a dedicated L2 neutron agent.
> > > >
> > > >
> > > >
> > > > BR,
> > > >
> > > > Irena
> > > >
> > > >
> > > >
> > > > From: Jiang, Yunhong [mailto:yunhong.jiang at intel.com]
> > > > Sent: Tuesday, October 29, 2013 9:04 AM
> > > >
> > > >
> > > > To: Robert Li (baoli); Irena Berezovsky;
> > > prashant.upadhyaya at aricent.com;
> > > > chris.friesen at windriver.com; He, Yongli; Itzik Brown
> > > >
> > > >
> > > > Cc: OpenStack Development Mailing List; Brian Bowen (brbowen);
> Kyle
> > > Mestery
> > > > (kmestery); Sandhya Dasu (sadasu)
> > > > Subject: RE: [openstack-dev] [nova] [neutron] PCI pass-through
> network
> > > > support
> > > >
> > > >
> > > >
> > > > Robert, is it possible to have a IRC meeting? I'd prefer to IRC meeting
> > > > because it's more openstack style and also can keep the minutes
> > > clearly.
> > > >
> > > >
> > > >
> > > > To your flow, can you give more detailed example. For example, I can
> > > > consider user specify the instance with -nic option specify a network
> id,
> > > > and then how nova device the requirement to the PCI device? I
> assume
> > > the
> > > > network id should define the switches that the device can connect
> to ,
> > > but
> > > > how is that information translated to the PCI property requirement?
> Will
> > > > this translation happen before the nova scheduler make host
> decision?
> > > >
> > > >
> > > >
> > > > Thanks
> > > >
> > > > --jyh
> > > >
> > > >
> > > >
> > > > From: Robert Li (baoli) [mailto:baoli at cisco.com]
> > > > Sent: Monday, October 28, 2013 12:22 PM
> > > > To: Irena Berezovsky; prashant.upadhyaya at aricent.com; Jiang,
> Yunhong;
> > > > chris.friesen at windriver.com; He, Yongli; Itzik Brown
> > > > Cc: OpenStack Development Mailing List; Brian Bowen (brbowen);
> Kyle
> > > Mestery
> > > > (kmestery); Sandhya Dasu (sadasu)
> > > > Subject: Re: [openstack-dev] [nova] [neutron] PCI pass-through
> network
> > > > support
> > > >
> > > >
> > > >
> > > > Hi Irena,
> > > >
> > > >
> > > >
> > > > Thank you very much for your comments. See inline.
> > > >
> > > >
> > > >
> > > > --Robert
> > > >
> > > >
> > > >
> > > > On 10/27/13 3:48 AM, "Irena Berezovsky" <irenab at mellanox.com>
> > > wrote:
> > > >
> > > >
> > > >
> > > > Hi Robert,
> > > >
> > > > Thank you very much for sharing the information regarding your
> efforts.
> > > Can
> > > > you please share your idea of the end to end flow? How do you
> suggest
> > > to
> > > > bind Nova and Neutron?
> > > >
> > > >
> > > >
> > > > The end to end flow is actually encompassed in the blueprints in a
> > > nutshell.
> > > > I will reiterate it in below. The binding between Nova and Neutron
> > > occurs
> > > > with the neutron v2 API that nova invokes in order to provision the
> > > neutron
> > > > services. The vif driver is responsible for plugging in an instance onto
> the
> > > > networking setup that neutron has created on the host.
> > > >
> > > >
> > > >
> > > > Normally, one will invoke "nova boot" api with the -nic options to
> specify
> > > > the nic with which the instance will be connected to the network. It
> > > > currently allows net-id, fixed ip and/or port-id to be specified for the
> > > > option. However, it doesn't allow one to specify special networking
> > > > requirements for the instance. Thanks to the nova pci-passthrough
> work,
> > > one
> > > > can specify PCI passthrough device(s) in the nova flavor. But it
> doesn't
> > > > provide means to tie up these PCI devices in the case of ethernet
> > > adpators
> > > > with networking services. Therefore the idea is actually simple as
> > > indicated
> > > > by the blueprint titles, to provide means to tie up SRIOV devices with
> > > > neutron services. A work flow would roughly look like this for 'nova
> > > boot':
> > > >
> > > >
> > > >
> > > >       -- Specifies networking requirements in the -nic option.
> > > Specifically
> > > > for SRIOV, allow the following to be specified in addition to the
> existing
> > > > required information:
> > > >
> > > >                . PCI alias
> > > >
> > > >                . direct pci-passthrough/macvtap
> > > >
> > > >                . port profileid that is compliant with 802.1Qbh
> > > >
> > > >
> > > >
> > > >         The above information is optional. In the absence of them,
> the
> > > > existing behavior remains.
> > > >
> > > >
> > > >
> > > >      -- if special networking requirements exist, Nova api creates
> PCI
> > > > requests in the nova instance type for scheduling purpose
> > > >
> > > >
> > > >
> > > >      -- Nova scheduler schedules the instance based on the
> requested
> > > flavor
> > > > plus the PCI requests that are created for networking.
> > > >
> > > >
> > > >
> > > >      -- Nova compute invokes neutron services with PCI
> passthrough
> > > > information if any
> > > >
> > > >
> > > >
> > > >      --  Neutron performs its normal operations based on the
> request,
> > > such
> > > > as allocating a port, assigning ip addresses, etc. Specific to SRIOV, it
> > > > should validate the information such as profileid, and stores them in
> its
> > > > db. It's also possible to associate a port profileid with a neutron
> network
> > > > so that port profileid becomes optional in the -nic option. Neutron
> > > returns
> > > > nova the port information, especially for PCI passthrough related
> > > > information in the port binding object. Currently, the port binding
> object
> > > > contains the following information:
> > > >
> > > >           binding:vif_type
> > > >
> > > >           binding:host_id
> > > >
> > > >           binding:profile
> > > >
> > > >           binding:capabilities
> > > >
> > > >
> > > >
> > > >     -- nova constructs the domain xml and plug in the instance by
> > > calling
> > > > the vif driver. The vif driver can build up the interface xml based on
> the
> > > > port binding information.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > The blueprints you registered make sense. On Nova side, there is a
> need
> > > to
> > > > bind between requested virtual network and PCI device/interface to
> be
> > > > allocated as vNIC.
> > > >
> > > > On the Neutron side, there is a need to  support networking
> > > configuration of
> > > > the vNIC. Neutron should be able to identify the PCI device/macvtap
> > > > interface in order to apply configuration. I think it makes sense to
> > > provide
> > > > neutron integration via dedicated Modular Layer 2 Mechanism
> Driver to
> > > allow
> > > > PCI pass-through vNIC support along with other networking
> technologies.
> > > >
> > > >
> > > >
> > > > I haven't sorted through this yet. A neutron port could be associated
> > > with a
> > > > PCI device or not, which is a common feature, IMHO. However, a
> ML2
> > > driver
> > > > may be needed specific to a particular SRIOV technology.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > During the Havana Release, we introduced Mellanox Neutron plugin
> that
> > > > enables networking via SRIOV pass-through devices or macvtap
> > > interfaces.
> > > >
> > > > We want to integrate our solution with PCI pass-through Nova
> support.
> > > I
> > > > will be glad to share more details if you are interested.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Good to know that you already have a SRIOV implementation. I found
> out
> > > some
> > > > information online about the mlnx plugin, but need more time to get
> to
> > > know
> > > > it better. And certainly I'm interested in knowing its details.
> > > >
> > > >
> > > >
> > > > The PCI pass-through networking support is planned to be discussed
> > > during
> > > > the summit: http://summit.openstack.org/cfp/details/129. I think it's
> > > worth
> > > > to drill down into more detailed proposal and present it during the
> > > summit,
> > > > especially since it impacts both nova and neutron projects.
> > > >
> > > >
> > > >
> > > > I agree. Maybe we can steal some time in that discussion.
> > > >
> > > >
> > > >
> > > > Would you be interested in collaboration on this effort? Would you
> be
> > > > interested to exchange more emails or set an IRC/WebEx meeting
> during
> > > this
> > > > week before the summit?
> > > >
> > > >
> > > >
> > > > Sure. If folks want to discuss it before the summit, we can schedule a
> > > webex
> > > > later this week. Or otherwise, we can continue the discussion with
> email.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > Regards,
> > > >
> > > > Irena
> > > >
> > > >
> > > >
> > > > From: Robert Li (baoli) [mailto:baoli at cisco.com]
> > > > Sent: Friday, October 25, 2013 11:16 PM
> > > > To: prashant.upadhyaya at aricent.com; Irena Berezovsky;
> > > > yunhong.jiang at intel.com; chris.friesen at windriver.com;
> > > yongli.he at intel.com
> > > > Cc: OpenStack Development Mailing List; Brian Bowen (brbowen);
> Kyle
> > > Mestery
> > > > (kmestery); Sandhya Dasu (sadasu)
> > > > Subject: Re: [openstack-dev] [nova] [neutron] PCI pass-through
> network
> > > > support
> > > >
> > > >
> > > >
> > > > Hi Irena,
> > > >
> > > >
> > > >
> > > > This is Robert Li from Cisco Systems. Recently, I was tasked to
> investigate
> > > > such support for Cisco's systems that support VM-FEX, which is a
> SRIOV
> > > > technology supporting 802-1Qbh. I was able to bring up nova
> instances
> > > with
> > > > SRIOV interfaces, and establish networking in between the instances
> that
> > > > employes the SRIOV interfaces. Certainly, this was accomplished with
> > > hacking
> > > > and some manual intervention. Based on this experience and my
> study
> > > with the
> > > > two existing nova pci-passthrough blueprints that have been
> > > implemented and
> > > > committed into Havana
> > > > (https://blueprints.launchpad.net/nova/+spec/pci-passthrough-base
> and
> > > >
> https://blueprints.launchpad.net/nova/+spec/pci-passthrough-libvirt),  I
> > > > registered a couple of blueprints (one on Nova side, the other on the
> > > > Neutron side):
> > > >
> > > >
> > > >
> > > > https://blueprints.launchpad.net/nova/+spec/pci-passthrough-sriov
> > > >
> > > >
> https://blueprints.launchpad.net/neutron/+spec/pci-passthrough-sriov
> > > >
> > > >
> > > >
> > > > in order to address SRIOV support in openstack.
> > > >
> > > >
> > > >
> > > > Please take a look at them and see if they make sense, and let me
> know
> > > any
> > > > comments and questions. We can also discuss this in the summit, I
> > > suppose.
> > > >
> > > >
> > > >
> > > > I noticed that there is another thread on this topic, so copy those
> folks
> > > > from that thread as well.
> > > >
> > > >
> > > >
> > > > thanks,
> > > >
> > > > Robert
> > > >
> > > >
> > > >
> > > > On 10/16/13 4:32 PM, "Irena Berezovsky" <irenab at mellanox.com>
> > > wrote:
> > > >
> > > >
> > > >
> > > > Hi,
> > > >
> > > > As one of the next steps for PCI pass-through I would like to discuss is
> > > the
> > > > support for PCI pass-through vNIC.
> > > >
> > > > While nova takes care of PCI pass-through device resources
> > > management and
> > > > VIF settings, neutron should manage their networking configuration.
> > > >
> > > > I would like to register asummit proposal to discuss the support for
> PCI
> > > > pass-through networking.
> > > >
> > > > I am not sure what would be the right topic to discuss the PCI
> > > pass-through
> > > > networking, since it involve both nova and neutron.
> > > >
> > > > There is already a session registered by Yongli on nova topic to
> discuss
> > > the
> > > > PCI pass-through next steps.
> > > >
> > > > I think PCI pass-through networking is quite a big topic and it worth to
> > > > have a separate discussion.
> > > >
> > > > Is there any other people who are interested to discuss it and share
> > > their
> > > > thoughts and experience?
> > > >
> > > >
> > > >
> > > > Regards,
> > > >
> > > > Irena
> > > >
> > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > OpenStack-dev mailing list
> > > > OpenStack-dev at lists.openstack.org
> > > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > > >
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> --
> Isaku Yamahata <isaku.yamahata at gmail.com>
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list