[openstack-dev] [ironic][neutron] SmartNics with Ironic

Moshe Levi moshele at mellanox.com
Thu Oct 4 18:31:20 UTC 2018


Hi Julia,

Apologize we were not able to be there to better represent the use case.

PSB

From: Julia Kreger <juliaashleykreger at gmail.com>
Sent: Monday, October 1, 2018 11:07 PM
To: OpenStack Development Mailing List (not for usage questions) <openstack-dev at lists.openstack.org>
Cc: isaku.yamahata at intel.com; Eyal Lavee <elavee at mellanox.com>
Subject: Re: [openstack-dev] [ironic][neutron] SmartNics with Ironic

Greetings, Comments in-line.

Thanks,

-Julia

On Sat, Sep 29, 2018 at 11:27 PM Moshe Levi <moshele at mellanox.com<mailto:moshele at mellanox.com>> wrote:
Hi Julia,

I don't mind to update the ironic spec [1]. Unfortunately, I wasn't in the PTG but I had a sync meeting with Isuku.

As I see it there is 2 use-cases:
1.       Running the neutron ovs agent in the smartnic
2.       Running the neutron super ovs agent which manage the ovs running on the smartnic.

My takeaway from the meeting with neutron is that there would not be a neutron ovs agent running on the smartnic. That the configuration would need to be pushed at all times, which is ultimately better security wise if the tenant NIC is somehow compromised it reduces the control plane exposure.
[ML] - Can you elaborate on  the security concerns with running the neutron ovs agent on the smart NIC?
If you compare this to the standard virtualization use case, this is as secure if not more secure.
The tenant image runs in the bare metal host and receives only a network interface/port.
The host has no way to access the OS/services/agents running on the smart NIC CPUs, in the same way that a tenant image running in a VM has no way to access the services/agents running in the hypervisor.
It is in fact event more secure, as they are running in physically disjoint hardware and memory (thus not accessible even through side-channel vulnerabilities such as meltdown/spectre).

1.

It seem that most of the discussion was around the second use-case.

By the time Ironic and Neutron met together, it seemed like the first use case was no longer under consideration. I may be wrong, but very strong preference existed for the second scenario when we met the next day.
[ML] –
We are seeing great interest on smart NICs for bare metal use cases to allow to provide services (networking, storage and others) to bare metal servers that were previously only possible for VMs.
Conceptually the smart NIC can be thought of as an isolated hypervisor layer for the bare metal host.
The first service we are targeting in this spec is aligning the bare metal networking with the standard neutron ovs agent.
The target is to try to align (as possible) the bare metal implementation to the virtualization use case, up to the point of actually running (as possible) the same agents on the smart NIC (again acting as a hypervisor for the bare metal host use case).
This allows to reuse/align the implementation, and naturally scales with the number of bare metal servers, as opposed to running the agents on controller nodes, requiring potentially scaling the controllers to match the number of bare metal servers.
It also provides a path to providing more advanced services in the smart NIC in the next steps (not limiting the implementation to be OVSDB protocol specific).


This is my understanding on the ironic neutron PTG meeting:

  1.  Ironic cores don't want to change the deployment interface as proposed in [1].
  2.  We should  a new network_interface for use case 2. But what about the first use case? Should it be a new network_interface as well?
  3.  We should delay the port binding until the baremetal is powered on the ovs is running.

     *   For the first use case I was thinking to change the neutron server to just keep the port binding information in the neutron DB. Then when the neutron ovs agent is a live it will retrieve all the baremeal port , add them to the ovsdb and start the neutron ovs agent fullsync.
     *   For the second use case the agent is alive so the agent itself can monitor the ovsdb of the baremetal and configure it when it up

  1.  How to notify that neutron agent successfully/unsuccessfully bind the port ?

     *   In both use-cases we should use neutron-ironic notification to make sure the port binding was done successfully.

Is my understanding correct?

Not quite.

1) We as in Ironic recognize that there would need to be changes, it is the method as to how that we would prefer to be explicit and have chosen by the interface. The underlying behavior needs to be different, and the new network_interface should support both cases 1 and 2 because that interface contain needed logic for the conductor to determine the appropriate path forward. We should likely also put some guards in to prevent non-smart interfaces from being used in the same configuration due to the security issues that creates.
3) I believe this would be more of a matter of the network_interface knowing that the machine is powered up, and attempting to assert configuration through Neutron to push configuration to the smartnic.
3a) The consensus is that the information to access the smartnic is hardware configuration metadata and that ironic should be the source of truth for information about that hardware. The discussion was push that as needed into neutron to help enable the attachment. I proposed just including it in the binding profile as a possibility, since it is transient information.
3b) As I understood it, this would ultimately be the default operating behavior.
4) Was not discussed, but something along the path is going to have to check and retry as necessary. That item could be in the network_interface code.
4a) This doesn't exist yet.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20181004/097cccdd/attachment.html>


More information about the OpenStack-dev mailing list