Greetings folks!

We've had some very positive initial discussions and great questions raised on this topic. I'm going to begin to try and widen this discussion in the coming days. If there is sufficient interest, hopefully we can start work on a prototype soon.

-Julia

On Thu, Apr 18, 2024 at 9:11 AM Julia Kreger <juliaashleykreger@gmail.com> wrote:

Greetings folks,


During the Ironic PTG last week, Ironic had our busiest PTG session on the topic of networking, and what to do in order to move things forward.


I think the best way to frame how we reached the discussion is to keep in mind that projects in our community have been built upon boundaries of delineation. For example, Nova for is virtualization. Ironic’s area has been bare metal machine deployment and lifecycle management. Neutron’s area has been networking. Ironic has always fit a little more awkwardly into this model due to the nature of bare metal – as evidenced by the networking related (networking-generic-switch, networking-baremetal) projects living under Ironic’s governance driven by the need to be able to toggle physical switch ports to ensure a secure provisioning process. Now it's getting even more awkward as the lines between a server and a switch start to blur.


Ironic is one of the few OpenStack projects that has significant use cases outside of a fully integrated OpenStack cluster. Additionally, even in the fully integrated case, bare metal is often not the primary use case considered -- quite understandable since virtualized networking is not a trivial problem.


This combination of two factors: physical networking being a secondary use case for Neutron, and operators wanting to get physical network automation even without a fully integrated OpenStack, leads to difficulties both on the development side and the operational side. This creates a situation where Ironic’s answer to just basic network automation of  “use Neutron and an ML2 plugin” is really not tenable for many infrastructure operators.


The underlying challenge is that infrastructure operators need something for just basic management actions like attaching or detaching hosts from physical networks which also works in a model of respecting separation of duties. This is further complicated with DPUs and the changing landscape DPUs are creating by forcing a much more complex interaction when it comes to just basic networking needs for supporting the deployment of a bare metal host. To attach a network to a host, it is no longer just toggle a switchport configuration, it is potentially toggle a switchport *and* toggle configuration on a DPU device.


And in this weeks’ Ironic PTG session, “Future of networking”[0], discussion of DPUs helped us reach a tipping point for the discussion. The idea of a separate network interface (for Ironic to use) which was not neutron, reached a point of being a chorus of contributors.


So to put it in different words, Ironic is considering implementing functionality for Switch and DPU device network configuration management in terms of device attachment to networks which would likely leverage the existing ML2 plugin model such that Ironic can trigger port attachment and detachment directly. We’re not talking about doing anything beyond attachment and detachment. This means aspects like IPAM, Security Groups, Routing, and so on and so forth are entirely off the table for us. Our answer for such will continue to be “use neutron.”


The discussion thus far has focused on not replacing existing plugins, but to enable their use. To use the plugins today, Neutron and the involvement of other OpenStack services are required –  which is not viable for approximately half of Ironic’s usage base. For example, a Metal3 user is only using Ironic as an embedded tool, and if their environment requires VXLAN networks to be terminated on a DPU then they would need to create/invent tooling to facilitate that, or we can solve that with Ironic and account for the overall physical machine lifecycle at the same time.


We also see a path where we can better integrate with Neutron and help operator’s meet more restrictive security requirements by delineating part of the functionality into Ironic’s scope when leveraging upstream solutions. We realize there are many unknowns, and welcome Neutron contributors to collaborate[1] with us, as this could ultimately impact project scope. Thoughts?


Thanks,


Julia


[0]: https://etherpad.opendev.org/p/ironic-ptg-april-2024#L609

[1]: https://review.opendev.org/c/openstack/ironic-specs/+/916126/