Fellow Open Stackers,
I have been thinking on how to handle SmartNICs, GPUs, FPGA handling across different projects within OpenStack with Cyborg taking a leading role in it.
Cyborg is important project and address accelerator devices that are part of the server and potentially switches and storage.
It is address 3 different use cases and users there are all grouped into single project.
The first 2 cases cover application life cycle of device usage.
The last one covers device life cycle independently how it is used.
Managing life cycle of devices is Ironic responsibility, One cannot and should not manage lifecycle of server components independently. Managing server devices outside server management violates customer service
agreements with server vendors and breaks server support agreements.
Nova and Neutron are getting info about all devices and their capabilities from Ironic; that they use for scheduling. We should avoid creating new project for every new component of the server and modify nova
and neuron for each new device. (the same will also apply to cinder and manila if smart devices used in its data/control path on a server).
Finally we want Cyborg to be able to be used in standalone capacity, say for Kubernetes.
Thus, I propose that Cyborg cover use cases 1 & 2, and Ironic would cover use case 3.
Thus, move all device Life-cycle code from Cyborg to Ironic.
Concentrate Cyborg of fulfilling the first 2 use cases.
Simplify integration with Nova and Neutron for using these accelerators to use existing Ironic mechanism for it.
Create idempotent calls for use case 1 so Nova and Neutron can use it as part of VM deployment to ensure that devices are programmed for VM under scheduling need.
Create idempotent call(s) for use case 2 for TripleO to setup device for single accelerator usage of a node.
[Propose similar model for CNI integration.]
Let the discussion start!
Thanks.,
Arkady