Hi Paweł, Thanks for the info. How do you customise the FCOS image? I mean I managed to install the drivers from rpmfusion.org in FCOS37 and they work just fine. I even managed to get the GPU Operator working. But I just can’t install the drivers in FCOS37 take an image e.g. openstack server image create. Whenever I try to use the modified image with Magnum the deployment is just stuck. I believe there is some sort of mechanism in FCOS similar to cloudinit that can only run once and I would need to reset it? Or do you really build a custom image from scratch? Best Regards, Oliver
On 12. Aug 2024, at 17:25, pawel.kubica@comarch.com wrote:
Hi Olivier,
To automate the deployment of the driver I'm using custom FCOS images with additional package (nvidia-container-toolkit) and extended Magnum Heat templates (additional scripts) that: - label GPU nodes with nvidia.com/gpu=present - install container image with additional kernel modules for Nvidia GPU (https://hub.docker.com/r/fifofonix/driver) - reconfigure container runtime - install nvidia-device-plugin (https://nvidia.github.io/k8s-device-plugin)
I didn't try to use NVIDIA gpu operator yet (but I heard that support for FCOS is not working properly).
Kind regards