Re: [openstack-hpc] Infiniband + Openstack?
Álvaro, I had tried to use straight PCI passthrough with our hardware and didn't succeed in getting the stock OFED drivers to initialize properly. It could be that because our cards have the updated SR-IOV firmware on them, that it was breaking this functionality. But no, we don't yet have anything we can share. I hope that we have it soon, and I'll certainly let everyone know when we do. best, JP On Jan 25, 2013, at 10:04 AM, Álvaro López García <alvaro.lopez.garcia@cern.ch> wrote:
Hello there, John.
On Wed 23 Jan 2013 (16:02), John Paul Walters wrote:
Z,
We're working on integrating IB into OpenStack via SR-IOV. It's not quite there yet, but we expect to have it working within a month or so. Right now, I'm unaware of anyone who has IB working inside of a VM, that's what we're developing. To do this, you need a few things: 1) a host that supports SR-IOV, 2) a ConnectX-2 or ConnectX-3 IB card, and 3) Alpha SR-IOV-enabled firmware for the cards as well as the corresponding SR-IOV-enabled OFED. Both of those (the firmware and OFED) need to come from Mellanox at the moment - they're not yet released. We've so far been successful in getting RDMA working inside of the VM; however, oddly, IPoIB doesn't work (the ports are recognized as being connected). We hope that this will get worked out in later firmware/OFED releases.
We've IB inside our VMs since long, using direct passthrough of the device into the VM without SR-IOV (so we only can run 1 VM per card).
Be aware that this isn't a Quantum integration, at least not yet. We're going to manage SR-IOV VIFs as resources and schedule accordingly. I know that there are folks using IB for image distribution and perhaps also for block storage. Our target is to get IB into the VM.
In our group we're really interested on this. We had some work done in the past about that (setting the card as a resource that could be attached to the machines directly) but we had to abandon it because of lack of time. Do you have something working?
For us this worked independetly of SR-IOV, since in our PoC we only specified the resources that could be used for each of the compute nodes (i.e. the PCI addresses) and add them to a pool. Then the user could request resources from this pool.
Regards, -- Álvaro López García aloga@ifca.unican.es Instituto de Física de Cantabria http://devel.ifca.es/~aloga/ Ed. Juan Jordá, Campus UC tel: (+34) 942 200 969 Avda. de los Castros s/n 39005 Santander (SPAIN) _____________________________________________________________________ "I am always doing that which I cannot do, in order that I may learn how to do it." -- Pablo Picasso
participants (1)
-
John Paul Walters