Openstack HPC Infiniband question

Mahendra Paipuri mahendra.paipuri at cnrs.fr
Wed Feb 16 10:20:03 UTC 2022


Hello,

I have worked on IB before although not on the top of Openstack. As 
primary purpose of IB is to use RDMA, you should check if it is working 
on your instances. I am not quite sure if a simple hello world code is 
sufficient to test RDMA functionality. Because, if there is any issue 
with IB stack, MPI implementations tend to fallback to TCP for 
communications. One thing you can do is install linux rdma-core [1] (if 
you have not already done it) and use `ibstat` command to check if your 
IB ports are up and running. Then you can build OpenMPI with UCX [2] and 
do a PingPong test [3] to see if you are getting proper bandwidth 
according to your ConnectX card type.

If you are planning to do more HPC tests on Openstack cloud, I suggest 
you look into HPC package managers like Spack [4] or EasyBuild [5] to 
build HPC related stack easily. StackHPC has been working on HPC over 
Openstack clouds and they developed some tools [6] which might of 
interest to you.  I hope that helps!!

-

Mahendra

[1] https://github.com/linux-rdma/rdma-core

[2] https://openucx.readthedocs.io/en/master/running.html#openmpi-with-ucx

[3] 
https://www.intel.com/content/www/us/en/develop/documentation/imb-user-guide/top/mpi-1-benchmarks/imb-p2p-benchmarks/pingpong.html

[4] https://spack-tutorial.readthedocs.io/en/latest/

[5] https://docs.easybuild.io/

[6] https://github.com/stackhpc

On 16/02/2022 05:31, Satish Patel wrote:
> Hi all,
>
> I am playing with HPC on openstack cloud deployment and I have a
> Mellanox infiniband nic card. I have a couple of deployment questions
> regarding the infiniband network. I am new to ib so excuse me if i ask
> noob questions.
>
> I have configured Mellanox for sriov and created a flavor with the
> property pci_passthrough:alias='mlx5-sriov-ib:1' to expose VF to my
> instance. so far so good and i am able to see the ib interface inside
> my vm and its active. (I am running SM inside infiniband HW switch)
>
> root at ib-vm:~# ethtool -i ibs5
> driver: mlx5_core[ib_ipoib]
> version: 5.5-1.0.3
> firmware-version: 20.28.1002 (MT_0000000222)
> expansion-rom-version:
> bus-info: 0000:00:05.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: no
> supports-register-dump: no
> supports-priv-flags: yes
>
> I didn't configure any ipaddr on the ibs5 interface etc. For testing
> purposes I have compiled mpirun hello world program to POC my
> infiniband network between two instances and I am able to successfully
> run the mpi sample program.
>
> Somewhere i read about neutron-mellanox agent to setup IPoIB for
> segmentation etc but not very sure that how complex it and what are
> the advantage here over just using simple passthru of SR-IOV
>
> Is this the correct way to set up an HPC cluster using openstack or is
> there a better way to design HPC on openstack?
>



More information about the openstack-discuss mailing list