Openstack HPC Infiniband question

Satish Patel satish.txt at gmail.com
Wed Feb 16 14:44:08 UTC 2022


Thank you Mahendra,

I did compile hello world with openmpi with ucx and also I turned off
fallback to TCP while running MPI jobs and it does work. I did test ib
interface using ib_read_bw running between two nodes and I almost hit
97 Gbps bandwidth.

ib-1 and ib-2 are my two vm instances running on two different hypervisors.

root at ib-1:~# ib_write_bw -F --report_gbits ib-2
---------------------------------------------------------------------------------------
                    RDMA_Write BW Test
 Dual-port       : OFF Device         : mlx5_0
 Number of qps   : 1 Transport type : IB
 Connection type : RC Using SRQ      : OFF
 PCIe relax order: ON
 ibv_wr* API     : ON
 TX depth        : 128
 CQ Moderation   : 1
 Mtu             : 4096[B]
 Link type       : IB
 Max inline data : 0[B]
 rdma_cm QPs : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0x26 QPN 0x03f0 PSN 0xae88e6 RKey 0x020464 VAddr
0x007fba82a4b000
 remote address: LID 0x2a QPN 0x03f2 PSN 0x5ac9fa RKey 0x020466 VAddr
0x007fe68a5cf000
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[Gb/sec]    BW average[Gb/sec]   MsgRate[Mpps]
 65536      5000             97.08              97.02     0.185048
---------------------------------------------------------------------------------------


Looks like my infiniband network is working so far based on all my
validation tests. I am just curious to know how folks running HPC on
openstack using IPoIB or RDMA or any other better and simple way to
deploy HPC on openstack.

On Wed, Feb 16, 2022 at 5:23 AM Mahendra Paipuri
<mahendra.paipuri at cnrs.fr> wrote:
>
> Hello,
>
> I have worked on IB before although not on the top of Openstack. As
> primary purpose of IB is to use RDMA, you should check if it is working
> on your instances. I am not quite sure if a simple hello world code is
> sufficient to test RDMA functionality. Because, if there is any issue
> with IB stack, MPI implementations tend to fallback to TCP for
> communications. One thing you can do is install linux rdma-core [1] (if
> you have not already done it) and use `ibstat` command to check if your
> IB ports are up and running. Then you can build OpenMPI with UCX [2] and
> do a PingPong test [3] to see if you are getting proper bandwidth
> according to your ConnectX card type.
>
> If you are planning to do more HPC tests on Openstack cloud, I suggest
> you look into HPC package managers like Spack [4] or EasyBuild [5] to
> build HPC related stack easily. StackHPC has been working on HPC over
> Openstack clouds and they developed some tools [6] which might of
> interest to you.  I hope that helps!!
>
> -
>
> Mahendra
>
> [1] https://github.com/linux-rdma/rdma-core
>
> [2] https://openucx.readthedocs.io/en/master/running.html#openmpi-with-ucx
>
> [3]
> https://www.intel.com/content/www/us/en/develop/documentation/imb-user-guide/top/mpi-1-benchmarks/imb-p2p-benchmarks/pingpong.html
>
> [4] https://spack-tutorial.readthedocs.io/en/latest/
>
> [5] https://docs.easybuild.io/
>
> [6] https://github.com/stackhpc
>
> On 16/02/2022 05:31, Satish Patel wrote:
> > Hi all,
> >
> > I am playing with HPC on openstack cloud deployment and I have a
> > Mellanox infiniband nic card. I have a couple of deployment questions
> > regarding the infiniband network. I am new to ib so excuse me if i ask
> > noob questions.
> >
> > I have configured Mellanox for sriov and created a flavor with the
> > property pci_passthrough:alias='mlx5-sriov-ib:1' to expose VF to my
> > instance. so far so good and i am able to see the ib interface inside
> > my vm and its active. (I am running SM inside infiniband HW switch)
> >
> > root at ib-vm:~# ethtool -i ibs5
> > driver: mlx5_core[ib_ipoib]
> > version: 5.5-1.0.3
> > firmware-version: 20.28.1002 (MT_0000000222)
> > expansion-rom-version:
> > bus-info: 0000:00:05.0
> > supports-statistics: yes
> > supports-test: yes
> > supports-eeprom-access: no
> > supports-register-dump: no
> > supports-priv-flags: yes
> >
> > I didn't configure any ipaddr on the ibs5 interface etc. For testing
> > purposes I have compiled mpirun hello world program to POC my
> > infiniband network between two instances and I am able to successfully
> > run the mpi sample program.
> >
> > Somewhere i read about neutron-mellanox agent to setup IPoIB for
> > segmentation etc but not very sure that how complex it and what are
> > the advantage here over just using simple passthru of SR-IOV
> >
> > Is this the correct way to set up an HPC cluster using openstack or is
> > there a better way to design HPC on openstack?
> >
>



More information about the openstack-discuss mailing list