Hello,

We have NetApp C800 arrays for our OpenStack. We have set up the Cinder configuration to use the "NVMe over TCP" protocol. 

We are currently seeing the following issue: 
- the volumes (namespaces) are correctly mounted on the hypervisors 
- the multipath (native) configuration is correctly handled by Cinder (this can be seen in the logs) 
- the volumes are only attached by a single path 

Looking at the NetApp driver code, and more specifically at the nvme_library.py file and the initialize_connection function, we find on line 724: 

portal = (target_portals[0], self.NVME_PORT, self.NVME_TRANSPORT) 
data = {
            "target_nqn": str(target_nqn),
            "host_nqn": host_nqn,
            "portals": [portal],
            "vol_uuid": namespace_uuid
        }
conn_info = {"driver_volume_type": "nvmeof", "data": data}

If we look at the values of the target_portals[ ], the array does indeed have all the paths available for the targeted subsystem. 
However, the function only returns the first path : target_portals[0]

I'm not a developer, so I can't guarantee that this is the cause of the problems encountered. 

What is certain is that manually connecting the namespace with the nvme command does indeed yield all four paths. 
I'll leave the existing discussion thread that led me to this observation open. 

Thank you.

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Vincent Godin vince.mlist@gmail.com

29 juil. 2025 15:53 (il y a 3 jours)


À Rajat, Sean, openstack-discuss
Hello,

Here are some of the results on the host.

An instance is launched by Openstack on the compute

nvme list
Node                  Generic               SN                   Model                                    Namespace  Usage                      Format           FW Rev  
--------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------
/dev/nvme0n1          /dev/ng0n1            81O3QJXiLzBDAAAAAAAH NetApp ONTAP Controller                  0x2         16.11  GB /  16.11  GB      4 KiB +  0 B   FFFFFFFF

if we have a look in the subsystem

nvme list-subsys
nvme-subsys0 - NQN=nqn.1992-08.com.netapp:sn.ec2c63655c3d11f0a40ad039eaba99f2:subsystem.openstack-79f1de4a-6645-4b47-9377-f06db6c2e0b5
               hostnqn=nqn.2014-08.org.nvmexpress:uuid:629788a4-04c6-547c-9121-8d7a39c17fe9
               iopolicy=round-robin
\
 +- nvme0 tcp traddr=10.10.184.3,trsvcid=4420,src_addr=10.10.184.33 live

I have only one path
I disconnect the subsystem manually

nvme disconnect -n nqn.1992-08.com.netapp:sn.ec2c63655c3d11f0a40ad039eaba99f2:subsystem.openstack-79f1de4a-6645-4b47-9377-f06db6c2e0b5
NQN:nqn.1992-08.com.netapp:sn.ec2c63655c3d11f0a40ad039eaba99f2:subsystem.openstack-79f1de4a-6645-4b47-9377-f06db6c2e0b5 disconnected 1 controller(s)

I reconnect to the subsystem with a manual command

nvme connect-all -t tcp  -a 10.10.186.3

nvme list
Node                  Generic               SN                   Model                                    Namespace  Usage                      Format           FW Rev  
--------------------- --------------------- -------------------- ---------------------------------------- ---------- -------------------------- ---------------- --------
/dev/nvme0n2          /dev/ng0n2            81O3QJXiLzBDAAAAAAAH NetApp ONTAP Controller                  0x2         16.11  GB /  16.11  GB      4 KiB +  0 B   FFFFFFFF

And if we look at the subsystem

nvme list-subsys
nvme-subsys0 - NQN=nqn.1992-08.com.netapp:sn.ec2c63655c3d11f0a40ad039eaba99f2:subsystem.openstack-79f1de4a-6645-4b47-9377-f06db6c2e0b5
               hostnqn=nqn.2014-08.org.nvmexpress:uuid:629788a4-04c6-547c-9121-8d7a39c17fe9
               iopolicy=round-robin
\
 +- nvme5 tcp traddr=10.10.184.3,trsvcid=4420,src_addr=10.10.184.33 live
 +- nvme4 tcp traddr=10.10.186.3,trsvcid=4420,src_addr=10.10.186.33 live
 +- nvme3 tcp traddr=10.10.184.4,trsvcid=4420,src_addr=10.10.184.33 live
 +- nvme2 tcp traddr=10.10.186.4,trsvcid=4420,src_addr=10.10.186.33 live

As you can see, i have four paths

Configuration details about multipath :

- in nova.conf 
[libvirt]
volume_use_multipath = True
- in cinder.conf
[DEFAULT]
target_protocol = nvmet_tcp
...
[netapp-backend]
use_multipath_for_image_xfer = True
netapp_storage_protocol = nvme
...

/sys/module/nvme_core/parameters/multipath
cat /sys/module/nvme_core/parameters/multipath
Y

nova-compute.log
grep -i get_connector_properties /var/log/kolla/nova/nova-compute.log

2025-07-29 14:09:51.553 7 DEBUG os_brick.initiator.connectors.lightos [None req-faf2b0ca-0709-4a70-8302-fa90ad293fd3 4e2ddaf17ee747f2a1f03a392943f80a cb513debb0834ec5b6588356a960bad9 - - default default] LIGHTOS: finally hostnqn: nqn.2014-08.org.nvmexpress:uuid:629788a4-04c6-547c-9121-8d7a39c17fe9 dsc:  get_connector_properties /var/lib/kolla/venv/lib/python3.12/site-packages/os_brick/initiator/connectors/lightos.py:115

2025-07-29 14:09:51.553 7 DEBUG os_brick.utils [None req-faf2b0ca-0709-4a70-8302-fa90ad293fd3 4e2ddaf17ee747f2a1f03a392943f80a cb513debb0834ec5b6588356a960bad9 - - default default] <== get_connector_properties: return (30ms) {'platform': 'x86_64', 'os_type': 'linux', 'ip': '10.10.52.161', 'host': 'pkc-dcp-cpt-03', 'multipath': True, 'enforce_multipath': True, 'initiator': 'iqn.2004-10.com.ubuntu:01:d0bb7aa9bcf1', 'do_local_attach': False, 'nvme_hostid': '5ca8b6d2-aa7d-42d8-bf74-c18484fab68c', 'system uuid': '31343550-3939-5a43-4a44-305930304c48', 'nqn': 'nqn.2014-08.org.nvmexpress:uuid:629788a4-04c6-547c-9121-8d7a39c17fe9', 'nvme_native_multipath': True, 'found_dsc': '', 'host_ips': ['10.20.128.33', '10.10.184.33', '10.10.186.33', '10.10.52.161', '10.10.22.161', '10.234.2.161', '10.10.50.161', '172.17.0.1', 'fe80::7864:3eff:fe13:5e1f', 'fe80::fc16:3eff:fe7f:3430', 'fe80::4c20:48ff:fe0f:2660']} trace_logging_wrapper /var/lib/kolla/venv/lib/python3.12/site-packages/os_brick/utils.py:204

multipathd
systemctl status multipathd.service
○ multipathd.service
     Loaded: masked (Reason: Unit multipathd.service is masked.)
     Active: inactive (dead)

If you can see some reason to explain why openstack connect to the subsystem only with one path !!!

Thanks

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Rajat Dhasmana

30 juil. 2025 12:37 (il y a 2 jours)


À moi, Sean, openstack-discuss
On Wed, Jul 30, 2025 at 3:15 PM Vincent Godin <vince.mlist@gmail.com> wrote:
Hello guys,

Some more informations found in the nova-compute.log :

-try iscsi

2025-07-29 14:09:51.523 1222 DEBUG oslo_concurrency.processutils [-] Running cmd (subprocess): cat /etc/iscsi/initiatorname.iscsi execute /var/lib/kolla/venv/lib/python3.12/site-packages/oslo_concurrency/processutils.py:349
2025-07-29 14:09:51.528 1222 DEBUG oslo_concurrency.processutils [-] CMD "cat /etc/iscsi/initiatorname.iscsi" returned: 0 in 0.005s execute /var/lib/kolla/venv/lib/python3.12/site-packages/oslo_concurrency/processutils.py:372
2025-07-29 14:09:51.528 1222 DEBUG oslo.privsep.daemon [-] privsep: reply[90a51cdb-5701-4339-b059-fefb0b79b7a5]: (4, ('## DO NOT EDIT OR REMOVE THIS FILE!\n## If you remove this file, the iSCSI daemon will not start.\n## If you change the InitiatorName, existing access control lists\n## may reject this initiator.  The InitiatorName must be unique\n## for each iSCSI initiator.  Do NOT duplicate iSCSI InitiatorNames.\nInitiatorName=iqn.2004-10.com.ubuntu:01:d0bb7aa9bcf1\n', '')) _call_back /var/lib/kolla/venv/lib/python3.12/site-packages/oslo_privsep/daemon.py:503

-try lightos ???

2025-07-29 14:09:51.552 7 DEBUG os_brick.initiator.connectors.lightos [None req-faf2b0ca-0709-4a70-8302-fa90ad293fd3 4e2ddaf17ee747f2a1f03a392943f80a cb513debb0834ec5b6588356a960bad9 - - default default] LIGHTOS: [Errno 111] ECONNREFUSED find_dsc /var/lib/kolla/venv/lib/python3.12/site-packages/os_brick/initiator/connectors/lightos.py:135
2025-07-29 14:09:51.553 7 INFO os_brick.initiator.connectors.lightos [None req-faf2b0ca-0709-4a70-8302-fa90ad293fd3 4e2ddaf17ee747f2a1f03a392943f80a cb513debb0834ec5b6588356a960bad9 - - default default] Current host hostNQN nqn.2014-08.org.nvmexpress:uuid:629788a4-04c6-547c-9121-8d7a39c17fe9 and IP(s) are ['10.20.128.33', '10.10.184.33', '10.10.186.33', '10.10.52.161', '10.10.22.161', '10.234.2.161', '10.10.50.161', '172.17.0.1', 'fe80::7864:3eff:fe13:5e1f', 'fe80::fc16:3eff:fe7f:3430', 'fe80::4c20:48ff:fe0f:2660']
2025-07-29 14:09:51.553 7 DEBUG os_brick.initiator.connectors.lightos [None req-faf2b0ca-0709-4a70-8302-fa90ad293fd3 4e2ddaf17ee747f2a1f03a392943f80a cb513debb0834ec5b6588356a960bad9 - - default default] LIGHTOS: did not find dsc, continuing anyway. get_connector_properties /var/lib/kolla/venv/lib/python3.12/site-packages/os_brick/initiator/connectors/lightos.py:112
2025-07-29 14:09:51.553 7 DEBUG os_brick.initiator.connectors.lightos [None req-faf2b0ca-0709-4a70-8302-fa90ad293fd3 4e2ddaf17ee747f2a1f03a392943f80a cb513debb0834ec5b6588356a960bad9 - - default default] LIGHTOS: finally hostnqn: nqn.2014-08.org.nvmexpress:uuid:629788a4-04c6-547c-9121-8d7a39c17fe9 dsc:  get_connector_properties /var/lib/kolla/venv/lib/python3.12/site-packages/os_brick/initiator/connectors/lightos.py:115

-then

2025-07-29 14:09:51.553 7 DEBUG os_brick.utils [None req-faf2b0ca-0709-4a70-8302-fa90ad293fd3 4e2ddaf17ee747f2a1f03a392943f80a cb513debb0834ec5b6588356a960bad9 - - default default] <== get_connector_properties: return (30ms) {'platform': 'x86_64', 'os_type': 'linux', 'ip': '10.10.52.161', 'host': 'pkc-dcp-cpt-03', 'multipath': True, 'enforce_multipath': True, 'initiator': 'iqn.2004-10.com.ubuntu:01:d0bb7aa9bcf1', 'do_local_attach': False, 'nvme_hostid': '5ca8b6d2-aa7d-42d8-bf74-c18484fab68c', 'system uuid': '31343550-3939-5a43-4a44-305930304c48', 'nqn': 'nqn.2014-08.org.nvmexpress:uuid:629788a4-04c6-547c-9121-8d7a39c17fe9', 'nvme_native_multipath': True, 'found_dsc': '', 'host_ips': ['10.20.128.33', '10.10.184.33', '10.10.186.33', '10.10.52.161', '10.10.22.161', '10.234.2.161', '10.10.50.161', '172.17.0.1', 'fe80::7864:3eff:fe13:5e1f', 'fe80::fc16:3eff:fe7f:3430', 'fe80::4c20:48ff:fe0f:2660']} trace_logging_wrapper /var/lib/kolla/venv/lib/python3.12/site-packages/os_brick/utils.py:204

 'multipath': True, 'enforce_multipath': True
This shows that multipath configuration is set correctly.

It would be good to search for this log entry[1] in cinder-volume logs and see the portals field to verify how many portals does the netapp nvme driver returns
Initialize connection info:
https://github.com/openstack/cinder/blob/master/cinder/volume/drivers/netapp/dataontap/nvme_library.py#L732
 
2025-07-29 14:09:51.554 7 DEBUG nova.virt.block_device [None req-faf2b0ca-0709-4a70-8302-fa90ad293fd3 4e2ddaf17ee747f2a1f03a392943f80a cb513debb0834ec5b6588356a960bad9 - - default default] [instance: 3fcb3e36-1890-44f7-9c3c-283c05e91910] Updating existing volume attachment record: b81aea6e-f2ae-4781-8c2e-3b7f1606ba0d _volume_attach /var/lib/kolla/venv/lib/python3.12/site-packages/nova/virt/block_device.py:666
2025-07-29 14:09:53.680 7 DEBUG os_brick.initiator.connectors.nvmeof [None req-faf2b0ca-0709-4a70-8302-fa90ad293fd3 4e2ddaf17ee747f2a1f03a392943f80a cb513debb0834ec5b6588356a960bad9 - - default default] ==> connect_volume: call "{'self': <os_brick.initiator.connectors.nvmeof.NVMeOFConnector object at 0x7cf65c576090>, 'connection_properties': {'target_nqn': 'nqn.1992-08.com.netapp:sn.ec2c63655c3d11f0a40ad039eaba99f2:subsystem.openstack-79f1de4a-6645-4b47-9377-f06db6c2e0b5', 'host_nqn': 'nqn.2014-08.org.nvmexpress:uuid:629788a4-04c6-547c-9121-8d7a39c17fe9', 'portals': [['10.10.184.3', 4420, 'tcp']], 'vol_uuid': '69da9918-7e84-4ee4-b7bb-9b50e3e6d739', 'qos_specs': None, 'access_mode': 'rw', 'encrypted': False, 'cacheable': False, 'enforce_multipath': True}}" trace_logging_wrapper /var/lib/kolla/venv/lib/python3.12/site-packages/os_brick/utils.py:177

 'portals': [['10.10.184.3', 4420, 'tcp']]
Here we can see that there is only one portal returned by the netapp driver