More info : server is actually running CentOS 7.6 (one of the few that didn't recently get updated)

System has 5 disk configured in and md RAID5 set as md126
md126 : active raid5 sdf[4] sdb[0] sde[3] sdc[1] sdd[2]
      11720536064 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/5] [UUUUU]
      bitmap: 6/22 pages [24KB], 65536KB chunk

LVM filter excludes the sd : filter = [ "r|^/dev/sd[bcdef]|" ]

boot.log has complaints about 5 dm disks 
[FAILED] Failed to start LVM2 PV scan on device 253:55.
[FAILED] Failed to start LVM2 PV scan on device 253:47.
[FAILED] Failed to start LVM2 PV scan on device 253:50.
[FAILED] Failed to start LVM2 PV scan on device 253:56.
[FAILED] Failed to start LVM2 PV scan on device 253:34.

Typical message :
[FAILED] Failed to start LVM2 PV scan on device 253:47.
See 'systemctl status lvm2-pvscan@253:47.service' for details.

output of systemctl status:
systemctl status lvm2-pvscan@253:55.service
● lvm2-pvscan@253:55.service - LVM2 PV scan on device 253:55
   Loaded: loaded (/usr/lib/systemd/system/lvm2-pvscan@.service; static; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2020-09-24 09:26:58 EDT; 5h 44min ago
     Docs: man:pvscan(8)
  Process: 17395 ExecStart=/usr/sbin/lvm pvscan --cache --activate ay %i (code=exited, status=5)
 Main PID: 17395 (code=exited, status=5)

Sep 24 09:26:58 stack3 systemd[1]: Starting LVM2 PV scan on device 253:55...
Sep 24 09:26:58 stack3 lvm[17395]: Multiple VGs found with the same name: skipping encrypted_vg
Sep 24 09:26:58 stack3 lvm[17395]: Use --select vg_uuid=<uuid> in place of the VG name.
Sep 24 09:26:58 stack3 systemd[1]: lvm2-pvscan@253:55.service: main process exited, code=exited, status=5/NOTINSTALLED
Sep 24 09:26:58 stack3 systemd[1]: Failed to start LVM2 PV scan on device 253:55.
Sep 24 09:26:58 stack3 systemd[1]: Unit lvm2-pvscan@253:55.service entered failed state.
Sep 24 09:26:58 stack3 systemd[1]: lvm2-pvscan@253:55.service failed.



On Thu, Sep 24, 2020 at 2:07 PM Alan Davis <alan.davis@apogee-research.com> wrote:
This morning my CentOS 7.7 RDO packstack installation of Rocky hung. On reboot some of the VMs won't start. This is a primary system and I need to find the most expedient way to recover without losing data. I'm not using LVM thin volumes.

Any help is appreciated.

Looking at nova-compute.log I see errors trying to find LUN 0 during the sysfs stage.

Several machines won't boot because their root disk entries in LVM are seen as PV and booting them doesn't see them in the DM subsystem.
Other machines boot but there attached disks throw LVM errors about duplicate PV and preferring the cinder-volumes VG version.

LVM is showing LVs that have both "bare" entries as well as entries in cinder-volumes and it's complaining about duplicate PVs, not using lvmetad and preferring some entries because they are in the dm subsystem.
I've verified that, so far, I haven't lost any data. The "bare" LV not being used as part of the DM subsystem because it's server won't boot can be mounted on the openstack host and all data on it is accessible.

This host has rebooted cleanly multiple times in the past. This is the first time it's shown any problems.

Am I missing an LVM filter? (unlikely since it wasn't neede before)
How can I reset the LVM configuration and convince it that it's not seeing duplicate PV?
How do I ensure that openstack sees the right UUID and volume ID?

Excerpts from error log and output of lvs :
--- nova-compute.log --- during VM start
2020-09-24 11:15:27.091 13953 INFO os_brick.initiator.connectors.iscsi [req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1 54af92f2bb494355b96024076184d1c8
 - default default] Trying to connect to iSCSI portal 172.10.0.40:3260
2020-09-24 11:15:29.721 13953 WARNING nova.compute.manager [req-fd32e16f-c879-402f-a32c-6be45a943c34 48af9a366301467d9fec912fd1c072c6 f9fc7b412a8446d083da1356aa370eb4 - default d
efault] [instance: de7d740c-786a-4aa2-aa09-d447ae7e14b6] Received unexpected event network-vif-unplugged-79aff403-d2e4-4266-bd88-d7bd19d501a9 for instance with vm_state stopped a
nd task_state powering-on.
2020-09-24 11:16:21.361 13953 WARNING os_brick.initiator.connectors.iscsi [req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1 54af92f2bb494355b96024076184d
1c8 - default default] LUN 0 on iSCSI portal 172.10.0.40:3260 not found on sysfs after logging in.
2020-09-24 11:16:23.482 13953 INFO os_brick.initiator.connectors.iscsi [req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1 54af92f2bb494355b96024076184d1c8
 - default default] Trying to connect to iSCSI portal 172.10.0.40:3260
2020-09-24 11:17:17.741 13953 WARNING os_brick.initiator.connectors.iscsi [req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1 54af92f2bb494355b96024076184d
1c8 - default default] LUN 0 on iSCSI portal 172.10.0.40:3260 not found on sysfs after logging in.: VolumeDeviceNotFound: Volume device not found at .
2020-09-24 11:17:21.864 13953 INFO os_brick.initiator.connectors.iscsi [req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1 54af92f2bb494355b96024076184d1c8
 - default default] Trying to connect to iSCSI portal 172.10.0.40:3260
2020-09-24 11:18:16.113 13953 WARNING os_brick.initiator.connectors.iscsi [req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1 54af92f2bb494355b96024076184d
1c8 - default default] LUN 0 on iSCSI portal 172.10.0.40:3260 not found on sysfs after logging in.: VolumeDeviceNotFound: Volume device not found at .
2020-09-24 11:18:17.252 13953 INFO nova.compute.manager [req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1 54af92f2bb494355b96024076184d1c8 - default defa
ult] [instance: de7d740c-786a-4aa2-aa09-d447ae7e14b6] Successfully reverted task state from powering-on on failure for instance.
2020-09-24 11:18:17.279 13953 ERROR oslo_messaging.rpc.server [req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1 54af92f2bb494355b96024076184d1c8 - defaul
t default] Exception during message handling: VolumeDeviceNotFound: Volume device not found at .
2020-09-24 11:18:17.279 13953 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2020-09-24 11:18:17.279 13953 ERROR oslo_messaging.rpc.server   File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming
2020-09-24 11:18:17.279 13953 ERROR oslo_messaging.rpc.server     res = self.dispatcher.dispatch(message)

 
--- lvs output ---
I've annotated 1 machine's disks to illustrate the relationship between the volume-*** cinder-volumes vg entries and the "bare" lv seen as directly accessible from the host.
There are 3 servers that won't boot, they are the one's who's home/vg_home and encrypted_home/encrypted_vg entries are shown.

  WARNING: Not using lvmetad because duplicate PVs were found.
  WARNING: Use multipath or vgimportclone to resolve duplicate PVs?
  WARNING: After duplicates are resolved, run "pvscan --cache" to enable lvmetad.
  WARNING: Not using device /dev/sdu for PV yZy8Xk-foKT-ovjV-0EZv-VxEM-GqiP-WH7k53. == backup_lv/encrypted_vg
  WARNING: Not using device /dev/sdv for PV tHA9ui-eSIO-MDmI-RM3u-3Bf4-Dznb-Ha3XfP. == varoptgitlab/encrypted_vg
  WARNING: Not using device /dev/sdm for PV 5eoyCa-sMO4-b7O4-jIfh-byZE-L5pS-3lOu0D.
  WARNING: Not using device /dev/sdp for PV 3BI0nV-TP0k-rgPC-PrjH-FT7z-reMe-ec1spj.
  WARNING: Not using device /dev/sdt for PV ILdbcY-VFCm-fnH6-Y3jc-pdWZ-fnl8-PH3TPe. == storage_lv/encrypted_vg
  WARNING: Not using device /dev/sdr for PV zowU2N-oaBh-r4cO-cxgX-YYiq-Kf3q-mqlHfK.
  WARNING: PV yZy8Xk-foKT-ovjV-0EZv-VxEM-GqiP-WH7k53 prefers device /dev/cinder-volumes/volume-c8da1abf-7143-422c-9ee5-b2724a71c8ff because device is in dm subsystem.
  WARNING: PV tHA9ui-eSIO-MDmI-RM3u-3Bf4-Dznb-Ha3XfP prefers device /dev/cinder-volumes/volume-0a12012f-8c2e-41fb-aa0c-a7ae99c62487 because device is in dm subsystem.
  WARNING: PV 5eoyCa-sMO4-b7O4-jIfh-byZE-L5pS-3lOu0D prefers device /dev/cinder-volumes/volume-990a057c-46cc-4a81-ba02-28b72c34791d because device is in dm subsystem.
  WARNING: PV 3BI0nV-TP0k-rgPC-PrjH-FT7z-reMe-ec1spj prefers device /dev/cinder-volumes/volume-b6a9da6e-1958-46ea-90b4-ac1aebed8c04 because device is in dm subsystem.
  WARNING: PV ILdbcY-VFCm-fnH6-Y3jc-pdWZ-fnl8-PH3TPe prefers device /dev/cinder-volumes/volume-302dd53b-7d05-4f6d-9ada-8f2ed6e1d4c6 because device is in dm subsystem.
  WARNING: PV zowU2N-oaBh-r4cO-cxgX-YYiq-Kf3q-mqlHfK prefers device /dev/cinder-volumes/volume-df006472-be7a-4957-972a-1db4463f5d67 because device is in dm subsystem.
  LV                                             VG             Attr       LSize    Pool Origin                                      Data%  Meta%  Move Log Cpy%Sync Convert
  home                                           centos_stack3  -wi-ao----    4.00g                                                                                        
  root                                           centos_stack3  -wi-ao----   50.00g                                                                                        
  swap                                           centos_stack3  -wi-ao----    4.00g                                                                                        
  _snapshot-05b1e46b-1ae3-4cd0-9117-3fb53a6d94b0 cinder-volumes swi-a-s---   20.00g      volume-1d0ff5d5-93a3-44e8-8bfa-a9290765c8c6 0.00                                  
  lv_filestore                                   cinder-volumes -wi-ao----    1.00t                                                                                        
...
  volume-c8da1abf-7143-422c-9ee5-b2724a71c8ff    cinder-volumes -wi-ao----  100.00g                                                                                        
  volume-0a12012f-8c2e-41fb-aa0c-a7ae99c62487    cinder-volumes -wi-ao----   60.00g                                                                                        
  volume-990a057c-46cc-4a81-ba02-28b72c34791d    cinder-volumes -wi-ao----  200.00g                                                                                        
  volume-b6a9da6e-1958-46ea-90b4-ac1aebed8c04    cinder-volumes -wi-ao----   30.00g                                                                                        
  volume-302dd53b-7d05-4f6d-9ada-8f2ed6e1d4c6    cinder-volumes -wi-ao----   60.00g                                                                                        
  volume-df006472-be7a-4957-972a-1db4463f5d67    cinder-volumes -wi-ao----  250.00g                                                                                        
...
  volume-f3250e15-bb9c-43d1-989d-8a8f6635a416    cinder-volumes -wi-ao----   20.00g                                                                                        
  volume-fc1d5fcb-fda1-456b-a89d-582b7f94fb04    cinder-volumes -wi-ao----  300.00g                                                                                        
  volume-fc50a717-0857-4da3-93cb-a55292f7ed6d    cinder-volumes -wi-ao----   20.00g                                                                                        
  volume-ff94e2d6-449b-495d-82e6-0debd694c1dd    cinder-volumes -wi-ao----   20.00g                                                                                        
  data2                                          data2_vg       -wi-a----- <300.00g                                                                                        
  data                                           data_vg        -wi-a-----    1.79t                                                                                        
  backup_lv                                      encrypted_vg   -wi------- <100.00g  == ...WH7k53                                                                                
  storage_lv                                     encrypted_vg   -wi-------  <60.00g  == ...PH3TPe                                                                                
  varoptgitlab_lv                                encrypted_vg   -wi------- <200.00g                                                                                        
  varoptgitlab_lv                                encrypted_vg   -wi-------  <30.00g                                                                                        
  varoptgitlab_lv                                encrypted_vg   -wi-------  <60.00g  == ...Ha3XfP
  encrypted_home                                 home_vg        -wi-a-----  <40.00g                                                                                        
  encrypted_home                                 home_vg        -wi-------  <60.00g                                                                                        
  pub                                            pub_vg         -wi-a-----  <40.00g                                                                                        
  pub_lv                                         pub_vg         -wi------- <250.00g                                                                                        
  rpms                                           repo           -wi-a-----  499.99g                                                                                        
  home                                           vg_home        -wi-a-----  <40.00g                                                                                        
  gtri_pub                                       vg_pub         -wi-a-----   20.00g                                                                                        
  pub                                            vg_pub         -wi-a-----  <40.00g             
--
Alan Davis
Principal System Administrator
Apogee Research LLC



--
Alan Davis
Principal System Administrator
Apogee Research LLC
Office : 571.384.8941 x26
Cell : 410.701.0518