LVM misconfiguration after openstack stackpack server hang and reboot

Alan Davis alan.davis at apogee-research.com
Thu Sep 24 18:07:24 UTC 2020


This morning my CentOS 7.7 RDO packstack installation of Rocky hung. On
reboot some of the VMs won't start. This is a primary system and I need to
find the most expedient way to recover without losing data. I'm not using
LVM thin volumes.

Any help is appreciated.

Looking at nova-compute.log I see errors trying to find LUN 0 during the
sysfs stage.

Several machines won't boot because their root disk entries in LVM are seen
as PV and booting them doesn't see them in the DM subsystem.
Other machines boot but there attached disks throw LVM errors about
duplicate PV and preferring the cinder-volumes VG version.

LVM is showing LVs that have both "bare" entries as well as entries in
cinder-volumes and it's complaining about duplicate PVs, not using lvmetad
and preferring some entries because they are in the dm subsystem.
I've verified that, so far, I haven't lost any data. The "bare" LV not
being used as part of the DM subsystem because it's server won't boot can
be mounted on the openstack host and all data on it is accessible.

This host has rebooted cleanly multiple times in the past. This is the
first time it's shown any problems.

Am I missing an LVM filter? (unlikely since it wasn't neede before)
How can I reset the LVM configuration and convince it that it's not seeing
duplicate PV?
How do I ensure that openstack sees the right UUID and volume ID?

Excerpts from error log and output of lvs :
--- nova-compute.log --- during VM start
2020-09-24 11:15:27.091 13953 INFO os_brick.initiator.connectors.iscsi
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d1c8
 - default default] Trying to connect to iSCSI portal 172.10.0.40:3260
2020-09-24 11:15:29.721 13953 WARNING nova.compute.manager
[req-fd32e16f-c879-402f-a32c-6be45a943c34 48af9a366301467d9fec912fd1c072c6
f9fc7b412a8446d083da1356aa370eb4 - default d
efault] [instance: de7d740c-786a-4aa2-aa09-d447ae7e14b6] Received
unexpected event network-vif-unplugged-79aff403-d2e4-4266-bd88-d7bd19d501a9
for instance with vm_state stopped a
nd task_state powering-on.
2020-09-24 11:16:21.361 13953 WARNING os_brick.initiator.connectors.iscsi
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d
1c8 - default default] LUN 0 on iSCSI portal 172.10.0.40:3260 not found on
sysfs after logging in.
2020-09-24 11:16:23.482 13953 INFO os_brick.initiator.connectors.iscsi
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d1c8
 - default default] Trying to connect to iSCSI portal 172.10.0.40:3260
2020-09-24 11:17:17.741 13953 WARNING os_brick.initiator.connectors.iscsi
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d
1c8 - default default] LUN 0 on iSCSI portal 172.10.0.40:3260 not found on
sysfs after logging in.: VolumeDeviceNotFound: Volume device not found at .
2020-09-24 11:17:21.864 13953 INFO os_brick.initiator.connectors.iscsi
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d1c8
 - default default] Trying to connect to iSCSI portal 172.10.0.40:3260
2020-09-24 11:18:16.113 13953 WARNING os_brick.initiator.connectors.iscsi
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d
1c8 - default default] LUN 0 on iSCSI portal 172.10.0.40:3260 not found on
sysfs after logging in.: VolumeDeviceNotFound: Volume device not found at .
2020-09-24 11:18:17.252 13953 INFO nova.compute.manager
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d1c8 - default defa
ult] [instance: de7d740c-786a-4aa2-aa09-d447ae7e14b6] Successfully reverted
task state from powering-on on failure for instance.
2020-09-24 11:18:17.279 13953 ERROR oslo_messaging.rpc.server
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d1c8 - defaul
t default] Exception during message handling: VolumeDeviceNotFound: Volume
device not found at .
2020-09-24 11:18:17.279 13953 ERROR oslo_messaging.rpc.server Traceback
(most recent call last):
2020-09-24 11:18:17.279 13953 ERROR oslo_messaging.rpc.server   File
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163,
in _process_incoming
2020-09-24 11:18:17.279 13953 ERROR oslo_messaging.rpc.server     res =
self.dispatcher.dispatch(message)


--- lvs output ---
I've annotated 1 machine's disks to illustrate the relationship between the
volume-*** cinder-volumes vg entries and the "bare" lv seen as directly
accessible from the host.
There are 3 servers that won't boot, they are the one's who's home/vg_home
and encrypted_home/encrypted_vg entries are shown.

  WARNING: Not using lvmetad because duplicate PVs were found.
  WARNING: Use multipath or vgimportclone to resolve duplicate PVs?
  WARNING: After duplicates are resolved, run "pvscan --cache" to enable
lvmetad.
  WARNING: Not using device /dev/sdu for PV
yZy8Xk-foKT-ovjV-0EZv-VxEM-GqiP-WH7k53. == backup_lv/encrypted_vg
  WARNING: Not using device /dev/sdv for PV
tHA9ui-eSIO-MDmI-RM3u-3Bf4-Dznb-Ha3XfP. == varoptgitlab/encrypted_vg
  WARNING: Not using device /dev/sdm for PV
5eoyCa-sMO4-b7O4-jIfh-byZE-L5pS-3lOu0D.
  WARNING: Not using device /dev/sdp for PV
3BI0nV-TP0k-rgPC-PrjH-FT7z-reMe-ec1spj.
  WARNING: Not using device /dev/sdt for PV
ILdbcY-VFCm-fnH6-Y3jc-pdWZ-fnl8-PH3TPe. == storage_lv/encrypted_vg
  WARNING: Not using device /dev/sdr for PV
zowU2N-oaBh-r4cO-cxgX-YYiq-Kf3q-mqlHfK.
  WARNING: PV yZy8Xk-foKT-ovjV-0EZv-VxEM-GqiP-WH7k53 prefers device
/dev/cinder-volumes/volume-c8da1abf-7143-422c-9ee5-b2724a71c8ff because
device is in dm subsystem.
  WARNING: PV tHA9ui-eSIO-MDmI-RM3u-3Bf4-Dznb-Ha3XfP prefers device
/dev/cinder-volumes/volume-0a12012f-8c2e-41fb-aa0c-a7ae99c62487 because
device is in dm subsystem.
  WARNING: PV 5eoyCa-sMO4-b7O4-jIfh-byZE-L5pS-3lOu0D prefers device
/dev/cinder-volumes/volume-990a057c-46cc-4a81-ba02-28b72c34791d because
device is in dm subsystem.
  WARNING: PV 3BI0nV-TP0k-rgPC-PrjH-FT7z-reMe-ec1spj prefers device
/dev/cinder-volumes/volume-b6a9da6e-1958-46ea-90b4-ac1aebed8c04 because
device is in dm subsystem.
  WARNING: PV ILdbcY-VFCm-fnH6-Y3jc-pdWZ-fnl8-PH3TPe prefers device
/dev/cinder-volumes/volume-302dd53b-7d05-4f6d-9ada-8f2ed6e1d4c6 because
device is in dm subsystem.
  WARNING: PV zowU2N-oaBh-r4cO-cxgX-YYiq-Kf3q-mqlHfK prefers device
/dev/cinder-volumes/volume-df006472-be7a-4957-972a-1db4463f5d67 because
device is in dm subsystem.
  LV                                             VG             Attr
LSize    Pool Origin                                      Data%  Meta%
 Move Log Cpy%Sync Convert
  home                                           centos_stack3  -wi-ao----
   4.00g

  root                                           centos_stack3  -wi-ao----
  50.00g

  swap                                           centos_stack3  -wi-ao----
   4.00g

  _snapshot-05b1e46b-1ae3-4cd0-9117-3fb53a6d94b0 cinder-volumes swi-a-s---
  20.00g      volume-1d0ff5d5-93a3-44e8-8bfa-a9290765c8c6 0.00

  lv_filestore                                   cinder-volumes -wi-ao----
   1.00t

...
  volume-c8da1abf-7143-422c-9ee5-b2724a71c8ff    cinder-volumes -wi-ao----
 100.00g

  volume-0a12012f-8c2e-41fb-aa0c-a7ae99c62487    cinder-volumes -wi-ao----
  60.00g

  volume-990a057c-46cc-4a81-ba02-28b72c34791d    cinder-volumes -wi-ao----
 200.00g

  volume-b6a9da6e-1958-46ea-90b4-ac1aebed8c04    cinder-volumes -wi-ao----
  30.00g

  volume-302dd53b-7d05-4f6d-9ada-8f2ed6e1d4c6    cinder-volumes -wi-ao----
  60.00g

  volume-df006472-be7a-4957-972a-1db4463f5d67    cinder-volumes -wi-ao----
 250.00g

...
  volume-f3250e15-bb9c-43d1-989d-8a8f6635a416    cinder-volumes -wi-ao----
  20.00g

  volume-fc1d5fcb-fda1-456b-a89d-582b7f94fb04    cinder-volumes -wi-ao----
 300.00g

  volume-fc50a717-0857-4da3-93cb-a55292f7ed6d    cinder-volumes -wi-ao----
  20.00g

  volume-ff94e2d6-449b-495d-82e6-0debd694c1dd    cinder-volumes -wi-ao----
  20.00g

  data2                                          data2_vg       -wi-a-----
<300.00g

  data                                           data_vg        -wi-a-----
   1.79t

  backup_lv                                      encrypted_vg   -wi-------
<100.00g  == ...WH7k53

  storage_lv                                     encrypted_vg   -wi-------
 <60.00g  == ...PH3TPe

  varoptgitlab_lv                                encrypted_vg   -wi-------
<200.00g

  varoptgitlab_lv                                encrypted_vg   -wi-------
 <30.00g

  varoptgitlab_lv                                encrypted_vg   -wi-------
 <60.00g  == ...Ha3XfP
  encrypted_home                                 home_vg        -wi-a-----
 <40.00g

  encrypted_home                                 home_vg        -wi-------
 <60.00g

  pub                                            pub_vg         -wi-a-----
 <40.00g

  pub_lv                                         pub_vg         -wi-------
<250.00g

  rpms                                           repo           -wi-a-----
 499.99g

  home                                           vg_home        -wi-a-----
 <40.00g

  gtri_pub                                       vg_pub         -wi-a-----
  20.00g

  pub                                            vg_pub         -wi-a-----
 <40.00g
-- 
Alan Davis
Principal System Administrator
Apogee Research LLC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200924/8dbc836b/attachment-0001.html>


More information about the openstack-discuss mailing list