LVM misconfiguration after openstack stackpack server hang and reboot
Alan Davis
alan.davis at apogee-research.com
Thu Sep 24 18:07:24 UTC 2020
This morning my CentOS 7.7 RDO packstack installation of Rocky hung. On
reboot some of the VMs won't start. This is a primary system and I need to
find the most expedient way to recover without losing data. I'm not using
LVM thin volumes.
Any help is appreciated.
Looking at nova-compute.log I see errors trying to find LUN 0 during the
sysfs stage.
Several machines won't boot because their root disk entries in LVM are seen
as PV and booting them doesn't see them in the DM subsystem.
Other machines boot but there attached disks throw LVM errors about
duplicate PV and preferring the cinder-volumes VG version.
LVM is showing LVs that have both "bare" entries as well as entries in
cinder-volumes and it's complaining about duplicate PVs, not using lvmetad
and preferring some entries because they are in the dm subsystem.
I've verified that, so far, I haven't lost any data. The "bare" LV not
being used as part of the DM subsystem because it's server won't boot can
be mounted on the openstack host and all data on it is accessible.
This host has rebooted cleanly multiple times in the past. This is the
first time it's shown any problems.
Am I missing an LVM filter? (unlikely since it wasn't neede before)
How can I reset the LVM configuration and convince it that it's not seeing
duplicate PV?
How do I ensure that openstack sees the right UUID and volume ID?
Excerpts from error log and output of lvs :
--- nova-compute.log --- during VM start
2020-09-24 11:15:27.091 13953 INFO os_brick.initiator.connectors.iscsi
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d1c8
- default default] Trying to connect to iSCSI portal 172.10.0.40:3260
2020-09-24 11:15:29.721 13953 WARNING nova.compute.manager
[req-fd32e16f-c879-402f-a32c-6be45a943c34 48af9a366301467d9fec912fd1c072c6
f9fc7b412a8446d083da1356aa370eb4 - default d
efault] [instance: de7d740c-786a-4aa2-aa09-d447ae7e14b6] Received
unexpected event network-vif-unplugged-79aff403-d2e4-4266-bd88-d7bd19d501a9
for instance with vm_state stopped a
nd task_state powering-on.
2020-09-24 11:16:21.361 13953 WARNING os_brick.initiator.connectors.iscsi
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d
1c8 - default default] LUN 0 on iSCSI portal 172.10.0.40:3260 not found on
sysfs after logging in.
2020-09-24 11:16:23.482 13953 INFO os_brick.initiator.connectors.iscsi
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d1c8
- default default] Trying to connect to iSCSI portal 172.10.0.40:3260
2020-09-24 11:17:17.741 13953 WARNING os_brick.initiator.connectors.iscsi
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d
1c8 - default default] LUN 0 on iSCSI portal 172.10.0.40:3260 not found on
sysfs after logging in.: VolumeDeviceNotFound: Volume device not found at .
2020-09-24 11:17:21.864 13953 INFO os_brick.initiator.connectors.iscsi
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d1c8
- default default] Trying to connect to iSCSI portal 172.10.0.40:3260
2020-09-24 11:18:16.113 13953 WARNING os_brick.initiator.connectors.iscsi
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d
1c8 - default default] LUN 0 on iSCSI portal 172.10.0.40:3260 not found on
sysfs after logging in.: VolumeDeviceNotFound: Volume device not found at .
2020-09-24 11:18:17.252 13953 INFO nova.compute.manager
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d1c8 - default defa
ult] [instance: de7d740c-786a-4aa2-aa09-d447ae7e14b6] Successfully reverted
task state from powering-on on failure for instance.
2020-09-24 11:18:17.279 13953 ERROR oslo_messaging.rpc.server
[req-8d15fb6a-6324-471e-9497-587885eef8f6 396aeda6552f44fdac5f878b90325ee1
54af92f2bb494355b96024076184d1c8 - defaul
t default] Exception during message handling: VolumeDeviceNotFound: Volume
device not found at .
2020-09-24 11:18:17.279 13953 ERROR oslo_messaging.rpc.server Traceback
(most recent call last):
2020-09-24 11:18:17.279 13953 ERROR oslo_messaging.rpc.server File
"/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163,
in _process_incoming
2020-09-24 11:18:17.279 13953 ERROR oslo_messaging.rpc.server res =
self.dispatcher.dispatch(message)
--- lvs output ---
I've annotated 1 machine's disks to illustrate the relationship between the
volume-*** cinder-volumes vg entries and the "bare" lv seen as directly
accessible from the host.
There are 3 servers that won't boot, they are the one's who's home/vg_home
and encrypted_home/encrypted_vg entries are shown.
WARNING: Not using lvmetad because duplicate PVs were found.
WARNING: Use multipath or vgimportclone to resolve duplicate PVs?
WARNING: After duplicates are resolved, run "pvscan --cache" to enable
lvmetad.
WARNING: Not using device /dev/sdu for PV
yZy8Xk-foKT-ovjV-0EZv-VxEM-GqiP-WH7k53. == backup_lv/encrypted_vg
WARNING: Not using device /dev/sdv for PV
tHA9ui-eSIO-MDmI-RM3u-3Bf4-Dznb-Ha3XfP. == varoptgitlab/encrypted_vg
WARNING: Not using device /dev/sdm for PV
5eoyCa-sMO4-b7O4-jIfh-byZE-L5pS-3lOu0D.
WARNING: Not using device /dev/sdp for PV
3BI0nV-TP0k-rgPC-PrjH-FT7z-reMe-ec1spj.
WARNING: Not using device /dev/sdt for PV
ILdbcY-VFCm-fnH6-Y3jc-pdWZ-fnl8-PH3TPe. == storage_lv/encrypted_vg
WARNING: Not using device /dev/sdr for PV
zowU2N-oaBh-r4cO-cxgX-YYiq-Kf3q-mqlHfK.
WARNING: PV yZy8Xk-foKT-ovjV-0EZv-VxEM-GqiP-WH7k53 prefers device
/dev/cinder-volumes/volume-c8da1abf-7143-422c-9ee5-b2724a71c8ff because
device is in dm subsystem.
WARNING: PV tHA9ui-eSIO-MDmI-RM3u-3Bf4-Dznb-Ha3XfP prefers device
/dev/cinder-volumes/volume-0a12012f-8c2e-41fb-aa0c-a7ae99c62487 because
device is in dm subsystem.
WARNING: PV 5eoyCa-sMO4-b7O4-jIfh-byZE-L5pS-3lOu0D prefers device
/dev/cinder-volumes/volume-990a057c-46cc-4a81-ba02-28b72c34791d because
device is in dm subsystem.
WARNING: PV 3BI0nV-TP0k-rgPC-PrjH-FT7z-reMe-ec1spj prefers device
/dev/cinder-volumes/volume-b6a9da6e-1958-46ea-90b4-ac1aebed8c04 because
device is in dm subsystem.
WARNING: PV ILdbcY-VFCm-fnH6-Y3jc-pdWZ-fnl8-PH3TPe prefers device
/dev/cinder-volumes/volume-302dd53b-7d05-4f6d-9ada-8f2ed6e1d4c6 because
device is in dm subsystem.
WARNING: PV zowU2N-oaBh-r4cO-cxgX-YYiq-Kf3q-mqlHfK prefers device
/dev/cinder-volumes/volume-df006472-be7a-4957-972a-1db4463f5d67 because
device is in dm subsystem.
LV VG Attr
LSize Pool Origin Data% Meta%
Move Log Cpy%Sync Convert
home centos_stack3 -wi-ao----
4.00g
root centos_stack3 -wi-ao----
50.00g
swap centos_stack3 -wi-ao----
4.00g
_snapshot-05b1e46b-1ae3-4cd0-9117-3fb53a6d94b0 cinder-volumes swi-a-s---
20.00g volume-1d0ff5d5-93a3-44e8-8bfa-a9290765c8c6 0.00
lv_filestore cinder-volumes -wi-ao----
1.00t
...
volume-c8da1abf-7143-422c-9ee5-b2724a71c8ff cinder-volumes -wi-ao----
100.00g
volume-0a12012f-8c2e-41fb-aa0c-a7ae99c62487 cinder-volumes -wi-ao----
60.00g
volume-990a057c-46cc-4a81-ba02-28b72c34791d cinder-volumes -wi-ao----
200.00g
volume-b6a9da6e-1958-46ea-90b4-ac1aebed8c04 cinder-volumes -wi-ao----
30.00g
volume-302dd53b-7d05-4f6d-9ada-8f2ed6e1d4c6 cinder-volumes -wi-ao----
60.00g
volume-df006472-be7a-4957-972a-1db4463f5d67 cinder-volumes -wi-ao----
250.00g
...
volume-f3250e15-bb9c-43d1-989d-8a8f6635a416 cinder-volumes -wi-ao----
20.00g
volume-fc1d5fcb-fda1-456b-a89d-582b7f94fb04 cinder-volumes -wi-ao----
300.00g
volume-fc50a717-0857-4da3-93cb-a55292f7ed6d cinder-volumes -wi-ao----
20.00g
volume-ff94e2d6-449b-495d-82e6-0debd694c1dd cinder-volumes -wi-ao----
20.00g
data2 data2_vg -wi-a-----
<300.00g
data data_vg -wi-a-----
1.79t
backup_lv encrypted_vg -wi-------
<100.00g == ...WH7k53
storage_lv encrypted_vg -wi-------
<60.00g == ...PH3TPe
varoptgitlab_lv encrypted_vg -wi-------
<200.00g
varoptgitlab_lv encrypted_vg -wi-------
<30.00g
varoptgitlab_lv encrypted_vg -wi-------
<60.00g == ...Ha3XfP
encrypted_home home_vg -wi-a-----
<40.00g
encrypted_home home_vg -wi-------
<60.00g
pub pub_vg -wi-a-----
<40.00g
pub_lv pub_vg -wi-------
<250.00g
rpms repo -wi-a-----
499.99g
home vg_home -wi-a-----
<40.00g
gtri_pub vg_pub -wi-a-----
20.00g
pub vg_pub -wi-a-----
<40.00g
--
Alan Davis
Principal System Administrator
Apogee Research LLC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200924/8dbc836b/attachment-0001.html>
More information about the openstack-discuss
mailing list