OpenStack Ansible Service troubleshooting

John Ratliff jdratlif at globalnoc.iu.edu
Tue Oct 4 18:56:34 UTC 2022


On Tue, 2022-10-04 at 18:21 +0200, Dmitriy Rabotyagov wrote:
> Hi John.
> 
> Well, it seems you've made a bunch of operations that were not
> required in the first place. However, I believe that at the end
> you've
> identified the problem correctly. systemd-machined service should be
> active and running on nova-compute hosts with kvm driver.
> I'd suggest looking deeper at why this service systemd-machined can't
> be started. What does journalctl says about that?

It's not very chatty, though I think your next question might answer
the why.

$ sudo journalctl -u systemd-machined
-- Logs begin at Tue 2022-10-04 17:45:02 UTC, end at Tue 2022-10-04
18:43:45 UTC. --
Oct 04 18:43:37 os-comp1 systemd[1]: Dependency failed for Virtual
Machine and Container Registration Service.
Oct 04 18:43:37 os-comp1 systemd[1]: systemd-machined.service: Job
systemd-machined.service/start failed with result 'dependency'.

> 
> As one of dependency systemd-machined requires to have
> /var/lib/machines. And I do have 2 assumptions there:
> 1. Was systemd-tmpfiles-setup.service activated? As we have seen
> sometimes that upon node boot due to some race condition it was not,
> which resulted in all kind of weirdness

It appears to be. The output looks very similar between the broken and
working clusters.

$ sudo systemctl status systemd-tmpfiles-setup                        
● systemd-tmpfiles-setup.service - Create Volatile Files and
Directories
     Loaded: loaded (/lib/systemd/system/systemd-tmpfiles-
setup.service; static; vendor preset: enabled)
     Active: active (exited) since Mon 2022-10-03 18:23:53 UTC; 24h ago
       Docs: man:tmpfiles.d(5)
             man:systemd-tmpfiles(8)
   Main PID: 1460 (code=exited, status=0/SUCCESS)
      Tasks: 0 (limit: 8192)
     Memory: 0B
     CGroup: /system.slice/systemd-tmpfiles-setup.service

Warning: journal has been rotated since unit was started, output may be
incomplete.

However, /var/lib/machines does not appear to be correct. On the
working cluster, this is mounted as an ext4 filesystem and has a
lost+found directory along with a directory for a defined instance.

There is no mount listed on the broken cluster, and the directory is
empty.

> 2. Don't you happen to run nova-compute on the same set of hosts
> where
> LXC containers are placed? As for example, in AIO setup we do manage
> /var/lib/machines/ mount with systemd var-lib-machines.mount. So if
> you happen to run nova-computes on controller host or AIO - this is
> another thing to check.

$ sudo journalctl -u var-lib-machines.mount
-- Logs begin at Tue 2022-10-04 18:01:46 UTC, end at Tue 2022-10-04
18:52:53 UTC. --
Oct 04 18:43:37 os-comp1 systemd[1]: Mounting Virtual Machine and
Container Storage (Compatibility)...
Oct 04 18:43:37 os-comp1 mount[1272300]: mount: /var/lib/machines:
wrong fs type, bad option, bad superblock on /dev/loop0, missing
codepage or helper program, or other error.
Oct 04 18:43:37 os-comp1 systemd[1]: var-lib-machines.mount: Mount
process exited, code=exited, status=32/n/a
Oct 04 18:43:37 os-comp1 systemd[1]: var-lib-machines.mount: Failed
with result 'exit-code'.
Oct 04 18:43:37 os-comp1 systemd[1]: Failed to mount Virtual Machine
and Container Storage (Compatibility).

This appears to be the problem. It looks like /dev/loop0 is probably
supposed to reference /var/lib/machines.raw. I tried running fsck on
/dev/loop0, but it doesn't think there is a valid extX filesystem on
any of the superblocks. Maybe /dev/loop0 is not really pointing to
/var/lib/machines.raw? Not sure how to tell if that's the case.

Maybe I should try to loopback this, or create a blank filesystem
image.



-- 
John Ratliff
Systems Automation Engineer 
GlobalNOC @ Indiana University
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5598 bytes
Desc: not available
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20221004/4b4041da/attachment.bin>


More information about the openstack-discuss mailing list