[openstack-ansible][ceph][yoga] wait for all osd to be up

Father Vlasie fv at spots.edu
Mon Aug 22 23:53:04 UTC 2022


I have done a bit more searching…the error is related to the _reporting_ on the OSDs. I tried to get some info from journalctl while the infrasrtucture playbook was running and all I could see was this:

Aug 22 22:11:31 compute3 python3[57496]: ansible-ceph_volume Invoked with cluster=ceph action=list objectstore=bluestore dmcrypt=False batch_devices=[] osds_per_device=1 journal_size=5120 journal_devices=[] block_db_size=-1 block_db_devices=[] wal_devices=[] report=False destroy=True data=None data_vg=None journal=None journal_vg=None db=None db_vg=None wal=None wal_vg=None crush_device_class=None osd_fsid=None osd_id=None
Aug 22 22:12:01 compute3 audit[57503]: USER_ACCT pid=57503 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:accounting grantors=pam_permit acct="root" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success'
Aug 22 22:12:01 compute3 audit[57503]: CRED_ACQ pid=57503 uid=0 auid=4294967295 ses=4294967295 msg='op=PAM:setcred grantors=pam_permit,pam_cap acct="root" exe="/usr/sbin/cron" hostname=? addr=? terminal=cron res=success'
Aug 22 22:12:01 compute3 audit[57503]: SYSCALL arch=c000003e syscall=1 success=yes exit=1 a0=7 a1=7ffe656d1100 a2=1 a3=7fe9c3d53371 items=0 ppid=1725 pid=57503 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=1445 comm="cron" exe="/usr/sbin/cron" key=(null)
Aug 22 22:12:01 compute3 audit: PROCTITLE proctitle=2F7573722F7362696E2F43524F4E002D66
Aug 22 22:12:01 compute3 CRON[57503]: pam_unix(cron:session): session opened for user root by (uid=0)

The only thing that stands out to me is that there are no devices listed but in all of the openstack-ansible ceph documentation devices are never mentioned so I assume they are being detected automatically, is that right?

Thank you,

FV

> On Aug 22, 2022, at 1:08 PM, Father Vlasie <fv at spots.edu> wrote:
> 
> 
> Hello everyone,
> 
> I am running setup-infrastucture.yml. I have followed the ceph production example here: https://docs.openstack.org/openstack-ansible/latest/user/ceph/full-deploy.html
> 
> I have set things up so the compute and storage nodes are the same machine (hyperconverged). And the storage devices are devoid of any volumes or partitions.
> 
> I see the following error: 
> 
> ------
> 
> FAILED - RETRYING: [compute3 -> infra1_ceph-mon_container-0d679d8d]: wait for all osd to be up (1 retries left).
> fatal: [compute3 -> infra1_ceph-mon_container-0d679d8d(192.168.3.145)]: FAILED! => {"attempts": 60, "changed": false, "cmd": ["ceph", "--cluster", "ceph", "osd", "stat", "-f", "json"], "delta": "0:00:00.223291", "end": "2022-08-22 19:36:29.473358", "msg": "", "rc": 0, "start": "2022-08-22 19:36:29.250067", "stderr": "", "stderr_lines": [], "stdout": "\n{\"epoch\":6,\"num_osds\":0,\"num_up_osds\":0,\"osd_up_since\":0,\"num_in_osds\":0,\"osd_in_since\":0,\"num_remapped_pgs\":0}", "stdout_lines": ["", "{\"epoch\":6,\"num_osds\":0,\"num_up_osds\":0,\"osd_up_since\":0,\"num_in_osds\":0,\"osd_in_since\":0,\"num_remapped_pgs\":0}”]}
> 
> ------ 
> 
> I am not sure where to look to find more information. Any help would be much appreciated!
> 
> Thank you,
> 
> FV




More information about the openstack-discuss mailing list