Just wanted to share a few observations from your https://github.com/qw3r3wq/OSP-ussuri/blob/master/v3/node-info.yaml 1. Your mon_max_pg_per_osd should be closer to 100 or 200. You have it set at 4k: CephConfigOverrides: global: mon_max_pg_per_osd: 4096 Maybe you set this to workaround https://ceph.com/community/new-luminous-pg-overdose-protection/ but this is not a good way to do it for any production data. This check was added to avoid setting this value too high so working around it increases the chances you can have the problems the check was made to avoid. I assume this is just a test cluster (1 mon) but I wanted to let you know. 2. Replicas If you only have one OSD node you need to set "CephPoolDefaultSize: 1" (that should help you with the pg overdose issue too). 3. metrics pool If you're deploying with telemetry disabled then you don't need a metrics pool. 4. Backend overrides You shouldn't need GlanceBackend: rbd, GnocchiBackend: rbd, or NovaEnableRbdBackend: true as that gets set by default by using the ceph-ansible env file we've been talking about. 5. DistributedComputeHCICount role This role is meant to be used with distributed compute nodes which don't run in the same stack as the controller node. They are meant to be used as described in [1] I think the ComputeHCI node would be a better role to deploy in the same stack as the Controller. Not saying you can't do this but it doesn't look like you're using the role for what it was designed for so I at least wanted to point that out. [1] https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features... John On Mon, Sep 21, 2020 at 1:29 PM John Fulton <johfulto@redhat.com> wrote:
On Mon, Sep 21, 2020 at 1:05 PM Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
Also another thing, cat ./ceph-ansible/group_vars/osds.yml looks that have not been modified over last re-deployments. delete'ing it again and removing config-download and everything from swift...
The tripleo-ansible role tripleo_ceph_work_dir will manage that directory for you (recreate it when needed to reflect what is in Heat). It is run when config-download is run.
https://opendev.org/openstack/tripleo-ansible/src/branch/master/tripleo_ansi...
I do not like it do not override everything... especially when launching deployment, when there is no stack (I mean in undercloud host, as overcloud nodes should be cleaned up by undercloud).
If there is no stack, the stack will be created when you deploy and config-download's directory of playbooks will also be recreated. You shouldn't need to worry about cleaning up the existing config-download directory. You can, but you don't have to.
https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/deployme...
John
Thank you, will keep updated.
On Mon, 21 Sep 2020 at 19:33, Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
I have one thought.
stack@undercloudv3 v3]$ cat /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml resource_registry: OS::TripleO::Services::CephMgr: ../../deployment/ceph-ansible/ceph-mgr.yaml OS::TripleO::Services::CephMon: ../../deployment/ceph-ansible/ceph-mon.yaml OS::TripleO::Services::CephOSD: ../../deployment/ceph-ansible/ceph-osd.yaml OS::TripleO::Services::CephClient: ../../deployment/ceph-ansible/ceph-client.yaml
parameter_defaults: # Ensure that if user overrides CephAnsiblePlaybook via some env # file, we go back to default when they stop passing their env file. CephAnsiblePlaybook: ['default']
CinderEnableIscsiBackend: false CinderEnableRbdBackend: true CinderBackupBackend: ceph NovaEnableRbdBackend: true GlanceBackend: rbd ## Uncomment below if enabling legacy telemetry # GnocchiBackend: rbd [stack@undercloudv3 v3]$
And my deploy has: -e ${_THT}/environments/ceph-ansible/ceph-ansible.yaml \ -e ${_THT}/environments/ceph-ansible/ceph-rgw.yaml \ -e ${_THT}/environments/ceph-ansible/ceph-mds.yaml \ -e ${_THT}/environments/ceph-ansible/ceph-dashboard.yaml \
generally the same files, BUT, they are specified by user, and it "might feel like" the user overwrote default settings?
Also I am thinking on the things you helped me tho find, John. And I recalled, what I have found strange. NFS part. That it was trying to configure CephNfs... Or it should even I do not have it specified? From the output [1] here is the small part of it: "statically imported: /usr/share/ceph-ansible/roles/ceph-nfs/tasks/create_rgw_nfs_user.yml", "statically imported: /usr/share/ceph-ansible/roles/ceph-nfs/tasks/ganesha_selinux_fix.yml", "statically imported: /usr/share/ceph-ansible/roles/ceph-nfs/tasks/start_nfs.yml",
-- Ruslanas Gžibovskis +370 6030 7030