Just wanted to share a few observations from your https://github.com/qw3r3wq/OSP-ussuri/blob/master/v3/node-info.yaml
1. Your mon_max_pg_per_osd should be closer to 100 or 200.
You have it set at 4k:
CephConfigOverrides: global: mon_max_pg_per_osd: 4096
Maybe you set this to workaround https://ceph.com/community/new-luminous-pg-overdose-protection/ but this is not a good way to do it for any production data. This check was added to avoid setting this value too high so working around it increases the chances you can have the problems the check was made to avoid. I assume this is just a test cluster (1 mon) but I wanted to let you know.
2. Replicas
If you only have one OSD node you need to set "CephPoolDefaultSize: 1" (that should help you with the pg overdose issue too).
3. metrics pool
If you're deploying with telemetry disabled then you don't need a metrics pool.
4. Backend overrides
You shouldn't need GlanceBackend: rbd, GnocchiBackend: rbd, or NovaEnableRbdBackend: true as that gets set by default by using the ceph-ansible env file we've been talking about.
5. DistributedComputeHCICount role
This role is meant to be used with distributed compute nodes which don't run in the same stack as the controller node. They are meant to be used as described in [1] I think the ComputeHCI node would be a better role to deploy in the same stack as the Controller. Not saying you can't do this but it doesn't look like you're using the role for what it was designed for so I at least wanted to point that out.
[1] https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features...
John
On Mon, Sep 21, 2020 at 1:29 PM John Fulton johfulto@redhat.com wrote:
On Mon, Sep 21, 2020 at 1:05 PM Ruslanas Gžibovskis ruslanas@lpic.lt wrote:
Also another thing, cat ./ceph-ansible/group_vars/osds.yml looks that have not been modified over last re-deployments. delete'ing it again and removing config-download and everything from swift...
The tripleo-ansible role tripleo_ceph_work_dir will manage that directory for you (recreate it when needed to reflect what is in Heat). It is run when config-download is run.
https://opendev.org/openstack/tripleo-ansible/src/branch/master/tripleo_ansi...
I do not like it do not override everything... especially when launching deployment, when there is no stack (I mean in undercloud host, as overcloud nodes should be cleaned up by undercloud).
If there is no stack, the stack will be created when you deploy and config-download's directory of playbooks will also be recreated. You shouldn't need to worry about cleaning up the existing config-download directory. You can, but you don't have to.
https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/deployme...
John
Thank you, will keep updated.
On Mon, 21 Sep 2020 at 19:33, Ruslanas Gžibovskis ruslanas@lpic.lt wrote:
I have one thought.
stack@undercloudv3 v3]$ cat /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml resource_registry: OS::TripleO::Services::CephMgr: ../../deployment/ceph-ansible/ceph-mgr.yaml OS::TripleO::Services::CephMon: ../../deployment/ceph-ansible/ceph-mon.yaml OS::TripleO::Services::CephOSD: ../../deployment/ceph-ansible/ceph-osd.yaml OS::TripleO::Services::CephClient: ../../deployment/ceph-ansible/ceph-client.yaml
parameter_defaults: # Ensure that if user overrides CephAnsiblePlaybook via some env # file, we go back to default when they stop passing their env file. CephAnsiblePlaybook: ['default']
CinderEnableIscsiBackend: false CinderEnableRbdBackend: true CinderBackupBackend: ceph NovaEnableRbdBackend: true GlanceBackend: rbd ## Uncomment below if enabling legacy telemetry # GnocchiBackend: rbd [stack@undercloudv3 v3]$
And my deploy has: -e ${_THT}/environments/ceph-ansible/ceph-ansible.yaml \ -e ${_THT}/environments/ceph-ansible/ceph-rgw.yaml \ -e ${_THT}/environments/ceph-ansible/ceph-mds.yaml \ -e ${_THT}/environments/ceph-ansible/ceph-dashboard.yaml \
generally the same files, BUT, they are specified by user, and it "might feel like" the user overwrote default settings?
Also I am thinking on the things you helped me tho find, John. And I recalled, what I have found strange. NFS part. That it was trying to configure CephNfs... Or it should even I do not have it specified? From the output [1] here is the small part of it: "statically imported: /usr/share/ceph-ansible/roles/ceph-nfs/tasks/create_rgw_nfs_user.yml", "statically imported: /usr/share/ceph-ansible/roles/ceph-nfs/tasks/ganesha_selinux_fix.yml", "statically imported: /usr/share/ceph-ansible/roles/ceph-nfs/tasks/start_nfs.yml",
-- Ruslanas Gžibovskis +370 6030 7030