On Thu, Dec 31, 2020 at 5:26 AM Igal Katzir <ikatzir@infinidat.com> wrote:
Hello all,

I am trying to deploy RHOSP16.1 (based on ‘train’ distribution) for Certification purposes.
I have build a container for our cinder driver and trying to deploy it.
Deployment runs almost till the end and fails at stage when it tries to configure Pacemaker;
Here is the last message:
"Info: Applying configuration version '1609231063'", "Notice: /Stage[main]/Pacemaker::Corosync/File_line[pcsd_bind_addr]/ensure: created", "Info: /Stage[main]/Pacemaker::Corosync/File_line[pcsd_bind_addr]: Scheduling refresh of Service[pcsd]", "Info: /Stage[main]/Pacemaker::Service/Service[pcsd]: Unscheduling all events on Service[pcsd]", "Info: Class[Pacemaker::Corosync]: Unscheduling all events on Class[Pacemaker::Corosync]", "Notice: /Stage[main]/Tripleo::Profile::Pacemaker::Cinder::Volume_bundle/Pacemaker::Resource::Bundle[openstack-cinder-volume]/Pcmk_bundle[openstack-cinder-volume]: Dependency Pcmk_property[property-overcloud-controller-0-cinder-volume-role] has failures: true", "Info: Creating state file /var/lib/puppet/state/state.yaml", "Notice: Applied catalog in 382.92 seconds", "Changes:", "            Total: 1", "Events:", "          Success: 1", "          Failure: 2", "            Total: 3",

I have verified that all packages on my container-image (Pacemaker,Corosync, libqb,and pcs) are installed with same versions as the overcloud-controller.

Hi Igal,

Thank you for checking these package versions and stating they match the ones installed on the overcloud node. This rules out one of the common reasons for failures when trying to run a customized cinder-volume container image.

But seems that something is still missing, because deployment with the default openstack-cinder-volume image completes successfully.

This is also good to know.

Can anyone help with debugging this? Let me know if more info needed.

More info is needed, but it's hard to predict exactly where to look for the root cause of the failure. I'd start by looking for something at the cinder log file
to determine whether the cinder-volume service is even trying to start. Look for  /var/log/containers/cinder/cinder-volume.log on the node where pacemaker is trying to run the service. Are there logs indicating the service is trying to start? Or maybe the service is launched, but fails early during startup?

Another possibility is podman fails to launch the container itself. If that's happening then check for errors in /var/log/messages. One source of this type of failure is you've specified a container bind mount, but the source directory doesn't exist (docker would auto-create the source directory, but podman does not).

You specifically mentioned RHOSP, so if you need additional support then I recommend opening a support case with Red Hat. That will provide a forum for posting private data, such as details of your overcloud deployment and full sosreports.

Alan
 

Thanks in advance,
Igal