[TripleO] Unable to deploy Overcloud Machines

Yatin Karel ykarel at redhat.com
Tue Dec 28 04:37:26 UTC 2021


Hi Anirudh,


On Mon, Dec 27, 2021 at 9:39 PM Anirudh Gupta <anyrude10 at gmail.com> wrote:
>
> Hi Team,
>
> I am trying to deploy TripleO Train with 3 controller and 1 Compute.
> For overcloud images, I have a registry server at undercloud only.
>
> I executed the following command to deploy overcloud
>
> openstack overcloud deploy --templates \
>     -r /home/stack/templates/roles_data.yaml \
>     -e /home/stack/templates/node-info.yaml \
>     -e
/usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \
>     -e /home/stack/containers-prepare-parameter.yaml
>
> The command ran for around 1.5 hrs and initially stack got successfully
created and after that for 45 mins, ansible tasks were getting executed. It
then gave following error in overcloud-controller-0
>
> 2021-12-27 11:12:27,507 p=181 u=mistral n=ansible | 2021-12-27
11:12:27.506838 | 525400b1-b522-2a06-ea9d-00000000356f |         OK | Debug
output for task: Start containers for step 2 | overcloud-novacompute-0 |
result={
>     "changed": false,
>     "failed_when_result": false,
>     "start_containers_outputs.stdout_lines | default([]) |
union(start_containers_outputs.stderr_lines | default([]))": [
>
"f206c31a781641313aa4a0499c62475efc335de6faea785cd4e855dc32ebb571",
>         "",
>         "Info: Loading facts",
>         "Notice: Compiled catalog for overcloud-novacompute-0.localdomain
in environment production in 0.05 seconds",
>         "Info: Applying configuration version '1640604309'",
>         "Notice:
/Stage[main]/Tripleo::Profile::Base::Neutron::Ovn_metadata_agent_wrappers/Tripleo::Profile::Base::Neutron::Wrappers::Haproxy[ovn_metadata_haproxy_process_wrapper]/File[/var/lib/neutron/ovn_metadata_haproxy_wrapper]/ensure:
defined content as '{md5}5bb050ca70c01981975efad9d8f81f2d'",
>         "Info:
Tripleo::Profile::Base::Neutron::Wrappers::Haproxy[ovn_metadata_haproxy_process_wrapper]:
Unscheduling all events on
Tripleo::Profile::Base::Neutron::Wrappers::Haproxy[ovn_metadata_haproxy_process_wrapper]",
>         "Info: Creating state file /var/lib/puppet/state/state.yaml",
>         "Notice: Applied catalog in 0.01 seconds",
>         "Changes:",
>         "            Total: 1",
>         "Events:",
>         "          Success: 1",
>         "Resources:",
>         "          Changed: 1",
>         "      Out of sync: 1",
>         "          Skipped: 7",
>         "            Total: 8",
>         "Time:",
>         "             File: 0.00",
>         "   Transaction evaluation: 0.01",
>         "   Catalog application: 0.01",
>         "   Config retrieval: 0.09",
>         "         Last run: 1640604309",
>         "            Total: 0.01",
>          "Version:",
>         "           Config: 1640604309",
>         "           Puppet: 5.5.10",
>         "Error executing ['podman', 'container', 'exists',
'nova_compute_init_log']: returned 1",
>         "Did not find container with \"['podman', 'ps', '-a', '--filter',
'label=container_name=nova_compute_init_log', '--filter',
'label=config_id=tripleo_step2', '--format', '{{.Names}}']\" - retrying
without config_id",
>         "Did not find container with \"['podman', 'ps', '-a', '--filter',
'label=container_name=nova_compute_init_log', '--format', '{{.Names}}']\"",
>         "Error executing ['podman', 'container', 'exists',
'create_haproxy_wrapper']: returned 1",
>         "Did not find container with \"['podman', 'ps', '-a', '--filter',
'label=container_name=create_haproxy_wrapper', '--filter',
'label=config_id=tripleo_step2', '--format', '{{.Names}}']\" - retrying
without config_id",
>         "Did not find container with \"['podman', 'ps', '-a', '--filter',
'label=container_name=create_haproxy_wrapper', '--format', '{{.Names}}']\""
>     ]
> }

This is not the actual error, actual error is: puppet-user: Error:
/Stage[main]/Tripleo::Profile::Base::Rabbitmq/Rabbitmq_policy[ha-all@/]:
Could not evaluate: Command is still failing after 180 seconds expired!"

>
> I am also attaching ansible.log file for more information.
>
> Note: On Centos 8, there is no docker, so I didn't pass docker-ha.yml
For enabling HA and with podman in Train on CentOS8, you need to pass both
docker-ha.yaml and podman.yaml in order(*order is important here*, so -e
/usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml -e
/usr/share/openstack-tripleo-heat-templates/environments/podman.yaml), this
way you will have deployment with HA and podman, i agree docker-ha name is
confusing here with podman but that has to be passed here to get the
required deployment. Also with Ussuri+ HA is turned on by default so those
releases may work even without passing docker-ha.yaml but for Train at
least it's needed.
>
> Can someone please help in resolving my issue
>
As per your requirement I would suggest running with the above config.

> Regards
> Anirudh Gupta

Thanks and Regards
Yatin Karel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20211228/57e965b0/attachment.htm>


More information about the openstack-discuss mailing list