[TripleO] Unable to deploy Overcloud Machines

Anirudh Gupta anyrude10 at gmail.com
Tue Dec 28 05:28:45 UTC 2021


If this is a docker-ha issue, then that has also been tried.

Since this is Centos 8, there is no docker available. If I pass the
docker-ha.yml, then it gives the following error

FATAL | Pull
undercloud.ctlplane.localdomain:8787/tripleotraincentos8/centos-binary-cinder-volume:current-tripleo
image | overcloud-controller-1 | error={"changed": true, "cmd": "docker
pull
undercloud.ctlplane.localdomain:8787/tripleotraincentos8/centos-binary-cinder-volume:current-tripleo",
"delta": "0:00:00.005932", "end": "2021-12-27 12:42:33.927484", "msg":
"non-zero return code", "rc": 127, "start": "2021-12-27 12:42:33.921552",
"stderr": "/bin/sh: docker: command not found", "*stderr_lines": ["/bin/sh:
docker: command not found"], "stdout": "", "stdout_lines": []}*

Regards
Anirudh Gupta

On Tue, Dec 28, 2021 at 10:26 AM Yatin Karel <ykarel at redhat.com> wrote:

> Hi Anirudh,
>
> Sorry which timer? Timer adjustment is not needed for the issue you are
> seeing, if you mean overcloud deploy timeout then overcloud deploy provides
> the option to do so using --timeout option. The best option for now is to
> try docker-ha and podman in order as suggested earlier.
>
>
> Thanks and Regards
> Yatin Karel
>
> On Tue, Dec 28, 2021 at 10:12 AM Anirudh Gupta <anyrude10 at gmail.com>
> wrote:
>
>> Thanks Yatin for your response.
>>
>> Please suggest how can this timer be increased or any other steps that
>> needs to be followed to rectify this?
>>
>> Regards
>> Anirudh Gupta
>>
>> On Tue, Dec 28, 2021 at 10:08 AM Yatin Karel <ykarel at redhat.com> wrote:
>>
>>> Hi Anirudh,
>>>
>>>
>>> On Mon, Dec 27, 2021 at 9:39 PM Anirudh Gupta <anyrude10 at gmail.com>
>>> wrote:
>>> >
>>> > Hi Team,
>>> >
>>> > I am trying to deploy TripleO Train with 3 controller and 1 Compute.
>>> > For overcloud images, I have a registry server at undercloud only.
>>> >
>>> > I executed the following command to deploy overcloud
>>> >
>>> > openstack overcloud deploy --templates \
>>> >     -r /home/stack/templates/roles_data.yaml \
>>> >     -e /home/stack/templates/node-info.yaml \
>>> >     -e
>>> /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \
>>> >     -e /home/stack/containers-prepare-parameter.yaml
>>> >
>>> > The command ran for around 1.5 hrs and initially stack got
>>> successfully created and after that for 45 mins, ansible tasks were getting
>>> executed. It then gave following error in overcloud-controller-0
>>> >
>>> > 2021-12-27 11:12:27,507 p=181 u=mistral n=ansible | 2021-12-27
>>> 11:12:27.506838 | 525400b1-b522-2a06-ea9d-00000000356f |         OK | Debug
>>> output for task: Start containers for step 2 | overcloud-novacompute-0 |
>>> result={
>>> >     "changed": false,
>>> >     "failed_when_result": false,
>>> >     "start_containers_outputs.stdout_lines | default([]) |
>>> union(start_containers_outputs.stderr_lines | default([]))": [
>>> >
>>> "f206c31a781641313aa4a0499c62475efc335de6faea785cd4e855dc32ebb571",
>>> >         "",
>>> >         "Info: Loading facts",
>>> >         "Notice: Compiled catalog for
>>> overcloud-novacompute-0.localdomain in environment production in 0.05
>>> seconds",
>>> >         "Info: Applying configuration version '1640604309'",
>>> >         "Notice:
>>> /Stage[main]/Tripleo::Profile::Base::Neutron::Ovn_metadata_agent_wrappers/Tripleo::Profile::Base::Neutron::Wrappers::Haproxy[ovn_metadata_haproxy_process_wrapper]/File[/var/lib/neutron/ovn_metadata_haproxy_wrapper]/ensure:
>>> defined content as '{md5}5bb050ca70c01981975efad9d8f81f2d'",
>>> >         "Info:
>>> Tripleo::Profile::Base::Neutron::Wrappers::Haproxy[ovn_metadata_haproxy_process_wrapper]:
>>> Unscheduling all events on
>>> Tripleo::Profile::Base::Neutron::Wrappers::Haproxy[ovn_metadata_haproxy_process_wrapper]",
>>> >         "Info: Creating state file /var/lib/puppet/state/state.yaml",
>>> >         "Notice: Applied catalog in 0.01 seconds",
>>> >         "Changes:",
>>> >         "            Total: 1",
>>> >         "Events:",
>>> >         "          Success: 1",
>>> >         "Resources:",
>>> >         "          Changed: 1",
>>> >         "      Out of sync: 1",
>>> >         "          Skipped: 7",
>>> >         "            Total: 8",
>>> >         "Time:",
>>> >         "             File: 0.00",
>>> >         "   Transaction evaluation: 0.01",
>>> >         "   Catalog application: 0.01",
>>> >         "   Config retrieval: 0.09",
>>> >         "         Last run: 1640604309",
>>> >         "            Total: 0.01",
>>> >          "Version:",
>>> >         "           Config: 1640604309",
>>> >         "           Puppet: 5.5.10",
>>> >         "Error executing ['podman', 'container', 'exists',
>>> 'nova_compute_init_log']: returned 1",
>>> >         "Did not find container with \"['podman', 'ps', '-a',
>>> '--filter', 'label=container_name=nova_compute_init_log', '--filter',
>>> 'label=config_id=tripleo_step2', '--format', '{{.Names}}']\" - retrying
>>> without config_id",
>>> >         "Did not find container with \"['podman', 'ps', '-a',
>>> '--filter', 'label=container_name=nova_compute_init_log', '--format',
>>> '{{.Names}}']\"",
>>> >         "Error executing ['podman', 'container', 'exists',
>>> 'create_haproxy_wrapper']: returned 1",
>>> >         "Did not find container with \"['podman', 'ps', '-a',
>>> '--filter', 'label=container_name=create_haproxy_wrapper', '--filter',
>>> 'label=config_id=tripleo_step2', '--format', '{{.Names}}']\" - retrying
>>> without config_id",
>>> >         "Did not find container with \"['podman', 'ps', '-a',
>>> '--filter', 'label=container_name=create_haproxy_wrapper', '--format',
>>> '{{.Names}}']\""
>>> >     ]
>>> > }
>>>
>>> This is not the actual error, actual error is: puppet-user: Error:
>>> /Stage[main]/Tripleo::Profile::Base::Rabbitmq/Rabbitmq_policy[ha-all@/]:
>>> Could not evaluate: Command is still failing after 180 seconds expired!"
>>>
>>> >
>>> > I am also attaching ansible.log file for more information.
>>> >
>>> > Note: On Centos 8, there is no docker, so I didn't pass docker-ha.yml
>>> For enabling HA and with podman in Train on CentOS8, you need to pass
>>> both docker-ha.yaml and podman.yaml in order(*order is important here*,
>>> so -e
>>> /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml -e
>>> /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml), this
>>> way you will have deployment with HA and podman, i agree docker-ha name is
>>> confusing here with podman but that has to be passed here to get the
>>> required deployment. Also with Ussuri+ HA is turned on by default so those
>>> releases may work even without passing docker-ha.yaml but for Train at
>>> least it's needed.
>>> >
>>> > Can someone please help in resolving my issue
>>> >
>>> As per your requirement I would suggest running with the above config.
>>>
>>> > Regards
>>> > Anirudh Gupta
>>>
>>> Thanks and Regards
>>> Yatin Karel
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20211228/d6d33759/attachment-0001.htm>


More information about the openstack-discuss mailing list