[TripleO] Unable to deploy Overcloud Machines

Anirudh Gupta anyrude10 at gmail.com
Tue Dec 28 11:02:06 UTC 2021


Hi Yatin,

Thanks a lot for your help. I am deleting the stack and running the
overcloud deploy command as a process.

Changing the NTP settings worked for me in proceeding ahead.

But it seems the issues are not ending here.

I would require some more help from you in order to deploy this.

*Issue:*

FATAL | Check Keystone service status | undercloud | item=heat-cfn |
error={"ansible_job_id": "687227427425.307276", "ansible_loop_var":
"tripleo_keystone_resources_service_async_result_item", "attempts": 1,
"changed": false, "extra_data": {"data": null, "details": "The request you
have made requires authentication.", "response":
"{\"error\":{\"code\":401,\"message\":\"The request you have made requires
authentication.\",\"title\":\"Unauthorized\"}}\n"}, "finished": 1, "msg":
"Failed to list services: Client Error for url:
http://10.10.30.222:5000/v3/services, *The request you have made requires
authentication.",* "tripleo_keystone_resources_service_async_result_item":
{"ansible_job_id": "687227427425.307276", "ansible_loop_var":
"tripleo_keystone_resources_data", "changed": true, "failed": false,
"finished": 0, "results_file": "/root/.ansible_async/687227427425.307276",
"started": 1, "tripleo_keystone_resources_data": {"key": "heat-cfn",
"value": {"endpoints": {"admin": "http://10.10.30.222:8000/v1", "internal":
"http://10.10.30.222:8000/v1", "public": "http://10.10.30.222:8000/v1"},
"region": "regionOne", "service": "cloudformation", "users": {"heat-cfn":
{"password": "3f3tHhxhna1CpRVPMjF7po49F"}}}}}}


PFA the ansible.log file.

Thanks your help and Patience.

Regards
Anirudh Gupta

On Tue, Dec 28, 2021 at 2:28 PM Yatin Karel <ykarel at redhat.com> wrote:

> Hi Anirudh,
>
> Not sure what can cause this issue, and also the shared log file is
> incomplete. So I believe you tried the command on the same overcloud
> deployment which was failing earlier(when docker-ha.yaml was not passed).
> If yes, to rule out if the issue is caused by an already deployed
> environment can delete the overcloud and then redeploy with correct
> environment files as used in the last run.
>
> One reason for the password expiration that i found could be the Time is
> not in Sync on the overcloud nodes. So it would be good to check that as
> well and fix(by using correct NTP sources) before attempting redeployment.
>
> Thanks and regards
> Yatin Karel
>
>
>
> On Tue, Dec 28, 2021 at 2:03 PM Anirudh Gupta <anyrude10 at gmail.com> wrote:
>
>> Hi Yatin & Team
>>
>> Thanks for your response.
>>
>> When I executed the command as below, the installation moved ahead and
>> encountered another error.
>>
>> openstack overcloud deploy --templates \
>>     -r /home/stack/templates/roles_data.yaml \
>>     -e /home/stack/templates/node-info.yaml \
>>     -e environment.yaml \
>>     -e
>> /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \
>>     -e
>> /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \
>>     -e /home/stack/containers-prepare-parameter.yaml
>>
>> *Issue:*
>> The error was: keystoneauth1.exceptions.http.Unauthorized: *The password
>> is expired and needs to be changed for user*:
>> 4f7d1dbf58574e64af9e359cb98ccbbc. (HTTP 401) (Request-ID:
>> req-b29aa655-e3ec-4d4b-8ada-397f9a132582)
>>
>> I am attaching the ansible.logs for your reference. It would be a great
>> help if you could suggest some pointers to resolve this issue.
>>
>> Regards
>> Anirudh Gupta
>>
>> On Tue, Dec 28, 2021 at 11:13 AM Yatin Karel <ykarel at redhat.com> wrote:
>>
>>> Hi Anirudh,
>>>
>>> As said order is important here, docker-ha.yaml should be followed by
>>> podman.yaml, the parameters in environment files override the parameters
>>> from previous environment files passed and that would make deployment to
>>> use podman instead of docker. Name of the parameter to which makes this
>>> switch is "ContainerCli".
>>>
>>>
>>> Thanks and regards
>>> Yatin Karel
>>>
>>> On Tue, Dec 28, 2021 at 10:59 AM Anirudh Gupta <anyrude10 at gmail.com>
>>> wrote:
>>>
>>>> If this is a docker-ha issue, then that has also been tried.
>>>>
>>>> Since this is Centos 8, there is no docker available. If I pass the
>>>> docker-ha.yml, then it gives the following error
>>>>
>>>> FATAL | Pull
>>>> undercloud.ctlplane.localdomain:8787/tripleotraincentos8/centos-binary-cinder-volume:current-tripleo
>>>> image | overcloud-controller-1 | error={"changed": true, "cmd": "docker
>>>> pull
>>>> undercloud.ctlplane.localdomain:8787/tripleotraincentos8/centos-binary-cinder-volume:current-tripleo",
>>>> "delta": "0:00:00.005932", "end": "2021-12-27 12:42:33.927484", "msg":
>>>> "non-zero return code", "rc": 127, "start": "2021-12-27 12:42:33.921552",
>>>> "stderr": "/bin/sh: docker: command not found", "*stderr_lines":
>>>> ["/bin/sh: docker: command not found"], "stdout": "", "stdout_lines": []}*
>>>>
>>>> Regards
>>>> Anirudh Gupta
>>>>
>>>> On Tue, Dec 28, 2021 at 10:26 AM Yatin Karel <ykarel at redhat.com> wrote:
>>>>
>>>>> Hi Anirudh,
>>>>>
>>>>> Sorry which timer? Timer adjustment is not needed for the issue you
>>>>> are seeing, if you mean overcloud deploy timeout then overcloud deploy
>>>>> provides the option to do so using --timeout option. The best option for
>>>>> now is to try docker-ha and podman in order as suggested earlier.
>>>>>
>>>>>
>>>>> Thanks and Regards
>>>>> Yatin Karel
>>>>>
>>>>> On Tue, Dec 28, 2021 at 10:12 AM Anirudh Gupta <anyrude10 at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks Yatin for your response.
>>>>>>
>>>>>> Please suggest how can this timer be increased or any other steps
>>>>>> that needs to be followed to rectify this?
>>>>>>
>>>>>> Regards
>>>>>> Anirudh Gupta
>>>>>>
>>>>>> On Tue, Dec 28, 2021 at 10:08 AM Yatin Karel <ykarel at redhat.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Anirudh,
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Dec 27, 2021 at 9:39 PM Anirudh Gupta <anyrude10 at gmail.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > Hi Team,
>>>>>>> >
>>>>>>> > I am trying to deploy TripleO Train with 3 controller and 1
>>>>>>> Compute.
>>>>>>> > For overcloud images, I have a registry server at undercloud only.
>>>>>>> >
>>>>>>> > I executed the following command to deploy overcloud
>>>>>>> >
>>>>>>> > openstack overcloud deploy --templates \
>>>>>>> >     -r /home/stack/templates/roles_data.yaml \
>>>>>>> >     -e /home/stack/templates/node-info.yaml \
>>>>>>> >     -e
>>>>>>> /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \
>>>>>>> >     -e /home/stack/containers-prepare-parameter.yaml
>>>>>>> >
>>>>>>> > The command ran for around 1.5 hrs and initially stack got
>>>>>>> successfully created and after that for 45 mins, ansible tasks were getting
>>>>>>> executed. It then gave following error in overcloud-controller-0
>>>>>>> >
>>>>>>> > 2021-12-27 11:12:27,507 p=181 u=mistral n=ansible | 2021-12-27
>>>>>>> 11:12:27.506838 | 525400b1-b522-2a06-ea9d-00000000356f |         OK | Debug
>>>>>>> output for task: Start containers for step 2 | overcloud-novacompute-0 |
>>>>>>> result={
>>>>>>> >     "changed": false,
>>>>>>> >     "failed_when_result": false,
>>>>>>> >     "start_containers_outputs.stdout_lines | default([]) |
>>>>>>> union(start_containers_outputs.stderr_lines | default([]))": [
>>>>>>> >
>>>>>>> "f206c31a781641313aa4a0499c62475efc335de6faea785cd4e855dc32ebb571",
>>>>>>> >         "",
>>>>>>> >         "Info: Loading facts",
>>>>>>> >         "Notice: Compiled catalog for
>>>>>>> overcloud-novacompute-0.localdomain in environment production in 0.05
>>>>>>> seconds",
>>>>>>> >         "Info: Applying configuration version '1640604309'",
>>>>>>> >         "Notice:
>>>>>>> /Stage[main]/Tripleo::Profile::Base::Neutron::Ovn_metadata_agent_wrappers/Tripleo::Profile::Base::Neutron::Wrappers::Haproxy[ovn_metadata_haproxy_process_wrapper]/File[/var/lib/neutron/ovn_metadata_haproxy_wrapper]/ensure:
>>>>>>> defined content as '{md5}5bb050ca70c01981975efad9d8f81f2d'",
>>>>>>> >         "Info:
>>>>>>> Tripleo::Profile::Base::Neutron::Wrappers::Haproxy[ovn_metadata_haproxy_process_wrapper]:
>>>>>>> Unscheduling all events on
>>>>>>> Tripleo::Profile::Base::Neutron::Wrappers::Haproxy[ovn_metadata_haproxy_process_wrapper]",
>>>>>>> >         "Info: Creating state file
>>>>>>> /var/lib/puppet/state/state.yaml",
>>>>>>> >         "Notice: Applied catalog in 0.01 seconds",
>>>>>>> >         "Changes:",
>>>>>>> >         "            Total: 1",
>>>>>>> >         "Events:",
>>>>>>> >         "          Success: 1",
>>>>>>> >         "Resources:",
>>>>>>> >         "          Changed: 1",
>>>>>>> >         "      Out of sync: 1",
>>>>>>> >         "          Skipped: 7",
>>>>>>> >         "            Total: 8",
>>>>>>> >         "Time:",
>>>>>>> >         "             File: 0.00",
>>>>>>> >         "   Transaction evaluation: 0.01",
>>>>>>> >         "   Catalog application: 0.01",
>>>>>>> >         "   Config retrieval: 0.09",
>>>>>>> >         "         Last run: 1640604309",
>>>>>>> >         "            Total: 0.01",
>>>>>>> >          "Version:",
>>>>>>> >         "           Config: 1640604309",
>>>>>>> >         "           Puppet: 5.5.10",
>>>>>>> >         "Error executing ['podman', 'container', 'exists',
>>>>>>> 'nova_compute_init_log']: returned 1",
>>>>>>> >         "Did not find container with \"['podman', 'ps', '-a',
>>>>>>> '--filter', 'label=container_name=nova_compute_init_log', '--filter',
>>>>>>> 'label=config_id=tripleo_step2', '--format', '{{.Names}}']\" - retrying
>>>>>>> without config_id",
>>>>>>> >         "Did not find container with \"['podman', 'ps', '-a',
>>>>>>> '--filter', 'label=container_name=nova_compute_init_log', '--format',
>>>>>>> '{{.Names}}']\"",
>>>>>>> >         "Error executing ['podman', 'container', 'exists',
>>>>>>> 'create_haproxy_wrapper']: returned 1",
>>>>>>> >         "Did not find container with \"['podman', 'ps', '-a',
>>>>>>> '--filter', 'label=container_name=create_haproxy_wrapper', '--filter',
>>>>>>> 'label=config_id=tripleo_step2', '--format', '{{.Names}}']\" - retrying
>>>>>>> without config_id",
>>>>>>> >         "Did not find container with \"['podman', 'ps', '-a',
>>>>>>> '--filter', 'label=container_name=create_haproxy_wrapper', '--format',
>>>>>>> '{{.Names}}']\""
>>>>>>> >     ]
>>>>>>> > }
>>>>>>>
>>>>>>> This is not the actual error, actual error is: puppet-user: Error:
>>>>>>> /Stage[main]/Tripleo::Profile::Base::Rabbitmq/Rabbitmq_policy[ha-all@/]:
>>>>>>> Could not evaluate: Command is still failing after 180 seconds expired!"
>>>>>>>
>>>>>>> >
>>>>>>> > I am also attaching ansible.log file for more information.
>>>>>>> >
>>>>>>> > Note: On Centos 8, there is no docker, so I didn't pass
>>>>>>> docker-ha.yml
>>>>>>> For enabling HA and with podman in Train on CentOS8, you need to
>>>>>>> pass both docker-ha.yaml and podman.yaml in order(*order is
>>>>>>> important here*, so -e
>>>>>>> /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml -e
>>>>>>> /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml), this
>>>>>>> way you will have deployment with HA and podman, i agree docker-ha name is
>>>>>>> confusing here with podman but that has to be passed here to get the
>>>>>>> required deployment. Also with Ussuri+ HA is turned on by default so those
>>>>>>> releases may work even without passing docker-ha.yaml but for Train at
>>>>>>> least it's needed.
>>>>>>> >
>>>>>>> > Can someone please help in resolving my issue
>>>>>>> >
>>>>>>> As per your requirement I would suggest running with the above
>>>>>>> config.
>>>>>>>
>>>>>>> > Regards
>>>>>>> > Anirudh Gupta
>>>>>>>
>>>>>>> Thanks and Regards
>>>>>>> Yatin Karel
>>>>>>>
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20211228/350925c2/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ansible.log
Type: application/octet-stream
Size: 2312730 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20211228/350925c2/attachment-0001.obj>


More information about the openstack-discuss mailing list