Open Stack

Mon Jun 21 08:36:07 UTC 2021

On Mon, 21 Jun 2021 at 06:51, Tony Pearce <tonyppe at gmail.com> wrote:
>
> Hi, me again :)
>
> I tested this again Friday and today (Monday) using a centos Ansible Control Host as well as different installation methods of the openstack host (such as minimal OS install and "server with gui"). Essentially, the deployment of Openstack Victora fails during "kayobe overcloud service deploy" because of the:  TASK [openvswitch : Ensuring OVS bridge is properly setup] .
>
> I investigated this, comparing it with a Train version. On Victoria, the host is missing:
> - ifcfg-p-bond0-ovs
> - ifcfg-p-bond0-phy
>
> And these are not visible in the bridge config as seen with "ovs-vsctl show". I tried to manually add the ifcfg and add to the bridge but I inadvertently created a bridging loop.
>
> Are you guys aware of this? I am not sure what else I can do to try and either help the kayobe/kolla-ansible teams or; resolve this to allow a successful Victoria install - please let me know?

One relevant thing that changed between Train and Victoria is that
Kayobe supports plugging a non-bridge interface directly into OVS,
without the veth pairs. So if your bond0 interface is not a bridge (I
assume it's a bond), then you would no longer get the veth links. I'm
not sure how it would have worked without a bridge previously though.
Mark

>
> Regards,
>
> Tony Pearce
>
>
> On Thu, 17 Jun 2021 at 16:13, Tony Pearce <tonyppe at gmail.com> wrote:
>>
>> Hi Mark,
>>
>> I made some time to test this again today with Victoria on a different ACH. During host configure, it fails not finding python:
>>
>> TASK [Verify that a command can be executed] **********************************************************************************************************************************
>> fatal: [juc-ucsb-5-p]: FAILED! => {"changed": false, "module_stderr": "Shared connection to 192.168.29.235 closed.\r\n", "module_stdout": "/bin/sh: /usr/bin/python3: No such file or directory\r\n", "msg": "The module failed to execute correctly, you probably need to set the interpreter.\nSee stdout/stderr for the exact error", "rc": 127}
>>
>> PLAY RECAP ********************************************************************************************************************************************************************
>> juc-ucsb-5-p               : ok=4    changed=1    unreachable=0    failed=1    skipped=2    rescued=0    ignored=0
>>
>> The task you mentioned previously, was ran but was not run against the host because no hosts matched:
>>
>> PLAY [Ensure python is installed] *********************************************************************************************************************************************
>> skipping: no hosts matched
>>
>> I looked at `venvs/kayobe/share/kayobe/ansible/kayobe-ansible-user.yml` and a comment in there says it's only run if the kayobe user account is inaccessible. In my deployment I have "#kayobe_ansible_user:" which is not defined by me. Previously, I defined it as my management user and it caused an issue with the password. So I'm unsure why this is an issue.
>>
>> To work around, I manually installed python and the host configure was successful this time around. I tried this twice and same experience both times.
>>
>> Then later, during service deploy it fails here:
>>
>> RUNNING HANDLER [common : Restart fluentd container] **************************************************************************************************************************
>> fatal: [juc-ucsb-5-p]: FAILED! => {"changed": true, "msg": "'Traceback (most recent call last):\\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python3.6/site-packages/docker/api/client.py\", line 259, in _raise_for_status\\n    response.raise_for_status()\\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python3.6/site-packages/requests/models.py\", line 941, in raise_for_status\\n    raise HTTPError(http_error_msg, response=self)\\nrequests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: http+docker://localhost/v1.41/containers/fluentd/start\\n\\nDuring handling of the above exception, another exception occurred:\\n\\nTraceback (most recent call last):\\n  File \"/tmp/ansible_kolla_docker_payload_34omrn2y/ansible_kolla_docker_payload.zip/ansible/modules/kolla_docker.py\", line 1131, in main\\n  File \"/tmp/ansible_kolla_docker_payload_34omrn2y/ansible_kolla_docker_payload.zip/ansible/modules/kolla_docker.py\", line 785, in recreate_or_restart_container\\n  File \"/tmp/ansible_kolla_docker_payload_34omrn2y/ansible_kolla_docker_payload.zip/ansible/modules/kolla_docker.py\", line 817, in start_container\\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python3.6/site-packages/docker/utils/decorators.py\", line 19, in wrapped\\n    return f(self, resource_id, *args, **kwargs)\\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python3.6/site-packages/docker/api/container.py\", line 1108, in start\\n    self._raise_for_status(res)\\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python3.6/site-packages/docker/api/client.py\", line 261, in _raise_for_status\\n    raise create_api_error_from_http_exception(e)\\n  File \"/opt/kayobe/venvs/kolla-ansible/lib/python3.6/site-packages/docker/errors.py\", line 31, in create_api_error_from_http_exception\\n    raise cls(e, response=response, explanation=explanation)\\ndocker.errors.APIError: 500 Server Error: Internal Server Error (\"error while creating mount source path \\'/etc/localtime\\': mkdir /etc/lo
>>
>> The error says that the file exists. So the first time I just renamed the symlink file and then this was successful in terms of allowing the deploy process to proceed past this point of failure. The 2nd time around, the rename was not good enough because there's a check to make sure that the file is present there. So the 2nd time around I issued "touch /etc/localtime" after renaming the existing and then this passed.
>>
>> Lastly, the deploy fails with a blocking action that I cannot resolve myself:
>>
>> TASK [openvswitch : Ensuring OVS bridge is properly setup] ********************************************************************************************************************
>> changed: [juc-ucsb-5-p] => (item=['enp6s0-ovs', 'enp6s0'])
>>
>> This step breaks networking on the host. Looking at the openvswitchdb, I think this could be something similar to the issue seen before with Wallaby. The first time I tried this was with enp6s0 configured as a bond0 as desired. I then tried without a bond0 and both times got the same result.
>> If I reboot the host then I can get successful ping replies for a short while before they stop again. Same experience as previous. I believe the pings stop when the bridge config is applied from the container shortly after host boot up. Ovs-vsctl show output: [1]
>>
>> I took a look at the logs [2] but to me I dont see anything alarming or that could point to the issue. I've previously tried turning off IPv6 and this did not have success in this part, although the log message about IPv6 went away.
>>
>> I tried removing the physical interface from the bridge "ovs-vsctl del-port..." and as soon as I do this, I can ping the host once again. Once I re-add the port back to the bridge, I can no longer connect to the host. There's no errors from ovs-vsctl at this point, either.
>>
>> [1] ovs-vsctl output screenshot
>> [2] ovs logs screenshot
>>
>> BTW I am trimming the rest of the mail off because it exceeds 40kb size for the group.
>>
>> Kind regards,
>>
>> Tony Pearce
>>>
>>> ...

Open Stack

Wallaby install via kayobe onto ubuntu 20 all in one host

OpenStack

Community

Documentation

Branding & Legal