Wallaby install via kayobe onto ubuntu 20 all in one host

Tony Pearce tonyppe at gmail.com
Mon Jun 21 08:58:43 UTC 2021


I see. I probably should have explained better, I tried resolving this by
using a console connection to the host - this is outside of "kayobe"
commands, because the host network gets broken during "kayobe service
deploy" which causes the deployment to fail. I tried manually creating the
interfaces and adding the ports.

In short, the "host configure" gets completed without issue (apart from the
localtime and the python3 issue already mentioned that I am able to work
around). After the host is configured, I run "service deploy" which halts
when the host is no longer reachable over IP, which is the task where it
tries to check that the bridge is set up.

I have tried over the past week or so to do enough testing to either
confirm or rule out a local issue on my side, having tried multiples of
systems and getting the same result.

I think there may be a bug with kayobe / kolla-ansible which is causing the
deployment failure. Are you aware of anything else that I could try to be
certain?

Kind regards,

Tony Pearce


On Mon, 21 Jun 2021 at 16:36, Mark Goddard <mark at stackhpc.com> wrote:

> On Mon, 21 Jun 2021 at 06:51, Tony Pearce <tonyppe at gmail.com> wrote:
> >
> > Hi, me again :)
> >
> > I tested this again Friday and today (Monday) using a centos Ansible
> Control Host as well as different installation methods of the openstack
> host (such as minimal OS install and "server with gui"). Essentially, the
> deployment of Openstack Victora fails during "kayobe overcloud service
> deploy" because of the:  TASK [openvswitch : Ensuring OVS bridge is
> properly setup] .
> >
> > I investigated this, comparing it with a Train version. On Victoria, the
> host is missing:
> > - ifcfg-p-bond0-ovs
> > - ifcfg-p-bond0-phy
> >
> > And these are not visible in the bridge config as seen with "ovs-vsctl
> show". I tried to manually add the ifcfg and add to the bridge but I
> inadvertently created a bridging loop.
> >
> > Are you guys aware of this? I am not sure what else I can do to try and
> either help the kayobe/kolla-ansible teams or; resolve this to allow a
> successful Victoria install - please let me know?
>
> One relevant thing that changed between Train and Victoria is that
> Kayobe supports plugging a non-bridge interface directly into OVS,
> without the veth pairs. So if your bond0 interface is not a bridge (I
> assume it's a bond), then you would no longer get the veth links. I'm
> not sure how it would have worked without a bridge previously though.
> Mark
>
> >
> > Regards,
> >
> > Tony Pearce
> >
> >
> > On Thu, 17 Jun 2021 at 16:13, Tony Pearce <tonyppe at gmail.com> wrote:
> >>
> >> Hi Mark,
> >>
> >> I made some time to test this again today with Victoria on a different
> ACH. During host configure, it fails not finding python:
> >>
> >> TASK [Verify that a command can be executed]
> **********************************************************************************************************************************
> >> fatal: [juc-ucsb-5-p]: FAILED! => {"changed": false, "module_stderr":
> "Shared connection to 192.168.29.235 closed.\r\n", "module_stdout":
> "/bin/sh: /usr/bin/python3: No such file or directory\r\n", "msg": "The
> module failed to execute correctly, you probably need to set the
> interpreter.\nSee stdout/stderr for the exact error", "rc": 127}
> >>
> >> PLAY RECAP
> ********************************************************************************************************************************************************************
> >> juc-ucsb-5-p               : ok=4    changed=1    unreachable=0
> failed=1    skipped=2    rescued=0    ignored=0
> >>
> >> The task you mentioned previously, was ran but was not run against the
> host because no hosts matched:
> >>
> >> PLAY [Ensure python is installed]
> *********************************************************************************************************************************************
> >> skipping: no hosts matched
> >>
> >> I looked at `venvs/kayobe/share/kayobe/ansible/kayobe-ansible-user.yml`
> and a comment in there says it's only run if the kayobe user account is
> inaccessible. In my deployment I have "#kayobe_ansible_user:" which is not
> defined by me. Previously, I defined it as my management user and it caused
> an issue with the password. So I'm unsure why this is an issue.
> >>
> >> To work around, I manually installed python and the host configure was
> successful this time around. I tried this twice and same experience both
> times.
> >>
> >> Then later, during service deploy it fails here:
> >>
> >> RUNNING HANDLER [common : Restart fluentd container]
> **************************************************************************************************************************
> >> fatal: [juc-ucsb-5-p]: FAILED! => {"changed": true, "msg": "'Traceback
> (most recent call last):\\n  File
> \"/opt/kayobe/venvs/kolla-ansible/lib/python3.6/site-packages/docker/api/client.py\",
> line 259, in _raise_for_status\\n    response.raise_for_status()\\n  File
> \"/opt/kayobe/venvs/kolla-ansible/lib/python3.6/site-packages/requests/models.py\",
> line 941, in raise_for_status\\n    raise HTTPError(http_error_msg,
> response=self)\\nrequests.exceptions.HTTPError: 500 Server Error: Internal
> Server Error for url:
> http+docker://localhost/v1.41/containers/fluentd/start\\n\\nDuring handling
> of the above exception, another exception occurred:\\n\\nTraceback (most
> recent call last):\\n  File
> \"/tmp/ansible_kolla_docker_payload_34omrn2y/ansible_kolla_docker_payload.zip/ansible/modules/kolla_docker.py\",
> line 1131, in main\\n  File
> \"/tmp/ansible_kolla_docker_payload_34omrn2y/ansible_kolla_docker_payload.zip/ansible/modules/kolla_docker.py\",
> line 785, in recreate_or_restart_container\\n  File
> \"/tmp/ansible_kolla_docker_payload_34omrn2y/ansible_kolla_docker_payload.zip/ansible/modules/kolla_docker.py\",
> line 817, in start_container\\n  File
> \"/opt/kayobe/venvs/kolla-ansible/lib/python3.6/site-packages/docker/utils/decorators.py\",
> line 19, in wrapped\\n    return f(self, resource_id, *args, **kwargs)\\n
> File
> \"/opt/kayobe/venvs/kolla-ansible/lib/python3.6/site-packages/docker/api/container.py\",
> line 1108, in start\\n    self._raise_for_status(res)\\n  File
> \"/opt/kayobe/venvs/kolla-ansible/lib/python3.6/site-packages/docker/api/client.py\",
> line 261, in _raise_for_status\\n    raise
> create_api_error_from_http_exception(e)\\n  File
> \"/opt/kayobe/venvs/kolla-ansible/lib/python3.6/site-packages/docker/errors.py\",
> line 31, in create_api_error_from_http_exception\\n    raise cls(e,
> response=response, explanation=explanation)\\ndocker.errors.APIError: 500
> Server Error: Internal Server Error (\"error while creating mount source
> path \\'/etc/localtime\\': mkdir /etc/lo
> >>
> >> The error says that the file exists. So the first time I just renamed
> the symlink file and then this was successful in terms of allowing the
> deploy process to proceed past this point of failure. The 2nd time around,
> the rename was not good enough because there's a check to make sure that
> the file is present there. So the 2nd time around I issued "touch
> /etc/localtime" after renaming the existing and then this passed.
> >>
> >> Lastly, the deploy fails with a blocking action that I cannot resolve
> myself:
> >>
> >> TASK [openvswitch : Ensuring OVS bridge is properly setup]
> ********************************************************************************************************************
> >> changed: [juc-ucsb-5-p] => (item=['enp6s0-ovs', 'enp6s0'])
> >>
> >> This step breaks networking on the host. Looking at the openvswitchdb,
> I think this could be something similar to the issue seen before with
> Wallaby. The first time I tried this was with enp6s0 configured as a bond0
> as desired. I then tried without a bond0 and both times got the same result.
> >> If I reboot the host then I can get successful ping replies for a short
> while before they stop again. Same experience as previous. I believe the
> pings stop when the bridge config is applied from the container shortly
> after host boot up. Ovs-vsctl show output: [1]
> >>
> >> I took a look at the logs [2] but to me I dont see anything alarming or
> that could point to the issue. I've previously tried turning off IPv6 and
> this did not have success in this part, although the log message about IPv6
> went away.
> >>
> >> I tried removing the physical interface from the bridge "ovs-vsctl
> del-port..." and as soon as I do this, I can ping the host once again. Once
> I re-add the port back to the bridge, I can no longer connect to the host.
> There's no errors from ovs-vsctl at this point, either.
> >>
> >> [1] ovs-vsctl output screenshot
> >> [2] ovs logs screenshot
> >>
> >> BTW I am trimming the rest of the mail off because it exceeds 40kb size
> for the group.
> >>
> >> Kind regards,
> >>
> >> Tony Pearce
> >>>
> >>> ...
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210621/b62e335d/attachment-0001.html>


More information about the openstack-discuss mailing list