Hi Experts,
I am facing on below issues while following guide of Tripleo Standalone Deployment:
https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/deployment/standalone.html
Issue 1) Using custom domain will cause vm instantiation error after machine reboot
Description
===========
After Openstack installed, it's adding centos82.localdomain on /etc/hosts and causing OVN Southbound (ovn-sbctl show) using this centos82.localdomain. But after the machine rebooted, "ovn-sbctl show" is showing the correct fqdn: centos82.domain.tld, and it is causing VM instantiate error (Refusing to bind port <UUID> due to no OVN chassis for host: centos82.localdomain).
Steps to reproduce
==================
Base installation: https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/deployment/standalone.html
Tested on Centos 8.2 latest, tested several times on Train and Ussuri.
100% reproduceable.
Set the custom FQDN hostname
hostnamectl set-hostname centos82.domain.tld
hostnamectl set-hostname centos82.domain.tld --transient (also tested with or without this line)
cat /etc/hosts
127.0.0.1 centos82.domain.tld localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 centos82.domain.tld localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.156.82 centos82.domain.tld
also tested with
hostnamectl set-hostname centos82.domain.tld
hostnamectl set-hostname centos82.domain.tld --transient (also tested with or without this line)
cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.156.82 centos82.domain.tld centos82
also tested with
hostnamectl set-hostname centos82.domain.tld
hostnamectl set-hostname centos82.domain.tld --transient (also tested with or without this line)
cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.156.82 centos82.domain.tld
Part of standalone_parameters.yaml for domain:
# domain name used by the host
NeutronDnsDomain: domain.tld
Follow the guide to install (without ceph)
(Issue 1B) with ceph was having another error, just tested once on Train:
fatal: [undercloud]: FAILED! => {
"ceph_ansible_std_out_err": [
"Using /home/stack/standalone-ansible-huva1ccw/ansible.cfg as config file",
"ERROR! the playbook: /usr/share/ceph-ansible/site-container.yml.sample could not be found"
],
"failed_when_result": true
}
All will be installed successfully, and able to instantiate VMs and these VMs are able to ping to each other.
Then do "reboot" on this standalone host/server.
Do instantiate a new VM, and then this new VM status will be ERROR.
On Neutron logs:
Refusing to bind port <UUID> due to no OVN chassis for host: centos82.localdomain bind_port
Found the cause on OVN:
Previous after installed (before reboot)
[root@centos82 /]# ovn-sbctl show
Chassis "<UUID>"
hostname: centos82.localdomain
Encap geneve
ip: "192.168.156.82"
options: {csum="true"}
[root@centos82 /]#
After reboot
[root@centos82 /]# ovn-sbctl show
Chassis "<UUID>"
hostname: centos82.domain.tld
Encap geneve
ip: "192.168.156.82"
options: {csum="true"}
[root@centos82 /]#
So that the OVN seems could not bind the port to different hostname (that already changed).
Environment
===========
1. Exact version of OpenStack you are running: Any, tested Train and Ussuri
2. Centos 8.2 latest update
Others (without ceph): https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/deployment/standalone.html
Workaround
==========
Don't use custom domain, instead use "localdomain" as on the document
hostnamectl set-hostname centos82.localdomain
[root@centos82 ~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.156.82 centos82.localdomain centos82
Deploy, test instantiate, reboot, test instantiate, and working perfectly fine.
Question: For custom domain, where did I do wrongly? Is it expected or kind of bugs?
Issue 2) This is not Openstack issue, perhaps bugs or perhaps my issue, but it is still related with the topic on Tripleo Standalone Deployment.
On the same Standalone Deployment Guide (https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/deployment/standalone.html)
with Centos 7.6 or 7.7 with "python-tripleoclient" and this has dependency of Docker 1.13.1, encounters kind of Dns querying issue which I posted here: https://serverfault.com/questions/1032816/centos-7-and-docker-1-13-1-error-timeout-exceeded-while-awaiting-headers-no
Perhaps if anybody knows how to resolve this.
Also, just wondering is there any way for "python-tripleoclient" to use newer docker-ce 1.19.x instead of docker 1.13.1?
Thanks
Best regards,
TY