[openstack-ansible] Installing OpenStack with Ansible fails during Keystone playbook on TASK openstack.osa.db_setup
Hello Community, I am trying to create a two machine deployment following Openstack Ansible Deployment Guide (https://docs.openstack.org/project-deploy-guide/openstack-ansible/latest/). The two machines are named targethost01 and targethost02, and I am running Ansible from deploymenthost. Every machine has 4-Core CPUs, 8 GB of RAM, and 240 GB SSD. I am using Ubuntu 22.04.1 LTS. The machine targethost01 has the following network configuration: network: version: 2 ethernets: enp5s0: dhcp4: true enp6s0: {} enp7s0: {} enp8s0: {} enp9s0: {} vlans: vlan.10: id: 10 link: enp6s0 addresses: [ ] vlan.20: id: 20 link: enp7s0 addresses: [ ] vlan.30: id: 30 link: enp8s0 addresses: [ ] vlan.40: id: 40 link: enp9s0 addresses: [ ] bridges: br-mgmt: addresses: [ 172.29.236.101/22 ] mtu: 1500 interfaces: - vlan.10 br-storage: addresses: [ 172.29.244.101/22 ] mtu: 1500 interfaces: - vlan.20 br-vlan: addresses: [] mtu: 1500 interfaces: - vlan.30 br-vxlan: addresses: [ 172.29.240.101/22 ] mtu: 1500 interfaces: - vlan.40 And targethost02 has the following network configuration: network: version: 2 ethernets: enp5s0: dhcp4: true enp6s0: {} enp7s0: {} enp8s0: {} enp9s0: {} vlans: vlan.10: id: 10 link: enp6s0 addresses: [ ] vlan.20: id: 20 link: enp7s0 addresses: [ ] vlan.30: id: 30 link: enp8s0 addresses: [ ] vlan.40: id: 40 link: enp9s0 addresses: [ ] bridges: br-mgmt: addresses: [ 172.29.236.102/22 ] mtu: 1500 interfaces: - vlan.10 br-storage: addresses: [ 172.29.244.102/22 ] mtu: 1500 interfaces: - vlan.20 br-vlan: addresses: [] mtu: 1500 interfaces: - vlan.30 br-vxlan: addresses: [ 172.29.240.102/22 ] mtu: 1500 interfaces: - vlan.40 On the deploymenthost, /etc/openstack_deploy/openstack_user_config.yml has the following: --- cidr_networks: container: 172.29.236.0/22 tunnel: 172.29.240.0/22 storage: 172.29.244.0/22 used_ips: - 172.29.236.1 - "172.29.236.100,172.29.236.200" - "172.29.240.100,172.29.240.200" - "172.29.244.100,172.29.244.200" global_overrides: internal_lb_vip_address: 172.29.236.101 external_lb_vip_address: "{{ bootstrap_host_public_address | default(ansible_facts['default_ipv4']['address']) }}" management_bridge: "br-mgmt" provider_networks: - network: group_binds: - all_containers - hosts type: "raw" container_bridge: "br-mgmt" container_interface: "eth1" container_type: "veth" ip_from_q: "container" is_container_address: true - network: group_binds: - glance_api - cinder_api - cinder_volume - nova_compute type: "raw" container_bridge: "br-storage" container_type: "veth" container_interface: "eth2" container_mtu: "9000" ip_from_q: "storage" - network: group_binds: - neutron_linuxbridge_agent container_bridge: "br-vxlan" container_type: "veth" container_interface: "eth10" container_mtu: "9000" ip_from_q: "tunnel" type: "vxlan" range: "1:1000" net_name: "vxlan" - network: group_binds: - neutron_linuxbridge_agent container_bridge: "br-vlan" container_type: "veth" container_interface: "eth11" type: "vlan" range: "101:200,301:400" net_name: "vlan" - network: group_binds: - neutron_linuxbridge_agent container_bridge: "br-vlan" container_type: "veth" container_interface: "eth12" host_bind_override: "eth12" type: "flat" net_name: "flat" shared-infra_hosts: targethost01: ip: 172.29.236.101 repo-infra_hosts: targethost01: ip: 172.29.236.101 coordination_hosts: targethost01: ip: 172.29.236.101 os-infra_hosts: targethost01: ip: 172.29.236.101 identity_hosts: targethost01: ip: 172.29.236.101 network_hosts: targethost01: ip: 172.29.236.101 compute_hosts: targethost01: ip: 172.29.236.101 targethost02: ip: 172.29.236.102 storage-infra_hosts: targethost01: ip: 172.29.236.101 storage_hosts: targethost01: ip: 172.29.236.101 Also on the deploymenthost, /etc/openstack_deploy/conf.d/haproxy.yml has the following: haproxy_hosts: targethost01: ip: 172.29.236.101 At the Run Playbooks step of the guide, the following two Ansible commands return with unreachable=0 failed=0: # openstack-ansible setup-hosts.yml # openstack-ansible setup-infrastructure.yml And verifying the database also returns no error: root@deploymenthost:/opt/openstack-ansible/playbooks# ansible galera_container -m shell \ -a "mysql -h localhost -e 'show status like \"%wsrep_cluster_%\";'" Variable files: "-e @/etc/openstack_deploy/user_secrets.yml -e @/etc/openstack_deploy/user_variables.yml " [WARNING]: Unable to parse /etc/openstack_deploy/inventory.ini as an inventory source targethost01_galera_container-5aa8474a | CHANGED | rc=0 >> Variable_name Value wsrep_cluster_weight 1 wsrep_cluster_capabilities wsrep_cluster_conf_id 1 wsrep_cluster_size 1 wsrep_cluster_state_uuid e7a0c332-97fe-11ed-b0d4-26b30049826d wsrep_cluster_status Primary But when I execute openstack-ansible setup-openstack.yml, I get this: TASK [os_keystone : Fact for apache module mod_auth_openidc to be installed] *** ok: [targethost01_keystone_container-76e9b31b] TASK [include_role : openstack.osa.db_setup] *********************************** TASK [openstack.osa.db_setup : Create database for service] ******************** failed: [targethost01_keystone_container-76e9b31b -> targethost01_utility_container-dc05dc90(172.29.238.59)] (item=None) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false} fatal: [targethost01_keystone_container-76e9b31b -> {{ _oslodb_setup_host }}]: FAILED! => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false} PLAY RECAP ********************************************************************* targethost01_keystone_container-76e9b31b : ok=33 changed=0 unreachable=0 failed=1 skipped=8 rescued=0 ignored=0 targethost01_utility_container-dc05dc90 : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 EXIT NOTICE [Playbook execution failure] ************************************** =============================================================================== First, how can I disable the "censored" warning? I wonder if the uncensored running could give me more clues. Second, it appears to be a problem creating the database (keystone db sync?) How can I test the database execution inside the LXC containers? I tried to log into one of the containers and ping the hosts IP and it works, so they have connectivity. I set up the passwords with: # cd /opt/openstack-ansible # ./scripts/pw-token-gen.py --file /etc/openstack_deploy/user_secrets.yml Any help? Best Regards. -- __________________________________ João Marcelo Uchôa de Alencar jmarcelo.alencar(at)gmail.com __________________________________
Hi – The ansible command to test the DB hits the Galera container directly, while the Ansible playbooks are likely using the VIP managed by HAproxy. I suspect that HAproxy has not started properly or is otherwise not serving traffic directed toward the internal_lb_vip_address. My suggestion at the moment is to check out the logs on the haproxy node to see if it’s working properly, and try testing connectivity from the deploy node via 172.29.236.101:3306. The haproxy logs will likely provide some insight here. -- James Denton Principal Architect Rackspace Private Cloud - OpenStack james.denton@rackspace.com From: jmarcelo.alencar@gmail.com <jmarcelo.alencar@gmail.com> Date: Friday, January 20, 2023 at 6:45 AM To: openstack-discuss@lists.openstack.org <openstack-discuss@lists.openstack.org> Subject: [openstack-ansible] Installing OpenStack with Ansible fails during Keystone playbook on TASK openstack.osa.db_setup CAUTION: This message originated externally, please use caution when clicking on links or opening attachments! Hello Community, I am trying to create a two machine deployment following Openstack Ansible Deployment Guide (https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.openstack.org%2Fproject-deploy-guide%2Fopenstack-ansible%2Flatest%2F&data=05%7C01%7Cjames.denton%40rackspace.com%7C2030b246126f4b053abd08dafae42aba%7C570057f473ef41c8bcbb08db2fc15c2b%7C0%7C0%7C638098155124685217%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jBqnF439N%2BD4e05ZoWzz11rMrtu1gxA7fxYStBnRXnw%3D&reserved=0). The two machines are named targethost01 and targethost02, and I am running Ansible from deploymenthost. Every machine has 4-Core CPUs, 8 GB of RAM, and 240 GB SSD. I am using Ubuntu 22.04.1 LTS. The machine targethost01 has the following network configuration: network: version: 2 ethernets: enp5s0: dhcp4: true enp6s0: {} enp7s0: {} enp8s0: {} enp9s0: {} vlans: vlan.10: id: 10 link: enp6s0 addresses: [ ] vlan.20: id: 20 link: enp7s0 addresses: [ ] vlan.30: id: 30 link: enp8s0 addresses: [ ] vlan.40: id: 40 link: enp9s0 addresses: [ ] bridges: br-mgmt: addresses: [ 172.29.236.101/22 ] mtu: 1500 interfaces: - vlan.10 br-storage: addresses: [ 172.29.244.101/22 ] mtu: 1500 interfaces: - vlan.20 br-vlan: addresses: [] mtu: 1500 interfaces: - vlan.30 br-vxlan: addresses: [ 172.29.240.101/22 ] mtu: 1500 interfaces: - vlan.40 And targethost02 has the following network configuration: network: version: 2 ethernets: enp5s0: dhcp4: true enp6s0: {} enp7s0: {} enp8s0: {} enp9s0: {} vlans: vlan.10: id: 10 link: enp6s0 addresses: [ ] vlan.20: id: 20 link: enp7s0 addresses: [ ] vlan.30: id: 30 link: enp8s0 addresses: [ ] vlan.40: id: 40 link: enp9s0 addresses: [ ] bridges: br-mgmt: addresses: [ 172.29.236.102/22 ] mtu: 1500 interfaces: - vlan.10 br-storage: addresses: [ 172.29.244.102/22 ] mtu: 1500 interfaces: - vlan.20 br-vlan: addresses: [] mtu: 1500 interfaces: - vlan.30 br-vxlan: addresses: [ 172.29.240.102/22 ] mtu: 1500 interfaces: - vlan.40 On the deploymenthost, /etc/openstack_deploy/openstack_user_config.yml has the following: --- cidr_networks: container: 172.29.236.0/22 tunnel: 172.29.240.0/22 storage: 172.29.244.0/22 used_ips: - 172.29.236.1 - "172.29.236.100,172.29.236.200" - "172.29.240.100,172.29.240.200" - "172.29.244.100,172.29.244.200" global_overrides: internal_lb_vip_address: 172.29.236.101 external_lb_vip_address: "{{ bootstrap_host_public_address | default(ansible_facts['default_ipv4']['address']) }}" management_bridge: "br-mgmt" provider_networks: - network: group_binds: - all_containers - hosts type: "raw" container_bridge: "br-mgmt" container_interface: "eth1" container_type: "veth" ip_from_q: "container" is_container_address: true - network: group_binds: - glance_api - cinder_api - cinder_volume - nova_compute type: "raw" container_bridge: "br-storage" container_type: "veth" container_interface: "eth2" container_mtu: "9000" ip_from_q: "storage" - network: group_binds: - neutron_linuxbridge_agent container_bridge: "br-vxlan" container_type: "veth" container_interface: "eth10" container_mtu: "9000" ip_from_q: "tunnel" type: "vxlan" range: "1:1000" net_name: "vxlan" - network: group_binds: - neutron_linuxbridge_agent container_bridge: "br-vlan" container_type: "veth" container_interface: "eth11" type: "vlan" range: "101:200,301:400" net_name: "vlan" - network: group_binds: - neutron_linuxbridge_agent container_bridge: "br-vlan" container_type: "veth" container_interface: "eth12" host_bind_override: "eth12" type: "flat" net_name: "flat" shared-infra_hosts: targethost01: ip: 172.29.236.101 repo-infra_hosts: targethost01: ip: 172.29.236.101 coordination_hosts: targethost01: ip: 172.29.236.101 os-infra_hosts: targethost01: ip: 172.29.236.101 identity_hosts: targethost01: ip: 172.29.236.101 network_hosts: targethost01: ip: 172.29.236.101 compute_hosts: targethost01: ip: 172.29.236.101 targethost02: ip: 172.29.236.102 storage-infra_hosts: targethost01: ip: 172.29.236.101 storage_hosts: targethost01: ip: 172.29.236.101 Also on the deploymenthost, /etc/openstack_deploy/conf.d/haproxy.yml has the following: haproxy_hosts: targethost01: ip: 172.29.236.101 At the Run Playbooks step of the guide, the following two Ansible commands return with unreachable=0 failed=0: # openstack-ansible setup-hosts.yml # openstack-ansible setup-infrastructure.yml And verifying the database also returns no error: root@deploymenthost:/opt/openstack-ansible/playbooks# ansible galera_container -m shell \ -a "mysql -h localhost -e 'show status like \"%wsrep_cluster_%\";'" Variable files: "-e @/etc/openstack_deploy/user_secrets.yml -e @/etc/openstack_deploy/user_variables.yml " [WARNING]: Unable to parse /etc/openstack_deploy/inventory.ini as an inventory source targethost01_galera_container-5aa8474a | CHANGED | rc=0 >> Variable_name Value wsrep_cluster_weight 1 wsrep_cluster_capabilities wsrep_cluster_conf_id 1 wsrep_cluster_size 1 wsrep_cluster_state_uuid e7a0c332-97fe-11ed-b0d4-26b30049826d wsrep_cluster_status Primary But when I execute openstack-ansible setup-openstack.yml, I get this: TASK [os_keystone : Fact for apache module mod_auth_openidc to be installed] *** ok: [targethost01_keystone_container-76e9b31b] TASK [include_role : openstack.osa.db_setup] *********************************** TASK [openstack.osa.db_setup : Create database for service] ******************** failed: [targethost01_keystone_container-76e9b31b -> targethost01_utility_container-dc05dc90(172.29.238.59)] (item=None) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false} fatal: [targethost01_keystone_container-76e9b31b -> {{ _oslodb_setup_host }}]: FAILED! => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false} PLAY RECAP ********************************************************************* targethost01_keystone_container-76e9b31b : ok=33 changed=0 unreachable=0 failed=1 skipped=8 rescued=0 ignored=0 targethost01_utility_container-dc05dc90 : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 EXIT NOTICE [Playbook execution failure] ************************************** =============================================================================== First, how can I disable the "censored" warning? I wonder if the uncensored running could give me more clues. Second, it appears to be a problem creating the database (keystone db sync?) How can I test the database execution inside the LXC containers? I tried to log into one of the containers and ping the hosts IP and it works, so they have connectivity. I set up the passwords with: # cd /opt/openstack-ansible # ./scripts/pw-token-gen.py --file /etc/openstack_deploy/user_secrets.yml Any help? Best Regards. -- __________________________________ João Marcelo Uchôa de Alencar jmarcelo.alencar(at)gmail.com __________________________________
Hi James Denton, Thanks for your quick response!!! So as far as I understand, running "openstack-ansible setup-openstack.yml" will start a keystone installation TASK that connects to HAProxy, which in turn sends the connection to the galera container. The machine targethost01 runs both the containers and HAProxy. From deploymenthost, there is some connectivity to HAProxy: root@deploymenthost:/opt/openstack-ansible/playbooks# telnet 172.29.236.101 3306 Trying 172.29.236.101... Connected to 172.29.236.101. Escape character is '^]'. Connection closed by foreign host. It appears that HAProxy is listening, but cannot provide a proper reply, so the connection closes. Following your suggestion, on targethost01, HAProxy is running, but complains about no galera backend: root@targethost01:~# systemctl status haproxy.service ● haproxy.service - HAProxy Load Balancer Loaded: loaded (/lib/systemd/system/haproxy.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2023-01-20 11:35:40 -03; 33min ago Docs: man:haproxy(1) file:/usr/share/doc/haproxy/configuration.txt.gz Process: 276870 ExecStartPre=/usr/sbin/haproxy -Ws -f $CONFIG -c -q $EXTRAOPTS (code=exited, status=0/SUCCESS) Main PID: 276873 (haproxy) Tasks: 5 (limit: 8192) Memory: 13.1M CPU: 2.165s CGroup: /system.slice/haproxy.service ├─276873 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -S /run/haproxy-master.sock └─276875 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -S /run/haproxy-master.sock Jan 20 11:35:48 targethost01 haproxy[276875]: Server nova_console-back/targethost01_nova_api_container-56e92564 is DOWN, reason: Layer4 connection problem, info: "Conn> Jan 20 11:35:48 targethost01 haproxy[276875]: backend nova_console-back has no server available! Jan 20 11:35:49 targethost01 haproxy[276875]: [WARNING] (276875) : Server placement-back/targethost01_placement_container-90ccebb6 is DOWN, reason: Layer4 connection > Jan 20 11:35:49 targethost01 haproxy[276875]: Server placement-back/targethost01_placement_container-90ccebb6 is DOWN, reason: Layer4 connection problem, info: "Connec> Jan 20 11:35:49 targethost01 haproxy[276875]: [ALERT] (276875) : backend 'placement-back' has no server available! Jan 20 11:35:49 targethost01 haproxy[276875]: backend placement-back has no server available! Jan 20 11:35:53 targethost01 haproxy[276875]: [WARNING] (276875) : Server galera-back/targethost01_galera_container-5aa8474a is DOWN, reason: Layer4 timeout, check du> Jan 20 11:35:53 targethost01 haproxy[276875]: [ALERT] (276875) : backend 'galera-back' has no server available! Jan 20 11:35:53 targethost01 haproxy[276875]: Server galera-back/targethost01_galera_container-5aa8474a is DOWN, reason: Layer4 timeout, check duration: 12001ms. 0 act> Jan 20 11:35:53 targethost01 haproxy[276875]: backend galera-back has no server available! It also warns about the other services, but since they are not installed yet, I believe that it is the expected behavior. But galera should have a functional backend, right? The container is running: root@targethost01:~# lxc-ls targethost01_cinder_api_container-b7ec9bdd targethost01_galera_container-5aa8474a targethost01_glance_container-b3ce5a33 targethost01_heat_api_container-57ec2a00 targethost01_horizon_container-c99d168e targethost01_keystone_container-76e9b31b targethost01_memcached_container-8edca03c targethost01_neutron_server_container-fba7cb77 targethost01_nova_api_container-56e92564 targethost01_placement_container-90ccebb6 targethost01_rabbit_mq_container-2e5c5470 targethost01_repo_container-00531c23 targethost01_utility_container-dc05dc90 targethost01_zookeeper_container-294429e8 ubuntu-22-amd64 root@targethost01:~# lxc-info targethost01_galera_container-5aa8474a Name: targethost01_galera_container-5aa8474a State: RUNNING PID: 102446 IP: 10.0.3.53 IP: 172.29.238.177 Link: 5aa8474a_eth0 TX bytes: 811.30 KiB RX bytes: 57.49 MiB Total bytes: 58.28 MiB Link: 5aa8474a_eth1 TX bytes: 84.35 KiB RX bytes: 1.06 MiB Total bytes: 1.14 MiB I can establish a connection and the server waits for a password: root@targethost01:~# telnet 172.29.238.177 3306 Trying 172.29.238.177... Connected to 172.29.238.177. Escape character is '^]'. u 5.5.5-10.6.10-MariaDB-1:10.6.10+maria~ubu2204-log:8PmS7Y:W'Yn=#6%Vbjmcmysql_native_password Any hints? Best regards. On Fri, Jan 20, 2023 at 11:18 AM James Denton <james.denton@rackspace.com> wrote:
Hi –
The ansible command to test the DB hits the Galera container directly, while the Ansible playbooks are likely using the VIP managed by HAproxy. I suspect that HAproxy has not started properly or is otherwise not serving traffic directed toward the internal_lb_vip_address.
My suggestion at the moment is to check out the logs on the haproxy node to see if it’s working properly, and try testing connectivity from the deploy node via 172.29.236.101:3306. The haproxy logs will likely provide some insight here.
--
James Denton
Principal Architect
Rackspace Private Cloud - OpenStack
james.denton@rackspace.com
From: jmarcelo.alencar@gmail.com <jmarcelo.alencar@gmail.com> Date: Friday, January 20, 2023 at 6:45 AM To: openstack-discuss@lists.openstack.org <openstack-discuss@lists.openstack.org> Subject: [openstack-ansible] Installing OpenStack with Ansible fails during Keystone playbook on TASK openstack.osa.db_setup
CAUTION: This message originated externally, please use caution when clicking on links or opening attachments!
Hello Community,
I am trying to create a two machine deployment following Openstack Ansible Deployment Guide (https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.openstack.org%2Fproject-deploy-guide%2Fopenstack-ansible%2Flatest%2F&data=05%7C01%7Cjames.denton%40rackspace.com%7C2030b246126f4b053abd08dafae42aba%7C570057f473ef41c8bcbb08db2fc15c2b%7C0%7C0%7C638098155124685217%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=jBqnF439N%2BD4e05ZoWzz11rMrtu1gxA7fxYStBnRXnw%3D&reserved=0). The two machines are named targethost01 and targethost02, and I am running Ansible from deploymenthost. Every machine has 4-Core CPUs, 8 GB of RAM, and 240 GB SSD. I am using Ubuntu 22.04.1 LTS.
The machine targethost01 has the following network configuration:
network: version: 2 ethernets: enp5s0: dhcp4: true enp6s0: {} enp7s0: {} enp8s0: {} enp9s0: {} vlans: vlan.10: id: 10 link: enp6s0 addresses: [ ] vlan.20: id: 20 link: enp7s0 addresses: [ ] vlan.30: id: 30 link: enp8s0 addresses: [ ] vlan.40: id: 40 link: enp9s0 addresses: [ ] bridges: br-mgmt: addresses: [ 172.29.236.101/22 ] mtu: 1500 interfaces: - vlan.10 br-storage: addresses: [ 172.29.244.101/22 ] mtu: 1500 interfaces: - vlan.20 br-vlan: addresses: [] mtu: 1500 interfaces: - vlan.30 br-vxlan: addresses: [ 172.29.240.101/22 ] mtu: 1500 interfaces: - vlan.40
And targethost02 has the following network configuration:
network: version: 2 ethernets: enp5s0: dhcp4: true enp6s0: {} enp7s0: {} enp8s0: {} enp9s0: {} vlans: vlan.10: id: 10 link: enp6s0 addresses: [ ] vlan.20: id: 20 link: enp7s0 addresses: [ ] vlan.30: id: 30 link: enp8s0 addresses: [ ] vlan.40: id: 40 link: enp9s0 addresses: [ ] bridges: br-mgmt: addresses: [ 172.29.236.102/22 ] mtu: 1500 interfaces: - vlan.10 br-storage: addresses: [ 172.29.244.102/22 ] mtu: 1500 interfaces: - vlan.20 br-vlan: addresses: [] mtu: 1500 interfaces: - vlan.30 br-vxlan: addresses: [ 172.29.240.102/22 ] mtu: 1500 interfaces: - vlan.40
On the deploymenthost, /etc/openstack_deploy/openstack_user_config.yml has the following:
--- cidr_networks: container: 172.29.236.0/22 tunnel: 172.29.240.0/22 storage: 172.29.244.0/22 used_ips: - 172.29.236.1 - "172.29.236.100,172.29.236.200" - "172.29.240.100,172.29.240.200" - "172.29.244.100,172.29.244.200" global_overrides: internal_lb_vip_address: 172.29.236.101 external_lb_vip_address: "{{ bootstrap_host_public_address | default(ansible_facts['default_ipv4']['address']) }}" management_bridge: "br-mgmt" provider_networks: - network: group_binds: - all_containers - hosts type: "raw" container_bridge: "br-mgmt" container_interface: "eth1" container_type: "veth" ip_from_q: "container" is_container_address: true - network: group_binds: - glance_api - cinder_api - cinder_volume - nova_compute type: "raw" container_bridge: "br-storage" container_type: "veth" container_interface: "eth2" container_mtu: "9000" ip_from_q: "storage" - network: group_binds: - neutron_linuxbridge_agent container_bridge: "br-vxlan" container_type: "veth" container_interface: "eth10" container_mtu: "9000" ip_from_q: "tunnel" type: "vxlan" range: "1:1000" net_name: "vxlan" - network: group_binds: - neutron_linuxbridge_agent container_bridge: "br-vlan" container_type: "veth" container_interface: "eth11" type: "vlan" range: "101:200,301:400" net_name: "vlan" - network: group_binds: - neutron_linuxbridge_agent container_bridge: "br-vlan" container_type: "veth" container_interface: "eth12" host_bind_override: "eth12" type: "flat" net_name: "flat" shared-infra_hosts: targethost01: ip: 172.29.236.101 repo-infra_hosts: targethost01: ip: 172.29.236.101 coordination_hosts: targethost01: ip: 172.29.236.101 os-infra_hosts: targethost01: ip: 172.29.236.101 identity_hosts: targethost01: ip: 172.29.236.101 network_hosts: targethost01: ip: 172.29.236.101 compute_hosts: targethost01: ip: 172.29.236.101 targethost02: ip: 172.29.236.102 storage-infra_hosts: targethost01: ip: 172.29.236.101 storage_hosts: targethost01: ip: 172.29.236.101
Also on the deploymenthost, /etc/openstack_deploy/conf.d/haproxy.yml has the following:
haproxy_hosts: targethost01: ip: 172.29.236.101
At the Run Playbooks step of the guide, the following two Ansible commands return with unreachable=0 failed=0:
# openstack-ansible setup-hosts.yml # openstack-ansible setup-infrastructure.yml
And verifying the database also returns no error:
root@deploymenthost:/opt/openstack-ansible/playbooks# ansible galera_container -m shell \ -a "mysql -h localhost -e 'show status like \"%wsrep_cluster_%\";'" Variable files: "-e @/etc/openstack_deploy/user_secrets.yml -e @/etc/openstack_deploy/user_variables.yml " [WARNING]: Unable to parse /etc/openstack_deploy/inventory.ini as an inventory source targethost01_galera_container-5aa8474a | CHANGED | rc=0 >> Variable_name Value wsrep_cluster_weight 1 wsrep_cluster_capabilities wsrep_cluster_conf_id 1 wsrep_cluster_size 1 wsrep_cluster_state_uuid e7a0c332-97fe-11ed-b0d4-26b30049826d wsrep_cluster_status Primary
But when I execute openstack-ansible setup-openstack.yml, I get this:
TASK [os_keystone : Fact for apache module mod_auth_openidc to be installed] *** ok: [targethost01_keystone_container-76e9b31b] TASK [include_role : openstack.osa.db_setup] *********************************** TASK [openstack.osa.db_setup : Create database for service] ******************** failed: [targethost01_keystone_container-76e9b31b -> targethost01_utility_container-dc05dc90(172.29.238.59)] (item=None) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false} fatal: [targethost01_keystone_container-76e9b31b -> {{ _oslodb_setup_host }}]: FAILED! => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false} PLAY RECAP ********************************************************************* targethost01_keystone_container-76e9b31b : ok=33 changed=0 unreachable=0 failed=1 skipped=8 rescued=0 ignored=0 targethost01_utility_container-dc05dc90 : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 EXIT NOTICE [Playbook execution failure] ************************************** ===============================================================================
First, how can I disable the "censored" warning? I wonder if the uncensored running could give me more clues. Second, it appears to be a problem creating the database (keystone db sync?) How can I test the database execution inside the LXC containers? I tried to log into one of the containers and ping the hosts IP and it works, so they have connectivity. I set up the passwords with:
# cd /opt/openstack-ansible # ./scripts/pw-token-gen.py --file /etc/openstack_deploy/user_secrets.yml
Any help?
Best Regards.
-- __________________________________
João Marcelo Uchôa de Alencar jmarcelo.alencar(at)gmail.com __________________________________
-- __________________________________ João Marcelo Uchôa de Alencar jmarcelo.alencar(at)gmail.com __________________________________
Hi, Thanks for the details. The MariaDB/Galera healthcheck occurs on port 9200, which may not be functioning. You can verify that in the /etc/haproxy/haproxy.cfg file. In the Galera container, there is a file, /etc/systemd/system/mariadbcheck.socket, which has the details, including the “allow” list. Might be worth looking at that to ensure the haproxy node IP is allowed. -- James Denton Principal Architect Rackspace Private Cloud - OpenStack james.denton@rackspace.com From: jmarcelo.alencar@gmail.com <jmarcelo.alencar@gmail.com> Date: Friday, January 20, 2023 at 9:20 AM To: James Denton <james.denton@rackspace.com>, openstack-discuss@lists.openstack.org <openstack-discuss@lists.openstack.org> Subject: Re: [openstack-ansible] Installing OpenStack with Ansible fails during Keystone playbook on TASK openstack.osa.db_setup CAUTION: This message originated externally, please use caution when clicking on links or opening attachments! Hi James Denton, Thanks for your quick response!!! So as far as I understand, running "openstack-ansible setup-openstack.yml" will start a keystone installation TASK that connects to HAProxy, which in turn sends the connection to the galera container. The machine targethost01 runs both the containers and HAProxy. From deploymenthost, there is some connectivity to HAProxy: root@deploymenthost:/opt/openstack-ansible/playbooks# telnet 172.29.236.101 3306 Trying 172.29.236.101... Connected to 172.29.236.101. Escape character is '^]'. Connection closed by foreign host. It appears that HAProxy is listening, but cannot provide a proper reply, so the connection closes. Following your suggestion, on targethost01, HAProxy is running, but complains about no galera backend: root@targethost01:~# systemctl status haproxy.service ● haproxy.service - HAProxy Load Balancer Loaded: loaded (/lib/systemd/system/haproxy.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2023-01-20 11:35:40 -03; 33min ago Docs: man:haproxy(1) file:/usr/share/doc/haproxy/configuration.txt.gz Process: 276870 ExecStartPre=/usr/sbin/haproxy -Ws -f $CONFIG -c -q $EXTRAOPTS (code=exited, status=0/SUCCESS) Main PID: 276873 (haproxy) Tasks: 5 (limit: 8192) Memory: 13.1M CPU: 2.165s CGroup: /system.slice/haproxy.service ├─276873 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -S /run/haproxy-master.sock └─276875 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid -S /run/haproxy-master.sock Jan 20 11:35:48 targethost01 haproxy[276875]: Server nova_console-back/targethost01_nova_api_container-56e92564 is DOWN, reason: Layer4 connection problem, info: "Conn> Jan 20 11:35:48 targethost01 haproxy[276875]: backend nova_console-back has no server available! Jan 20 11:35:49 targethost01 haproxy[276875]: [WARNING] (276875) : Server placement-back/targethost01_placement_container-90ccebb6 is DOWN, reason: Layer4 connection > Jan 20 11:35:49 targethost01 haproxy[276875]: Server placement-back/targethost01_placement_container-90ccebb6 is DOWN, reason: Layer4 connection problem, info: "Connec> Jan 20 11:35:49 targethost01 haproxy[276875]: [ALERT] (276875) : backend 'placement-back' has no server available! Jan 20 11:35:49 targethost01 haproxy[276875]: backend placement-back has no server available! Jan 20 11:35:53 targethost01 haproxy[276875]: [WARNING] (276875) : Server galera-back/targethost01_galera_container-5aa8474a is DOWN, reason: Layer4 timeout, check du> Jan 20 11:35:53 targethost01 haproxy[276875]: [ALERT] (276875) : backend 'galera-back' has no server available! Jan 20 11:35:53 targethost01 haproxy[276875]: Server galera-back/targethost01_galera_container-5aa8474a is DOWN, reason: Layer4 timeout, check duration: 12001ms. 0 act> Jan 20 11:35:53 targethost01 haproxy[276875]: backend galera-back has no server available! It also warns about the other services, but since they are not installed yet, I believe that it is the expected behavior. But galera should have a functional backend, right? The container is running: root@targethost01:~# lxc-ls targethost01_cinder_api_container-b7ec9bdd targethost01_galera_container-5aa8474a targethost01_glance_container-b3ce5a33 targethost01_heat_api_container-57ec2a00 targethost01_horizon_container-c99d168e targethost01_keystone_container-76e9b31b targethost01_memcached_container-8edca03c targethost01_neutron_server_container-fba7cb77 targethost01_nova_api_container-56e92564 targethost01_placement_container-90ccebb6 targethost01_rabbit_mq_container-2e5c5470 targethost01_repo_container-00531c23 targethost01_utility_container-dc05dc90 targethost01_zookeeper_container-294429e8 ubuntu-22-amd64 root@targethost01:~# lxc-info targethost01_galera_container-5aa8474a Name: targethost01_galera_container-5aa8474a State: RUNNING PID: 102446 IP: 10.0.3.53 IP: 172.29.238.177 Link: 5aa8474a_eth0 TX bytes: 811.30 KiB RX bytes: 57.49 MiB Total bytes: 58.28 MiB Link: 5aa8474a_eth1 TX bytes: 84.35 KiB RX bytes: 1.06 MiB Total bytes: 1.14 MiB I can establish a connection and the server waits for a password: root@targethost01:~# telnet 172.29.238.177 3306 Trying 172.29.238.177... Connected to 172.29.238.177. Escape character is '^]'. u 5.5.5-10.6.10-MariaDB-1:10.6.10+maria~ubu2204-log:8PmS7Y:W'Yn=#6%Vbjmcmysql_native_password Any hints? Best regards. On Fri, Jan 20, 2023 at 11:18 AM James Denton <james.denton@rackspace.com> wrote:
Hi –
The ansible command to test the DB hits the Galera container directly, while the Ansible playbooks are likely using the VIP managed by HAproxy. I suspect that HAproxy has not started properly or is otherwise not serving traffic directed toward the internal_lb_vip_address.
My suggestion at the moment is to check out the logs on the haproxy node to see if it’s working properly, and try testing connectivity from the deploy node via 172.29.236.101:3306. The haproxy logs will likely provide some insight here.
--
James Denton
Principal Architect
Rackspace Private Cloud - OpenStack
james.denton@rackspace.com
From: jmarcelo.alencar@gmail.com <jmarcelo.alencar@gmail.com> Date: Friday, January 20, 2023 at 6:45 AM To: openstack-discuss@lists.openstack.org <openstack-discuss@lists.openstack.org> Subject: [openstack-ansible] Installing OpenStack with Ansible fails during Keystone playbook on TASK openstack.osa.db_setup
CAUTION: This message originated externally, please use caution when clicking on links or opening attachments!
Hello Community,
I am trying to create a two machine deployment following Openstack Ansible Deployment Guide (https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.openstack.org%2Fproject-deploy-guide%2Fopenstack-ansible%2Flatest%2F&data=05%7C01%7Cjames.denton%40rackspace.com%7Ca0d5435aeb294d38bbcb08dafaf9ccd7%7C570057f473ef41c8bcbb08db2fc15c2b%7C0%7C0%7C638098248039916228%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=9Guhh4n3xlExA0biSyHR5iXrxmzrkZyF0xJh2cf8zrk%3D&reserved=0). The two machines are named targethost01 and targethost02, and I am running Ansible from deploymenthost. Every machine has 4-Core CPUs, 8 GB of RAM, and 240 GB SSD. I am using Ubuntu 22.04.1 LTS.
The machine targethost01 has the following network configuration:
network: version: 2 ethernets: enp5s0: dhcp4: true enp6s0: {} enp7s0: {} enp8s0: {} enp9s0: {} vlans: vlan.10: id: 10 link: enp6s0 addresses: [ ] vlan.20: id: 20 link: enp7s0 addresses: [ ] vlan.30: id: 30 link: enp8s0 addresses: [ ] vlan.40: id: 40 link: enp9s0 addresses: [ ] bridges: br-mgmt: addresses: [ 172.29.236.101/22 ] mtu: 1500 interfaces: - vlan.10 br-storage: addresses: [ 172.29.244.101/22 ] mtu: 1500 interfaces: - vlan.20 br-vlan: addresses: [] mtu: 1500 interfaces: - vlan.30 br-vxlan: addresses: [ 172.29.240.101/22 ] mtu: 1500 interfaces: - vlan.40
And targethost02 has the following network configuration:
network: version: 2 ethernets: enp5s0: dhcp4: true enp6s0: {} enp7s0: {} enp8s0: {} enp9s0: {} vlans: vlan.10: id: 10 link: enp6s0 addresses: [ ] vlan.20: id: 20 link: enp7s0 addresses: [ ] vlan.30: id: 30 link: enp8s0 addresses: [ ] vlan.40: id: 40 link: enp9s0 addresses: [ ] bridges: br-mgmt: addresses: [ 172.29.236.102/22 ] mtu: 1500 interfaces: - vlan.10 br-storage: addresses: [ 172.29.244.102/22 ] mtu: 1500 interfaces: - vlan.20 br-vlan: addresses: [] mtu: 1500 interfaces: - vlan.30 br-vxlan: addresses: [ 172.29.240.102/22 ] mtu: 1500 interfaces: - vlan.40
On the deploymenthost, /etc/openstack_deploy/openstack_user_config.yml has the following:
--- cidr_networks: container: 172.29.236.0/22 tunnel: 172.29.240.0/22 storage: 172.29.244.0/22 used_ips: - 172.29.236.1 - "172.29.236.100,172.29.236.200" - "172.29.240.100,172.29.240.200" - "172.29.244.100,172.29.244.200" global_overrides: internal_lb_vip_address: 172.29.236.101 external_lb_vip_address: "{{ bootstrap_host_public_address | default(ansible_facts['default_ipv4']['address']) }}" management_bridge: "br-mgmt" provider_networks: - network: group_binds: - all_containers - hosts type: "raw" container_bridge: "br-mgmt" container_interface: "eth1" container_type: "veth" ip_from_q: "container" is_container_address: true - network: group_binds: - glance_api - cinder_api - cinder_volume - nova_compute type: "raw" container_bridge: "br-storage" container_type: "veth" container_interface: "eth2" container_mtu: "9000" ip_from_q: "storage" - network: group_binds: - neutron_linuxbridge_agent container_bridge: "br-vxlan" container_type: "veth" container_interface: "eth10" container_mtu: "9000" ip_from_q: "tunnel" type: "vxlan" range: "1:1000" net_name: "vxlan" - network: group_binds: - neutron_linuxbridge_agent container_bridge: "br-vlan" container_type: "veth" container_interface: "eth11" type: "vlan" range: "101:200,301:400" net_name: "vlan" - network: group_binds: - neutron_linuxbridge_agent container_bridge: "br-vlan" container_type: "veth" container_interface: "eth12" host_bind_override: "eth12" type: "flat" net_name: "flat" shared-infra_hosts: targethost01: ip: 172.29.236.101 repo-infra_hosts: targethost01: ip: 172.29.236.101 coordination_hosts: targethost01: ip: 172.29.236.101 os-infra_hosts: targethost01: ip: 172.29.236.101 identity_hosts: targethost01: ip: 172.29.236.101 network_hosts: targethost01: ip: 172.29.236.101 compute_hosts: targethost01: ip: 172.29.236.101 targethost02: ip: 172.29.236.102 storage-infra_hosts: targethost01: ip: 172.29.236.101 storage_hosts: targethost01: ip: 172.29.236.101
Also on the deploymenthost, /etc/openstack_deploy/conf.d/haproxy.yml has the following:
haproxy_hosts: targethost01: ip: 172.29.236.101
At the Run Playbooks step of the guide, the following two Ansible commands return with unreachable=0 failed=0:
# openstack-ansible setup-hosts.yml # openstack-ansible setup-infrastructure.yml
And verifying the database also returns no error:
root@deploymenthost:/opt/openstack-ansible/playbooks# ansible galera_container -m shell \ -a "mysql -h localhost -e 'show status like \"%wsrep_cluster_%\";'" Variable files: "-e @/etc/openstack_deploy/user_secrets.yml -e @/etc/openstack_deploy/user_variables.yml " [WARNING]: Unable to parse /etc/openstack_deploy/inventory.ini as an inventory source targethost01_galera_container-5aa8474a | CHANGED | rc=0 >> Variable_name Value wsrep_cluster_weight 1 wsrep_cluster_capabilities wsrep_cluster_conf_id 1 wsrep_cluster_size 1 wsrep_cluster_state_uuid e7a0c332-97fe-11ed-b0d4-26b30049826d wsrep_cluster_status Primary
But when I execute openstack-ansible setup-openstack.yml, I get this:
TASK [os_keystone : Fact for apache module mod_auth_openidc to be installed] *** ok: [targethost01_keystone_container-76e9b31b] TASK [include_role : openstack.osa.db_setup] *********************************** TASK [openstack.osa.db_setup : Create database for service] ******************** failed: [targethost01_keystone_container-76e9b31b -> targethost01_utility_container-dc05dc90(172.29.238.59)] (item=None) => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false} fatal: [targethost01_keystone_container-76e9b31b -> {{ _oslodb_setup_host }}]: FAILED! => {"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result", "changed": false} PLAY RECAP ********************************************************************* targethost01_keystone_container-76e9b31b : ok=33 changed=0 unreachable=0 failed=1 skipped=8 rescued=0 ignored=0 targethost01_utility_container-dc05dc90 : ok=3 changed=0 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 EXIT NOTICE [Playbook execution failure] ************************************** ===============================================================================
First, how can I disable the "censored" warning? I wonder if the uncensored running could give me more clues. Second, it appears to be a problem creating the database (keystone db sync?) How can I test the database execution inside the LXC containers? I tried to log into one of the containers and ping the hosts IP and it works, so they have connectivity. I set up the passwords with:
# cd /opt/openstack-ansible # ./scripts/pw-token-gen.py --file /etc/openstack_deploy/user_secrets.yml
Any help?
Best Regards.
-- __________________________________
João Marcelo Uchôa de Alencar jmarcelo.alencar(at)gmail.com __________________________________
-- __________________________________ João Marcelo Uchôa de Alencar jmarcelo.alencar(at)gmail.com __________________________________
participants (2)
-
James Denton
-
jmarcelo.alencar@gmail.com