Re: Re: [openstack-ansible] OpenStack Ansible deployment fails due to lxc containers not having network connection
Hi Jonathan, thank you for your reply! I probably should have specified that I already changed some default values in /etc/ansible/roles/lxc_hosts/defaults/main.yml to prevent a conflict with my storage network. Here's the part that I changed:
``` lxc_net_address: 10.255.255.1 lxc_net_netmask: 255.255.255.0 lxc_net_dhcp_range: 10.255.255.2,10.255.255.253 ```
Could there be some other reference to the original default address range which causes the error?
I'm also confused about dnsmasq: Running 'apt-get install dnsmasq' I discovered that it wasn't installed on the infra host yet (though installing it also didn't solve the problem). Moreover, I couldn't find dnsmasq in the prerequisites in the OSA deployment guide.
Kind regards, Oliver
On 03/09/2020 18:38, openstack-discuss-request@lists.openstack.org wrote:
Message: 1 Date: Thu, 3 Sep 2020 16:51:51 +0100 From: Jonathan Rosser jonathan.rosser@rd.bbc.co.uk To: openstack-discuss@lists.openstack.org Subject: Re: [openstack-ansible] OpenStack Ansible deployment fails due to lxc containers not having network connection Message-ID: e6746294-ada6-0f29-cdfb-59538396a0e0@rd.bbc.co.uk Content-Type: text/plain; charset=utf-8; format=flowed
Hi Oliver,
The default route would normally be via eth0 in the container, which I suspect has some issue.
This is given an address by dnsmasq/dhcp on the host and attached to lxcbr0. This is where I would start to look. I am straight seeing that the default address range used for eth0 is in conflict with your storage network, so perhaps this is also something to look at. See https://github.com/openstack/openstack-ansible-lxc_hosts/blob/master/default... https://github.com/openstack/openstack-ansible-lxc_hosts/blob/master/defaults/main.yml#L104
You join us on irc at #openstack-ansible for some 'real-time' assistance if necessary.
Regards, Jonathan.
On 03/09/2020 16:18, Oliver Wenz wrote:
I'm trying to deploy OpenStack Ansible. When running the first playbook ```openstack-ansible setup-hosts.yml```, there are errors for all containers during the task ```[openstack_hosts : Remove the blacklisted packages]``` (see below) and the playbook fails.
fatal: [infra1_repo_container-1f1565cd]: FAILED! => {"changed": false, "cmd": "apt-get update", "msg": "E: The repository 'http://ubuntu.mirror.lrz.de/ubuntu bionic Release' no longer has a Release file. E: The repository 'http://ubuntu.mirror.lrz.de/ubuntu bionic-updates Release' no longer has a Release file. E: The repository 'http://ubuntu.mirror.lrz.de/ubuntu bionic-backports Release' no longer has a Release file. E: The repository 'http://ubuntu.mirror.lrz.de/ubuntu bionic-security Release' no longer has a Release file.", "rc": 100, "stderr": "E: The repository 'http://ubuntu.mirror.lrz.de/ubuntu bionic Release' no longer has a Release file. E: The repository 'http://ubuntu.mirror.lrz.de/ubuntu bionic-updates Release' no longer has a Release file. E: The repository 'http://ubuntu.mirror.lrz.de/ubuntu bionic-backports Release' no longer has a Release file. E: The repository 'http://ubuntu.mirror.lrz.de/ubuntu bionic-security Release' no longer has a Release file. ", "stderr_lines": ["E: The repository 'http://ubuntu.mirror.lrz.de/ubuntu bionic Release' no longer has a Release file.", "E: The repository 'http://ubuntu.mirror.lrz.de/ubuntu bionic-updates Release' no longer has a Release file.", "E: The repository 'http://ubuntu.mirror.lrz.de/ubuntu bionic-backports Release' no longer has a Release file.", "E: The repository 'http://ubuntu.mirror.lrz.de/ubuntu bionic-security Release' no longer has a Release file."], "stdout": "Ign:1 http://ubuntu.mirror.lrz.de/ubuntu bionic InRelease Ign:2 http://ubuntu.mirror.lrz.de/ubuntu bionic-updates InRelease Ign:3 http://ubuntu.mirror.lrz.de/ubuntu bionic-backports InRelease Ign:4 http://ubuntu.mirror.lrz.de/ubuntu bionic-security InRelease Err:5 http://ubuntu.mirror.lrz.de/ubuntu bionic Release Cannot initiate the connection to 192.168.100.6:8000 (192.168.100.6). - connect (101: Network is unreachable) Err:6 http://ubuntu.mirror.lrz.de/ubuntu bionic-updates Release Cannot initiate the connection to 192.168.100.6:8000 (192.168.100.6). - connect (101: Network is unreachable) Err:7 http://ubuntu.mirror.lrz.de/ubuntu bionic-backports Release Cannot initiate the connection to 192.168.100.6:8000 (192.168.100.6). - connect (101: Network is unreachable) Err:8 http://ubuntu.mirror.lrz.de/ubuntu bionic-security Release Cannot initiate the connection to 192.168.100.6:8000 (192.168.100.6). - connect (101: Network is unreachable) Reading package lists... ", "stdout_lines": ["Ign:1 http://ubuntu.mirror.lrz.de/ubuntu bionic InRelease", "Ign:2 http://ubuntu.mirror.lrz.de/ubuntu bionic-updates InRelease", "Ign:3 http://ubuntu.mirror.lrz.de/ubuntu bionic-backports InRelease", "Ign:4 http://ubuntu.mirror.lrz.de/ubuntu bionic-security InRelease", "Err:5 http://ubuntu.mirror.lrz.de/ubuntu bionic Release", " Cannot initiate the connection to 192.168.100.6:8000 (192.168.100.6). - connect (101: Network is unreachable)", "Err:6 http://ubuntu.mirror.lrz.de/ubuntu bionic-updates Release", " Cannot initiate the connection to 192.168.100.6:8000 (192.168.100.6). - connect (101: Network is unreachable)", "Err:7 http://ubuntu.mirror.lrz.de/ubuntu bionic-backports Release", " Cannot initiate the connection to 192.168.100.6:8000 (192.168.100.6). - connect (101: Network is unreachable)", "Err:8 http://ubuntu.mirror.lrz.de/ubuntu bionic-security Release", " Cannot initiate the connection to 192.168.100.6:8000 (192.168.100.6). - connect (101: Network is unreachable)", "Reading package lists..."]}
When I attach to any container and run ```ping 192.168.100.6``` (local DNS), I get the same error (```connect: Network is unreachable```). However, when I specify an interface by running ```ping -I eth1 192.168.100.6``` there is a successful connection. Running ```ip r``` on the infra_cinder container yields:
10.0.3.0/24 dev eth2 proto kernel scope link src 10.0.3.5 192.168.110.0/24 dev eth1 proto kernel scope link src 192.168.110.232
so there seems to be no default route which is why the connection fails (similar for the other infra containers). Shouldn't OSA automatically configure this? I didn't find anything regarding a default route on containers in the Docs.
Here's my openstack_user_config.yml:
cidr_networks: container: 192.168.110.0/24 tunnel: 192.168.32.0/24 storage: 10.0.3.0/24 used_ips: - "192.168.110.1,192.168.110.2" - "192.168.110.111" - "192.168.110.115" - "192.168.110.117,192.168.110.118" - "192.168.110.131,192.168.110.140" - "192.168.110.201,192.168.110.207" - "192.168.32.1,192.168.32.2" - "192.168.32.201,192.168.32.207" - "10.0.3.1" - "10.0.3.11,10.0.3.14" - "10.0.3.21,10.0.3.24" - "10.0.3.31,10.0.3.42" - "10.0.3.201,10.0.3.207" global_overrides: # The internal and external VIP should be different IPs, however they # do not need to be on separate networks. external_lb_vip_address: 192.168.100.168 internal_lb_vip_address: 192.168.110.201 management_bridge: "br-mgmt" provider_networks: - network: container_bridge: "br-mgmt" container_type: "veth" container_interface: "eth1" ip_from_q: "container" type: "raw" group_binds: - all_containers - hosts is_container_address: true - network: container_bridge: "br-vxlan" container_type: "veth" container_interface: "eth10" ip_from_q: "tunnel" type: "vxlan" range: "1:1000" net_name: "vxlan" group_binds: - neutron_linuxbridge_agent - network: container_bridge: "br-ext1" container_type: "veth" container_interface: "eth12" host_bind_override: "eth12" type: "flat" net_name: "ext_net" group_binds: - neutron_linuxbridge_agent - network: container_bridge: "br-storage" container_type: "veth" container_interface: "eth2" ip_from_q: "storage" type: "raw" group_binds: - glance_api - cinder_api - cinder_volume - nova_compute - swift-proxy ### ### Infrastructure ### # galera, memcache, rabbitmq, utility shared-infra_hosts: infra1: ip: 192.168.110.201 # repository (apt cache, python packages, etc) repo-infra_hosts: infra1: ip: 192.168.110.201 # load balancer haproxy_hosts: infra1: ip: 192.168.110.201 ### ### OpenStack ### os-infra_hosts: infra1: ip: 192.168.110.201 identity_hosts: infra1: ip: 192.168.110.201 network_hosts: infra1: ip: 192.168.110.201 compute_hosts: compute1: ip: 192.168.110.204 compute2: ip: 192.168.110.205 compute3: ip: 192.168.110.206 compute4: ip: 192.168.110.207 storage-infra_hosts: infra1: ip: 192.168.110.201 storage_hosts: lvm-storage1: ip: 192.168.110.202 container_vars: cinder_backends: lvm: volume_backend_name: LVM_iSCSI volume_driver: cinder.volume.drivers.lvm.LVMVolumeDriver volume_group: cinder_volumes iscsi_ip_address: "{{ cinder_storage_address }}" limit_container_types: cinder_volume
I also asked this question on the server fault stackexchange: https://serverfault.com/questions/1032573/openstack-ansible-deployment-fails...
Kind regards, Oliver
Hi Oliver,
The dnsmasq dependancy will be pulled in by lxc, which in turn needs lxc-utils, that then wants dnsmasq-base as you can see here https://packages.ubuntu.com/bionic/lxc-utils. You will not find LXC itself as a per-requisite in the documentation as the setup is handled completely by the lxc_hosts ansible role.
For openstack-ansible it is not necessarily a good idea to adjust the variables in /etc/ansible/roles/.... because these repositories will be overwritten any time you do a minor/major upgrade.
There is a reference here https://docs.openstack.org/openstack-ansible/latest/reference/configuration/... for overriding variables, and the most common starting point would be to create /etc/openstack_deploy/user_variables.yml and put your customization there.
I would recommend always building an All-In-One deployment in a virtual machine so that you have a reference to compare against when moving away from the 'stock config'. Documentation for the AIO can be found here https://docs.openstack.org/openstack-ansible/ussuri/user/aio/quickstart.html
Regards,
Jonathan.
On 21/09/2020 10:11, Oliver Wenz wrote:
Hi Jonathan, thank you for your reply! I probably should have specified that I already changed some default values in /etc/ansible/roles/lxc_hosts/defaults/main.yml to prevent a conflict with my storage network. Here's the part that I changed:
lxc_net_address: 10.255.255.1 lxc_net_netmask: 255.255.255.0 lxc_net_dhcp_range: 10.255.255.2,10.255.255.253
Could there be some other reference to the original default address range which causes the error?
I'm also confused about dnsmasq: Running 'apt-get install dnsmasq' I discovered that it wasn't installed on the infra host yet (though installing it also didn't solve the problem). Moreover, I couldn't find dnsmasq in the prerequisites in the OSA deployment guide.
Kind regards, Oliver
participants (2)
-
Jonathan Rosser
-
Oliver Wenz