[Bobcat][Ansible][AIO] how to inspect lxc logs
Hi OpenStack Users Group, I am building a single instance of AWS EC2 using OpenStack Ansible AIO (Bobcat). We are trying to build an all-in-one environment and check the functions of Bobcat. Construction Steps: https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html * I could use VLAN on EC2 only one instance. it doesn't multi-instance. Anssible has been successful, and I can access and log in to the Horizon console with a browser. I have been able to register the image, but I get an error when I create a volume from the image. * Even if you create an empty volume, the same error will occur. Wording displayed in the volume creation message: Message Level: ERROR Event Id: VOLUME_VOLUME_001_003 User Message: "all schedulelocate volume:Could not find any available weighted backend." When I tried to refer to the lxc log to debug the environment, Under /var/log/lxc, there is only the log from the first startup, and there are no debug statements that are running during the actual inspection. The lxc-* command does not have a command like "Docker logs", so I am having trouble analyzing logs such as Cinder. When building with Ansible, it is built with "debug:True" defined in /etc/openstacl_deploy/user_variables.yml. Is there any other configuration that outputs Debug logs for each component? Any advice on things to look at when analyzing each component built with lxc would be fine. Furthermore, my environment is as follows. *Instance type: m5a.2xlarge *vCPU8, 32GiB, EBS storage 100GB(root) and 50GB(seccond) *OS:Ubuntu-jammy-22.04 *OpenStack Ansible: 28.0.1 (Bobcat) Thanks regards.
Hey! I can guess that there's some config issue we have with the default ISCSI driver for cinder-volume that results in a failure you see. I can recall patching something simmilar a while back, but maybe it was not enough... I guess, I would actually try to add `ceph` to the SCENARIO (in case you're considering to use ceph later on) to spawn a ceph cluster as a backing storage for cinder - the flavor you've picked should be enough to handle that. Regarding logs - we try to write all logs to journald. So one way to get logs for cinder-volume could be like this: lxc-attach -n $(lxc-ls -1 | grep cinder_volume) -- journalctl -u cinder-volume Or you can just do lxc-attach -n $(lxc-ls -1 | grep cinder_volume) and inspect the container once attached. LXC on the contrary to Docker are system containers, not application ones. Journal logs are passed to host under /var/log/journal/ - each container has it's own directory based on the container machine-id, however I can't recall right away on how to map these folders to the specific container except do smth like: CINDER_MACHINE=$(lxc-attach -n $(lxc-ls -1 | grep cinder_volume) -- cat /etc/machine-id) journalctl --directory /var/log/journal/$CINDER_MACHINE -f Hope this helps. пт, 22 мар. 2024 г. в 12:45, <tatsuro.makita.bp@nttdata.com>:
Hi OpenStack Users Group,
I am building a single instance of AWS EC2 using OpenStack Ansible AIO (Bobcat). We are trying to build an all-in-one environment and check the functions of Bobcat. Construction Steps: https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html * I could use VLAN on EC2 only one instance. it doesn't multi-instance.
Anssible has been successful, and I can access and log in to the Horizon console with a browser. I have been able to register the image, but I get an error when I create a volume from the image. * Even if you create an empty volume, the same error will occur.
Wording displayed in the volume creation message: Message Level: ERROR Event Id: VOLUME_VOLUME_001_003 User Message: "all schedulelocate volume:Could not find any available weighted backend."
When I tried to refer to the lxc log to debug the environment, Under /var/log/lxc, there is only the log from the first startup, and there are no debug statements that are running during the actual inspection. The lxc-* command does not have a command like "Docker logs", so I am having trouble analyzing logs such as Cinder.
When building with Ansible, it is built with "debug:True" defined in /etc/openstacl_deploy/user_variables.yml. Is there any other configuration that outputs Debug logs for each component? Any advice on things to look at when analyzing each component built with lxc would be fine.
Furthermore, my environment is as follows. *Instance type: m5a.2xlarge *vCPU8, 32GiB, EBS storage 100GB(root) and 50GB(seccond) *OS:Ubuntu-jammy-22.04 *OpenStack Ansible: 28.0.1 (Bobcat)
Thanks regards.
Also, just in case you can also try "metal" deployment that does not involve any containers for simplicity of setup. For that, you can replace `lxc` with `metal` for SCENARIO for bootstrap-aio.sh. пт, 22 мар. 2024 г. в 14:04, Dmitriy Rabotyagov <noonedeadpunk@gmail.com>:
Hey!
I can guess that there's some config issue we have with the default ISCSI driver for cinder-volume that results in a failure you see. I can recall patching something simmilar a while back, but maybe it was not enough... I guess, I would actually try to add `ceph` to the SCENARIO (in case you're considering to use ceph later on) to spawn a ceph cluster as a backing storage for cinder - the flavor you've picked should be enough to handle that.
Regarding logs - we try to write all logs to journald. So one way to get logs for cinder-volume could be like this: lxc-attach -n $(lxc-ls -1 | grep cinder_volume) -- journalctl -u cinder-volume
Or you can just do lxc-attach -n $(lxc-ls -1 | grep cinder_volume) and inspect the container once attached. LXC on the contrary to Docker are system containers, not application ones.
Journal logs are passed to host under /var/log/journal/ - each container has it's own directory based on the container machine-id, however I can't recall right away on how to map these folders to the specific container except do smth like: CINDER_MACHINE=$(lxc-attach -n $(lxc-ls -1 | grep cinder_volume) -- cat /etc/machine-id) journalctl --directory /var/log/journal/$CINDER_MACHINE -f
Hope this helps.
пт, 22 мар. 2024 г. в 12:45, <tatsuro.makita.bp@nttdata.com>:
Hi OpenStack Users Group,
I am building a single instance of AWS EC2 using OpenStack Ansible AIO (Bobcat). We are trying to build an all-in-one environment and check the functions of Bobcat. Construction Steps: https://docs.openstack.org/openstack-ansible/latest/user/aio/quickstart.html * I could use VLAN on EC2 only one instance. it doesn't multi-instance.
Anssible has been successful, and I can access and log in to the Horizon console with a browser. I have been able to register the image, but I get an error when I create a volume from the image. * Even if you create an empty volume, the same error will occur.
Wording displayed in the volume creation message: Message Level: ERROR Event Id: VOLUME_VOLUME_001_003 User Message: "all schedulelocate volume:Could not find any available weighted backend."
When I tried to refer to the lxc log to debug the environment, Under /var/log/lxc, there is only the log from the first startup, and there are no debug statements that are running during the actual inspection. The lxc-* command does not have a command like "Docker logs", so I am having trouble analyzing logs such as Cinder.
When building with Ansible, it is built with "debug:True" defined in /etc/openstacl_deploy/user_variables.yml. Is there any other configuration that outputs Debug logs for each component? Any advice on things to look at when analyzing each component built with lxc would be fine.
Furthermore, my environment is as follows. *Instance type: m5a.2xlarge *vCPU8, 32GiB, EBS storage 100GB(root) and 50GB(seccond) *OS:Ubuntu-jammy-22.04 *OpenStack Ansible: 28.0.1 (Bobcat)
Thanks regards.
Dmitriy Thank you for your information, i can reach the logs using journalctl. i faced next problem. i reconstructed environment from ec2 instance creation using lxc, ansible aio. i get happend error below; # openstack-ansible setup-infrastrurture.yml : TASK [openstack.osa.glusterfs : Create gluster peers] ******************************************************************* task path: /etc/ansible/ansible_collections/openstack/osa/roles/glusterfs/tasks/main.yml:101 container_name: "aio1_repo_container-1f209dad" physical_host: "aio1" Container confirmed Using module file /etc/ansible/ansible_collections/gluster/gluster/plugins/modules/gluster_peer.py Pipelining is enabled. <aio1> ESTABLISH SSH CONNECTION FOR USER: root <aio1> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=300 -o StrictHostKeyChecking=no -o Port=22 -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=5 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o ServerAliveInterval=64 -o ServerAliveCountMax=1024 -o Compression=no -o TCPKeepAlive=yes -o VerifyHostKeyDNS=no -o ForwardX11=no -o ForwardAgent=yes -T -o 'ControlPath="/root/.ansible/cp/270ca4df34"' aio1 'lxc-attach --clear-env --name aio1_repo_container-1f209dad -- su - root -c '"'"'/bin/sh -c '"'"'"'"'"'"'"'"'/usr/bin/python3 && sleep 0'"'"'"'"'"'"'"'"''"'"'' <aio1> (1, b'\n{"changed": false, "rc": 1, "failed": true, "msg": "peer probe: failed: Probe returned with Transport endpoint is not connected\\n", "invocation": {"module_args": {"nodes": ["aio1-repo-container-1f209dad"], "state": "present", "force": null}}}\n', b'/tmp/ansible_gluster.gluster.gluster_peer_payload_3dl9voiq/ansible_gluster.gluster.gluster_peer_payload.zip/ansible_collections/gluster/gluster/plugins/modules/gluster_peer.py:81: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives\n') <aio1> Failed to connect to the host via ssh: /tmp/ansible_gluster.gluster.gluster_peer_payload_3dl9voiq/ansible_gluster.gluster.gluster_peer_payload.zip/ansible_collections/gluster/gluster/plugins/modules/gluster_peer.py:81: DeprecationWarning: The distutils package is deprecated and slated for removal in Python 3.12. Use setuptools or check PEP 632 for potential alternatives fatal: [aio1_repo_container-1f209dad]: FAILED! => { "changed": false, "invocation": { "module_args": { "force": null, "nodes": [ "aio1-repo-container-1f209dad" ], "state": "present" } }, "msg": "peer probe: failed: Probe returned with Transport endpoint is not connected\n", "rc": 1 } NOTIFIED HANDLER systemd_service : Restart changed services for aio1_repo_container-1f209dad -------------------------------------------------------------------------------------------------------- my contraction step is below; 1. create instance * m5a.2xlarge / SSD 100GB(root) and 50GB(second) / Ubuntu22.04 2. create network creation for vlan # vi /etc/netplan/50-cloud-init.yaml vlan 10 - 40 create. 3. package install # apt update && apt dist-upgrade -y && apt install -y build-essential git chrony openssh-server python3-dev sudo bridge-utils debootstrap tcpdump vlan python3 linux-modules-extra-$(uname -r) # reboot 4. lsblk, create 2 partition in secondly disk device. fdisk /dev/nvme1n1 n p 1 2048 20971520 n p 2 20973568 104857599 p w # pvcreate --metadatasize 2048 /dev/nvme1n1p1 # vgcreate cinder-volumes /dev/nvme1n1p1 5. clone openstack # git clone -b 28.0.1 https://opendev.org/openstack/openstack-ansible /opt/openstack-ansible # cd /opt/openstack-ansible/ # scripts/bootstrap-ansible.sh 6. configuration for infrastrature # cp -ra /opt/openstack-ansible/etc/openstack_deploy /etc/openstack_deploy # cp etc/openstack_deploy/conf.d/{aodh,gnocchi,ceilometer}.yml.aio /etc/openstack_deploy/conf.d/ # for f in $(ls -1 /etc/openstack_deploy/conf.d/*.aio); do mv -v ${f} ${f%.*}; done # cp /etc/openstack_deploy/openstack_user_config.yml.test.example /etc/openstack_deploy/openstack_user_config.yml # export BOOTSTRAP_OPTS="bootstrap_host_data_disk_device=nvme1n1" # export BOOTSTRAP_OPTS="${BOOTSTRAP_OPTS} bootstrap_host_data_disk_fs_type=lvm" # export BOOTSTRAP_OPTS="bootstrap_host_public_interface=ens5" # export SCENARIO='aio_lxc_barbican_ceph_lxb' # vi ./tests/roles/bootstrap-host/tasks/prepare_data_disk.yml Correct the end of lines 66 and 76 below to p1 and p2. Give p. : 66 dev: "/dev/{{ _bootstrap_host_data_disk_device }}p1" : 76 dev: "/dev/{{ _bootstrap_host_data_disk_device }}p2" : # vi tests/roles/bootstrap-host/tasks/prepare_networking.yml Change lines 207 and 210 below and the IP address .100 to .10. : 136 address: "172.29.244.10" : 173 address: "172.29.240.10" : 207 - 172.29.244.10 # br-storage : 210 - 172.29.240.10 # br-vxlan : # scripts/bootstrap-aio.sh wait 5 - 10 min.... 7. Setting hosts # vi /etc/hosts 172.29.236.11 infra1 172.29.240.11 infra1 172.29.244.11 infra1 172.29.236.12 compute1 172.29.240.12 compute1 172.29.244.12 compute1 172.29.236.13 storage1 172.29.240.13 storage1 172.29.244.13 storage1 8. test ssh login # ssh aio1 yes $ exit # ssh external yes $ exit # ssh infra1 yes $ exit # ssh compute1 yes $ exit # ssh storage1 9. create password # cd /opt/openstack-ansible # ./scripts/pw-token-gen.py --file /etc/openstack_deploy/user_secrets.yml Operation Complete, [ /etc/openstack_deploy/user_secrets.yml ] is ready # 10. execute ansible for setup-hosts # cd /opt/openstack-ansible/playbooks # openstack-ansible setup-hosts.yml wait 25 -30 min.... 11. Copy inventry of original # cp /etc/openstack_deploy/openstack_inventory.json /etc/openstack_deploy/openstack_inventory.json.org # sed -e "s/eth1/ens5/g" /etc/openstack_deploy/openstack_inventory.json > /etc/openstack_deploy/openstack_inventory.json.eth1 # sed -e "s/eth2/ens5/g" /etc/openstack_deploy/openstack_inventory.json.eth1 > /etc/openstack_deploy/openstack_inventory.json.eth2 # cp /etc/openstack_deploy/openstack_inventory.json.eth2 /etc/openstack_deploy/openstack_inventory.json 12. execute ansible infrastrature # openstack-ansible setup-infrastructure.yml -e ansible_distribution=Ubuntu -e ansible_distribution_major_version="22" -e ansible_distribution_version="22.04" -e ansible_os_family="Debian" -e ansible_pkg_mgr="apt" -vvv after 5 min, it will be happend. ------------------------ Also, My testing below; # vi /etc/ansible/ansible_collections/gluster/gluster/plugins/modules/gluster_peer.py Insert the following on line 80 and save. : 80 import warnings 81 warnings.filterwarnings('ignore', category=DeprecationWarning) : # openstack-ansible setup-infrastructure.yml -e ansible_distribution=Ubuntu -e ansible_distribution_major_version="22" -e ansible_distribution_version="22.04" -e ansible_os_family="Debian" -e ansible_pkg_mgr="apt" -vvv The DeprecationWarning has disappeared, but the message below appears again. ``` "msg": "peer probe: failed: Probe returned with Transport endpoint is not connected\n", ```
additional question: As shown below, when I run "gluster peer probe" command on repo-container, the corresponding FAILED occurs. I can referenced the hostname" aio1-repo-container-8ee985f4" from /var/log/glusterfs/cmd_history.log. When referencing this aio1-repo-container-8ee985f4 using the nslookup command "172.29.238.62", and when exiting from the container and referencing the IP address from the host (/etc/hosts), the DNS host names(aio1-unbound-container) do not appear to match. This province seems to have a problem with resolvconf? Anyone know what it is? regards. ---- root@aio1-repo-container-8ee985f4:/# gluster peer probe aio1-repo-container-8ee985f4 peer probe: failed: Probe returned with Transport endpoint is not connected root@aio1-repo-container-8ee985f4:/# root@aio1-repo-container-8ee985f4:/# nslookup aio1_repo_container-8ee985f4 Server: 172.29.238.62 Address: 172.29.238.62#53 ** server can't find aio1_repo_container-8ee985f4: NXDOMAIN root@aio1-repo-container-8ee985f4:/# exit exit # cat /etc/hosts | grep 172.29.238.62 172.29.238.62 aio1-unbound-container-ff4b869c.openstack.local aio1-unbound-container-ff4b869c aio1_unbound_container-ff4b869c
Hey, Sorry for the delay with a reply. Frankly speaking I'm slightly confused now. So I assume you have added `unbound_hosts` to your openstack_user_config.yml? Did you happen to delete /etc/openstack_deploy/openstack_inventory.json file without deleting all containers in advance? As such action will result in Inventory being re-generated from scratch, meaning new container names and IP mappings, so they can overlap as a result. And yes, gluster would totally not like such overlap. But otherwise I'm a bit confused about how/why would records in /etc/hosts overlap. Though frankly speaking I haven't run an unbound scenario here for years now, and potentially it might need some love. And there could be quite valid bug related to unbound as well. Also, I'm not sure that bootstrap-aio.sh is really doing valid things for multi-node deployment. It gets some "reference" for 1-node environments that can be scaled-up, but it also has a couple of opinionated decisions, like making local loopback devices for lvm/storage, and then a bunch of dummy interfaces for networking. So I would suggest leaving aio environment somewhere nearby and then just spawn a multi-node one nearby with providing configuration manually to avoid these opinionated decisions which may strike back at later stages. вт, 26 мар. 2024 г. в 06:59, Tatsuro Makita <tatsuro.makita.bp@nttdata.com>:
additional question:
As shown below, when I run "gluster peer probe" command on repo-container, the corresponding FAILED occurs. I can referenced the hostname" aio1-repo-container-8ee985f4" from /var/log/glusterfs/cmd_history.log. When referencing this aio1-repo-container-8ee985f4 using the nslookup command "172.29.238.62", and when exiting from the container and referencing the IP address from the host (/etc/hosts), the DNS host names(aio1-unbound-container) do not appear to match. This province seems to have a problem with resolvconf? Anyone know what it is?
regards.
---- root@aio1-repo-container-8ee985f4:/# gluster peer probe aio1-repo-container-8ee985f4 peer probe: failed: Probe returned with Transport endpoint is not connected root@aio1-repo-container-8ee985f4:/# root@aio1-repo-container-8ee985f4:/# nslookup aio1_repo_container-8ee985f4 Server: 172.29.238.62 Address: 172.29.238.62#53
** server can't find aio1_repo_container-8ee985f4: NXDOMAIN
root@aio1-repo-container-8ee985f4:/# exit exit # cat /etc/hosts | grep 172.29.238.62 172.29.238.62 aio1-unbound-container-ff4b869c.openstack.local aio1-unbound-container-ff4b869c aio1_unbound_container-ff4b869c
participants (3)
-
Dmitriy Rabotyagov
-
Tatsuro Makita
-
tatsuro.makita.bp@nttdata.com