[TripleO][train][rdo] installation of undercloud fails during Run container-puppet tasks step1
Hi all, I am running a fresh install of rdo train, on centos7 I almost a week I am facing error at this step: TASK [Run container-puppet tasks (generate config) during step 1] So I have ansible.log attached, I cannot find anything, where it is failing. According to some understanding in ansible, it fails if it finds stderr output. I cannot find error/fail or smth, I see Notices and Warnings, but I believe it is not stderr? I see containers running and removed after some time. (as it should be I think)... Could you help me, where to dig? -- Ruslanas Gžibovskis +370 6030 7030
On Tue, Apr 28, 2020 at 11:57 AM Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
Hi all,
I am running a fresh install of rdo train, on centos7 I almost a week I am facing error at this step: TASK [Run container-puppet tasks (generate config) during step 1]
So I have ansible.log attached, I cannot find anything, where it is failing. According to some understanding in ansible, it fails if it finds stderr output. I cannot find error/fail or smth, I see Notices and Warnings, but I believe it is not stderr?
I see containers running and removed after some time. (as it should be I think)...
Could you help me, where to dig?
2020-04-27 22:27:46,147 p=132230 u=root | TASK [Start containers for step 1 using paunch] ***************************************************************************************************************************** 2020-04-27 22:27:46,148 p=132230 u=root | Monday 27 April 2020 22:27:46 +0200 (0:00:00.137) 0:04:44.326 **********? 2020-04-27 22:27:46,816 p=132230 u=root | ok: [remote-u] 2020-04-27 22:27:46,914 p=132230 u=root | TASK [Debug output for task: Start containers for step 1] ******************************************************************************************************************* 2020-04-27 22:27:46,915 p=132230 u=root | Monday 27 April 2020 22:27:46 +0200 (0:00:00.767) 0:04:45.093 **********? 2020-04-27 22:27:46,977 p=132230 u=root | fatal: [remote-u]: FAILED! => { "failed_when_result": true,? "outputs.stdout_lines | default([]) | union(outputs.stderr_lines | default([]))": [] Check /var/log/paunch.log. It probably has additional information as to why the containers didn't start. You might also check the output of 'sudo podman ps -a' to see if any containers exited with errors.
-- Ruslanas Gžibovskis +370 6030 7030
podman ps -a = clean, no containers at all. I have a watch -d "sudo podman ps -a ; sudo podman images -a ; sudo df -h" paunch.log is empty. (I did several reinstallations). I found in image logs: 2020-04-29 08:52:49,854 140572 DEBUG urllib3.connectionpool [ ] https://registry-1.docker.io:443 "GET /v2/ HTTP/1.1" 401 87 2020-04-29 08:52:49,855 140572 DEBUG tripleo_common.image.image_uploader [ ] https://registry-1.docker.io/v2/ status code 401 2020-04-29 08:52:49,855 140572 DEBUG tripleo_common.image.image_uploader [ ] Token parameters: params {'scope': 'repository:tripleotrain/centos-binary-zaqar-wsgi:pull', 'service': ' registry.docker.io'} 2020-04-29 08:52:49,731 140572 DEBUG urllib3.connectionpool [ ] https://registry-1.docker.io:443 "GET /v2/ HTTP/1.1" 401 87 2020-04-29 08:52:49,732 140572 DEBUG tripleo_common.image.image_uploader [ ] https://registry-1.docker.io/v2/ status code 401 2020-04-29 08:52:49,732 140572 DEBUG tripleo_common.image.image_uploader [ ] Token parameters: params {'scope': 'repository:tripleotrain/centos-binary-rsyslog:pull', 'service': ' registry.docker.io'} 2020-04-29 08:52:49,583 140572 DEBUG urllib3.connectionpool [ ] https://registry-1.docker.io:443 "GET /v2/ HTTP/1.1" 401 87 2020-04-29 08:52:49,584 140572 DEBUG tripleo_common.image.image_uploader [ ] https://registry-1.docker.io/v2/ status code 401 2020-04-29 08:52:49,584 140572 DEBUG tripleo_common.image.image_uploader [ ] Token parameters: params {'scope': 'repository:tripleotrain/centos-binary-swift-proxy-server:pull', 'service': 'registry.docker.io'} 2020-04-29 08:52:49,586 140572 DEBUG urllib3.connectionpool [ ] Starting new HTTPS connection (1): auth.docker.io:443 2020-04-29 08:52:49,606 140572 DEBUG urllib3.connectionpool [ ] https://registry-1.docker.io:443 "GET /v2/ HTTP/1.1" 401 87 2020-04-29 08:52:49,607 140572 DEBUG tripleo_common.image.image_uploader [ ] https://registry-1.docker.io/v2/ status code 401 2020-04-29 08:52:49,607 140572 DEBUG tripleo_common.image.image_uploader [ ] Token parameters: params {'scope': 'repository:tripleotrain/centos-binary-swift-object:pull', 'service': ' registry.docker.io'} Later I saw connectionpool retrying, but I have not seen " tripleo_common.image.image_uploader" with same. Every 2.0s: sudo podman ps -a ; sudo podman images -a ; sudo df -h Wed Apr 29 09:38:26 2020 CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/tripleotrain/centos-binary-nova-api current-tripleo e32831544953 2 days ago 1.39 GB docker.io/tripleotrain/centos-binary-glance-api current-tripleo edbb7dff6427 2 days ago 1.31 GB docker.io/tripleotrain/centos-binary-mistral-api current-tripleo bcb3e95028a3 2 days ago 1.54 GB docker.io/tripleotrain/centos-binary-ironic-pxe current-tripleo 2f1eb1da3fa4 2 days ago 909 MB docker.io/tripleotrain/centos-binary-heat-api current-tripleo b425da0e0a89 2 days ago 947 MB docker.io/tripleotrain/centos-binary-ironic-api current-tripleo d0b670006bc6 2 days ago 903 MB docker.io/tripleotrain/centos-binary-swift-proxy-server current-tripleo 73432aea0d63 2 days ago 895 MB docker.io/tripleotrain/centos-binary-neutron-server current-tripleo d7b8f19cc5ed 2 days ago 1.1 GB docker.io/tripleotrain/centos-binary-keystone current-tripleo 8352bb3fd528 2 days ago 905 MB docker.io/tripleotrain/centos-binary-zaqar-wsgi current-tripleo 49a7f0066616 2 days ago 894 MB docker.io/tripleotrain/centos-binary-placement-api current-tripleo 096ce1da63d3 2 days ago 1 GB docker.io/tripleotrain/centos-binary-ironic-inspector current-tripleo 4505c408a230 2 days ago 817 MB docker.io/tripleotrain/centos-binary-rabbitmq current-tripleo bee62aacf8fb 2 days ago 700 MB docker.io/tripleotrain/centos-binary-haproxy current-tripleo 4b11e3d9c95f 2 days ago 692 MB docker.io/tripleotrain/centos-binary-mariadb current-tripleo 16cc78bc1e94 2 days ago 845 MB docker.io/tripleotrain/centos-binary-keepalived current-tripleo 67de7d2af948 2 days ago 568 MB docker.io/tripleotrain/centos-binary-memcached current-tripleo a1019d76359c 2 days ago 561 MB docker.io/tripleotrain/centos-binary-iscsid current-tripleo c62bc10064c2 2 days ago 527 MB docker.io/tripleotrain/centos-binary-cron current-tripleo be0199eb5b89 2 days ago 522 MB On Tue, 28 Apr 2020 at 20:10, Alex Schultz <aschultz@redhat.com> wrote:
On Tue, Apr 28, 2020 at 11:57 AM Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
Hi all,
I am running a fresh install of rdo train, on centos7 I almost a week I am facing error at this step: TASK [Run container-puppet tasks (generate config) during step 1]
So I have ansible.log attached, I cannot find anything, where it is
failing.
According to some understanding in ansible, it fails if it finds stderr output. I cannot find error/fail or smth, I see Notices and Warnings, but I believe it is not stderr?
I see containers running and removed after some time. (as it should be I think)...
Could you help me, where to dig?
2020-04-27 22:27:46,147 p=132230 u=root | TASK [Start containers for step 1 using paunch]
***************************************************************************************************************************** 2020-04-27 22:27:46,148 p=132230 u=root | Monday 27 April 2020 22:27:46 +0200 (0:00:00.137) 0:04:44.326 **********? 2020-04-27 22:27:46,816 p=132230 u=root | ok: [remote-u] 2020-04-27 22:27:46,914 p=132230 u=root | TASK [Debug output for task: Start containers for step 1]
******************************************************************************************************************* 2020-04-27 22:27:46,915 p=132230 u=root | Monday 27 April 2020 22:27:46 +0200 (0:00:00.767) 0:04:45.093 **********? 2020-04-27 22:27:46,977 p=132230 u=root | fatal: [remote-u]: FAILED! => { "failed_when_result": true,? "outputs.stdout_lines | default([]) | union(outputs.stderr_lines | default([]))": []
Check /var/log/paunch.log. It probably has additional information as to why the containers didn't start. You might also check the output of 'sudo podman ps -a' to see if any containers exited with errors.
-- Ruslanas Gžibovskis +370 6030 7030
-- Ruslanas Gžibovskis +370 6030 7030
I just now realized, that I have seen in some log, messages, missing some puppet-dependencies... Cannot find log again... On Wed, 29 Apr 2020 at 09:39, Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
podman ps -a = clean, no containers at all. I have a watch -d "sudo podman ps -a ; sudo podman images -a ; sudo df -h"
paunch.log is empty. (I did several reinstallations).
I found in image logs: 2020-04-29 08:52:49,854 140572 DEBUG urllib3.connectionpool [ ] https://registry-1.docker.io:443 "GET /v2/ HTTP/1.1" 401 87 2020-04-29 08:52:49,855 140572 DEBUG tripleo_common.image.image_uploader [ ] https://registry-1.docker.io/v2/ status code 401 2020-04-29 08:52:49,855 140572 DEBUG tripleo_common.image.image_uploader [ ] Token parameters: params {'scope': 'repository:tripleotrain/centos-binary-zaqar-wsgi:pull', 'service': ' registry.docker.io'} 2020-04-29 08:52:49,731 140572 DEBUG urllib3.connectionpool [ ] https://registry-1.docker.io:443 "GET /v2/ HTTP/1.1" 401 87 2020-04-29 08:52:49,732 140572 DEBUG tripleo_common.image.image_uploader [ ] https://registry-1.docker.io/v2/ status code 401 2020-04-29 08:52:49,732 140572 DEBUG tripleo_common.image.image_uploader [ ] Token parameters: params {'scope': 'repository:tripleotrain/centos-binary-rsyslog:pull', 'service': ' registry.docker.io'} 2020-04-29 08:52:49,583 140572 DEBUG urllib3.connectionpool [ ] https://registry-1.docker.io:443 "GET /v2/ HTTP/1.1" 401 87 2020-04-29 08:52:49,584 140572 DEBUG tripleo_common.image.image_uploader [ ] https://registry-1.docker.io/v2/ status code 401 2020-04-29 08:52:49,584 140572 DEBUG tripleo_common.image.image_uploader [ ] Token parameters: params {'scope': 'repository:tripleotrain/centos-binary-swift-proxy-server:pull', 'service': 'registry.docker.io'} 2020-04-29 08:52:49,586 140572 DEBUG urllib3.connectionpool [ ] Starting new HTTPS connection (1): auth.docker.io:443 2020-04-29 08:52:49,606 140572 DEBUG urllib3.connectionpool [ ] https://registry-1.docker.io:443 "GET /v2/ HTTP/1.1" 401 87 2020-04-29 08:52:49,607 140572 DEBUG tripleo_common.image.image_uploader [ ] https://registry-1.docker.io/v2/ status code 401 2020-04-29 08:52:49,607 140572 DEBUG tripleo_common.image.image_uploader [ ] Token parameters: params {'scope': 'repository:tripleotrain/centos-binary-swift-object:pull', 'service': ' registry.docker.io'}
Later I saw connectionpool retrying, but I have not seen " tripleo_common.image.image_uploader" with same.
Every 2.0s: sudo podman ps -a ; sudo podman images -a ; sudo df -h
Wed Apr 29 09:38:26 2020
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/tripleotrain/centos-binary-nova-api current-tripleo e32831544953 2 days ago 1.39 GB docker.io/tripleotrain/centos-binary-glance-api current-tripleo edbb7dff6427 2 days ago 1.31 GB docker.io/tripleotrain/centos-binary-mistral-api current-tripleo bcb3e95028a3 2 days ago 1.54 GB docker.io/tripleotrain/centos-binary-ironic-pxe current-tripleo 2f1eb1da3fa4 2 days ago 909 MB docker.io/tripleotrain/centos-binary-heat-api current-tripleo b425da0e0a89 2 days ago 947 MB docker.io/tripleotrain/centos-binary-ironic-api current-tripleo d0b670006bc6 2 days ago 903 MB docker.io/tripleotrain/centos-binary-swift-proxy-server current-tripleo 73432aea0d63 2 days ago 895 MB docker.io/tripleotrain/centos-binary-neutron-server current-tripleo d7b8f19cc5ed 2 days ago 1.1 GB docker.io/tripleotrain/centos-binary-keystone current-tripleo 8352bb3fd528 2 days ago 905 MB docker.io/tripleotrain/centos-binary-zaqar-wsgi current-tripleo 49a7f0066616 2 days ago 894 MB docker.io/tripleotrain/centos-binary-placement-api current-tripleo 096ce1da63d3 2 days ago 1 GB docker.io/tripleotrain/centos-binary-ironic-inspector current-tripleo 4505c408a230 2 days ago 817 MB docker.io/tripleotrain/centos-binary-rabbitmq current-tripleo bee62aacf8fb 2 days ago 700 MB docker.io/tripleotrain/centos-binary-haproxy current-tripleo 4b11e3d9c95f 2 days ago 692 MB docker.io/tripleotrain/centos-binary-mariadb current-tripleo 16cc78bc1e94 2 days ago 845 MB docker.io/tripleotrain/centos-binary-keepalived current-tripleo 67de7d2af948 2 days ago 568 MB docker.io/tripleotrain/centos-binary-memcached current-tripleo a1019d76359c 2 days ago 561 MB docker.io/tripleotrain/centos-binary-iscsid current-tripleo c62bc10064c2 2 days ago 527 MB docker.io/tripleotrain/centos-binary-cron current-tripleo be0199eb5b89 2 days ago 522 MB
On Tue, 28 Apr 2020 at 20:10, Alex Schultz <aschultz@redhat.com> wrote:
On Tue, Apr 28, 2020 at 11:57 AM Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
Hi all,
I am running a fresh install of rdo train, on centos7 I almost a week I am facing error at this step: TASK [Run container-puppet tasks (generate config) during step 1]
So I have ansible.log attached, I cannot find anything, where it is
failing.
According to some understanding in ansible, it fails if it finds stderr output. I cannot find error/fail or smth, I see Notices and Warnings, but I believe it is not stderr?
I see containers running and removed after some time. (as it should be I think)...
Could you help me, where to dig?
2020-04-27 22:27:46,147 p=132230 u=root | TASK [Start containers for step 1 using paunch]
***************************************************************************************************************************** 2020-04-27 22:27:46,148 p=132230 u=root | Monday 27 April 2020 22:27:46 +0200 (0:00:00.137) 0:04:44.326 **********? 2020-04-27 22:27:46,816 p=132230 u=root | ok: [remote-u] 2020-04-27 22:27:46,914 p=132230 u=root | TASK [Debug output for task: Start containers for step 1]
******************************************************************************************************************* 2020-04-27 22:27:46,915 p=132230 u=root | Monday 27 April 2020 22:27:46 +0200 (0:00:00.767) 0:04:45.093 **********? 2020-04-27 22:27:46,977 p=132230 u=root | fatal: [remote-u]: FAILED! => { "failed_when_result": true,? "outputs.stdout_lines | default([]) | union(outputs.stderr_lines | default([]))": []
Check /var/log/paunch.log. It probably has additional information as to why the containers didn't start. You might also check the output of 'sudo podman ps -a' to see if any containers exited with errors.
-- Ruslanas Gžibovskis +370 6030 7030
-- Ruslanas Gžibovskis +370 6030 7030
-- Ruslanas Gžibovskis +370 6030 7030
On the other installation I have, I see a bit more images: REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/tripleotrain/centos-binary-mistral-executor current-tripleo 826aaf79409f 2 months ago 1.76 GB docker.io/tripleotrain/centos-binary-nova-compute-ironic current-tripleo 57f6cccea249 2 months ago 2.02 GB docker.io/tripleotrain/centos-binary-neutron-l3-agent current-tripleo 1c342a20093a 2 months ago 1.17 GB docker.io/tripleotrain/centos-binary-neutron-dhcp-agent current-tripleo 3a9849df6381 2 months ago 1.16 GB docker.io/tripleotrain/centos-binary-neutron-server current-tripleo 3f56961b1a67 2 months ago 1.1 GB docker.io/tripleotrain/centos-binary-ironic-neutron-agent current-tripleo ffaea5a0eafb 2 months ago 951 MB docker.io/tripleotrain/centos-binary-neutron-openvswitch-agent current-tripleo e0bb38de7810 2 months ago 951 MB docker.io/tripleotrain/centos-binary-mistral-api current-tripleo 3d1388019c87 2 months ago 1.48 GB docker.io/tripleotrain/centos-binary-glance-api current-tripleo 8e1d04aa8f46 2 months ago 1.31 GB docker.io/tripleotrain/centos-binary-mistral-engine current-tripleo ede1273e60ac 2 months ago 1.44 GB docker.io/tripleotrain/centos-binary-mistral-event-engine current-tripleo 4e9f3446fa88 2 months ago 1.44 GB docker.io/tripleotrain/centos-binary-nova-api current-tripleo b38dce72601d 2 months ago 1.39 GB docker.io/tripleotrain/centos-binary-zaqar-wsgi current-tripleo 7e9acff2d188 2 months ago 893 MB docker.io/tripleotrain/centos-binary-nova-scheduler current-tripleo 08121d755b68 2 months ago 1.45 GB docker.io/tripleotrain/centos-binary-ironic-conductor current-tripleo d9810d76bacf 2 months ago 1.09 GB docker.io/tripleotrain/centos-binary-nova-conductor current-tripleo f771753e6e8b 2 months ago 1.28 GB docker.io/tripleotrain/centos-binary-placement-api current-tripleo 318fa9e266da 2 months ago 914 MB docker.io/tripleotrain/centos-binary-ironic-api current-tripleo 5c0167e4ca6c 2 months ago 903 MB docker.io/tripleotrain/centos-binary-ironic-pxe current-tripleo f08a3d35e1ee 2 months ago 908 MB docker.io/tripleotrain/centos-binary-keystone current-tripleo b876505251f9 2 months ago 905 MB docker.io/tripleotrain/centos-binary-heat-api current-tripleo 7b147f4b215a 2 months ago 946 MB docker.io/tripleotrain/centos-binary-swift-proxy-server current-tripleo 34aa5292ac93 2 months ago 894 MB docker.io/tripleotrain/centos-binary-heat-engine current-tripleo 515b7034dcd5 2 months ago 946 MB docker.io/tripleotrain/centos-binary-swift-container current-tripleo e7bd1b5f50e5 2 months ago 846 MB docker.io/tripleotrain/centos-binary-swift-account current-tripleo fa8f07aab6c1 2 months ago 846 MB docker.io/tripleotrain/centos-binary-swift-object current-tripleo cdb70d74e5d8 2 months ago 846 MB docker.io/tripleotrain/centos-binary-ironic-inspector current-tripleo 8ded64b6dcec 2 months ago 817 MB docker.io/tripleotrain/centos-binary-mariadb current-tripleo 949e61588879 2 months ago 846 MB docker.io/tripleotrain/centos-binary-cron current-tripleo 3579f123aa33 2 months ago 522 MB docker.io/tripleotrain/centos-binary-rabbitmq current-tripleo 75b4deddc0c3 2 months ago 700 MB docker.io/tripleotrain/centos-binary-haproxy current-tripleo af7d3eadd110 2 months ago 692 MB docker.io/tripleotrain/centos-binary-keepalived current-tripleo 7fc292e41708 2 months ago 568 MB docker.io/tripleotrain/centos-binary-memcached current-tripleo bddc81718cfc 2 months ago 561 MB docker.io/tripleotrain/centos-binary-iscsid current-tripleo 7456537e0c25 2 months ago 527 MB I am new to containers, can I somehow transfer these images between, as I understand, it might help? On Wed, 29 Apr 2020 at 09:45, Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
I just now realized, that I have seen in some log, messages, missing some puppet-dependencies... Cannot find log again...
On Wed, 29 Apr 2020 at 09:39, Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
podman ps -a = clean, no containers at all. I have a watch -d "sudo podman ps -a ; sudo podman images -a ; sudo df -h"
paunch.log is empty. (I did several reinstallations).
I found in image logs: 2020-04-29 08:52:49,854 140572 DEBUG urllib3.connectionpool [ ] https://registry-1.docker.io:443 "GET /v2/ HTTP/1.1" 401 87 2020-04-29 08:52:49,855 140572 DEBUG tripleo_common.image.image_uploader [ ] https://registry-1.docker.io/v2/ status code 401 2020-04-29 08:52:49,855 140572 DEBUG tripleo_common.image.image_uploader [ ] Token parameters: params {'scope': 'repository:tripleotrain/centos-binary-zaqar-wsgi:pull', 'service': ' registry.docker.io'} 2020-04-29 08:52:49,731 140572 DEBUG urllib3.connectionpool [ ] https://registry-1.docker.io:443 "GET /v2/ HTTP/1.1" 401 87 2020-04-29 08:52:49,732 140572 DEBUG tripleo_common.image.image_uploader [ ] https://registry-1.docker.io/v2/ status code 401 2020-04-29 08:52:49,732 140572 DEBUG tripleo_common.image.image_uploader [ ] Token parameters: params {'scope': 'repository:tripleotrain/centos-binary-rsyslog:pull', 'service': ' registry.docker.io'} 2020-04-29 08:52:49,583 140572 DEBUG urllib3.connectionpool [ ] https://registry-1.docker.io:443 "GET /v2/ HTTP/1.1" 401 87 2020-04-29 08:52:49,584 140572 DEBUG tripleo_common.image.image_uploader [ ] https://registry-1.docker.io/v2/ status code 401 2020-04-29 08:52:49,584 140572 DEBUG tripleo_common.image.image_uploader [ ] Token parameters: params {'scope': 'repository:tripleotrain/centos-binary-swift-proxy-server:pull', 'service': 'registry.docker.io'} 2020-04-29 08:52:49,586 140572 DEBUG urllib3.connectionpool [ ] Starting new HTTPS connection (1): auth.docker.io:443 2020-04-29 08:52:49,606 140572 DEBUG urllib3.connectionpool [ ] https://registry-1.docker.io:443 "GET /v2/ HTTP/1.1" 401 87 2020-04-29 08:52:49,607 140572 DEBUG tripleo_common.image.image_uploader [ ] https://registry-1.docker.io/v2/ status code 401 2020-04-29 08:52:49,607 140572 DEBUG tripleo_common.image.image_uploader [ ] Token parameters: params {'scope': 'repository:tripleotrain/centos-binary-swift-object:pull', 'service': ' registry.docker.io'}
Later I saw connectionpool retrying, but I have not seen " tripleo_common.image.image_uploader" with same.
Every 2.0s: sudo podman ps -a ; sudo podman images -a ; sudo df -h
Wed Apr 29 09:38:26 2020
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/tripleotrain/centos-binary-nova-api current-tripleo e32831544953 2 days ago 1.39 GB docker.io/tripleotrain/centos-binary-glance-api current-tripleo edbb7dff6427 2 days ago 1.31 GB docker.io/tripleotrain/centos-binary-mistral-api current-tripleo bcb3e95028a3 2 days ago 1.54 GB docker.io/tripleotrain/centos-binary-ironic-pxe current-tripleo 2f1eb1da3fa4 2 days ago 909 MB docker.io/tripleotrain/centos-binary-heat-api current-tripleo b425da0e0a89 2 days ago 947 MB docker.io/tripleotrain/centos-binary-ironic-api current-tripleo d0b670006bc6 2 days ago 903 MB docker.io/tripleotrain/centos-binary-swift-proxy-server current-tripleo 73432aea0d63 2 days ago 895 MB docker.io/tripleotrain/centos-binary-neutron-server current-tripleo d7b8f19cc5ed 2 days ago 1.1 GB docker.io/tripleotrain/centos-binary-keystone current-tripleo 8352bb3fd528 2 days ago 905 MB docker.io/tripleotrain/centos-binary-zaqar-wsgi current-tripleo 49a7f0066616 2 days ago 894 MB docker.io/tripleotrain/centos-binary-placement-api current-tripleo 096ce1da63d3 2 days ago 1 GB docker.io/tripleotrain/centos-binary-ironic-inspector current-tripleo 4505c408a230 2 days ago 817 MB docker.io/tripleotrain/centos-binary-rabbitmq current-tripleo bee62aacf8fb 2 days ago 700 MB docker.io/tripleotrain/centos-binary-haproxy current-tripleo 4b11e3d9c95f 2 days ago 692 MB docker.io/tripleotrain/centos-binary-mariadb current-tripleo 16cc78bc1e94 2 days ago 845 MB docker.io/tripleotrain/centos-binary-keepalived current-tripleo 67de7d2af948 2 days ago 568 MB docker.io/tripleotrain/centos-binary-memcached current-tripleo a1019d76359c 2 days ago 561 MB docker.io/tripleotrain/centos-binary-iscsid current-tripleo c62bc10064c2 2 days ago 527 MB docker.io/tripleotrain/centos-binary-cron current-tripleo be0199eb5b89 2 days ago 522 MB
On Tue, 28 Apr 2020 at 20:10, Alex Schultz <aschultz@redhat.com> wrote:
On Tue, Apr 28, 2020 at 11:57 AM Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
Hi all,
I am running a fresh install of rdo train, on centos7 I almost a week I am facing error at this step: TASK [Run container-puppet tasks (generate config) during step 1]
So I have ansible.log attached, I cannot find anything, where it is
failing.
According to some understanding in ansible, it fails if it finds stderr output. I cannot find error/fail or smth, I see Notices and Warnings, but I believe it is not stderr?
I see containers running and removed after some time. (as it should be I think)...
Could you help me, where to dig?
2020-04-27 22:27:46,147 p=132230 u=root | TASK [Start containers for step 1 using paunch]
***************************************************************************************************************************** 2020-04-27 22:27:46,148 p=132230 u=root | Monday 27 April 2020 22:27:46 +0200 (0:00:00.137) 0:04:44.326 **********? 2020-04-27 22:27:46,816 p=132230 u=root | ok: [remote-u] 2020-04-27 22:27:46,914 p=132230 u=root | TASK [Debug output for task: Start containers for step 1]
******************************************************************************************************************* 2020-04-27 22:27:46,915 p=132230 u=root | Monday 27 April 2020 22:27:46 +0200 (0:00:00.767) 0:04:45.093 **********? 2020-04-27 22:27:46,977 p=132230 u=root | fatal: [remote-u]: FAILED! => { "failed_when_result": true,? "outputs.stdout_lines | default([]) | union(outputs.stderr_lines | default([]))": []
Check /var/log/paunch.log. It probably has additional information as to why the containers didn't start. You might also check the output of 'sudo podman ps -a' to see if any containers exited with errors.
-- Ruslanas Gžibovskis +370 6030 7030
-- Ruslanas Gžibovskis +370 6030 7030
-- Ruslanas Gžibovskis +370 6030 7030
-- Ruslanas Gžibovskis +370 6030 7030
I am new to containers, can I somehow transfer these images between, as I understand, it might help?
The image will be downloaded automatically with the installation, you don't need to do it manually. The problem seems to be that the container failed to start, not the errors in the images themselves. You may need to check the configuration of undercloud.conf and proxy settings (if they exist).
By the way, I have shrinked a bit ansible.log, you can find it here: http://paste.debian.net/hidden/ffe95b54/ It looks like containers are going up, and I see them being up. but when I exec: [root@remote-u stack]# podman run -it d7b8f19cc5ed /bin/bash Error: error configuring network namespace for container 0f5d43e509f3b673180a677e192dd5f2498f6061680cb0db9ae15d739c2337e3: Missing CNI default network [root@remote-u stack]# I get this interesting line, should I get it? also, here is my undercloud.conf: http://paste.debian.net/hidden/09feaefa/ also, as I mentioned, previously, paunch.log is empty, I have added +w to it but it still empty. [root@remote-u stack]# cat /var/log/paunch.log [root@remote-u stack]# ls -la /var/log/paunch.log -rw-rw-rw-. 1 root root 0 Apr 28 10:59 /var/log/paunch.log [root@remote-u stack]# And I do not have proxy, luckily. but on another site, where we have proxy we face similar issues. and fails on same place. On Wed, 29 Apr 2020 at 12:02, Wu, Heng <wuh.fnst@cn.fujitsu.com> wrote:
I am new to containers, can I somehow transfer these images between, as I understand, it might help?
The image will be downloaded automatically with the installation, you don't need to do it manually.
The problem seems to be that the container failed to start, not the errors in the images themselves.
You may need to check the configuration of undercloud.conf and proxy settings (if they exist).
Hi, I just wanted to notice, that I have will it be sufficient to run: ansibple-playbook -vvvv -i inventory.yaml common_deploy_steps_tasks_step_1.yaml ??? cause testing takes looong time to run all openstack undercloud install Also trying with downgraded version of containernetworking-plugins.x86_64 0.8.1-2.el7.centos @extras My next thought is to run installation of packages to the same version as I have on the older box, where everything is running. On Thu, 30 Apr 2020 at 17:02, Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
By the way,
I have shrinked a bit ansible.log, you can find it here: http://paste.debian.net/hidden/ffe95b54/
It looks like containers are going up, and I see them being up.
but when I exec: [root@remote-u stack]# podman run -it d7b8f19cc5ed /bin/bash Error: error configuring network namespace for container 0f5d43e509f3b673180a677e192dd5f2498f6061680cb0db9ae15d739c2337e3: Missing CNI default network [root@remote-u stack]#
I get this interesting line, should I get it?
also, here is my undercloud.conf: http://paste.debian.net/hidden/09feaefa/
also, as I mentioned, previously, paunch.log is empty, I have added +w to it but it still empty.
[root@remote-u stack]# cat /var/log/paunch.log [root@remote-u stack]# ls -la /var/log/paunch.log -rw-rw-rw-. 1 root root 0 Apr 28 10:59 /var/log/paunch.log [root@remote-u stack]#
And I do not have proxy, luckily. but on another site, where we have proxy we face similar issues. and fails on same place.
On Wed, 29 Apr 2020 at 12:02, Wu, Heng <wuh.fnst@cn.fujitsu.com> wrote:
I am new to containers, can I somehow transfer these images between, as I understand, it might help?
The image will be downloaded automatically with the installation, you don't need to do it manually.
The problem seems to be that the container failed to start, not the errors in the images themselves.
You may need to check the configuration of undercloud.conf and proxy settings (if they exist).
-- Ruslanas Gžibovskis +370 6030 7030
Hi all, I also noticed, that /var/lib/tripleo-config/container-startup-config-step_6.json looks empty, even /var/log/containers/stdouts/*.log contain "step": 6 message. [root@remote-u stack]# ls -la /var/lib/tripleo-config/container-startup-config-step_* -rw-------. 1 root root 7774 May 1 11:56 /var/lib/tripleo-config/container-startup-config-step_1.json -rw-------. 1 root root 9793 May 1 11:56 /var/lib/tripleo-config/container-startup-config-step_2.json -rw-------. 1 root root 28945 May 1 11:56 /var/lib/tripleo-config/container-startup-config-step_3.json -rw-------. 1 root root 57302 May 1 11:56 /var/lib/tripleo-config/container-startup-config-step_4.json -rw-------. 1 root root 4050 May 1 11:56 /var/lib/tripleo-config/container-startup-config-step_5.json -rw-------. 1 root root 2 Apr 28 10:55 /var/lib/tripleo-config/container-startup-config-step_6.json On Fri, 1 May 2020 at 10:57, Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
Hi, I just wanted to notice, that I have
will it be sufficient to run: ansibple-playbook -vvvv -i inventory.yaml common_deploy_steps_tasks_step_1.yaml ???
cause testing takes looong time to run all openstack undercloud install
Also trying with downgraded version of containernetworking-plugins.x86_64 0.8.1-2.el7.centos @extras
My next thought is to run installation of packages to the same version as I have on the older box, where everything is running.
On Thu, 30 Apr 2020 at 17:02, Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
By the way,
I have shrinked a bit ansible.log, you can find it here: http://paste.debian.net/hidden/ffe95b54/
It looks like containers are going up, and I see them being up.
but when I exec: [root@remote-u stack]# podman run -it d7b8f19cc5ed /bin/bash Error: error configuring network namespace for container 0f5d43e509f3b673180a677e192dd5f2498f6061680cb0db9ae15d739c2337e3: Missing CNI default network [root@remote-u stack]#
I get this interesting line, should I get it?
also, here is my undercloud.conf: http://paste.debian.net/hidden/09feaefa/
also, as I mentioned, previously, paunch.log is empty, I have added +w to it but it still empty.
[root@remote-u stack]# cat /var/log/paunch.log [root@remote-u stack]# ls -la /var/log/paunch.log -rw-rw-rw-. 1 root root 0 Apr 28 10:59 /var/log/paunch.log [root@remote-u stack]#
And I do not have proxy, luckily. but on another site, where we have proxy we face similar issues. and fails on same place.
Hi all, In keepalived container config gen execution I find: "+ rc=2", "+ '[' False = false ']'", "+ set -e", "+ '[' 2 -ne 2 -a 2 -ne 0 ']'", "+ verbosity=", "+ verbosity=-v", "+ '[' -z '' ']'",
Hi all, it took me sevceral days to understand what you were saying Alex and Heng it looks like in this file: ./common_deploy_steps_tasks.yaml paunch is not executed, or executed without logging? paunch --verbose apply --file /var/lib/tripleo-config/container-startup-config-step_1.json --config-id tripleo_step1 --default-runtime podman --container-log-path /var/log/paunch.log When I executed it manually, I got some errors: https://pastebin.com/hnmW0rGX According to some error messages, it might be, that it is trying to launch with value for log file: ... '--log-driver', 'k8s-file', '--log-opt', 'path=/var/log/paunch.log/rabbitmq_init_logs.log' ... 2020-05-01 23:14:48.328 28129 ERROR paunch [ ] stderr: [conmon:e]: Failed to open log file Not a directory ... '--log-driver', 'k8s-file', '--log-opt', 'path=/var/log/paunch.log/mysql_init_logs.log' ... 020-05-01 23:14:48.812 28129 ERROR paunch [ ] stderr: [conmon:e]: Failed to open log file Not a directory Also, i see it failed to enable in systemctl tripleo_memcached "Failed to start memcached container." and same for haproxy and keepalived... Any ideas where to go further?
Hi all, it was my error, that paunch did not launch containers when executed paunch manually. I have fixed log path to containers dir, and it worked, reexecuted undercloud deploy again with --force-stack-update and it failed again, same step, and did not exec paunch still.
Hi all, After executing: [stack@remote-u undercloud-ansible-CxDDOO]$ sudo ansible-playbook-2 -vvvv -i inventory.yaml deploy_steps_playbook.yaml not sure, who has no attribute: AttributeError: 'module' object has no attribute 'load_config'\n" TASK [Start containers for step 1 using paunch] ***************************************************************************************************************************** task path: /home/stack/builddir/a/undercloud-ansible-CxDDOO/common_deploy_steps_tasks.yaml:257 Saturday 02 May 2020 13:11:01 +0200 (0:00:00.165) 0:04:57.422 ********** Using module file /usr/share/ansible/plugins/modules/paunch.py Pipelining is enabled. <10.120.129.222> ESTABLISH LOCAL CONNECTION FOR USER: root <10.120.129.222> EXEC /bin/sh -c 'TRIPLEO_MINOR_UPDATE=False /usr/bin/python2 && sleep 0' ok: [remote-u] => { "changed": false, "failed_when_result": false, "module_stderr": "Traceback (most recent call last):\n File \"<stdin>\", line 114, in <module>\n File \"<stdin>\", line 106, in _ansiballz_main\n File \"<stdin>\", line 49, in invoke_module\n File \"/tmp/ansible_paunch_payload_8IPdGs/__main__.py\", line 250, in <module>\n File \"/tmp/ansible_paunch_payload_8IPdGs/__main__.py\", line 246, in main\n File \"/tmp/ansible_paunch_payload_8IPdGs/__main__.py\", line 172, in __init__\nAttributeError: 'module' object has no attribute 'load_config'\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1 } TASK [Debug output for task: Start containers for step 1] ******************************************************************************************************************* task path: /home/stack/builddir/a/undercloud-ansible-CxDDOO/common_deploy_steps_tasks.yaml:275 Saturday 02 May 2020 13:11:02 +0200 (0:00:00.901) 0:04:58.323 ********** fatal: [remote-u]: FAILED! => { "failed_when_result": true, "outputs.stdout_lines | default([]) | union(outputs.stderr_lines | default([]))": [] } NO MORE HOSTS LEFT ********************************************************************************************************************************************************** PLAY RECAP ****************************************************************************************************************************************************************** remote-u : ok=242 changed=78 unreachable=0 failed=1 skipped=94 rescued=0 ignored=2 On Sat, 2 May 2020 at 00:32, Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
Hi all,
it was my error, that paunch did not launch containers when executed paunch manually. I have fixed log path to containers dir, and it worked, reexecuted undercloud deploy again with --force-stack-update
and it failed again, same step, and did not exec paunch still.
-- Ruslanas Gžibovskis +370 6030 7030
Hi all, Just for your info: the solution was: su - c "yum downgrade tripleo-ansible.noarch" yum list | grep -i tripleo-ansible tripleo-ansible.noarch 0.4.1-1.el7 @centos-openstack-train tripleo-ansible.noarch 0.5.0-1.el7 centos-openstack-train What is wrong with *tripleo-ansible.noarch 0.5.0-1.el7 centos-openstack-train* ???
Ruslanas, If you still have the environment handy can you please file a launchpad bug with the appropriate log files attached. Thank you very much!!! On Sat, May 2, 2020 at 6:13 AM Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
Hi all,
Just for your info: the solution was: su - c "yum downgrade tripleo-ansible.noarch" yum list | grep -i tripleo-ansible tripleo-ansible.noarch 0.4.1-1.el7 @centos-openstack-train tripleo-ansible.noarch 0.5.0-1.el7 centos-openstack-train
What is wrong with *tripleo-ansible.noarch 0.5.0-1.el7 centos-openstack-train* ???
Ruslanas, do you know if any other dependencies were downgraded when you ran the mentioned command? In looking through module code, its pulling in paunch which has the `load_config` attribute [0] in the stable/train branch. So I'm curious, what packages were impacted by the downgrade, and do you know what version of paunch installed on the system? As Wes mentioned, a Launchpad bug with this information would be most helpful. [0] https://github.com/openstack/paunch/blob/stable/train/paunch/utils/common.py... -- Kevin Carter IRC: Cloudnull On Tue, May 5, 2020 at 10:55 AM Wesley Hayutin <whayutin@redhat.com> wrote:
Ruslanas,
If you still have the environment handy can you please file a launchpad bug with the appropriate log files attached. Thank you very much!!!
On Sat, May 2, 2020 at 6:13 AM Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
Hi all,
Just for your info: the solution was: su - c "yum downgrade tripleo-ansible.noarch" yum list | grep -i tripleo-ansible tripleo-ansible.noarch 0.4.1-1.el7 @centos-openstack-train tripleo-ansible.noarch 0.5.0-1.el7 centos-openstack-train
What is wrong with *tripleo-ansible.noarch 0.5.0-1.el7 centos-openstack-train* ???
I will file a launchpad bug, Tomorrow, today almost sleeping ;) nope, only tripleo-ansible, I have 3 versions available, 0.4.0 0.4.1 and 0.5.0. 0.4.1 - works, 0.5.0 do not. [stack@remote-u ~]$ paunch --version paunch 5.3.1 [stack@remote-u ~]$ rpm -qa | grep -i paunch python2-paunch-5.3.1-1.el7.noarch paunch-services-5.3.1-1.el7.noarch [stack@remote-u ~]$ uname -a Linux remote-u.tecloud.sample.xxx 3.10.0-1127.el7.x86_64 #1 SMP Tue Mar 31 23:36:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux [stack@remote-u ~]$ Would be very helpful if you could share or howto, or steps to place that bug report :) https://bugs.launchpad.net/openstack/+filebug ? This one? Which project? tripleo? How to name it, so it would be understandable? Which logs would you need? would be suficient with ansible.log and ansible-playbook-2 -vvvv .... last output, with error message? I also can provide a diff for what is changed in that exact file. And package contents from 0.4.1 and 0.5.0 if you need more or less, just mention here, or after I launch a bug report, I can recreate VM and try setting it up.
The bug report should go here: https://bugs.launchpad.net/tripleo/+filebug * Fill in the summary, click next, and then provide everything you can within the body. Once the bug is created, you can upload files (like the mentioned diff) to the bug report. Any and all information you can share is helpful in tracking down what is going on; please also include repos enabled so we can try and recreate the issue locally. Sorry for the issues you've run into, but thanks again for reporting them. If you have any questions let us know. -- Kevin Carter IRC: Cloudnull On Tue, May 5, 2020 at 3:40 PM Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
I will file a launchpad bug, Tomorrow, today almost sleeping ;) nope, only tripleo-ansible, I have 3 versions available, 0.4.0 0.4.1 and 0.5.0. 0.4.1 - works, 0.5.0 do not.
[stack@remote-u ~]$ paunch --version paunch 5.3.1 [stack@remote-u ~]$ rpm -qa | grep -i paunch python2-paunch-5.3.1-1.el7.noarch paunch-services-5.3.1-1.el7.noarch [stack@remote-u ~]$ uname -a Linux remote-u.tecloud.sample.xxx 3.10.0-1127.el7.x86_64 #1 SMP Tue Mar 31 23:36:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux [stack@remote-u ~]$
Would be very helpful if you could share or howto, or steps to place that bug report :) https://bugs.launchpad.net/openstack/+filebug ? This one? Which project? tripleo? How to name it, so it would be understandable? Which logs would you need? would be suficient with ansible.log and ansible-playbook-2 -vvvv .... last output, with error message?
I also can provide a diff for what is changed in that exact file. And package contents from 0.4.1 and 0.5.0
if you need more or less, just mention here, or after I launch a bug report, I can recreate VM and try setting it up.
https://bugs.launchpad.net/tripleo/+bug/1877043 please check if anything needed, please feel free to ask, I can test if any updates you will have On Tue, 5 May 2020 at 23:02, Carter, Kevin <kevin@cloudnull.com> wrote:
The bug report should go here: https://bugs.launchpad.net/tripleo/+filebug
* Fill in the summary, click next, and then provide everything you can within the body. Once the bug is created, you can upload files (like the mentioned diff) to the bug report. Any and all information you can share is helpful in tracking down what is going on; please also include repos enabled so we can try and recreate the issue locally.
Sorry for the issues you've run into, but thanks again for reporting them. If you have any questions let us know.
--
Kevin Carter IRC: Cloudnull
On Tue, May 5, 2020 at 3:40 PM Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
I will file a launchpad bug, Tomorrow, today almost sleeping ;) nope, only tripleo-ansible, I have 3 versions available, 0.4.0 0.4.1 and 0.5.0. 0.4.1 - works, 0.5.0 do not.
[stack@remote-u ~]$ paunch --version paunch 5.3.1 [stack@remote-u ~]$ rpm -qa | grep -i paunch python2-paunch-5.3.1-1.el7.noarch paunch-services-5.3.1-1.el7.noarch [stack@remote-u ~]$ uname -a Linux remote-u.tecloud.sample.xxx 3.10.0-1127.el7.x86_64 #1 SMP Tue Mar 31 23:36:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux [stack@remote-u ~]$
Would be very helpful if you could share or howto, or steps to place that bug report :) https://bugs.launchpad.net/openstack/+filebug ? This one? Which project? tripleo? How to name it, so it would be understandable? Which logs would you need? would be suficient with ansible.log and ansible-playbook-2 -vvvv .... last output, with error message?
I also can provide a diff for what is changed in that exact file. And package contents from 0.4.1 and 0.5.0
if you need more or less, just mention here, or after I launch a bug report, I can recreate VM and try setting it up.
-- Ruslanas Gžibovskis +370 6030 7030
Thanks Ruslanas, the bug report looks great. It looks like Rabi has been able to identify the issue and has submitted for a new upstream release, which should resolve the bug [0]. [0] https://review.opendev.org/#/c/725783 -- Kevin Carter IRC: Cloudnull On Wed, May 6, 2020 at 2:32 AM Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
https://bugs.launchpad.net/tripleo/+bug/1877043 please check if anything needed, please feel free to ask, I can test if any updates you will have
On Tue, 5 May 2020 at 23:02, Carter, Kevin <kevin@cloudnull.com> wrote:
The bug report should go here: https://bugs.launchpad.net/tripleo/+filebug
* Fill in the summary, click next, and then provide everything you can within the body. Once the bug is created, you can upload files (like the mentioned diff) to the bug report. Any and all information you can share is helpful in tracking down what is going on; please also include repos enabled so we can try and recreate the issue locally.
Sorry for the issues you've run into, but thanks again for reporting them. If you have any questions let us know.
--
Kevin Carter IRC: Cloudnull
On Tue, May 5, 2020 at 3:40 PM Ruslanas Gžibovskis <ruslanas@lpic.lt> wrote:
I will file a launchpad bug, Tomorrow, today almost sleeping ;) nope, only tripleo-ansible, I have 3 versions available, 0.4.0 0.4.1 and 0.5.0. 0.4.1 - works, 0.5.0 do not.
[stack@remote-u ~]$ paunch --version paunch 5.3.1 [stack@remote-u ~]$ rpm -qa | grep -i paunch python2-paunch-5.3.1-1.el7.noarch paunch-services-5.3.1-1.el7.noarch [stack@remote-u ~]$ uname -a Linux remote-u.tecloud.sample.xxx 3.10.0-1127.el7.x86_64 #1 SMP Tue Mar 31 23:36:51 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux [stack@remote-u ~]$
Would be very helpful if you could share or howto, or steps to place that bug report :) https://bugs.launchpad.net/openstack/+filebug ? This one? Which project? tripleo? How to name it, so it would be understandable? Which logs would you need? would be suficient with ansible.log and ansible-playbook-2 -vvvv .... last output, with error message?
I also can provide a diff for what is changed in that exact file. And package contents from 0.4.1 and 0.5.0
if you need more or less, just mention here, or after I launch a bug report, I can recreate VM and try setting it up.
-- Ruslanas Gžibovskis +370 6030 7030
participants (5)
-
Alex Schultz
-
Carter, Kevin
-
Ruslanas Gžibovskis
-
Wesley Hayutin
-
Wu, Heng