[kolla][magnum] Cluster creation failed due to "Waiting for Kubernetes API..."
Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment: --------------- kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled ---------------
VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"
I managed to log into Kuber Master and from the cloud-init-output.log I can see: + echo 'Waiting for Kubernetes API...' Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5
Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)
Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.
[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.
May I ask for an help on this ?
Many thanks /Giuseppe
Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template.
From the cloud-init-output.log I see:
Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds
But the cluster creation keeps on failing.
From the journalctl -f I see a possible issue:
Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping
anyone familiar with this problem ?
Thanks as usual. /Giuseppe
On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino km.giuseppesannino@gmail.com wrote:
Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment:
kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled
VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"
I managed to log into Kuber Master and from the cloud-init-output.log I can see:
- echo 'Waiting for Kubernetes API...'
Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz
- '[' ok = '' ']'
- sleep 5
Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)
Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.
[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.
May I ask for an help on this ?
Many thanks /Giuseppe
Can you talk to the Heat API from your master node?
On 20/02/19 6:43 AM, Giuseppe Sannino wrote:
Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template. From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds
But the cluster creation keeps on failing. From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping
anyone familiar with this problem ?
Thanks as usual. /Giuseppe
On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino <km.giuseppesannino@gmail.com mailto:km.giuseppesannino@gmail.com> wrote:
Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment: --------------- kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled --------------- VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out" I managed to log into Kuber Master and from the cloud-init-output.log I can see: + echo 'Waiting for Kubernetes API...' Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE) Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver. [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1. May I ask for an help on this ? Many thanks /Giuseppe
I have the same problem. Weird thing is /etc/sysconfig/heat-params has region_name specified in my case!
Sent from my iPhone
On 19 Feb 2019, at 22:00, Feilong Wang feilong@catalyst.net.nz wrote:
Can you talk to the Heat API from your master node?
On 20/02/19 6:43 AM, Giuseppe Sannino wrote: Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template. From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds
But the cluster creation keeps on failing. From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping
anyone familiar with this problem ?
Thanks as usual. /Giuseppe
On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino km.giuseppesannino@gmail.com wrote: Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment:
kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled
VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"
I managed to log into Kuber Master and from the cloud-init-output.log I can see:
- echo 'Waiting for Kubernetes API...'
Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz
- '[' ok = '' ']'
- sleep 5
Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)
Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.
[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.
May I ask for an help on this ?
Many thanks /Giuseppe
-- Cheers & Best regards, Feilong Wang (王飞龙)
Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang@catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington
Hi Feilong, Bharat, thanks for your answer.
@Feilong,
From /etc/kolla/heat-engine/heat.conf I see:
[clients_keystone] auth_uri = http://10.1.7.201:5000
This should map into auth_url within the k8s master. Within the k8s master in /etc/os-collect-config.conf I see:
[heat] auth_url = http://10.1.7.201:5000/v3/ : : resource_name = kube-master region_name = null
and from /etc/sysconfig/heat-params (among the others): : REGION_NAME="RegionOne" : AUTH_URL="http://10.1.7.201:5000/v3"
This URL corresponds to the "public" Heat endpoint openstack endpoint list | grep heat | 3d5f58c43f6b44f6b54990d6fd9ff55d | RegionOne | heat | orchestration | True | internal | http://10.1.7.200:8004/v1/%(tenant_id)s | | 8c2492cb0ddc48ca94942a4a299a88dc | RegionOne | heat-cfn | cloudformation | True | internal | http://10.1.7.200:8000/v1 | | b164c4618a784da9ae14da75a6c764a3 | RegionOne | heat | orchestration | True | public | http://10.1.7.201:8004/v1/%(tenant_id)s | | da203f7d337b4587a0f5fc774c993390 | RegionOne | heat | orchestration | True | admin | http://10.1.7.200:8004/v1/%(tenant_id)s | | e0d3743e7c604e5c8aa4684df2d1ce53 | RegionOne | heat-cfn | cloudformation | True | public | http://10.1.7.201:8000/v1 | | efe0b8418aa24dfca33c243e7eed7e90 | RegionOne | heat-cfn | cloudformation | True | admin | http://10.1.7.200:8000/v1 |
Connectivity tests: [fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ ping 10.1.7.201 PING 10.1.7.201 (10.1.7.201) 56(84) bytes of data. 64 bytes from 10.1.7.201: icmp_seq=1 ttl=63 time=0.285 ms
[fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ curl http://10.1.7.201:5000/v3/ {"version": {"status": "stable", "updated": "2018-10-15T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}], "id": "v3.11", "links": [{"href": "http://10.1.7.201:5000/v3/", "rel": "self"}]}}
Apparently, I can reach such endpoint from within the k8s master
@Bharat, that file seems to be properly conifugured to me as well. The problem pointed by "systemctl status heat-container-agent" is with:
Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping
Still no way forward from my side.
/Giuseppe
On Tue, 19 Feb 2019 at 22:16, Bharat Kunwar bharat@stackhpc.com wrote:
I have the same problem. Weird thing is /etc/sysconfig/heat-params has region_name specified in my case!
Sent from my iPhone
On 19 Feb 2019, at 22:00, Feilong Wang feilong@catalyst.net.nz wrote:
Can you talk to the Heat API from your master node?
On 20/02/19 6:43 AM, Giuseppe Sannino wrote:
Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template. From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds
But the cluster creation keeps on failing. From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping
anyone familiar with this problem ?
Thanks as usual. /Giuseppe
On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino < km.giuseppesannino@gmail.com> wrote:
Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment:
kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled
VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"
I managed to log into Kuber Master and from the cloud-init-output.log I can see:
- echo 'Waiting for Kubernetes API...'
Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz
- '[' ok = '' ']'
- sleep 5
Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)
Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.
[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.
May I ask for an help on this ?
Many thanks /Giuseppe
--
Cheers & Best regards, Feilong Wang (王飞龙)
Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang@catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington
Hi Giuseppe,
What version of heat are you running?
Can you check if you have this patch merged? https://review.openstack.org/579485
https://review.openstack.org/579485
Bharat
Sent from my iPhone
On 20 Feb 2019, at 10:38, Giuseppe Sannino km.giuseppesannino@gmail.com wrote:
Hi Feilong, Bharat, thanks for your answer.
@Feilong, From /etc/kolla/heat-engine/heat.conf I see: [clients_keystone] auth_uri = http://10.1.7.201:5000
This should map into auth_url within the k8s master. Within the k8s master in /etc/os-collect-config.conf I see:
[heat] auth_url = http://10.1.7.201:5000/v3/ : : resource_name = kube-master region_name = null
and from /etc/sysconfig/heat-params (among the others): : REGION_NAME="RegionOne" : AUTH_URL="http://10.1.7.201:5000/v3"
This URL corresponds to the "public" Heat endpoint openstack endpoint list | grep heat | 3d5f58c43f6b44f6b54990d6fd9ff55d | RegionOne | heat | orchestration | True | internal | http://10.1.7.200:8004/v1/%(tenant_id)s | | 8c2492cb0ddc48ca94942a4a299a88dc | RegionOne | heat-cfn | cloudformation | True | internal | http://10.1.7.200:8000/v1 | | b164c4618a784da9ae14da75a6c764a3 | RegionOne | heat | orchestration | True | public | http://10.1.7.201:8004/v1/%(tenant_id)s | | da203f7d337b4587a0f5fc774c993390 | RegionOne | heat | orchestration | True | admin | http://10.1.7.200:8004/v1/%(tenant_id)s | | e0d3743e7c604e5c8aa4684df2d1ce53 | RegionOne | heat-cfn | cloudformation | True | public | http://10.1.7.201:8000/v1 | | efe0b8418aa24dfca33c243e7eed7e90 | RegionOne | heat-cfn | cloudformation | True | admin | http://10.1.7.200:8000/v1 |
Connectivity tests: [fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ ping 10.1.7.201 PING 10.1.7.201 (10.1.7.201) 56(84) bytes of data. 64 bytes from 10.1.7.201: icmp_seq=1 ttl=63 time=0.285 ms
[fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ curl http://10.1.7.201:5000/v3/ {"version": {"status": "stable", "updated": "2018-10-15T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}], "id": "v3.11", "links": [{"href": "http://10.1.7.201:5000/v3/", "rel": "self"}]}}
Apparently, I can reach such endpoint from within the k8s master
@Bharat, that file seems to be properly conifugured to me as well. The problem pointed by "systemctl status heat-container-agent" is with:
Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping
Still no way forward from my side.
/Giuseppe
On Tue, 19 Feb 2019 at 22:16, Bharat Kunwar bharat@stackhpc.com wrote: I have the same problem. Weird thing is /etc/sysconfig/heat-params has region_name specified in my case!
Sent from my iPhone
On 19 Feb 2019, at 22:00, Feilong Wang feilong@catalyst.net.nz wrote:
Can you talk to the Heat API from your master node?
On 20/02/19 6:43 AM, Giuseppe Sannino wrote: Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template. From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds
But the cluster creation keeps on failing. From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping
anyone familiar with this problem ?
Thanks as usual. /Giuseppe
On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino km.giuseppesannino@gmail.com wrote: Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment:
kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled
VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"
I managed to log into Kuber Master and from the cloud-init-output.log I can see:
- echo 'Waiting for Kubernetes API...'
Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz
- '[' ok = '' ']'
- sleep 5
Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)
Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.
[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.
May I ask for an help on this ?
Many thanks /Giuseppe
-- Cheers & Best regards, Feilong Wang (王飞龙)
Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang@catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington
Hi, I think we've hit this, and John Garbutt has added the following configuration for Kolla Ansible in /etc/kolla/config/heat.conf:
[DEFAULT] region_name_for_services=RegionOne We'll need a patch in kolla ansible to do that without custom config changes. Mark
On Wed, 20 Feb 2019 at 11:05, Bharat Kunwar bharat@stackhpc.com wrote:
Hi Giuseppe,
What version of heat are you running?
Can you check if you have this patch merged? https://review.openstack.org/579485
https://review.openstack.org/579485
Bharat
Sent from my iPhone
On 20 Feb 2019, at 10:38, Giuseppe Sannino km.giuseppesannino@gmail.com wrote:
Hi Feilong, Bharat, thanks for your answer.
@Feilong, From /etc/kolla/heat-engine/heat.conf I see: [clients_keystone] auth_uri = http://10.1.7.201:5000
This should map into auth_url within the k8s master. Within the k8s master in /etc/os-collect-config.conf I see:
[heat] auth_url = http://10.1.7.201:5000/v3/ : : resource_name = kube-master region_name = null
and from /etc/sysconfig/heat-params (among the others): : REGION_NAME="RegionOne" : AUTH_URL="http://10.1.7.201:5000/v3"
This URL corresponds to the "public" Heat endpoint openstack endpoint list | grep heat | 3d5f58c43f6b44f6b54990d6fd9ff55d | RegionOne | heat | orchestration | True | internal | http://10.1.7.200:8004/v1/%(tenant_id)s | | 8c2492cb0ddc48ca94942a4a299a88dc | RegionOne | heat-cfn | cloudformation | True | internal | http://10.1.7.200:8000/v1 | | b164c4618a784da9ae14da75a6c764a3 | RegionOne | heat | orchestration | True | public | http://10.1.7.201:8004/v1/%(tenant_id)s | | da203f7d337b4587a0f5fc774c993390 | RegionOne | heat | orchestration | True | admin | http://10.1.7.200:8004/v1/%(tenant_id)s | | e0d3743e7c604e5c8aa4684df2d1ce53 | RegionOne | heat-cfn | cloudformation | True | public | http://10.1.7.201:8000/v1 | | efe0b8418aa24dfca33c243e7eed7e90 | RegionOne | heat-cfn | cloudformation | True | admin | http://10.1.7.200:8000/v1 |
Connectivity tests: [fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ ping 10.1.7.201 PING 10.1.7.201 (10.1.7.201) 56(84) bytes of data. 64 bytes from 10.1.7.201: icmp_seq=1 ttl=63 time=0.285 ms
[fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ curl http://10.1.7.201:5000/v3/ {"version": {"status": "stable", "updated": "2018-10-15T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}], "id": "v3.11", "links": [{"href": "http://10.1.7.201:5000/v3/", "rel": "self"}]}}
Apparently, I can reach such endpoint from within the k8s master
@Bharat, that file seems to be properly conifugured to me as well. The problem pointed by "systemctl status heat-container-agent" is with:
Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping
Still no way forward from my side.
/Giuseppe
On Tue, 19 Feb 2019 at 22:16, Bharat Kunwar bharat@stackhpc.com wrote:
I have the same problem. Weird thing is /etc/sysconfig/heat-params has region_name specified in my case!
Sent from my iPhone
On 19 Feb 2019, at 22:00, Feilong Wang feilong@catalyst.net.nz wrote:
Can you talk to the Heat API from your master node?
On 20/02/19 6:43 AM, Giuseppe Sannino wrote:
Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template. From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds
But the cluster creation keeps on failing. From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping
anyone familiar with this problem ?
Thanks as usual. /Giuseppe
On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino < km.giuseppesannino@gmail.com> wrote:
Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment:
kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled
VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"
I managed to log into Kuber Master and from the cloud-init-output.log I can see:
- echo 'Waiting for Kubernetes API...'
Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz
- '[' ok = '' ']'
- sleep 5
Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)
Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.
[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.
May I ask for an help on this ?
Many thanks /Giuseppe
--
Cheers & Best regards, Feilong Wang (王飞龙)
Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang@catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington
Ciao Mark, finally it works! Many many thanks! That was the missing piece of the puzzle.
Just FYI information, from the systemctl status for the heat-container-agent I can still see this repetitive logs: : Feb 21 08:00:40 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping Feb 21 08:01:11 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping :
This doesn't seem to harm the deployment but I will check further.
Thanks a lot to everyone!
/Giuseppe
On Wed, 20 Feb 2019 at 20:16, Mark Goddard mark@stackhpc.com wrote:
Hi, I think we've hit this, and John Garbutt has added the following configuration for Kolla Ansible in /etc/kolla/config/heat.conf:
[DEFAULT] region_name_for_services=RegionOne We'll need a patch in kolla ansible to do that without custom config changes. Mark
On Wed, 20 Feb 2019 at 11:05, Bharat Kunwar bharat@stackhpc.com wrote:
Hi Giuseppe,
What version of heat are you running?
Can you check if you have this patch merged? https://review.openstack.org/579485
https://review.openstack.org/579485
Bharat
Sent from my iPhone
On 20 Feb 2019, at 10:38, Giuseppe Sannino km.giuseppesannino@gmail.com wrote:
Hi Feilong, Bharat, thanks for your answer.
@Feilong, From /etc/kolla/heat-engine/heat.conf I see: [clients_keystone] auth_uri = http://10.1.7.201:5000
This should map into auth_url within the k8s master. Within the k8s master in /etc/os-collect-config.conf I see:
[heat] auth_url = http://10.1.7.201:5000/v3/ : : resource_name = kube-master region_name = null
and from /etc/sysconfig/heat-params (among the others): : REGION_NAME="RegionOne" : AUTH_URL="http://10.1.7.201:5000/v3"
This URL corresponds to the "public" Heat endpoint openstack endpoint list | grep heat | 3d5f58c43f6b44f6b54990d6fd9ff55d | RegionOne | heat | orchestration | True | internal | http://10.1.7.200:8004/v1/%(tenant_id)s | | 8c2492cb0ddc48ca94942a4a299a88dc | RegionOne | heat-cfn | cloudformation | True | internal | http://10.1.7.200:8000/v1 | | b164c4618a784da9ae14da75a6c764a3 | RegionOne | heat | orchestration | True | public | http://10.1.7.201:8004/v1/%(tenant_id)s | | da203f7d337b4587a0f5fc774c993390 | RegionOne | heat | orchestration | True | admin | http://10.1.7.200:8004/v1/%(tenant_id)s | | e0d3743e7c604e5c8aa4684df2d1ce53 | RegionOne | heat-cfn | cloudformation | True | public | http://10.1.7.201:8000/v1 | | efe0b8418aa24dfca33c243e7eed7e90 | RegionOne | heat-cfn | cloudformation | True | admin | http://10.1.7.200:8000/v1 |
Connectivity tests: [fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ ping 10.1.7.201 PING 10.1.7.201 (10.1.7.201) 56(84) bytes of data. 64 bytes from 10.1.7.201: icmp_seq=1 ttl=63 time=0.285 ms
[fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ curl http://10.1.7.201:5000/v3/ {"version": {"status": "stable", "updated": "2018-10-15T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}], "id": "v3.11", "links": [{"href": "http://10.1.7.201:5000/v3/", "rel": "self"}]}}
Apparently, I can reach such endpoint from within the k8s master
@Bharat, that file seems to be properly conifugured to me as well. The problem pointed by "systemctl status heat-container-agent" is with:
Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping
Still no way forward from my side.
/Giuseppe
On Tue, 19 Feb 2019 at 22:16, Bharat Kunwar bharat@stackhpc.com wrote:
I have the same problem. Weird thing is /etc/sysconfig/heat-params has region_name specified in my case!
Sent from my iPhone
On 19 Feb 2019, at 22:00, Feilong Wang feilong@catalyst.net.nz wrote:
Can you talk to the Heat API from your master node?
On 20/02/19 6:43 AM, Giuseppe Sannino wrote:
Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template. From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds
But the cluster creation keeps on failing. From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping
anyone familiar with this problem ?
Thanks as usual. /Giuseppe
On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino < km.giuseppesannino@gmail.com> wrote:
Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment:
kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled
VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"
I managed to log into Kuber Master and from the cloud-init-output.log I can see:
- echo 'Waiting for Kubernetes API...'
Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz
- '[' ok = '' ']'
- sleep 5
Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)
Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.
[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.
May I ask for an help on this ?
Many thanks /Giuseppe
--
Cheers & Best regards, Feilong Wang (王飞龙)
Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang@catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington
Yes I’ve seen those messages too, I think it’s normal so wouldn’t worry too much. Glad this is sorted!
Sent from my iPhone
On 21 Feb 2019, at 11:03, Giuseppe Sannino km.giuseppesannino@gmail.com wrote:
Ciao Mark, finally it works! Many many thanks! That was the missing piece of the puzzle.
Just FYI information, from the systemctl status for the heat-container-agent I can still see this repetitive logs: : Feb 21 08:00:40 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping Feb 21 08:01:11 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping :
This doesn't seem to harm the deployment but I will check further.
Thanks a lot to everyone!
/Giuseppe
On Wed, 20 Feb 2019 at 20:16, Mark Goddard mark@stackhpc.com wrote: Hi, I think we've hit this, and John Garbutt has added the following configuration for Kolla Ansible in /etc/kolla/config/heat.conf:
[DEFAULT] region_name_for_services=RegionOne
We'll need a patch in kolla ansible to do that without custom config changes. Mark
On Wed, 20 Feb 2019 at 11:05, Bharat Kunwar bharat@stackhpc.com wrote: Hi Giuseppe,
What version of heat are you running?
Can you check if you have this patch merged? https://review.openstack.org/579485
https://review.openstack.org/579485
Bharat
Sent from my iPhone
On 20 Feb 2019, at 10:38, Giuseppe Sannino km.giuseppesannino@gmail.com wrote:
Hi Feilong, Bharat, thanks for your answer.
@Feilong, From /etc/kolla/heat-engine/heat.conf I see: [clients_keystone] auth_uri = http://10.1.7.201:5000
This should map into auth_url within the k8s master. Within the k8s master in /etc/os-collect-config.conf I see:
[heat] auth_url = http://10.1.7.201:5000/v3/ : : resource_name = kube-master region_name = null
and from /etc/sysconfig/heat-params (among the others): : REGION_NAME="RegionOne" : AUTH_URL="http://10.1.7.201:5000/v3"
This URL corresponds to the "public" Heat endpoint openstack endpoint list | grep heat | 3d5f58c43f6b44f6b54990d6fd9ff55d | RegionOne | heat | orchestration | True | internal | http://10.1.7.200:8004/v1/%(tenant_id)s | | 8c2492cb0ddc48ca94942a4a299a88dc | RegionOne | heat-cfn | cloudformation | True | internal | http://10.1.7.200:8000/v1 | | b164c4618a784da9ae14da75a6c764a3 | RegionOne | heat | orchestration | True | public | http://10.1.7.201:8004/v1/%(tenant_id)s | | da203f7d337b4587a0f5fc774c993390 | RegionOne | heat | orchestration | True | admin | http://10.1.7.200:8004/v1/%(tenant_id)s | | e0d3743e7c604e5c8aa4684df2d1ce53 | RegionOne | heat-cfn | cloudformation | True | public | http://10.1.7.201:8000/v1 | | efe0b8418aa24dfca33c243e7eed7e90 | RegionOne | heat-cfn | cloudformation | True | admin | http://10.1.7.200:8000/v1 |
Connectivity tests: [fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ ping 10.1.7.201 PING 10.1.7.201 (10.1.7.201) 56(84) bytes of data. 64 bytes from 10.1.7.201: icmp_seq=1 ttl=63 time=0.285 ms
[fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ curl http://10.1.7.201:5000/v3/ {"version": {"status": "stable", "updated": "2018-10-15T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}], "id": "v3.11", "links": [{"href": "http://10.1.7.201:5000/v3/", "rel": "self"}]}}
Apparently, I can reach such endpoint from within the k8s master
@Bharat, that file seems to be properly conifugured to me as well. The problem pointed by "systemctl status heat-container-agent" is with:
Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping
Still no way forward from my side.
/Giuseppe
On Tue, 19 Feb 2019 at 22:16, Bharat Kunwar bharat@stackhpc.com wrote: I have the same problem. Weird thing is /etc/sysconfig/heat-params has region_name specified in my case!
Sent from my iPhone
On 19 Feb 2019, at 22:00, Feilong Wang feilong@catalyst.net.nz wrote:
Can you talk to the Heat API from your master node?
> On 20/02/19 6:43 AM, Giuseppe Sannino wrote: > Hi all...again, > I managed to get over the previous issue by "not disabling" the TLS in the cluster template. > From the cloud-init-output.log I see: > Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. > Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds > > But the cluster creation keeps on failing. > From the journalctl -f I see a possible issue: > Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found > Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. > Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping > > anyone familiar with this problem ? > > Thanks as usual. > /Giuseppe > > > > > > > >> On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino km.giuseppesannino@gmail.com wrote: >> Hi all, >> need an help. >> I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment: >> --------------- >> kolla-ansible: 7.0.1 >> openstack_release: Rocky >> kolla_base_distro: centos >> kolla_install_type: source >> TLS: disabled >> --------------- >> >> >> VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out" >> >> I managed to log into Kuber Master and from the cloud-init-output.log I can see: >> + echo 'Waiting for Kubernetes API...' >> Waiting for Kubernetes API... >> ++ curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '' ']' >> + sleep 5 >> >> >> Checking via systemctl and journalctl I see: >> [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver >> ● kube-apiserver.service - kubernetes-apiserver >> Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) >> Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago >> Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) >> Main PID: 3796 (code=exited, status=1/FAILURE) >> >> Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE >> Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. >> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. >> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. >> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. >> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. >> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. >> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver. >> >> [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver >> -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- >> Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. >> Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. >> Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. >> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied >> : >> : >> : >> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied >> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE >> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. >> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. >> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1. >> >> >> May I ask for an help on this ? >> >> Many thanks >> /Giuseppe >> >> >>
>>
Cheers & Best regards, Feilong Wang (王飞龙)
Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang@catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington
On Thu, 21 Feb 2019 at 10:03, Giuseppe Sannino km.giuseppesannino@gmail.com wrote:
Ciao Mark, finally it works! Many many thanks! That was the missing piece of the puzzle.
Just FYI information, from the systemctl status for the heat-container-agent I can still see this repetitive logs: : Feb 21 08:00:40 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping Feb 21 08:01:11 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping :
This doesn't seem to harm the deployment but I will check further.
Thanks a lot to everyone!
/Giuseppe
Glad to hear it worked for you. I've raised a bug [1] and proposed a fix [2] in kolla ansible. Mark
[1] https://bugs.launchpad.net/kolla-ansible/+bug/1817051 [2] https://review.openstack.org/638400
Sounds great!!
Thanks again !
/Giuseppe
On Thu, 21 Feb 2019 at 12:36, Mark Goddard mark@stackhpc.com wrote:
On Thu, 21 Feb 2019 at 10:03, Giuseppe Sannino < km.giuseppesannino@gmail.com> wrote:
Ciao Mark, finally it works! Many many thanks! That was the missing piece of the puzzle.
Just FYI information, from the systemctl status for the heat-container-agent I can still see this repetitive logs: : Feb 21 08:00:40 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping Feb 21 08:01:11 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping :
This doesn't seem to harm the deployment but I will check further.
Thanks a lot to everyone!
/Giuseppe
Glad to hear it worked for you. I've raised a bug [1] and proposed a fix [2] in kolla ansible. Mark
[1] https://bugs.launchpad.net/kolla-ansible/+bug/1817051 [2] https://review.openstack.org/638400
On 20/02/19 2:15 PM, Mark Goddard wrote:
Hi, I think we've hit this, and John Garbutt has added the following configuration for Kolla Ansible in /etc/kolla/config/heat.conf:
[DEFAULT] region_name_for_services=RegionOne
We'll need a patch in kolla ansible to do that without custom config changes. Mark
On Wed, 20 Feb 2019 at 11:05, Bharat Kunwar <bharat@stackhpc.com mailto:bharat@stackhpc.com> wrote:
Hi Giuseppe, What version of heat are you running? Can you check if you have this patch merged? https://review.openstack.org/579485 https://review.openstack.org/579485
This patch caused a regression (in combination with corresponding patches to os-collect-config and heat-agents) due to weird things that happen in os-apply-config (https://bugs.launchpad.net/os-apply-config/+bug/1830967).
Details are here: https://storyboard.openstack.org/#!/story/2005797
I've proposed a fix, and once that merges the workaround suggested above will no longer be needed. (Although setting the region name explicitly is a Good Thing to do anyway.)
cheers, Zane.
Bharat Sent from my iPhone On 20 Feb 2019, at 10:38, Giuseppe Sannino <km.giuseppesannino@gmail.com <mailto:km.giuseppesannino@gmail.com>> wrote:
Hi Feilong, Bharat, thanks for your answer. @Feilong, From /etc/kolla/heat-engine/heat.conf I see: [clients_keystone] auth_uri = http://10.1.7.201:5000 This should map into auth_url within the k8s master. Within the k8s master in /etc/os-collect-config.conf I see: [heat] auth_url = http://10.1.7.201:5000/v3/ : : resource_name = kube-master region_name = null and from /etc/sysconfig/heat-params (among the others): : REGION_NAME="RegionOne" : AUTH_URL="http://10.1.7.201:5000/v3" This URL corresponds to the "public" Heat endpoint openstack endpoint list | grep heat | 3d5f58c43f6b44f6b54990d6fd9ff55d | RegionOne | heat | orchestration | True | internal | http://10.1.7.200:8004/v1/%(tenant_id)s | | 8c2492cb0ddc48ca94942a4a299a88dc | RegionOne | heat-cfn | cloudformation | True | internal | http://10.1.7.200:8000/v1 | | b164c4618a784da9ae14da75a6c764a3 | RegionOne | heat | orchestration | True | public | http://10.1.7.201:8004/v1/%(tenant_id)s | | da203f7d337b4587a0f5fc774c993390 | RegionOne | heat | orchestration | True | admin | http://10.1.7.200:8004/v1/%(tenant_id)s | | e0d3743e7c604e5c8aa4684df2d1ce53 | RegionOne | heat-cfn | cloudformation | True | public | http://10.1.7.201:8000/v1 | | efe0b8418aa24dfca33c243e7eed7e90 | RegionOne | heat-cfn | cloudformation | True | admin | http://10.1.7.200:8000/v1 | Connectivity tests: [fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ ping 10.1.7.201 PING 10.1.7.201 (10.1.7.201) 56(84) bytes of data. 64 bytes from 10.1.7.201 <http://10.1.7.201>: icmp_seq=1 ttl=63 time=0.285 ms [fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ curl http://10.1.7.201:5000/v3/ {"version": {"status": "stable", "updated": "2018-10-15T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}], "id": "v3.11", "links": [{"href": "http://10.1.7.201:5000/v3/", "rel": "self"}]}} Apparently, I can reach such endpoint from within the k8s master @Bharat, that file seems to be properly conifugured to me as well. The problem pointed by "systemctl status heat-container-agent" is with: Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping Still no way forward from my side. /Giuseppe On Tue, 19 Feb 2019 at 22:16, Bharat Kunwar <bharat@stackhpc.com <mailto:bharat@stackhpc.com>> wrote: I have the same problem. Weird thing is /etc/sysconfig/heat-params has region_name specified in my case! Sent from my iPhone On 19 Feb 2019, at 22:00, Feilong Wang <feilong@catalyst.net.nz <mailto:feilong@catalyst.net.nz>> wrote:
Can you talk to the Heat API from your master node? On 20/02/19 6:43 AM, Giuseppe Sannino wrote:
Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template. From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds But the cluster creation keeps on failing. From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping anyone familiar with this problem ? Thanks as usual. /Giuseppe On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino <km.giuseppesannino@gmail.com <mailto:km.giuseppesannino@gmail.com>> wrote: Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment: --------------- kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled --------------- VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out" I managed to log into Kuber Master and from the cloud-init-output.log I can see: + echo 'Waiting for Kubernetes API...' Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE) Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver. [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1. May I ask for an help on this ? Many thanks /Giuseppe
-- Cheers & Best regards, Feilong Wang (王飞龙) -------------------------------------------------------------------------- Senior Cloud Software Engineer Tel: +64-48032246 Email:flwang@catalyst.net.nz <mailto:flwang@catalyst.net.nz> Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington --------------------------------------------------------------------------
participants (5)
-
Bharat Kunwar
-
Feilong Wang
-
Giuseppe Sannino
-
Mark Goddard
-
Zane Bitter