[kolla][magnum] Cluster creation failed due to "Waiting for Kubernetes API..."

older
[watcher] Question about baremetal...

Giuseppe Sannino

19 Feb 2019 19 Feb '19

8:35 a.m.

Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment: --------------- kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled --------------- VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out" I managed to log into Kuber Master and from the cloud-init-output.log I can see: + echo 'Waiting for Kubernetes API...' Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5 Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE) Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver. [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1. May I ask for an help on this ? Many thanks /Giuseppe

Attachments:

attachment.html (text/html — 6.0 KB)

Show replies by date

Giuseppe Sannino

19 Feb 19 Feb

9:43 a.m.

Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template.

...

From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds

But the cluster creation keeps on failing.

...

From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping

anyone familiar with this problem ? Thanks as usual. /Giuseppe On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino <km.giuseppesannino@gmail.com> wrote:

...

Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment: --------------- kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled ---------------

VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"

I managed to log into Kuber Master and from the cloud-init-output.log I can see: + echo 'Waiting for Kubernetes API...' Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5

Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)

Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.

[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.

May I ask for an help on this ?

Many thanks /Giuseppe

Feilong Wang

1 p.m.

Can you talk to the Heat API from your master node? On 20/02/19 6:43 AM, Giuseppe Sannino wrote:

...

Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template. From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds

But the cluster creation keeps on failing. From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping

anyone familiar with this problem ?

Thanks as usual. /Giuseppe

On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino <km.giuseppesannino@gmail.com <mailto:km.giuseppesannino@gmail.com>> wrote:

Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment: --------------- kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled ---------------

VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"

I managed to log into Kuber Master and from the cloud-init-output.log I can see: + echo 'Waiting for Kubernetes API...' Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5

Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)

Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.

[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.

May I ask for an help on this ?

Many thanks /Giuseppe

-- Cheers & Best regards, Feilong Wang (王飞龙) -------------------------------------------------------------------------- Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang@catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington --------------------------------------------------------------------------

Bharat Kunwar

1:15 p.m.

I have the same problem. Weird thing is /etc/sysconfig/heat-params has region_name specified in my case! Sent from my iPhone

...

On 19 Feb 2019, at 22:00, Feilong Wang <feilong@catalyst.net.nz> wrote:

Can you talk to the Heat API from your master node?

...
On 20/02/19 6:43 AM, Giuseppe Sannino wrote: Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template. From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds

But the cluster creation keeps on failing. From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping

anyone familiar with this problem ?

Thanks as usual. /Giuseppe

...
On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino <km.giuseppesannino@gmail.com> wrote: Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment: --------------- kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled ---------------

VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"

I managed to log into Kuber Master and from the cloud-init-output.log I can see: + echo 'Waiting for Kubernetes API...' Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5

Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)

Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.

[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.

May I ask for an help on this ?

Many thanks /Giuseppe

-- Cheers & Best regards, Feilong Wang (王飞龙)

Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang@catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington --------------------------------------------------------------------------

Giuseppe Sannino

20 Feb 20 Feb

1:38 a.m.

Hi Feilong, Bharat, thanks for your answer. @Feilong,

...

From /etc/kolla/heat-engine/heat.conf I see: [clients_keystone] auth_uri = http://10.1.7.201:5000

This should map into auth_url within the k8s master. Within the k8s master in /etc/os-collect-config.conf I see: [heat] auth_url = http://10.1.7.201:5000/v3/ : : resource_name = kube-master region_name = null and from /etc/sysconfig/heat-params (among the others): : REGION_NAME="RegionOne" : AUTH_URL="http://10.1.7.201:5000/v3" This URL corresponds to the "public" Heat endpoint openstack endpoint list | grep heat | 3d5f58c43f6b44f6b54990d6fd9ff55d | RegionOne | heat | orchestration | True | internal | http://10.1.7.200:8004/v1/%(tenant_id)s | | 8c2492cb0ddc48ca94942a4a299a88dc | RegionOne | heat-cfn | cloudformation | True | internal | http://10.1.7.200:8000/v1 | | b164c4618a784da9ae14da75a6c764a3 | RegionOne | heat | orchestration | True | public | http://10.1.7.201:8004/v1/%(tenant_id)s | | da203f7d337b4587a0f5fc774c993390 | RegionOne | heat | orchestration | True | admin | http://10.1.7.200:8004/v1/%(tenant_id)s | | e0d3743e7c604e5c8aa4684df2d1ce53 | RegionOne | heat-cfn | cloudformation | True | public | http://10.1.7.201:8000/v1 | | efe0b8418aa24dfca33c243e7eed7e90 | RegionOne | heat-cfn | cloudformation | True | admin | http://10.1.7.200:8000/v1 | Connectivity tests: [fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ ping 10.1.7.201 PING 10.1.7.201 (10.1.7.201) 56(84) bytes of data. 64 bytes from 10.1.7.201: icmp_seq=1 ttl=63 time=0.285 ms [fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ curl http://10.1.7.201:5000/v3/ {"version": {"status": "stable", "updated": "2018-10-15T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}], "id": "v3.11", "links": [{"href": "http://10.1.7.201:5000/v3/", "rel": "self"}]}} Apparently, I can reach such endpoint from within the k8s master @Bharat, that file seems to be properly conifugured to me as well. The problem pointed by "systemctl status heat-container-agent" is with: Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping Still no way forward from my side. /Giuseppe On Tue, 19 Feb 2019 at 22:16, Bharat Kunwar <bharat@stackhpc.com> wrote:

...

I have the same problem. Weird thing is /etc/sysconfig/heat-params has region_name specified in my case!

Sent from my iPhone

On 19 Feb 2019, at 22:00, Feilong Wang <feilong@catalyst.net.nz> wrote:

Can you talk to the Heat API from your master node?

On 20/02/19 6:43 AM, Giuseppe Sannino wrote:

Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template. From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds

But the cluster creation keeps on failing. From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping

anyone familiar with this problem ?

Thanks as usual. /Giuseppe

On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino < km.giuseppesannino@gmail.com> wrote:

...
Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment: --------------- kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled ---------------

VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"

I managed to log into Kuber Master and from the cloud-init-output.log I can see: + echo 'Waiting for Kubernetes API...' Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5

Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)

Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.

[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.

May I ask for an help on this ?

Many thanks /Giuseppe

-- Cheers & Best regards, Feilong Wang (王飞龙)

Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang@catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington --------------------------------------------------------------------------

Bharat Kunwar

3:04 a.m.

Hi Giuseppe, What version of heat are you running? Can you check if you have this patch merged? https://review.openstack.org/579485 https://review.openstack.org/579485 Bharat Sent from my iPhone

...

On 20 Feb 2019, at 10:38, Giuseppe Sannino <km.giuseppesannino@gmail.com> wrote:

Hi Feilong, Bharat, thanks for your answer.

@Feilong, From /etc/kolla/heat-engine/heat.conf I see: [clients_keystone] auth_uri = http://10.1.7.201:5000

This should map into auth_url within the k8s master. Within the k8s master in /etc/os-collect-config.conf I see:

[heat] auth_url = http://10.1.7.201:5000/v3/ : : resource_name = kube-master region_name = null

and from /etc/sysconfig/heat-params (among the others): : REGION_NAME="RegionOne" : AUTH_URL="http://10.1.7.201:5000/v3"

This URL corresponds to the "public" Heat endpoint openstack endpoint list | grep heat | 3d5f58c43f6b44f6b54990d6fd9ff55d | RegionOne | heat | orchestration | True | internal | http://10.1.7.200:8004/v1/%(tenant_id)s | | 8c2492cb0ddc48ca94942a4a299a88dc | RegionOne | heat-cfn | cloudformation | True | internal | http://10.1.7.200:8000/v1 | | b164c4618a784da9ae14da75a6c764a3 | RegionOne | heat | orchestration | True | public | http://10.1.7.201:8004/v1/%(tenant_id)s | | da203f7d337b4587a0f5fc774c993390 | RegionOne | heat | orchestration | True | admin | http://10.1.7.200:8004/v1/%(tenant_id)s | | e0d3743e7c604e5c8aa4684df2d1ce53 | RegionOne | heat-cfn | cloudformation | True | public | http://10.1.7.201:8000/v1 | | efe0b8418aa24dfca33c243e7eed7e90 | RegionOne | heat-cfn | cloudformation | True | admin | http://10.1.7.200:8000/v1 |

Connectivity tests: [fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ ping 10.1.7.201 PING 10.1.7.201 (10.1.7.201) 56(84) bytes of data. 64 bytes from 10.1.7.201: icmp_seq=1 ttl=63 time=0.285 ms

[fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ curl http://10.1.7.201:5000/v3/ {"version": {"status": "stable", "updated": "2018-10-15T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}], "id": "v3.11", "links": [{"href": "http://10.1.7.201:5000/v3/", "rel": "self"}]}}

Apparently, I can reach such endpoint from within the k8s master

@Bharat, that file seems to be properly conifugured to me as well. The problem pointed by "systemctl status heat-container-agent" is with:

Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping

Still no way forward from my side.

/Giuseppe

...
On Tue, 19 Feb 2019 at 22:16, Bharat Kunwar <bharat@stackhpc.com> wrote: I have the same problem. Weird thing is /etc/sysconfig/heat-params has region_name specified in my case!

Sent from my iPhone

...
On 19 Feb 2019, at 22:00, Feilong Wang <feilong@catalyst.net.nz> wrote:

Can you talk to the Heat API from your master node?

...
On 20/02/19 6:43 AM, Giuseppe Sannino wrote: Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template. From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds

But the cluster creation keeps on failing. From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping

anyone familiar with this problem ?

Thanks as usual. /Giuseppe

...
On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino <km.giuseppesannino@gmail.com> wrote: Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment: --------------- kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled ---------------

VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"

I managed to log into Kuber Master and from the cloud-init-output.log I can see: + echo 'Waiting for Kubernetes API...' Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5

Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)

Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.

[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.

May I ask for an help on this ?

Many thanks /Giuseppe

-- Cheers & Best regards, Feilong Wang (王飞龙)

Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang@catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington --------------------------------------------------------------------------

Mark Goddard

11:15 a.m.

Hi, I think we've hit this, and John Garbutt has added the following configuration for Kolla Ansible in /etc/kolla/config/heat.conf: [DEFAULT] region_name_for_services=RegionOne We'll need a patch in kolla ansible to do that without custom config changes. Mark On Wed, 20 Feb 2019 at 11:05, Bharat Kunwar <bharat@stackhpc.com> wrote:

...

Hi Giuseppe,

What version of heat are you running?

Can you check if you have this patch merged? https://review.openstack.org/579485

https://review.openstack.org/579485

Bharat

Sent from my iPhone

On 20 Feb 2019, at 10:38, Giuseppe Sannino <km.giuseppesannino@gmail.com> wrote:

Hi Feilong, Bharat, thanks for your answer.

@Feilong, From /etc/kolla/heat-engine/heat.conf I see: [clients_keystone] auth_uri = http://10.1.7.201:5000

This should map into auth_url within the k8s master. Within the k8s master in /etc/os-collect-config.conf I see:

[heat] auth_url = http://10.1.7.201:5000/v3/ : : resource_name = kube-master region_name = null

and from /etc/sysconfig/heat-params (among the others): : REGION_NAME="RegionOne" : AUTH_URL="http://10.1.7.201:5000/v3"

This URL corresponds to the "public" Heat endpoint openstack endpoint list | grep heat | 3d5f58c43f6b44f6b54990d6fd9ff55d | RegionOne | heat | orchestration | True | internal | http://10.1.7.200:8004/v1/%(tenant_id)s | | 8c2492cb0ddc48ca94942a4a299a88dc | RegionOne | heat-cfn | cloudformation | True | internal | http://10.1.7.200:8000/v1 | | b164c4618a784da9ae14da75a6c764a3 | RegionOne | heat | orchestration | True | public | http://10.1.7.201:8004/v1/%(tenant_id)s | | da203f7d337b4587a0f5fc774c993390 | RegionOne | heat | orchestration | True | admin | http://10.1.7.200:8004/v1/%(tenant_id)s | | e0d3743e7c604e5c8aa4684df2d1ce53 | RegionOne | heat-cfn | cloudformation | True | public | http://10.1.7.201:8000/v1 | | efe0b8418aa24dfca33c243e7eed7e90 | RegionOne | heat-cfn | cloudformation | True | admin | http://10.1.7.200:8000/v1 |

Connectivity tests: [fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ ping 10.1.7.201 PING 10.1.7.201 (10.1.7.201) 56(84) bytes of data. 64 bytes from 10.1.7.201: icmp_seq=1 ttl=63 time=0.285 ms

[fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ curl http://10.1.7.201:5000/v3/ {"version": {"status": "stable", "updated": "2018-10-15T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}], "id": "v3.11", "links": [{"href": "http://10.1.7.201:5000/v3/", "rel": "self"}]}}

Apparently, I can reach such endpoint from within the k8s master

@Bharat, that file seems to be properly conifugured to me as well. The problem pointed by "systemctl status heat-container-agent" is with:

Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping

Still no way forward from my side.

/Giuseppe

On Tue, 19 Feb 2019 at 22:16, Bharat Kunwar <bharat@stackhpc.com> wrote:

...
I have the same problem. Weird thing is /etc/sysconfig/heat-params has region_name specified in my case!

Sent from my iPhone

On 19 Feb 2019, at 22:00, Feilong Wang <feilong@catalyst.net.nz> wrote:

Can you talk to the Heat API from your master node?

On 20/02/19 6:43 AM, Giuseppe Sannino wrote:

Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template. From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds

But the cluster creation keeps on failing. From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping

anyone familiar with this problem ?

Thanks as usual. /Giuseppe

On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino < km.giuseppesannino@gmail.com> wrote:

...
Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment: --------------- kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled ---------------

VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"

I managed to log into Kuber Master and from the cloud-init-output.log I can see: + echo 'Waiting for Kubernetes API...' Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5

Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)

Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.

[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.

May I ask for an help on this ?

Many thanks /Giuseppe

-- Cheers & Best regards, Feilong Wang (王飞龙)

Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang@catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington --------------------------------------------------------------------------

Giuseppe Sannino

21 Feb 21 Feb

2:03 a.m.

Ciao Mark, finally it works! Many many thanks! That was the missing piece of the puzzle. Just FYI information, from the systemctl status for the heat-container-agent I can still see this repetitive logs: : Feb 21 08:00:40 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping Feb 21 08:01:11 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping : This doesn't seem to harm the deployment but I will check further. Thanks a lot to everyone! /Giuseppe On Wed, 20 Feb 2019 at 20:16, Mark Goddard <mark@stackhpc.com> wrote:

...

Hi, I think we've hit this, and John Garbutt has added the following configuration for Kolla Ansible in /etc/kolla/config/heat.conf:

[DEFAULT] region_name_for_services=RegionOne We'll need a patch in kolla ansible to do that without custom config changes. Mark

On Wed, 20 Feb 2019 at 11:05, Bharat Kunwar <bharat@stackhpc.com> wrote:

...
Hi Giuseppe,

What version of heat are you running?

Can you check if you have this patch merged? https://review.openstack.org/579485

https://review.openstack.org/579485

Bharat

Sent from my iPhone

On 20 Feb 2019, at 10:38, Giuseppe Sannino <km.giuseppesannino@gmail.com> wrote:

Hi Feilong, Bharat, thanks for your answer.

@Feilong, From /etc/kolla/heat-engine/heat.conf I see: [clients_keystone] auth_uri = http://10.1.7.201:5000

This should map into auth_url within the k8s master. Within the k8s master in /etc/os-collect-config.conf I see:

[heat] auth_url = http://10.1.7.201:5000/v3/ : : resource_name = kube-master region_name = null

and from /etc/sysconfig/heat-params (among the others): : REGION_NAME="RegionOne" : AUTH_URL="http://10.1.7.201:5000/v3"

This URL corresponds to the "public" Heat endpoint openstack endpoint list | grep heat | 3d5f58c43f6b44f6b54990d6fd9ff55d | RegionOne | heat | orchestration | True | internal | http://10.1.7.200:8004/v1/%(tenant_id)s | | 8c2492cb0ddc48ca94942a4a299a88dc | RegionOne | heat-cfn | cloudformation | True | internal | http://10.1.7.200:8000/v1 | | b164c4618a784da9ae14da75a6c764a3 | RegionOne | heat | orchestration | True | public | http://10.1.7.201:8004/v1/%(tenant_id)s | | da203f7d337b4587a0f5fc774c993390 | RegionOne | heat | orchestration | True | admin | http://10.1.7.200:8004/v1/%(tenant_id)s | | e0d3743e7c604e5c8aa4684df2d1ce53 | RegionOne | heat-cfn | cloudformation | True | public | http://10.1.7.201:8000/v1 | | efe0b8418aa24dfca33c243e7eed7e90 | RegionOne | heat-cfn | cloudformation | True | admin | http://10.1.7.200:8000/v1 |

Connectivity tests: [fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ ping 10.1.7.201 PING 10.1.7.201 (10.1.7.201) 56(84) bytes of data. 64 bytes from 10.1.7.201: icmp_seq=1 ttl=63 time=0.285 ms

[fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ curl http://10.1.7.201:5000/v3/ {"version": {"status": "stable", "updated": "2018-10-15T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}], "id": "v3.11", "links": [{"href": "http://10.1.7.201:5000/v3/", "rel": "self"}]}}

Apparently, I can reach such endpoint from within the k8s master

@Bharat, that file seems to be properly conifugured to me as well. The problem pointed by "systemctl status heat-container-agent" is with:

Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping

Still no way forward from my side.

/Giuseppe

On Tue, 19 Feb 2019 at 22:16, Bharat Kunwar <bharat@stackhpc.com> wrote:

...
I have the same problem. Weird thing is /etc/sysconfig/heat-params has region_name specified in my case!

Sent from my iPhone

On 19 Feb 2019, at 22:00, Feilong Wang <feilong@catalyst.net.nz> wrote:

Can you talk to the Heat API from your master node?

On 20/02/19 6:43 AM, Giuseppe Sannino wrote:

Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template. From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds

But the cluster creation keeps on failing. From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping

anyone familiar with this problem ?

Thanks as usual. /Giuseppe

On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino < km.giuseppesannino@gmail.com> wrote:

...
Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment: --------------- kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled ---------------

VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"

I managed to log into Kuber Master and from the cloud-init-output.log I can see: + echo 'Waiting for Kubernetes API...' Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5

Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)

Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.

[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.

May I ask for an help on this ?

Many thanks /Giuseppe

-- Cheers & Best regards, Feilong Wang (王飞龙)

Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang@catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington --------------------------------------------------------------------------

Bharat Kunwar

3:03 a.m.

Yes I’ve seen those messages too, I think it’s normal so wouldn’t worry too much. Glad this is sorted! Sent from my iPhone

...

On 21 Feb 2019, at 11:03, Giuseppe Sannino <km.giuseppesannino@gmail.com> wrote:

Ciao Mark, finally it works! Many many thanks! That was the missing piece of the puzzle.

Just FYI information, from the systemctl status for the heat-container-agent I can still see this repetitive logs: : Feb 21 08:00:40 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping Feb 21 08:01:11 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping :

This doesn't seem to harm the deployment but I will check further.

Thanks a lot to everyone!

/Giuseppe

...
On Wed, 20 Feb 2019 at 20:16, Mark Goddard <mark@stackhpc.com> wrote: Hi, I think we've hit this, and John Garbutt has added the following configuration for Kolla Ansible in /etc/kolla/config/heat.conf:

[DEFAULT] region_name_for_services=RegionOne

We'll need a patch in kolla ansible to do that without custom config changes. Mark

...
On Wed, 20 Feb 2019 at 11:05, Bharat Kunwar <bharat@stackhpc.com> wrote: Hi Giuseppe,

What version of heat are you running?

Can you check if you have this patch merged? https://review.openstack.org/579485

https://review.openstack.org/579485

Bharat

Sent from my iPhone

...
On 20 Feb 2019, at 10:38, Giuseppe Sannino <km.giuseppesannino@gmail.com> wrote:

Hi Feilong, Bharat, thanks for your answer.

@Feilong, From /etc/kolla/heat-engine/heat.conf I see: [clients_keystone] auth_uri = http://10.1.7.201:5000

This should map into auth_url within the k8s master. Within the k8s master in /etc/os-collect-config.conf I see:

[heat] auth_url = http://10.1.7.201:5000/v3/ : : resource_name = kube-master region_name = null

and from /etc/sysconfig/heat-params (among the others): : REGION_NAME="RegionOne" : AUTH_URL="http://10.1.7.201:5000/v3"

This URL corresponds to the "public" Heat endpoint openstack endpoint list | grep heat | 3d5f58c43f6b44f6b54990d6fd9ff55d | RegionOne | heat | orchestration | True | internal | http://10.1.7.200:8004/v1/%(tenant_id)s | | 8c2492cb0ddc48ca94942a4a299a88dc | RegionOne | heat-cfn | cloudformation | True | internal | http://10.1.7.200:8000/v1 | | b164c4618a784da9ae14da75a6c764a3 | RegionOne | heat | orchestration | True | public | http://10.1.7.201:8004/v1/%(tenant_id)s | | da203f7d337b4587a0f5fc774c993390 | RegionOne | heat | orchestration | True | admin | http://10.1.7.200:8004/v1/%(tenant_id)s | | e0d3743e7c604e5c8aa4684df2d1ce53 | RegionOne | heat-cfn | cloudformation | True | public | http://10.1.7.201:8000/v1 | | efe0b8418aa24dfca33c243e7eed7e90 | RegionOne | heat-cfn | cloudformation | True | admin | http://10.1.7.200:8000/v1 |

Connectivity tests: [fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ ping 10.1.7.201 PING 10.1.7.201 (10.1.7.201) 56(84) bytes of data. 64 bytes from 10.1.7.201: icmp_seq=1 ttl=63 time=0.285 ms

[fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ curl http://10.1.7.201:5000/v3/ {"version": {"status": "stable", "updated": "2018-10-15T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}], "id": "v3.11", "links": [{"href": "http://10.1.7.201:5000/v3/", "rel": "self"}]}}

Apparently, I can reach such endpoint from within the k8s master

@Bharat, that file seems to be properly conifugured to me as well. The problem pointed by "systemctl status heat-container-agent" is with:

Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping

Still no way forward from my side.

/Giuseppe

...
On Tue, 19 Feb 2019 at 22:16, Bharat Kunwar <bharat@stackhpc.com> wrote: I have the same problem. Weird thing is /etc/sysconfig/heat-params has region_name specified in my case!

Sent from my iPhone

...
On 19 Feb 2019, at 22:00, Feilong Wang <feilong@catalyst.net.nz> wrote:

Can you talk to the Heat API from your master node?

> On 20/02/19 6:43 AM, Giuseppe Sannino wrote: > Hi all...again, > I managed to get over the previous issue by "not disabling" the TLS in the cluster template. > From the cloud-init-output.log I see: > Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. > Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds > > But the cluster creation keeps on failing. > From the journalctl -f I see a possible issue: > Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found > Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. > Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping > > anyone familiar with this problem ? > > Thanks as usual. > /Giuseppe > > > > > > > >> On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino <km.giuseppesannino@gmail.com> wrote: >> Hi all, >> need an help. >> I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment: >> --------------- >> kolla-ansible: 7.0.1 >> openstack_release: Rocky >> kolla_base_distro: centos >> kolla_install_type: source >> TLS: disabled >> --------------- >> >> >> VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out" >> >> I managed to log into Kuber Master and from the cloud-init-output.log I can see: >> + echo 'Waiting for Kubernetes API...' >> Waiting for Kubernetes API... >> ++ curl --silent http://127.0.0.1:8080/healthz >> + '[' ok = '' ']' >> + sleep 5 >> >> >> Checking via systemctl and journalctl I see: >> [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver >> ● kube-apiserver.service - kubernetes-apiserver >> Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) >> Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago >> Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) >> Main PID: 3796 (code=exited, status=1/FAILURE) >> >> Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE >> Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. >> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. >> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. >> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. >> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. >> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. >> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver. >> >> [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver >> -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- >> Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. >> Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. >> Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. >> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied >> : >> : >> : >> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied >> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE >> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. >> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. >> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1. >> >> >> May I ask for an help on this ? >> >> Many thanks >> /Giuseppe >> >> >> >> -- Cheers & Best regards, Feilong Wang (王飞龙) -------------------------------------------------------------------------- Senior Cloud Software Engineer Tel: +64-48032246 Email: flwang@catalyst.net.nz Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington --------------------------------------------------------------------------

Mark Goddard

3:36 a.m.

On Thu, 21 Feb 2019 at 10:03, Giuseppe Sannino <km.giuseppesannino@gmail.com> wrote:

...

Ciao Mark, finally it works! Many many thanks! That was the missing piece of the puzzle.

Just FYI information, from the systemctl status for the heat-container-agent I can still see this repetitive logs: : Feb 21 08:00:40 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping Feb 21 08:01:11 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping :

This doesn't seem to harm the deployment but I will check further.

Thanks a lot to everyone!

/Giuseppe

Glad to hear it worked for you. I've raised a bug [1] and proposed a fix [2] in kolla ansible. Mark [1] https://bugs.launchpad.net/kolla-ansible/+bug/1817051 [2] https://review.openstack.org/638400

Giuseppe Sannino

8:25 a.m.

Sounds great!! Thanks again ! /Giuseppe On Thu, 21 Feb 2019 at 12:36, Mark Goddard <mark@stackhpc.com> wrote:

...

On Thu, 21 Feb 2019 at 10:03, Giuseppe Sannino < km.giuseppesannino@gmail.com> wrote:

...
Ciao Mark, finally it works! Many many thanks! That was the missing piece of the puzzle.

Just FYI information, from the systemctl status for the heat-container-agent I can still see this repetitive logs: : Feb 21 08:00:40 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping Feb 21 08:01:11 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping :

This doesn't seem to harm the deployment but I will check further.

Thanks a lot to everyone!

/Giuseppe

Glad to hear it worked for you. I've raised a bug [1] and proposed a fix [2] in kolla ansible. Mark

[1] https://bugs.launchpad.net/kolla-ansible/+bug/1817051 [2] https://review.openstack.org/638400

Zane Bitter

29 May 29 May

1:35 p.m.

On 20/02/19 2:15 PM, Mark Goddard wrote:

...

Hi, I think we've hit this, and John Garbutt has added the following configuration for Kolla Ansible in /etc/kolla/config/heat.conf:

[DEFAULT] region_name_for_services=RegionOne

We'll need a patch in kolla ansible to do that without custom config changes. Mark

On Wed, 20 Feb 2019 at 11:05, Bharat Kunwar <bharat@stackhpc.com <mailto:bharat@stackhpc.com>> wrote:

Hi Giuseppe,

What version of heat are you running?

Can you check if you have this patch merged? https://review.openstack.org/579485

https://review.openstack.org/579485

This patch caused a regression (in combination with corresponding patches to os-collect-config and heat-agents) due to weird things that happen in os-apply-config (https://bugs.launchpad.net/os-apply-config/+bug/1830967). Details are here: https://storyboard.openstack.org/#!/story/2005797 I've proposed a fix, and once that merges the workaround suggested above will no longer be needed. (Although setting the region name explicitly is a Good Thing to do anyway.) cheers, Zane.

...

Bharat

Sent from my iPhone

On 20 Feb 2019, at 10:38, Giuseppe Sannino <km.giuseppesannino@gmail.com <mailto:km.giuseppesannino@gmail.com>> wrote:

...
Hi Feilong, Bharat, thanks for your answer.

@Feilong, From /etc/kolla/heat-engine/heat.conf I see: [clients_keystone] auth_uri = http://10.1.7.201:5000

This should map into auth_url within the k8s master. Within the k8s master in /etc/os-collect-config.conf I see:

[heat] auth_url = http://10.1.7.201:5000/v3/ : : resource_name = kube-master region_name = null

and from /etc/sysconfig/heat-params (among the others): : REGION_NAME="RegionOne" : AUTH_URL="http://10.1.7.201:5000/v3"

This URL corresponds to the "public" Heat endpoint openstack endpoint list | grep heat | 3d5f58c43f6b44f6b54990d6fd9ff55d | RegionOne | heat | orchestration | True | internal | http://10.1.7.200:8004/v1/%(tenant_id)s | | 8c2492cb0ddc48ca94942a4a299a88dc | RegionOne | heat-cfn | cloudformation | True | internal | http://10.1.7.200:8000/v1 | | b164c4618a784da9ae14da75a6c764a3 | RegionOne | heat | orchestration | True | public | http://10.1.7.201:8004/v1/%(tenant_id)s | | da203f7d337b4587a0f5fc774c993390 | RegionOne | heat | orchestration | True | admin | http://10.1.7.200:8004/v1/%(tenant_id)s | | e0d3743e7c604e5c8aa4684df2d1ce53 | RegionOne | heat-cfn | cloudformation | True | public | http://10.1.7.201:8000/v1 | | efe0b8418aa24dfca33c243e7eed7e90 | RegionOne | heat-cfn | cloudformation | True | admin | http://10.1.7.200:8000/v1 |

Connectivity tests: [fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ ping 10.1.7.201 PING 10.1.7.201 (10.1.7.201) 56(84) bytes of data. 64 bytes from 10.1.7.201 <http://10.1.7.201>: icmp_seq=1 ttl=63 time=0.285 ms

[fedora@kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ curl http://10.1.7.201:5000/v3/ {"version": {"status": "stable", "updated": "2018-10-15T00:00:00Z", "media-types": [{"base": "application/json", "type": "application/vnd.openstack.identity-v3+json"}], "id": "v3.11", "links": [{"href": "http://10.1.7.201:5000/v3/", "rel": "self"}]}}

Apparently, I can reach such endpoint from within the k8s master

@Bharat, that file seems to be properly conifugured to me as well. The problem pointed by "systemctl status heat-container-agent" is with:

Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: publicURL endpoint for orchestration service in null region not found Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: Source [heat] Unavailable. Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping

Still no way forward from my side.

/Giuseppe

On Tue, 19 Feb 2019 at 22:16, Bharat Kunwar <bharat@stackhpc.com <mailto:bharat@stackhpc.com>> wrote:

I have the same problem. Weird thing is /etc/sysconfig/heat-params has region_name specified in my case!

Sent from my iPhone

On 19 Feb 2019, at 22:00, Feilong Wang <feilong@catalyst.net.nz <mailto:feilong@catalyst.net.nz>> wrote:

...
Can you talk to the Heat API from your master node?

On 20/02/19 6:43 AM, Giuseppe Sannino wrote:

...
Hi all...again, I managed to get over the previous issue by "not disabling" the TLS in the cluster template. From the cloud-init-output.log I see: Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53 +0000. Up 38.08 seconds. Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000. Datasource DataSourceEc2. Up 607.13 seconds

But the cluster creation keeps on failing. From the journalctl -f I see a possible issue: Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: publicURL endpoint for orchestration service in null region not found Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: Source [heat] Unavailable. Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping

anyone familiar with this problem ?

Thanks as usual. /Giuseppe

On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino <km.giuseppesannino@gmail.com <mailto:km.giuseppesannino@gmail.com>> wrote:

Hi all, need an help. I deployed an AIO via Kolla on a baremetal node. Here some information about the deployment: --------------- kolla-ansible: 7.0.1 openstack_release: Rocky kolla_base_distro: centos kolla_install_type: source TLS: disabled ---------------

VMs spawn without issue but I can't make the "Kubernetes cluster creation" successfully. It fails due to "Time out"

I managed to log into Kuber Master and from the cloud-init-output.log I can see: + echo 'Waiting for Kubernetes API...' Waiting for Kubernetes API... ++ curl --silent http://127.0.0.1:8080/healthz + '[' ok = '' ']' + sleep 5

Checking via systemctl and journalctl I see: [fedora@kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status kube-apiserver ● kube-apiserver.service - kubernetes-apiserver Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41 UTC; 45min ago Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run kube-apiserver (code=exited, status=1/FAILURE) Main PID: 3796 (code=exited, status=1/FAILURE)

Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 6. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Stopped kubernetes-apiserver. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Start request repeated too quickly. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Failed to start kubernetes-apiserver.

[fedora@kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u kube-apiserver -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19 16:17:00 UTC. -- Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: Started kubernetes-apiserver. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-bind-address has been deprecated, This flag will be removed in a future version. Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Flag --insecure-port has been deprecated, This flag will be removed in a future version. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: Error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied : : : Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal runc[2794]: error: error creating self-signed certificates: open /var/run/kubernetes/apiserver.crt: permission denied Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Main process exited, code=exited, status=1/FAILURE Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Failed with result 'exit-code'. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired, scheduling restart. Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter is at 1.

May I ask for an help on this ?

Many thanks /Giuseppe

-- Cheers & Best regards, Feilong Wang (王飞龙) -------------------------------------------------------------------------- Senior Cloud Software Engineer Tel: +64-48032246 Email:flwang@catalyst.net.nz <mailto:flwang@catalyst.net.nz> Catalyst IT Limited Level 6, Catalyst House, 150 Willis Street, Wellington --------------------------------------------------------------------------

2231

Age (days ago)

2330

Last active (days ago)

List overview

Download

11 comments

5 participants

participants (5)

Bharat Kunwar
Feilong Wang
Giuseppe Sannino
Mark Goddard
Zane Bitter