[kolla][magnum] Cluster creation failed due to "Waiting for Kubernetes API..."

Giuseppe Sannino km.giuseppesannino at gmail.com
Thu Feb 21 10:03:38 UTC 2019


Ciao Mark,
finally it works! Many many thanks!
That was the missing piece of the puzzle.

Just FYI information, from the systemctl status for the
heat-container-agent I can still see this repetitive logs:
:
Feb 21 08:00:40 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal
runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping
Feb 21 08:01:11 kube-cluster-goddard-lq54faeabuhu-master-0.novalocal
runc[2715]: /var/lib/os-collect-config/local-data not found. Skipping
:

This doesn't seem to harm the deployment but I will check further.

Thanks a lot to everyone!

/Giuseppe

On Wed, 20 Feb 2019 at 20:16, Mark Goddard <mark at stackhpc.com> wrote:

> Hi, I think we've hit this, and John Garbutt has added the following
> configuration for Kolla Ansible in /etc/kolla/config/heat.conf:
>
> [DEFAULT]
> region_name_for_services=RegionOne
> We'll need a patch in kolla ansible to do that without custom config
> changes.
> Mark
>
> On Wed, 20 Feb 2019 at 11:05, Bharat Kunwar <bharat at stackhpc.com> wrote:
>
>> Hi Giuseppe,
>>
>> What version of heat are you running?
>>
>> Can you check if you have this patch merged?
>> https://review.openstack.org/579485
>>
>> https://review.openstack.org/579485
>>
>> Bharat
>>
>> Sent from my iPhone
>>
>> On 20 Feb 2019, at 10:38, Giuseppe Sannino <km.giuseppesannino at gmail.com>
>> wrote:
>>
>> Hi Feilong, Bharat,
>> thanks for your answer.
>>
>> @Feilong,
>> From /etc/kolla/heat-engine/heat.conf I see:
>> [clients_keystone]
>> auth_uri = http://10.1.7.201:5000
>>
>> This should map into auth_url within the k8s master.
>> Within the k8s master in /etc/os-collect-config.conf  I see:
>>
>> [heat]
>> auth_url = http://10.1.7.201:5000/v3/
>> :
>> :
>> resource_name = kube-master
>> region_name = null
>>
>>
>> and from /etc/sysconfig/heat-params (among the others):
>> :
>> REGION_NAME="RegionOne"
>> :
>> AUTH_URL="http://10.1.7.201:5000/v3"
>>
>> This URL corresponds to the "public" Heat endpoint
>> openstack endpoint list | grep heat
>> | 3d5f58c43f6b44f6b54990d6fd9ff55d | RegionOne | heat         |
>> orchestration   | True    | internal  |
>> http://10.1.7.200:8004/v1/%(tenant_id)s   |
>> | 8c2492cb0ddc48ca94942a4a299a88dc | RegionOne | heat-cfn     |
>> cloudformation  | True    | internal  | http://10.1.7.200:8000/v1
>>          |
>> | b164c4618a784da9ae14da75a6c764a3 | RegionOne | heat         |
>> orchestration   | True    | public    |
>> http://10.1.7.201:8004/v1/%(tenant_id)s   |
>> | da203f7d337b4587a0f5fc774c993390 | RegionOne | heat         |
>> orchestration   | True    | admin     |
>> http://10.1.7.200:8004/v1/%(tenant_id)s   |
>> | e0d3743e7c604e5c8aa4684df2d1ce53 | RegionOne | heat-cfn     |
>> cloudformation  | True    | public    | http://10.1.7.201:8000/v1
>>          |
>> | efe0b8418aa24dfca33c243e7eed7e90 | RegionOne | heat-cfn     |
>> cloudformation  | True    | admin     | http://10.1.7.200:8000/v1
>>          |
>>
>> Connectivity tests:
>> [fedora at kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ ping 10.1.7.201
>> PING 10.1.7.201 (10.1.7.201) 56(84) bytes of data.
>> 64 bytes from 10.1.7.201: icmp_seq=1 ttl=63 time=0.285 ms
>>
>> [fedora at kube-cluster-fed27-k5di3i7stgks-master-0 ~]$ curl
>> http://10.1.7.201:5000/v3/
>> {"version": {"status": "stable", "updated": "2018-10-15T00:00:00Z",
>> "media-types": [{"base": "application/json", "type":
>> "application/vnd.openstack.identity-v3+json"}], "id": "v3.11", "links":
>> [{"href": "http://10.1.7.201:5000/v3/", "rel": "self"}]}}
>>
>>
>> Apparently, I can reach such endpoint from within the k8s master
>>
>>
>> @Bharat,
>> that file seems to be properly conifugured to me as well.
>> The problem pointed by "systemctl status heat-container-agent" is with:
>>
>> Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal
>> runc[2837]: publicURL endpoint for orchestration service in null region not
>> found
>> Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal
>> runc[2837]: Source [heat] Unavailable.
>> Feb 20 09:33:23 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal
>> runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping
>> Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal
>> runc[2837]: publicURL endpoint for orchestration service in null region not
>> found
>> Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal
>> runc[2837]: Source [heat] Unavailable.
>> Feb 20 09:33:53 kube-cluster-fed27-k5di3i7stgks-master-0.novalocal
>> runc[2837]: /var/lib/os-collect-config/local-data not found. Skipping
>>
>>
>> Still no way forward from my side.
>>
>> /Giuseppe
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> On Tue, 19 Feb 2019 at 22:16, Bharat Kunwar <bharat at stackhpc.com> wrote:
>>
>>> I have the same problem. Weird thing is /etc/sysconfig/heat-params has
>>> region_name specified in my case!
>>>
>>> Sent from my iPhone
>>>
>>> On 19 Feb 2019, at 22:00, Feilong Wang <feilong at catalyst.net.nz> wrote:
>>>
>>> Can you talk to the Heat API from your master node?
>>>
>>>
>>> On 20/02/19 6:43 AM, Giuseppe Sannino wrote:
>>>
>>> Hi all...again,
>>> I managed to get over the previous issue by "not disabling" the TLS in
>>> the cluster template.
>>> From the cloud-init-output.log I see:
>>> Cloud-init v. 17.1 running 'modules:final' at Tue, 19 Feb 2019 17:03:53
>>> +0000. Up 38.08 seconds.
>>> Cloud-init v. 17.1 finished at Tue, 19 Feb 2019 17:13:22 +0000.
>>> Datasource DataSourceEc2.  Up 607.13 seconds
>>>
>>> But the cluster creation keeps on failing.
>>> From the journalctl -f I see a possible issue:
>>> Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal
>>> runc[2723]: publicURL endpoint for orchestration service in null region not
>>> found
>>> Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal
>>> runc[2723]: Source [heat] Unavailable.
>>> Feb 19 17:42:38 kube-cluster-tls-6hezqcq4ien3-master-0.novalocal
>>> runc[2723]: /var/lib/os-collect-config/local-data not found. Skipping
>>>
>>> anyone familiar with this problem ?
>>>
>>> Thanks as usual.
>>> /Giuseppe
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, 19 Feb 2019 at 17:35, Giuseppe Sannino <
>>> km.giuseppesannino at gmail.com> wrote:
>>>
>>>> Hi all,
>>>> need an help.
>>>> I deployed an AIO via Kolla on a baremetal node. Here some information
>>>> about the deployment:
>>>> ---------------
>>>> kolla-ansible: 7.0.1
>>>> openstack_release: Rocky
>>>> kolla_base_distro: centos
>>>> kolla_install_type: source
>>>> TLS: disabled
>>>> ---------------
>>>>
>>>>
>>>> VMs spawn without issue but I can't make the "Kubernetes cluster
>>>> creation" successfully. It fails due to "Time out"
>>>>
>>>> I managed to log into Kuber Master and from the cloud-init-output.log I
>>>> can see:
>>>> + echo 'Waiting for Kubernetes API...'
>>>> Waiting for Kubernetes API...
>>>> ++ curl --silent http://127.0.0.1:8080/healthz
>>>> + '[' ok = '' ']'
>>>> + sleep 5
>>>>
>>>>
>>>> Checking via systemctl and journalctl I see:
>>>> [fedora at kube-clsuter-qamdealetlbi-master-0 log]$ systemctl status
>>>> kube-apiserver
>>>> ● kube-apiserver.service - kubernetes-apiserver
>>>>    Loaded: loaded (/etc/systemd/system/kube-apiserver.service; enabled;
>>>> vendor preset: disabled)
>>>>    Active: failed (Result: exit-code) since Tue 2019-02-19 15:31:41
>>>> UTC; 45min ago
>>>>   Process: 3796 ExecStart=/usr/bin/runc --systemd-cgroup run
>>>> kube-apiserver (code=exited, status=1/FAILURE)
>>>>  Main PID: 3796 (code=exited, status=1/FAILURE)
>>>>
>>>> Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> systemd[1]: kube-apiserver.service: Main process exited, code=exited,
>>>> status=1/FAILURE
>>>> Feb 19 15:31:40 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> systemd[1]: kube-apiserver.service: Failed with result 'exit-code'.
>>>> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired,
>>>> scheduling restart.
>>>> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter
>>>> is at 6.
>>>> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> systemd[1]: Stopped kubernetes-apiserver.
>>>> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> systemd[1]: kube-apiserver.service: Start request repeated too quickly.
>>>> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> systemd[1]: kube-apiserver.service: Failed with result 'exit-code'.
>>>> Feb 19 15:31:41 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> systemd[1]: Failed to start kubernetes-apiserver.
>>>>
>>>> [fedora at kube-clsuter-qamdealetlbi-master-0 log]$ sudo journalctl -u
>>>> kube-apiserver
>>>> -- Logs begin at Tue 2019-02-19 15:21:36 UTC, end at Tue 2019-02-19
>>>> 16:17:00 UTC. --
>>>> Feb 19 15:31:33 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> systemd[1]: Started kubernetes-apiserver.
>>>> Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> runc[2794]: Flag --insecure-bind-address has been deprecated, This flag
>>>> will be removed in a future version.
>>>> Feb 19 15:31:34 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> runc[2794]: Flag --insecure-port has been deprecated, This flag will be
>>>> removed in a future version.
>>>> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> runc[2794]: Error: error creating self-signed certificates: open
>>>> /var/run/kubernetes/apiserver.crt: permission denied
>>>> :
>>>> :
>>>> :
>>>> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> runc[2794]: error: error creating self-signed certificates: open
>>>> /var/run/kubernetes/apiserver.crt: permission denied
>>>> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> systemd[1]: kube-apiserver.service: Main process exited, code=exited,
>>>> status=1/FAILURE
>>>> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> systemd[1]: kube-apiserver.service: Failed with result 'exit-code'.
>>>> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> systemd[1]: kube-apiserver.service: Service RestartSec=100ms expired,
>>>> scheduling restart.
>>>> Feb 19 15:31:35 kube-clsuter-qamdealetlbi-master-0.novalocal
>>>> systemd[1]: kube-apiserver.service: Scheduled restart job, restart counter
>>>> is at 1.
>>>>
>>>>
>>>> May I ask for an help on this ?
>>>>
>>>> Many thanks
>>>> /Giuseppe
>>>>
>>>>
>>>>
>>>>
>>>> --
>>> Cheers & Best regards,
>>> Feilong Wang (王飞龙)
>>> --------------------------------------------------------------------------
>>> Senior Cloud Software Engineer
>>> Tel: +64-48032246
>>> Email: flwang at catalyst.net.nz
>>> Catalyst IT Limited
>>> Level 6, Catalyst House, 150 Willis Street, Wellington
>>> --------------------------------------------------------------------------
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190221/f050720d/attachment-0001.html>


More information about the openstack-discuss mailing list