[magnum] k8s cluster deploy stuck timeout
Hi all, I have installed magnum with no issues. After the installation, I tried to deploy a kubernetes cluster but after some time, the creation process times out and this always happens during the kube_cluster_deploy stage. I'm running 17.0.0 (Bobcat) so I tried with core os fedora-coreos-38.20230806.3.0 and kube_tag v1.26.8-rancher1 as stated in the documentation but no joy. I also tried different combinations of fedora core and kube_tag but same, no joy. I can ssh into the vms, podman exec into the containers but if I run kubectl version I get the following error The connection to the server localhost:8080 was refused - did you specify the right host or port? and tailing /var/log/heat-config/heat-config-script/ I get the following message ++ kubectl get --raw=/healthz The connection to the server localhost:8080 was refused - did you specify the right host or port? Any idea what might be the problem? TIA Jaime -- salu2 Jaime
Hi Oliver, thanks for the links, I'll give Capi a go. Jaime On 22/04/2024 19:35, Oliver Weinmann wrote:
Hi,
Debugging this is quite difficult. But in case you are interested I have a blog post about it:
* https://www.roksblog.de/deploy-kubernetes-clusters-in-openstack-within-minut... <https://www.roksblog.de/deploy-kubernetes-clusters-in-openstack-within-minutes-with-magnum/>
I highly recommend using Capi with magnum. Much faster, newer k8s releases and much less failure prone.
* https://www.roksblog.de/openstack-magnum-cluster-api-driver/ <https://www.roksblog.de/openstack-magnum-cluster-api-driver/>
Cheers, Oliver
Von meinem iPhone gesendet
Am 22.04.2024 um 19:54 schrieb Jaime Ibar <jim2k7@gmail.com>:
Hi all,
I have installed magnum with no issues. After the installation, I tried to deploy a kubernetes cluster but after some time, the creation process times out and this always happens during the kube_cluster_deploy stage. I'm running 17.0.0 (Bobcat) so I tried with core os fedora-coreos-38.20230806.3.0 and kube_tag v1.26.8-rancher1 as stated in the documentation but no joy. I also tried different combinations of fedora core and kube_tag but same, no joy. I can ssh into the vms, podman exec into the containers but if I run kubectl version I get the following error The connection to the server localhost:8080 was refused - did you specify the right host or port? and tailing /var/log/heat-config/heat-config-script/ I get the following message ++ kubectl get --raw=/healthz The connection to the server localhost:8080 was refused - did you specify the right host or port?
Any idea what might be the problem?
TIA Jaime
-- salu2
Jaime
-- Jaime
Hi Jaime, did you check the status of heat service in your fedora-coresos vm, usually it will pop up which causes timeouts, I previously encountered an error that the fedora-coreos vm could not connect to the openstack api, so I solved it by making the fedora-coreos vm connect to the openstack api.
Hi Jaime, You can check if all the resources in the heat stack was created properly. $ openstack coe cluster show <cluster> to get stack, then $ openstack stack list -n5 <stack_id> If you can SSH to the nodes, check out if services come up with `systemctl`. kubelet, kube-apiserver, etc should be up. You can also check out the heat log on worker(s), /var/log/heat-config/* Regards, Jake On 23/4/2024 3:45 am, Jaime Ibar wrote:
Hi all,
I have installed magnum with no issues. After the installation, I tried to deploy a kubernetes cluster but after some time, the creation process times out and this always happens during the kube_cluster_deploy stage. I'm running 17.0.0 (Bobcat) so I tried with core os fedora-coreos-38.20230806.3.0 and kube_tag v1.26.8-rancher1 as stated in the documentation but no joy. I also tried different combinations of fedora core and kube_tag but same, no joy. I can ssh into the vms, podman exec into the containers but if I run kubectl version I get the following error The connection to the server localhost:8080 was refused - did you specify the right host or port? and tailing /var/log/heat-config/heat-config-script/ I get the following message ++ kubectl get --raw=/healthz The connection to the server localhost:8080 was refused - did you specify the right host or port?
Any idea what might be the problem?
TIA Jaime
-- salu2
Jaime
Hi all, I have changed the cert_manager_type from x509keypair to barbican and enabled tls and I have a cluster running now. Unfortunately, openstack coe cluster list shows an unhealthy cluster :( Jake, running the openstack stack list -n5 <stack_id> gives error, openstack stack list doesn't seem to take any extra argument. Yes, I can ssh to the nodes, I'll check the kube* services you mentioned. Thanks Jaime On 23/04/2024 15:48, Jake Yip wrote:
Hi Jaime,
You can check if all the resources in the heat stack was created properly.
$ openstack coe cluster show <cluster>
to get stack, then
$ openstack stack list -n5 <stack_id>
If you can SSH to the nodes, check out if services come up with `systemctl`. kubelet, kube-apiserver, etc should be up.
You can also check out the heat log on worker(s), /var/log/heat-config/*
Regards, Jake
On 23/4/2024 3:45 am, Jaime Ibar wrote:
Hi all,
I have installed magnum with no issues. After the installation, I tried to deploy a kubernetes cluster but after some time, the creation process times out and this always happens during the kube_cluster_deploy stage. I'm running 17.0.0 (Bobcat) so I tried with core os fedora-coreos-38.20230806.3.0 and kube_tag v1.26.8-rancher1 as stated in the documentation but no joy. I also tried different combinations of fedora core and kube_tag but same, no joy. I can ssh into the vms, podman exec into the containers but if I run kubectl version I get the following error The connection to the server localhost:8080 was refused - did you specify the right host or port? and tailing /var/log/heat-config/heat-config-script/ I get the following message ++ kubectl get --raw=/healthz The connection to the server localhost:8080 was refused - did you specify the right host or port?
Any idea what might be the problem?
TIA Jaime
-- salu2
Jaime
-- Jaime
participants (4)
-
Jaime Ibar
-
Jake Yip
-
Oliver Weinmann
-
pahrialtkj@gmail.com