Hi,
I recently deployed Openstack wallaby using kolla-ansible and also deployed magnum. I know it was working fine a while ago and I was able to spin up K8s clusters without a problem. But I think This was on Ussuri back then. I went through the magnum troubleshooting guide but couldn’t solve my problem. Magnum spins up the master node and I can log in via SSH using its floating IP. I checked the logs and saw this after waiting for a few minutes:
role.kubernetes.io/master=""
+ echo 'Trying to label master node with node-role.kubernetes.io/master=""'
+ sleep 5s
++ kubectl get --raw=/healthz
Error from server (InternalError): an error on the server ("[+]ping ok\n[+]log ok\n[+]etcd ok\n[+]poststarthook/start-kube-apiserver-admission-initializer ok\n[+]poststarthook/generic-apiserver-start-informers ok\n[+]poststarthook/start-apiextensions-informers ok\n[+]poststarthook/start-apiextensions-controllers ok\n[-]poststarthook/crd-informer-synced failed: reason withheld\n[-]poststarthook/bootstrap-controller failed: reason withheld\n[-]poststarthook/rbac/bootstrap-roles failed: reason withheld\n[-]poststarthook/scheduling/bootstrap-system-priority-classes failed: reason withheld\n[+]poststarthook/apiserver/bootstrap-system-flowcontrol-configuration ok\n[+]poststarthook/start-cluster-authentication-info-controller ok\n[+]poststarthook/start-kube-aggregator-informers ok\n[+]poststarthook/apiservice-registration-controller ok\n[+]poststarthook/apiservice-status-available-controller ok\n[+]poststarthook/kube-apiserver-autoregistration ok\n[-]autoregister-completion failed: reason withheld\n[+]poststarthook/apiservice-openapi-controller ok\nhealthz check failed") has prevented the request from succeeding
+ '[' ok = '' ']'
+ echo 'Trying to label master node with node-role.kubernetes.io/master=""'
+ sleep 5s
Trying to label master node with node-role.kubernetes.io/master=""
++ kubectl get --raw=/healthz
+ '[' ok = ok ']'
+ kubectl patch node k8s-test-small-cal-zwe5xmigugwj-master-0 --patch '{"metadata": {"labels": {"node-role.kubernetes.io/master": ""}}}'
Error from server (NotFound): nodes "k8s-test-small-cal-zwe5xmigugwj-master-0" not found
+ echo 'Trying to label master node with node-role.kubernetes.io/master=""'
Running kubectl get nodes is just empty even when appending all-namespaces. I pretty much used the documentation that I created when I was using Ussuri. I wonder what has changed since then that would make this fail.
I googled for hours but was not able to find similar issues and if then it was about having different version of k8s server and client. Which is definitely not the case. I also tried this on Xena but it also fails.
I do have the feeling that the issue is network related but I do not see any issues at all spinning up instances and also the communication between instances works fine.
Here are my current configs:
Globals.yml
[vagrant@seed ~]$ grep ^[^#] /etc/kolla/globals.yml
---
kolla_base_distro: "centos"
kolla_install_type: "source"
openstack_release: "wallaby"
kolla_internal_vip_address: "192.168.45.222"
kolla_external_vip_address: "192.168.2.222"
network_interface: "eth2"
neutron_external_interface: "eth1"
keepalived_virtual_router_id: "222"
enable_haproxy: "yes"
enable_magnum: “yes”
multinode hosts file
control[01:03] ansible_user=vagrant ansible_password=vagrant ansible_become=true api_interface=eth3
compute[01:02] ansible_user=vagrant ansible_password=vagrant ansible_become=true api_interface=eth3
[control]
# These hostname must be resolvable from your deployment host
control[01:03]
# The above can also be specified as follows:
#control[01:03] ansible_user=vagrant ansible_password=vagrant ansible_become=true
#compute[01:02] ansible_user=vagrant ansible_password=vagrant ansible_become=true
# The network nodes are where your l3-agent and loadbalancers will run
# This can be the same as a host in the control group
[network]
control[01:03]
#network01
#network02
[compute]
compute[01:02]
[monitoring]
control[01:03]
#monitoring01
# When compute nodes and control nodes use different interfaces,
# you need to comment out "api_interface" and other interfaces from the globals.yml
# and specify like below:
#compute01 neutron_external_interface=eth0 api_interface=em1 storage_interface=em1 tunnel_interface=em1
[storage]
control[01:03]
#storage01
[deployment]
localhost ansible_connection=local
cat /etc/kolla/config/magnum.conf
[trust]
cluster_user_trust = True
Sorry for the formatting. Sending this on a smartphone with plenty of copy and paste.
Oliver