[magnum] Kubernetes, multiple versions: kube-system pods in pending status on newly spun-up clusters (since last Thursday)

Namrata Sitlani nsitlani03 at gmail.com
Tue Nov 26 15:33:35 UTC 2019


Hello folks,

As of last week (14 Nov), our Magnum (Rocky) environment has stopped
spinning up working Kubernetes clusters. To be precise, Magnum does report
the cluster status as CREATE_COMPLETE, but once it is up all its Kubernetes
pods are stuck in the Pending state.

We use the following commands to create the Kubernetes clusters:
http://paste.openstack.org/show/786348/.

All pods show pending status which shows cluster could not select a minion
node for them.

The deployment fails with the following output " 0/2 nodes are available: 2
node(s) had taints that the pod didn't tolerate"
For reference:  http://paste.openstack.org/show/786717/.
For reference output of  kubectl get all -A
http://paste.openstack.org/show/786729/

If we manually remove  NoSchedule taint from the minion nodes we get all
the pods running
For reference: http://paste.openstack.org/show/786718/

But after the manual fix too openstack-cloud-controller-manager pods are
missing so any interaction from the Kubernetes control plane to OpenStack
services is non-functional.

We are assuming that the missing openstack cloud controller manager pod is
also the reason for the taint issue which we are encountering.

For the node taint issue,
https://ask.openstack.org/en/question/120442/magnum-kubernetes-noschedule-taint/
suggests to add
[trust]
cluster_user_trust = true

to magnum.conf. But there is OSA variable named magnum_cluster_user_trust
that can be set to true for this purpose. However, the default for this
variable has been True and we have confirmed we have the parameter in our
environment cluster_user_trust=True as well.


Note: The kubernetes cluster deployed here uses kube_tag 1.14.8 but we are
getting same result with other kube_tag versions v1.13.12, v1.14.8 also.

Ideally, we should not remove taints manually so can you please confirm our
findings and help us find a way forward
We can provide more logs if needed

Thanks
Namrata Sitlani
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20191126/14bf3cba/attachment.html>


More information about the openstack-discuss mailing list