[magnum] autoscaling feature not working
Folks, I am trying to set the auto scaling feature in magnum "auto_scaling_enabled '' true but somehow the auto scaling is not working. truly speaking I don't understand the workflow of scaling up and down of worker nodes. How does magnum know that it requires scaling up and down without monitoring? On what basis will it trigger that action?
Hi Satish: If this is based for CAPI driver, it provisions a cluster-autoscaler instance which deploys new nodes when there are pods in Pending state. Have a look at the info here: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler Thanks Mohammed Get Outlook for iOS<https://aka.ms/o0ukef> ________________________________ From: Satish Patel <satish.txt@gmail.com> Sent: Wednesday, January 3, 2024 3:43:40 PM To: OpenStack Discuss <openstack-discuss@lists.openstack.org> Subject: [magnum] autoscaling feature not working Folks, I am trying to set the auto scaling feature in magnum "auto_scaling_enabled '' true but somehow the auto scaling is not working. truly speaking I don't understand the workflow of scaling up and down of worker nodes. How does magnum know that it requires scaling up and down without monitoring? On what basis will it trigger that action?
Hi Mohammed, Yes, I am using a CAPI driver. In k8s management cluster my autoscaler is in pending status. Did I miss something? root@os2-capi-01:~# kubectl get deploy -A NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager 1/1 1 1 8d capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager 1/1 1 1 8d capi-system capi-controller-manager 1/1 1 1 8d capo-system capo-controller-manager 1/1 1 1 8d cert-manager cert-manager 1/1 1 1 8d cert-manager cert-manager-cainjector 1/1 1 1 8d cert-manager cert-manager-webhook 1/1 1 1 8d kube-system calico-kube-controllers 1/1 1 1 8d kube-system coredns 2/2 2 2 8d kube-system dns-autoscaler 1/1 1 1 8d magnum-system kube-2lke8-autoscaler 0/1 1 0 4h25m magnum-system kube-d6n1t-autoscaler 0/1 1 0 14h magnum-system kube-dmmks-autoscaler 0/1 1 0 14h ## It is in pending status root@os2-capi-01:~# kubectl get pod -n magnum-system NAME READY STATUS RESTARTS AGE kube-2lke8-autoscaler-77bd94cc6f-5xbjg 0/1 Pending 0 4h25m kube-d6n1t-autoscaler-7486955bd4-fwfc9 0/1 Pending 0 14h kube-dmmks-autoscaler-596f9d48c-wzlbk 0/1 Pending 0 14h ## Logs are empty.. root@os2-capi-01:~# kubectl logs kube-2lke8-autoscaler-77bd94cc6f-5xbjg -n magnum-system root@os2-capi-01:~# kubectl logs kube-dmmks-autoscaler-596f9d48c-wzlbk -n magnum-system On Thu, Jan 4, 2024 at 1:16 AM Mohammed Naser <mnaser@vexxhost.com> wrote:
Hi Satish:
If this is based for CAPI driver, it provisions a cluster-autoscaler instance which deploys new nodes when there are pods in Pending state.
Have a look at the info here: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
Thanks Mohammed
Get Outlook for iOS <https://aka.ms/o0ukef> ------------------------------ *From:* Satish Patel <satish.txt@gmail.com> *Sent:* Wednesday, January 3, 2024 3:43:40 PM *To:* OpenStack Discuss <openstack-discuss@lists.openstack.org> *Subject:* [magnum] autoscaling feature not working
Folks,
I am trying to set the auto scaling feature in magnum "auto_scaling_enabled '' true but somehow the auto scaling is not working. truly speaking I don't understand the workflow of scaling up and down of worker nodes.
How does magnum know that it requires scaling up and down without monitoring? On what basis will it trigger that action?
Hi Mohammed, After digging into it I found autoscaler is not getting scheduled to run any nodes. Getting the following error. Full output is here - https://paste.opendev.org/show/bmJY3ZrRf07S1Q6OTaeA/ Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 40m default-scheduler 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.. Warning FailedScheduling 15m (x5 over 35m) default-scheduler 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.. Do you think it has the following flag? Node-Selectors: openstack-control-plane=enabled On Thu, Jan 4, 2024 at 9:39 AM Satish Patel <satish.txt@gmail.com> wrote:
Hi Mohammed,
Yes, I am using a CAPI driver. In k8s management cluster my autoscaler is in pending status. Did I miss something?
root@os2-capi-01:~# kubectl get deploy -A NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager 1/1 1 1 8d capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager 1/1 1 1 8d capi-system capi-controller-manager 1/1 1 1 8d capo-system capo-controller-manager 1/1 1 1 8d cert-manager cert-manager 1/1 1 1 8d cert-manager cert-manager-cainjector 1/1 1 1 8d cert-manager cert-manager-webhook 1/1 1 1 8d kube-system calico-kube-controllers 1/1 1 1 8d kube-system coredns 2/2 2 2 8d kube-system dns-autoscaler 1/1 1 1 8d magnum-system kube-2lke8-autoscaler 0/1 1 0 4h25m magnum-system kube-d6n1t-autoscaler 0/1 1 0 14h magnum-system kube-dmmks-autoscaler 0/1 1 0 14h
## It is in pending status
root@os2-capi-01:~# kubectl get pod -n magnum-system NAME READY STATUS RESTARTS AGE kube-2lke8-autoscaler-77bd94cc6f-5xbjg 0/1 Pending 0 4h25m kube-d6n1t-autoscaler-7486955bd4-fwfc9 0/1 Pending 0 14h kube-dmmks-autoscaler-596f9d48c-wzlbk 0/1 Pending 0 14h
## Logs are empty..
root@os2-capi-01:~# kubectl logs kube-2lke8-autoscaler-77bd94cc6f-5xbjg -n magnum-system root@os2-capi-01:~# kubectl logs kube-dmmks-autoscaler-596f9d48c-wzlbk -n magnum-system
On Thu, Jan 4, 2024 at 1:16 AM Mohammed Naser <mnaser@vexxhost.com> wrote:
Hi Satish:
If this is based for CAPI driver, it provisions a cluster-autoscaler instance which deploys new nodes when there are pods in Pending state.
Have a look at the info here: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
Thanks Mohammed
Get Outlook for iOS <https://aka.ms/o0ukef> ------------------------------ *From:* Satish Patel <satish.txt@gmail.com> *Sent:* Wednesday, January 3, 2024 3:43:40 PM *To:* OpenStack Discuss <openstack-discuss@lists.openstack.org> *Subject:* [magnum] autoscaling feature not working
Folks,
I am trying to set the auto scaling feature in magnum "auto_scaling_enabled '' true but somehow the auto scaling is not working. truly speaking I don't understand the workflow of scaling up and down of worker nodes.
How does magnum know that it requires scaling up and down without monitoring? On what basis will it trigger that action?
Yes, this is an assumption we’re making since we’re running things with Atmosphere. Can you running `kubectl label node/openstack-control-plane=enabled` Thanks From: Satish Patel <satish.txt@gmail.com> Date: Wednesday, January 10, 2024 at 12:57 PM To: Mohammed Naser <mnaser@vexxhost.com> Cc: OpenStack Discuss <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] autoscaling feature not working Hi Mohammed, After digging into it I found autoscaler is not getting scheduled to run any nodes. Getting the following error. Full output is here - https://paste.opendev.org/show/bmJY3ZrRf07S1Q6OTaeA/ Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 40m default-scheduler 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.. Warning FailedScheduling 15m (x5 over 35m) default-scheduler 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.. Do you think it has the following flag? Node-Selectors: openstack-control-plane=enabled On Thu, Jan 4, 2024 at 9:39 AM Satish Patel <satish.txt@gmail.com<mailto:satish.txt@gmail.com>> wrote: Hi Mohammed, Yes, I am using a CAPI driver. In k8s management cluster my autoscaler is in pending status. Did I miss something? root@os2-capi-01:~# kubectl get deploy -A NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager 1/1 1 1 8d capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager 1/1 1 1 8d capi-system capi-controller-manager 1/1 1 1 8d capo-system capo-controller-manager 1/1 1 1 8d cert-manager cert-manager 1/1 1 1 8d cert-manager cert-manager-cainjector 1/1 1 1 8d cert-manager cert-manager-webhook 1/1 1 1 8d kube-system calico-kube-controllers 1/1 1 1 8d kube-system coredns 2/2 2 2 8d kube-system dns-autoscaler 1/1 1 1 8d magnum-system kube-2lke8-autoscaler 0/1 1 0 4h25m magnum-system kube-d6n1t-autoscaler 0/1 1 0 14h magnum-system kube-dmmks-autoscaler 0/1 1 0 14h ## It is in pending status root@os2-capi-01:~# kubectl get pod -n magnum-system NAME READY STATUS RESTARTS AGE kube-2lke8-autoscaler-77bd94cc6f-5xbjg 0/1 Pending 0 4h25m kube-d6n1t-autoscaler-7486955bd4-fwfc9 0/1 Pending 0 14h kube-dmmks-autoscaler-596f9d48c-wzlbk 0/1 Pending 0 14h ## Logs are empty.. root@os2-capi-01:~# kubectl logs kube-2lke8-autoscaler-77bd94cc6f-5xbjg -n magnum-system root@os2-capi-01:~# kubectl logs kube-dmmks-autoscaler-596f9d48c-wzlbk -n magnum-system On Thu, Jan 4, 2024 at 1:16 AM Mohammed Naser <mnaser@vexxhost.com<mailto:mnaser@vexxhost.com>> wrote: Hi Satish: If this is based for CAPI driver, it provisions a cluster-autoscaler instance which deploys new nodes when there are pods in Pending state. Have a look at the info here: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler Thanks Mohammed Get Outlook for iOS<https://aka.ms/o0ukef> ________________________________ From: Satish Patel <satish.txt@gmail.com<mailto:satish.txt@gmail.com>> Sent: Wednesday, January 3, 2024 3:43:40 PM To: OpenStack Discuss <openstack-discuss@lists.openstack.org<mailto:openstack-discuss@lists.openstack.org>> Subject: [magnum] autoscaling feature not working Folks, I am trying to set the auto scaling feature in magnum "auto_scaling_enabled '' true but somehow the auto scaling is not working. truly speaking I don't understand the workflow of scaling up and down of worker nodes. How does magnum know that it requires scaling up and down without monitoring? On what basis will it trigger that action?
Hi Mohammed, Yes! "kubectl label node/openstack-control-plane=enabled" fixed autoscaler scheduling issue and now I can see deployment is running. Thank you for the command. For the testing I have deployed ngnix sample app and created 100s of replica to see autoscale add more worker nodes or not. # kubectl scale deployment --replicas=150 nginx-deployment As soon as I noticed my pods stuck in pending autoscaler started create new nova vms but as soon as vm up suddenly it deleted vm (This is happening in loops create vm and delete vm) I0111 04:35:13.427995 1 orchestrator.go:310] Final scale-up plan: [{MachineDeployment/magnum-system/kube-al0rs-default-worker-7cxtx 1->2 (max: 4)}] I0111 04:35:13.428019 1 orchestrator.go:582] Scale-up: setting group MachineDeployment/magnum-system/kube-al0rs-default-worker-7cxtx size to 2 I0111 04:35:49.576173 1 static_autoscaler.go:405] 1 unregistered nodes present Here is the full logs of autoscaler - https://pastebin.com/AMQbGhRj Do you know what could be wrong here? or any clue where I should look for culprits. On Wed, Jan 10, 2024 at 4:09 PM Mohammed Naser <mnaser@vexxhost.com> wrote:
Yes, this is an assumption we’re making since we’re running things with Atmosphere.
Can you running `kubectl label node/openstack-control-plane=enabled`
Thanks
*From: *Satish Patel <satish.txt@gmail.com> *Date: *Wednesday, January 10, 2024 at 12:57 PM *To: *Mohammed Naser <mnaser@vexxhost.com> *Cc: *OpenStack Discuss <openstack-discuss@lists.openstack.org> *Subject: *Re: [magnum] autoscaling feature not working
Hi Mohammed,
After digging into it I found autoscaler is not getting scheduled to run any nodes. Getting the following error. Full output is here - https://paste.opendev.org/show/bmJY3ZrRf07S1Q6OTaeA/
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 40m default-scheduler 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
Warning FailedScheduling 15m (x5 over 35m) default-scheduler 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
Do you think it has the following flag?
Node-Selectors: openstack-control-plane=enabled
On Thu, Jan 4, 2024 at 9:39 AM Satish Patel <satish.txt@gmail.com> wrote:
Hi Mohammed,
Yes, I am using a CAPI driver. In k8s management cluster my autoscaler is in pending status. Did I miss something?
root@os2-capi-01:~# kubectl get deploy -A
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager 1/1 1 1 8d
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager 1/1 1 1 8d
capi-system capi-controller-manager 1/1 1 1 8d
capo-system capo-controller-manager 1/1 1 1 8d
cert-manager cert-manager 1/1 1 1 8d
cert-manager cert-manager-cainjector 1/1 1 1 8d
cert-manager cert-manager-webhook 1/1 1 1 8d
kube-system calico-kube-controllers 1/1 1 1 8d
kube-system coredns 2/2 2 2 8d
kube-system dns-autoscaler 1/1 1 1 8d
magnum-system kube-2lke8-autoscaler 0/1 1 0 4h25m
magnum-system kube-d6n1t-autoscaler 0/1 1 0 14h
magnum-system kube-dmmks-autoscaler 0/1 1 0 14h
## It is in pending status
root@os2-capi-01:~# kubectl get pod -n magnum-system
NAME READY STATUS RESTARTS AGE
kube-2lke8-autoscaler-77bd94cc6f-5xbjg 0/1 Pending 0 4h25m
kube-d6n1t-autoscaler-7486955bd4-fwfc9 0/1 Pending 0 14h
kube-dmmks-autoscaler-596f9d48c-wzlbk 0/1 Pending 0 14h
## Logs are empty..
root@os2-capi-01:~# kubectl logs kube-2lke8-autoscaler-77bd94cc6f-5xbjg -n magnum-system
root@os2-capi-01:~# kubectl logs kube-dmmks-autoscaler-596f9d48c-wzlbk -n magnum-system
On Thu, Jan 4, 2024 at 1:16 AM Mohammed Naser <mnaser@vexxhost.com> wrote:
Hi Satish:
If this is based for CAPI driver, it provisions a cluster-autoscaler instance which deploys new nodes when there are pods in Pending state.
Have a look at the info here: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
Thanks
Mohammed
Get Outlook for iOS <https://aka.ms/o0ukef> ------------------------------
*From:* Satish Patel <satish.txt@gmail.com> *Sent:* Wednesday, January 3, 2024 3:43:40 PM *To:* OpenStack Discuss <openstack-discuss@lists.openstack.org> *Subject:* [magnum] autoscaling feature not working
Folks,
I am trying to set the auto scaling feature in magnum "auto_scaling_enabled '' true but somehow the auto scaling is not working. truly speaking I don't understand the workflow of scaling up and down of worker nodes.
How does magnum know that it requires scaling up and down without monitoring? On what basis will it trigger that action?
I think this is a bug which we solved in the latest version of the Cluster API driver for Magnum I suggest updating to the latest and trying on a new cluster. Get Outlook for iOS<https://aka.ms/o0ukef> ________________________________ From: Satish Patel <satish.txt@gmail.com> Sent: Wednesday, January 10, 2024 11:51:24 PM To: Mohammed Naser <mnaser@vexxhost.com> Cc: OpenStack Discuss <openstack-discuss@lists.openstack.org> Subject: Re: [magnum] autoscaling feature not working Hi Mohammed, Yes! "kubectl label node/openstack-control-plane=enabled" fixed autoscaler scheduling issue and now I can see deployment is running. Thank you for the command. For the testing I have deployed ngnix sample app and created 100s of replica to see autoscale add more worker nodes or not. # kubectl scale deployment --replicas=150 nginx-deployment As soon as I noticed my pods stuck in pending autoscaler started create new nova vms but as soon as vm up suddenly it deleted vm (This is happening in loops create vm and delete vm) I0111 04:35:13.427995 1 orchestrator.go:310] Final scale-up plan: [{MachineDeployment/magnum-system/kube-al0rs-default-worker-7cxtx 1->2 (max: 4)}] I0111 04:35:13.428019 1 orchestrator.go:582] Scale-up: setting group MachineDeployment/magnum-system/kube-al0rs-default-worker-7cxtx size to 2 I0111 04:35:49.576173 1 static_autoscaler.go:405] 1 unregistered nodes present Here is the full logs of autoscaler - https://pastebin.com/AMQbGhRj Do you know what could be wrong here? or any clue where I should look for culprits. On Wed, Jan 10, 2024 at 4:09 PM Mohammed Naser <mnaser@vexxhost.com<mailto:mnaser@vexxhost.com>> wrote: Yes, this is an assumption we’re making since we’re running things with Atmosphere. Can you running `kubectl label node/openstack-control-plane=enabled` Thanks From: Satish Patel <satish.txt@gmail.com<mailto:satish.txt@gmail.com>> Date: Wednesday, January 10, 2024 at 12:57 PM To: Mohammed Naser <mnaser@vexxhost.com<mailto:mnaser@vexxhost.com>> Cc: OpenStack Discuss <openstack-discuss@lists.openstack.org<mailto:openstack-discuss@lists.openstack.org>> Subject: Re: [magnum] autoscaling feature not working Hi Mohammed, After digging into it I found autoscaler is not getting scheduled to run any nodes. Getting the following error. Full output is here - https://paste.opendev.org/show/bmJY3ZrRf07S1Q6OTaeA/ Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedScheduling 40m default-scheduler 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.. Warning FailedScheduling 15m (x5 over 35m) default-scheduler 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling.. Do you think it has the following flag? Node-Selectors: openstack-control-plane=enabled On Thu, Jan 4, 2024 at 9:39 AM Satish Patel <satish.txt@gmail.com<mailto:satish.txt@gmail.com>> wrote: Hi Mohammed, Yes, I am using a CAPI driver. In k8s management cluster my autoscaler is in pending status. Did I miss something? root@os2-capi-01:~# kubectl get deploy -A NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager 1/1 1 1 8d capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager 1/1 1 1 8d capi-system capi-controller-manager 1/1 1 1 8d capo-system capo-controller-manager 1/1 1 1 8d cert-manager cert-manager 1/1 1 1 8d cert-manager cert-manager-cainjector 1/1 1 1 8d cert-manager cert-manager-webhook 1/1 1 1 8d kube-system calico-kube-controllers 1/1 1 1 8d kube-system coredns 2/2 2 2 8d kube-system dns-autoscaler 1/1 1 1 8d magnum-system kube-2lke8-autoscaler 0/1 1 0 4h25m magnum-system kube-d6n1t-autoscaler 0/1 1 0 14h magnum-system kube-dmmks-autoscaler 0/1 1 0 14h ## It is in pending status root@os2-capi-01:~# kubectl get pod -n magnum-system NAME READY STATUS RESTARTS AGE kube-2lke8-autoscaler-77bd94cc6f-5xbjg 0/1 Pending 0 4h25m kube-d6n1t-autoscaler-7486955bd4-fwfc9 0/1 Pending 0 14h kube-dmmks-autoscaler-596f9d48c-wzlbk 0/1 Pending 0 14h ## Logs are empty.. root@os2-capi-01:~# kubectl logs kube-2lke8-autoscaler-77bd94cc6f-5xbjg -n magnum-system root@os2-capi-01:~# kubectl logs kube-dmmks-autoscaler-596f9d48c-wzlbk -n magnum-system On Thu, Jan 4, 2024 at 1:16 AM Mohammed Naser <mnaser@vexxhost.com<mailto:mnaser@vexxhost.com>> wrote: Hi Satish: If this is based for CAPI driver, it provisions a cluster-autoscaler instance which deploys new nodes when there are pods in Pending state. Have a look at the info here: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler Thanks Mohammed Get Outlook for iOS<https://aka.ms/o0ukef> ________________________________ From: Satish Patel <satish.txt@gmail.com<mailto:satish.txt@gmail.com>> Sent: Wednesday, January 3, 2024 3:43:40 PM To: OpenStack Discuss <openstack-discuss@lists.openstack.org<mailto:openstack-discuss@lists.openstack.org>> Subject: [magnum] autoscaling feature not working Folks, I am trying to set the auto scaling feature in magnum "auto_scaling_enabled '' true but somehow the auto scaling is not working. truly speaking I don't understand the workflow of scaling up and down of worker nodes. How does magnum know that it requires scaling up and down without monitoring? On what basis will it trigger that action?
Thank you for the update, FYI, I have deployed CAPI just last month 0.13.0 (I will update and get back to you) # pip list | grep magnum-cluster-api magnum-cluster-api 0.13.0 On Thu, Jan 11, 2024 at 2:04 AM Mohammed Naser <mnaser@vexxhost.com> wrote:
I think this is a bug which we solved in the latest version of the Cluster API driver for Magnum
I suggest updating to the latest and trying on a new cluster.
Get Outlook for iOS <https://aka.ms/o0ukef> ------------------------------ *From:* Satish Patel <satish.txt@gmail.com> *Sent:* Wednesday, January 10, 2024 11:51:24 PM *To:* Mohammed Naser <mnaser@vexxhost.com> *Cc:* OpenStack Discuss <openstack-discuss@lists.openstack.org> *Subject:* Re: [magnum] autoscaling feature not working
Hi Mohammed,
Yes! "kubectl label node/openstack-control-plane=enabled" fixed autoscaler scheduling issue and now I can see deployment is running. Thank you for the command.
For the testing I have deployed ngnix sample app and created 100s of replica to see autoscale add more worker nodes or not.
# kubectl scale deployment --replicas=150 nginx-deployment
As soon as I noticed my pods stuck in pending autoscaler started create new nova vms but as soon as vm up suddenly it deleted vm (This is happening in loops create vm and delete vm)
I0111 04:35:13.427995 1 orchestrator.go:310] Final scale-up plan: [{MachineDeployment/magnum-system/kube-al0rs-default-worker-7cxtx 1->2 (max: 4)}] I0111 04:35:13.428019 1 orchestrator.go:582] Scale-up: setting group MachineDeployment/magnum-system/kube-al0rs-default-worker-7cxtx size to 2
I0111 04:35:49.576173 1 static_autoscaler.go:405] 1 unregistered nodes present
Here is the full logs of autoscaler - https://pastebin.com/AMQbGhRj
Do you know what could be wrong here? or any clue where I should look for culprits.
On Wed, Jan 10, 2024 at 4:09 PM Mohammed Naser <mnaser@vexxhost.com> wrote:
Yes, this is an assumption we’re making since we’re running things with Atmosphere.
Can you running `kubectl label node/openstack-control-plane=enabled`
Thanks
*From: *Satish Patel <satish.txt@gmail.com> *Date: *Wednesday, January 10, 2024 at 12:57 PM *To: *Mohammed Naser <mnaser@vexxhost.com> *Cc: *OpenStack Discuss <openstack-discuss@lists.openstack.org> *Subject: *Re: [magnum] autoscaling feature not working
Hi Mohammed,
After digging into it I found autoscaler is not getting scheduled to run any nodes. Getting the following error. Full output is here - https://paste.opendev.org/show/bmJY3ZrRf07S1Q6OTaeA/
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 40m default-scheduler 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
Warning FailedScheduling 15m (x5 over 35m) default-scheduler 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
Do you think it has the following flag?
Node-Selectors: openstack-control-plane=enabled
On Thu, Jan 4, 2024 at 9:39 AM Satish Patel <satish.txt@gmail.com> wrote:
Hi Mohammed,
Yes, I am using a CAPI driver. In k8s management cluster my autoscaler is in pending status. Did I miss something?
root@os2-capi-01:~# kubectl get deploy -A
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager 1/1 1 1 8d
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager 1/1 1 1 8d
capi-system capi-controller-manager 1/1 1 1 8d
capo-system capo-controller-manager 1/1 1 1 8d
cert-manager cert-manager 1/1 1 1 8d
cert-manager cert-manager-cainjector 1/1 1 1 8d
cert-manager cert-manager-webhook 1/1 1 1 8d
kube-system calico-kube-controllers 1/1 1 1 8d
kube-system coredns 2/2 2 2 8d
kube-system dns-autoscaler 1/1 1 1 8d
magnum-system kube-2lke8-autoscaler 0/1 1 0 4h25m
magnum-system kube-d6n1t-autoscaler 0/1 1 0 14h
magnum-system kube-dmmks-autoscaler 0/1 1 0 14h
## It is in pending status
root@os2-capi-01:~# kubectl get pod -n magnum-system
NAME READY STATUS RESTARTS AGE
kube-2lke8-autoscaler-77bd94cc6f-5xbjg 0/1 Pending 0 4h25m
kube-d6n1t-autoscaler-7486955bd4-fwfc9 0/1 Pending 0 14h
kube-dmmks-autoscaler-596f9d48c-wzlbk 0/1 Pending 0 14h
## Logs are empty..
root@os2-capi-01:~# kubectl logs kube-2lke8-autoscaler-77bd94cc6f-5xbjg -n magnum-system
root@os2-capi-01:~# kubectl logs kube-dmmks-autoscaler-596f9d48c-wzlbk -n magnum-system
On Thu, Jan 4, 2024 at 1:16 AM Mohammed Naser <mnaser@vexxhost.com> wrote:
Hi Satish:
If this is based for CAPI driver, it provisions a cluster-autoscaler instance which deploys new nodes when there are pods in Pending state.
Have a look at the info here: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
Thanks
Mohammed
Get Outlook for iOS <https://aka.ms/o0ukef> ------------------------------
*From:* Satish Patel <satish.txt@gmail.com> *Sent:* Wednesday, January 3, 2024 3:43:40 PM *To:* OpenStack Discuss <openstack-discuss@lists.openstack.org> *Subject:* [magnum] autoscaling feature not working
Folks,
I am trying to set the auto scaling feature in magnum "auto_scaling_enabled '' true but somehow the auto scaling is not working. truly speaking I don't understand the workflow of scaling up and down of worker nodes.
How does magnum know that it requires scaling up and down without monitoring? On what basis will it trigger that action?
Hi Mohammed, Damn!! I owe you a donut!! That capi driver update to 0.13.3 fixed my issue. Now autoscaling is working like ice in a cake. (venv-openstack) root@os2-mon-01:~/k8s# kubectl get nodes NAME STATUS ROLES AGE VERSION kube-9ikbz-default-worker-mgxlf-dkbbg Ready worker 69s v1.27.4 <---- Just pop up.. after putting load kube-9ikbz-default-worker-mgxlf-hmcm4 Ready worker 11m v1.27.4 kube-9ikbz-q5qlf-q5rnd Ready control-plane,master 13m v1.27.4 On Thu, Jan 11, 2024 at 8:25 AM Satish Patel <satish.txt@gmail.com> wrote:
Thank you for the update,
FYI, I have deployed CAPI just last month 0.13.0 (I will update and get back to you)
# pip list | grep magnum-cluster-api magnum-cluster-api 0.13.0
On Thu, Jan 11, 2024 at 2:04 AM Mohammed Naser <mnaser@vexxhost.com> wrote:
I think this is a bug which we solved in the latest version of the Cluster API driver for Magnum
I suggest updating to the latest and trying on a new cluster.
Get Outlook for iOS <https://aka.ms/o0ukef> ------------------------------ *From:* Satish Patel <satish.txt@gmail.com> *Sent:* Wednesday, January 10, 2024 11:51:24 PM *To:* Mohammed Naser <mnaser@vexxhost.com> *Cc:* OpenStack Discuss <openstack-discuss@lists.openstack.org> *Subject:* Re: [magnum] autoscaling feature not working
Hi Mohammed,
Yes! "kubectl label node/openstack-control-plane=enabled" fixed autoscaler scheduling issue and now I can see deployment is running. Thank you for the command.
For the testing I have deployed ngnix sample app and created 100s of replica to see autoscale add more worker nodes or not.
# kubectl scale deployment --replicas=150 nginx-deployment
As soon as I noticed my pods stuck in pending autoscaler started create new nova vms but as soon as vm up suddenly it deleted vm (This is happening in loops create vm and delete vm)
I0111 04:35:13.427995 1 orchestrator.go:310] Final scale-up plan: [{MachineDeployment/magnum-system/kube-al0rs-default-worker-7cxtx 1->2 (max: 4)}] I0111 04:35:13.428019 1 orchestrator.go:582] Scale-up: setting group MachineDeployment/magnum-system/kube-al0rs-default-worker-7cxtx size to 2
I0111 04:35:49.576173 1 static_autoscaler.go:405] 1 unregistered nodes present
Here is the full logs of autoscaler - https://pastebin.com/AMQbGhRj
Do you know what could be wrong here? or any clue where I should look for culprits.
On Wed, Jan 10, 2024 at 4:09 PM Mohammed Naser <mnaser@vexxhost.com> wrote:
Yes, this is an assumption we’re making since we’re running things with Atmosphere.
Can you running `kubectl label node/openstack-control-plane=enabled`
Thanks
*From: *Satish Patel <satish.txt@gmail.com> *Date: *Wednesday, January 10, 2024 at 12:57 PM *To: *Mohammed Naser <mnaser@vexxhost.com> *Cc: *OpenStack Discuss <openstack-discuss@lists.openstack.org> *Subject: *Re: [magnum] autoscaling feature not working
Hi Mohammed,
After digging into it I found autoscaler is not getting scheduled to run any nodes. Getting the following error. Full output is here - https://paste.opendev.org/show/bmJY3ZrRf07S1Q6OTaeA/
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 40m default-scheduler 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
Warning FailedScheduling 15m (x5 over 35m) default-scheduler 0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
Do you think it has the following flag?
Node-Selectors: openstack-control-plane=enabled
On Thu, Jan 4, 2024 at 9:39 AM Satish Patel <satish.txt@gmail.com> wrote:
Hi Mohammed,
Yes, I am using a CAPI driver. In k8s management cluster my autoscaler is in pending status. Did I miss something?
root@os2-capi-01:~# kubectl get deploy -A
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
capi-kubeadm-bootstrap-system capi-kubeadm-bootstrap-controller-manager 1/1 1 1 8d
capi-kubeadm-control-plane-system capi-kubeadm-control-plane-controller-manager 1/1 1 1 8d
capi-system capi-controller-manager 1/1 1 1 8d
capo-system capo-controller-manager 1/1 1 1 8d
cert-manager cert-manager 1/1 1 1 8d
cert-manager cert-manager-cainjector 1/1 1 1 8d
cert-manager cert-manager-webhook 1/1 1 1 8d
kube-system calico-kube-controllers 1/1 1 1 8d
kube-system coredns 2/2 2 2 8d
kube-system dns-autoscaler 1/1 1 1 8d
magnum-system kube-2lke8-autoscaler 0/1 1 0 4h25m
magnum-system kube-d6n1t-autoscaler 0/1 1 0 14h
magnum-system kube-dmmks-autoscaler 0/1 1 0 14h
## It is in pending status
root@os2-capi-01:~# kubectl get pod -n magnum-system
NAME READY STATUS RESTARTS AGE
kube-2lke8-autoscaler-77bd94cc6f-5xbjg 0/1 Pending 0 4h25m
kube-d6n1t-autoscaler-7486955bd4-fwfc9 0/1 Pending 0 14h
kube-dmmks-autoscaler-596f9d48c-wzlbk 0/1 Pending 0 14h
## Logs are empty..
root@os2-capi-01:~# kubectl logs kube-2lke8-autoscaler-77bd94cc6f-5xbjg -n magnum-system
root@os2-capi-01:~# kubectl logs kube-dmmks-autoscaler-596f9d48c-wzlbk -n magnum-system
On Thu, Jan 4, 2024 at 1:16 AM Mohammed Naser <mnaser@vexxhost.com> wrote:
Hi Satish:
If this is based for CAPI driver, it provisions a cluster-autoscaler instance which deploys new nodes when there are pods in Pending state.
Have a look at the info here: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler
Thanks
Mohammed
Get Outlook for iOS <https://aka.ms/o0ukef> ------------------------------
*From:* Satish Patel <satish.txt@gmail.com> *Sent:* Wednesday, January 3, 2024 3:43:40 PM *To:* OpenStack Discuss <openstack-discuss@lists.openstack.org> *Subject:* [magnum] autoscaling feature not working
Folks,
I am trying to set the auto scaling feature in magnum "auto_scaling_enabled '' true but somehow the auto scaling is not working. truly speaking I don't understand the workflow of scaling up and down of worker nodes.
How does magnum know that it requires scaling up and down without monitoring? On what basis will it trigger that action?
participants (2)
-
Mohammed Naser
-
Satish Patel