On Thu, Jan 11, 2024 at 2:04 AM Mohammed Naser <mnaser@vexxhost.com> wrote:

I think this is a bug which we solved in the latest version of the Cluster API driver for Magnum

I suggest updating to the latest and trying on a new cluster.

Get Outlook for iOS

From: Satish Patel <satish.txt@gmail.com>
Sent: Wednesday, January 10, 2024 11:51:24 PM
To: Mohammed Naser <mnaser@vexxhost.com>
Cc: OpenStack Discuss <openstack-discuss@lists.openstack.org>
Subject: Re: [magnum] autoscaling feature not working
Hi Mohammed,

Yes! "kubectl label node/openstack-control-plane=enabled" fixed autoscaler scheduling issue and now I can see deployment is running. Thank you for the command.

For the testing I have deployed ngnix sample app and created 100s of replica to see autoscale add more worker nodes or not.

# kubectl scale deployment --replicas=150 nginx-deployment

As soon as I noticed my pods stuck in pending autoscaler started create new nova vms but as soon as vm up suddenly it deleted vm (This is happening in loops create vm and delete vm)

I0111 04:35:13.427995 1 orchestrator.go:310] Final scale-up plan: [{MachineDeployment/magnum-system/kube-al0rs-default-worker-7cxtx 1->2 (max: 4)}]
I0111 04:35:13.428019 1 orchestrator.go:582] Scale-up: setting group MachineDeployment/magnum-system/kube-al0rs-default-worker-7cxtx size to 2

I0111 04:35:49.576173 1 static_autoscaler.go:405] 1 unregistered nodes present

Here is the full logs of autoscaler - https://pastebin.com/AMQbGhRj

Do you know what could be wrong here? or any clue where I should look for culprits.
On Wed, Jan 10, 2024 at 4:09 PM Mohammed Naser <mnaser@vexxhost.com> wrote:
Yes, this is an assumption we’re making since we’re running things with Atmosphere.

Can you running `kubectl label node/openstack-control-plane=enabled`

Thanks
From: Satish Patel <satish.txt@gmail.com>
Date: Wednesday, January 10, 2024 at 12:57 PM
To: Mohammed Naser <mnaser@vexxhost.com>
Cc: OpenStack Discuss <openstack-discuss@lists.openstack.org>
Subject: Re: [magnum] autoscaling feature not working
Hi Mohammed,

After digging into it I found autoscaler is not getting scheduled to run any nodes. Getting the following error. Full output is here - https://paste.opendev.org/show/bmJY3ZrRf07S1Q6OTaeA/
Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  40m                default-scheduler  0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
  Warning  FailedScheduling  15m (x5 over 35m)  default-scheduler  0/3 nodes are available: 3 node(s) didn't match Pod's node affinity/selector. preemption: 0/3 nodes are available: 3 Preemption is not helpful for scheduling..
Do you think it has the following flag?
Node-Selectors:  openstack-control-plane=enabled
On Thu, Jan 4, 2024 at 9:39 AM Satish Patel <satish.txt@gmail.com> wrote:
Hi Mohammed,

Yes, I am using a CAPI driver. In k8s management cluster my autoscaler is in pending status. Did I miss something?
root@os2-capi-01:~# kubectl get deploy -A
NAMESPACE                           NAME                                            READY   UP-TO-DATE   AVAILABLE   AGE
capi-kubeadm-bootstrap-system       capi-kubeadm-bootstrap-controller-manager       1/1     1            1           8d
capi-kubeadm-control-plane-system   capi-kubeadm-control-plane-controller-manager   1/1     1            1           8d
capi-system                         capi-controller-manager                         1/1     1            1           8d
capo-system                         capo-controller-manager                         1/1     1            1           8d
cert-manager                        cert-manager                                    1/1     1            1           8d
cert-manager                        cert-manager-cainjector                         1/1     1            1           8d
cert-manager                        cert-manager-webhook                            1/1     1            1           8d
kube-system                         calico-kube-controllers                         1/1     1            1           8d
kube-system                         coredns                                         2/2     2            2           8d
kube-system                         dns-autoscaler                                  1/1     1            1           8d
magnum-system                       kube-2lke8-autoscaler                           0/1     1            0           4h25m
magnum-system                       kube-d6n1t-autoscaler                           0/1     1            0           14h
magnum-system                       kube-dmmks-autoscaler                           0/1     1            0           14h
 
## It is in pending status 
 
root@os2-capi-01:~# kubectl get pod -n magnum-system
NAME                                     READY   STATUS    RESTARTS   AGE
kube-2lke8-autoscaler-77bd94cc6f-5xbjg   0/1     Pending   0          4h25m
kube-d6n1t-autoscaler-7486955bd4-fwfc9   0/1     Pending   0          14h
kube-dmmks-autoscaler-596f9d48c-wzlbk    0/1     Pending   0          14h
 
## Logs are empty.. 
 
root@os2-capi-01:~# kubectl logs kube-2lke8-autoscaler-77bd94cc6f-5xbjg -n magnum-system
root@os2-capi-01:~# kubectl logs kube-dmmks-autoscaler-596f9d48c-wzlbk -n magnum-system
On Thu, Jan 4, 2024 at 1:16 AM Mohammed Naser <mnaser@vexxhost.com> wrote:

Hi Satish:

If this is based for CAPI driver, it provisions a cluster-autoscaler instance which deploys new nodes when there are pods in Pending state.

Have a look at the info here: https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler

Thanks

Mohammed

Get Outlook for iOS

From: Satish Patel <satish.txt@gmail.com>
Sent: Wednesday, January 3, 2024 3:43:40 PM
To: OpenStack Discuss <openstack-discuss@lists.openstack.org>
Subject: [magnum] autoscaling feature not working

Folks,

I am trying to set the auto scaling feature in magnum "auto_scaling_enabled '' true but somehow the auto scaling is not working. truly speaking I don't understand the workflow of scaling up and down of worker nodes.

How does magnum know that it requires scaling up and down without monitoring? On what basis will it trigger that action?