[Openstack] Lack of Balance solution such as Watcher.

Sean Mooney smooney at redhat.com
Thu Mar 16 08:46:51 UTC 2023


On Thu, 2023-03-16 at 02:03 +0100, Dmitriy Rabotyagov wrote:
> Eventually I don't fully understand reasons behind need of such service.
> 
> As fighting with high load by migrating instances between computes is
> fighting with consequences rather then with root cause, not saying that it
> brings more negative effects then positive for experience of the end-users,
> as you're just moving problem to another place affecting more workloads
> with degraded performance.
> 
> If you struggling from high load on a daily basis - then you have too high
> cpu_allocation_ratio set for computes. As high load issues always come from
> attempts to oversell too agressively.
> 
> If you have workloads in the cloud that always utilize all CPUs available -
> then you should consider having flavors and aggregates with cpu-pinning,
> meaning providing physical CPUs for such workloads.
> 
> Also don't forget, that it's worth setting more realistic numbers for
> reserved resources on computes, because default 2gb of RAM is usually too
> small.
i tend to agree although there are some thing you can do in the nova schduler ot help
e.g. prefering spreading over packing.

for cpu load in particalar you can also enable the metric weigher

i have not read this thread in detail altough skiming i see refrences to ceilometer.
nova's metrics weigher has no depency on it.
the metrics weigher 
https://github.com/openstack/nova/blob/master/nova/scheduler/weights/metrics.py
is configured by adding weight_setting in the schduler config
https://docs.openstack.org/nova/latest/configuration/config.html#metrics.weight_setting

    [metrics]
    weight_setting = name1=1.0, name2=-1.0
and enabeling the monitors in the nova-comptue config
https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.compute_monitors
[DEFAULT]
compute_monitors = cpu.virt_driver

^ that is the only one we support

the datafiles we report are set here
https://github.com/openstack/nova/blob/master/nova/compute/monitors/cpu/virt_driver.py#L52-L101

the more intersting values are 
"cpu.iowait.percent", "cpu.idle.percent" and "cpu.percent"

we have a fairly large internal cloud that is used for dev and ci and as of about 12 to 18 months ago they
have been using this to help balance the schduling fo instance as we have a mix of hyperviros skus
and this help blance systme load.

  [metrics]
    weight_setting = cpu.iowait.percent=-1.0, cpu.percent=-1.0, cpu.idle.percent=1.0

you want iowait and cpu.percent to be negitive since you want to avoid host with high iowait or high cpu utilsation.
and you woudl want to prefer idle host if your intent is to blance load.

iowait is actully included in cpu.percent and infact cpu.percent is basicaly cpu load - idel so 
[metrics]
    weight_setting = cpu.percent=-1.0
would have a simialreffect but you might want the extra granularity to weight iowait vs idle differntly

so if you find the normal cpu/ram/disk weigher are not sufficent to blance based onload check out the
metrics weigher and see it that helps. just be awere that collecting the cpu metrics and providing them
to the schduelr will increase rabbitmq load a little since we perodicly have ot update those values for
each compute. if you have a lot of compute that might be problematic. its one of the reasons we
decided not to add more metrics like this.



> 
> 
> 
> ср, 15 мар. 2023 г., 13:11 Nguyễn Hữu Khôi <nguyenhuukhoinw at gmail.com>:
> 
> > Hello.
> > I cannot use because missing cpu_util metric. I try to match it work but
> > not yet. It need some code to make it work. It seem none care about balance
> > reources on cloud.
> > 
> > On Wed, Mar 15, 2023, 6:26 PM Thomas Goirand <zigo at debian.org> wrote:
> > 
> > > On 12/11/22 01:59, Nguyễn Hữu Khôi wrote:
> > > > Watcher is not good because It need cpu metric
> > > > such as cpu load in Ceilometer which is removed so we cannot use it.
> > > 
> > > Hi!
> > > 
> > > What do you mean by "Ceilometer [is] removed"? It certainly isn't dead,
> > > and it works well... If by that, you mean "ceilometer-api" is removed,
> > > then yes, but then you can use gnocchi.
> > > 
> > > Cheers,
> > > 
> > > Thomas Goirand (zigo)
> > > 
> > > 




More information about the openstack-discuss mailing list