Thank you very much for sharing!
I will dig dive with it.
Nguyen Huu Khoi


On Thu, Mar 16, 2023 at 4:54 PM Sean Mooney <smooney@redhat.com> wrote:
On Thu, 2023-03-16 at 10:35 +0100, Dmitriy Rabotyagov wrote:
> Oh, thanks for that detailed explanation!
> I was looking at metrics weighter for years and looked through code couple
> of times but never got it properly configured. That is very helpful, thanks
> a lot!

that tells me i sure porbaly update the docs...
>
> чт, 16 мар. 2023 г., 09:46 Sean Mooney <smooney@redhat.com>:
>
> > On Thu, 2023-03-16 at 02:03 +0100, Dmitriy Rabotyagov wrote:
> > > Eventually I don't fully understand reasons behind need of such service.
> > >
> > > As fighting with high load by migrating instances between computes is
> > > fighting with consequences rather then with root cause, not saying that
> > it
> > > brings more negative effects then positive for experience of the
> > end-users,
> > > as you're just moving problem to another place affecting more workloads
> > > with degraded performance.
> > >
> > > If you struggling from high load on a daily basis - then you have too
> > high
> > > cpu_allocation_ratio set for computes. As high load issues always come
> > from
> > > attempts to oversell too agressively.
> > >
> > > If you have workloads in the cloud that always utilize all CPUs
> > available -
> > > then you should consider having flavors and aggregates with cpu-pinning,
> > > meaning providing physical CPUs for such workloads.
> > >
> > > Also don't forget, that it's worth setting more realistic numbers for
> > > reserved resources on computes, because default 2gb of RAM is usually too
> > > small.
> > i tend to agree although there are some thing you can do in the nova
> > schduler ot help
> > e.g. prefering spreading over packing.
> >
> > for cpu load in particalar you can also enable the metric weigher
> >
> > i have not read this thread in detail altough skiming i see refrences to
> > ceilometer.
> > nova's metrics weigher has no depency on it.
> > the metrics weigher
> >
> > https://github.com/openstack/nova/blob/master/nova/scheduler/weights/metrics.py
> > is configured by adding weight_setting in the schduler config
> >
> > https://docs.openstack.org/nova/latest/configuration/config.html#metrics.weight_setting
> >
> >     [metrics]
> >     weight_setting = name1=1.0, name2=-1.0
> > and enabeling the monitors in the nova-comptue config
> >
> > https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.compute_monitors
> > [DEFAULT]
> > compute_monitors = cpu.virt_driver
> >
> > ^ that is the only one we support
> >
> > the datafiles we report are set here
> >
> > https://github.com/openstack/nova/blob/master/nova/compute/monitors/cpu/virt_driver.py#L52-L101
> >
> > the more intersting values are
> > "cpu.iowait.percent", "cpu.idle.percent" and "cpu.percent"
> >
> > we have a fairly large internal cloud that is used for dev and ci and as
> > of about 12 to 18 months ago they
> > have been using this to help balance the schduling fo instance as we have
> > a mix of hyperviros skus
> > and this help blance systme load.
> >
> >   [metrics]
> >     weight_setting = cpu.iowait.percent=-1.0, cpu.percent=-1.0,
> > cpu.idle.percent=1.0
> >
> > you want iowait and cpu.percent to be negitive since you want to avoid
> > host with high iowait or high cpu utilsation.
> > and you woudl want to prefer idle host if your intent is to blance load.
> >
> > iowait is actully included in cpu.percent and infact cpu.percent is
> > basicaly cpu load - idel so
> > [metrics]
> >     weight_setting = cpu.percent=-1.0
> > would have a simialreffect but you might want the extra granularity to
> > weight iowait vs idle differntly
> >
> > so if you find the normal cpu/ram/disk weigher are not sufficent to blance
> > based onload check out the
> > metrics weigher and see it that helps. just be awere that collecting the
> > cpu metrics and providing them
> > to the schduelr will increase rabbitmq load a little since we perodicly
> > have ot update those values for
> > each compute. if you have a lot of compute that might be problematic. its
> > one of the reasons we
> > decided not to add more metrics like this.
> >
> >
> >
> > >
> > >
> > >
> > > ср, 15 мар. 2023 г., 13:11 Nguyễn Hữu Khôi <nguyenhuukhoinw@gmail.com>:
> > >
> > > > Hello.
> > > > I cannot use because missing cpu_util metric. I try to match it work
> > but
> > > > not yet. It need some code to make it work. It seem none care about
> > balance
> > > > reources on cloud.
> > > >
> > > > On Wed, Mar 15, 2023, 6:26 PM Thomas Goirand <zigo@debian.org> wrote:
> > > >
> > > > > On 12/11/22 01:59, Nguyễn Hữu Khôi wrote:
> > > > > > Watcher is not good because It need cpu metric
> > > > > > such as cpu load in Ceilometer which is removed so we cannot use
> > it.
> > > > >
> > > > > Hi!
> > > > >
> > > > > What do you mean by "Ceilometer [is] removed"? It certainly isn't
> > dead,
> > > > > and it works well... If by that, you mean "ceilometer-api" is
> > removed,
> > > > > then yes, but then you can use gnocchi.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Thomas Goirand (zigo)
> > > > >
> > > > >
> >
> >