[ops][largescale-sig] How many compute nodes in a single cluster ?

David Ivey david.j.ivey at gmail.com
Wed Feb 3 15:03:24 UTC 2021


I never thought about it being like a CI cloud, but it would be very
similar in usage. I should clarify that it is actually physical cores (AMD
Epics) so it's 128 and 256 threads and yes at least 1TB ram per with ceph
shared storage.

That 400 is actually capped out at about 415 instances per compute (same
cap for 64 and 128 cpu's) where I run into kernel/libvirt issues and
nfconntrack hits limits and crashes. I don't have specifics to give at the
moment regarding that issue, I will have to try and recreate/reproduce that
when I get my other environment freed up to allow me to test that again. I
was in a hurry last time that happened to me and did not get a chance to
gather all the information for a bug.

Switching to python binding with ovs and some tuning of mariadb, rabbit,
haproxy and memcached is how I got to be able to accommodate that rate of
turnover.

On Wed, Feb 3, 2021 at 9:40 AM Sean Mooney <smooney at redhat.com> wrote:

> On Wed, 2021-02-03 at 09:05 -0500, David Ivey wrote:
> > I am not sure simply going off the number of compute nodes is a good
> > representation of scaling issues. I think it has a lot more to do with
> > density/networks/ports and the rate of churn in the environment, but I
> > could be wrong. For example, I only have 80 high density computes (64 or
> > 128 CPU's with ~400 instances per compute) and I run into the same
> scaling
> > issues that are described in the Large Scale Sig and have to do a lot of
> > tuning to keep the environment stable. My environment is also kinda
> unique
> > in the way mine gets used as I have 2k to 4k instances torn down and
> > rebuilt within an hour or two quite often so my API's are constantly
> > bombarded.
> actully your envionment sound like a pretty typical CI cloud
> where you often have short lifetimes for instance, oftten have high density
> and large turnover.
> but you are correct compute node scalse alone is not a good indictor.
> port,volume,instance count are deffinetly factors as is the workload
> profile
>
> im just assuming your cloud is a ci cloud but interms of generic workload
> profiles
> that would seam to be the closes aproximation im aware off to that type of
> creation
> and deleteion in a n hour period.
>
> 400 instance per comput ewhile a lot is really not that unreasonable
> assuming your
> typical host have 1+TB of ram and you have typically less than 4-8 cores
> per guests
> with only 128 CPUs going much above that would be over subscitbing the
> cpus quite hevially
> we generally dont recommend exceeding more then about 4x oversubsiption
> for cpus even though
> the default is 16 based on legacy reason that assume effectvly website
> hosts type workloads
> where the botelneck is not on cpu but disk and network io.
>
> with 400 instance per host that also equatest to at least 400 neutrop ports
> if you are using ipatable thats actully at least 1200 ports on the host
> which definetly has
> scalining issues on agent restart or host reboot.
>
> usign the python binding for ovs can help a lot as well as changing to the
> ovs firewall driver
> as that removes the linux bridge and veth pair created for each nueton
> port when doing hybrid plug.
>
> >
> > On Tue, Feb 2, 2021 at 3:15 PM Erik Olof Gunnar Andersson <
> > eandersson at blizzard.com> wrote:
> >
> > > > the old value of 500 nodes max has not been true for a very long time
> > > rabbitmq and the db still tends to be the bottelneck to scale however
> > > beyond 1500 nodes
> > > outside of the operational overhead.
> > >
> > > We manage our scale with regions as well. With 1k nodes our RabbitMQ
> > > isn't breaking a sweat, and no signs that the database would be
> hitting any
> > > limits. Our issues have been limited to scaling Neutron and VM
> scheduling
> > > on Nova mostly due to, NUMA pinning.
> > > ------------------------------
> > > *From:* Sean Mooney <smooney at redhat.com>
> > > *Sent:* Tuesday, February 2, 2021 9:50 AM
> > > *To:* openstack-discuss at lists.openstack.org <
> > > openstack-discuss at lists.openstack.org>
> > > *Subject:* Re: [ops][largescale-sig] How many compute nodes in a single
> > > cluster ?
> > >
> > > On Tue, 2021-02-02 at 17:37 +0000, Arnaud Morin wrote:
> > > > Hey all,
> > > >
> > > > I will start the answers :)
> > > >
> > > > At OVH, our hard limit is around 1500 hypervisors on a region.
> > > > It also depends a lot on number of instances (and neutron ports).
> > > > The effects if we try to go above this number:
> > > > - load on control plane (db/rabbit) is increasing a lot
> > > > - "burst" load is hard to manage (e.g. restart of all neutron agent
> or
> > > >   nova computes is putting a high pressure on control plane)
> > > > - and of course, failure domain is bigger
> > > >
> > > > Note that we dont use cells.
> > > > We are deploying multiple regions, but this is painful to manage /
> > > > understand for our clients.
> > > > We are looking for a solution to unify the regions, but we did not
> find
> > > > anything which could fit our needs for now.
> > >
> > > i assume you do not see cells v2 as a replacment for multipel regions
> > > because they
> > > do not provide indepente falut domains and also because they are only a
> > > nova feature
> > > so it does not solve sclaing issue in other service like neutorn which
> are
> > > streached acrooss
> > > all cells.
> > >
> > > cells are a scaling mechinm but the larger the cloud the harder it is
> to
> > > upgrade and cells does not
> > > help with that infact by adding more contoler it hinders upgrades.
> > >
> > > seperate regoins can all be upgraded indepently and can be fault
> tolerant
> > > if you dont share serviecs
> > > between regjions and use fedeeration to avoid sharing keystone.
> > >
> > >
> > > glad to hear you can manage 1500 compute nodes by the way.
> > >
> > > the old value of 500 nodes max has not been true for a very long time
> > > rabbitmq and the db still tends to be the bottelneck to scale however
> > > beyond 1500 nodes
> > > outside of the operational overhead.
> > >
> > > >
> > > > Cheers,
> > > >
> > >
> > >
> > >
> > >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210203/f0845b77/attachment-0001.html>


More information about the openstack-discuss mailing list