[openstack-dev] [nova] is there a way to simulate thousands or millions of compute nodes?

Gareth academicgareth at gmail.com
Mon Dec 1 06:46:17 UTC 2014


@Michael

Okay, focusing on 'thousands' now, I know 'millions' is not good metaphor
here. I also know 'cells' functionality is nova's solution for large scale
deployment. But it also makes sense to find and re-produce large scale
problems in relatively small scale deployment.

@Sandy

All-in-all, I think you'd be better off load testing each piece
independently on a fixed hardware platform and faking out all the
incoming/outgoing services....

I understand and this is what I want to know. Is anyone doing the work like
this? If yes, I would like to join :)



On Fri, Nov 28, 2014 at 8:36 AM, Sandy Walsh <sandy.walsh at rackspace.com>
wrote:

> >From: Michael Still [mikal at stillhq.com] Thursday, November 27, 2014 6:57
> PM
> >To: OpenStack Development Mailing List (not for usage questions)
> >Subject: Re: [openstack-dev] [nova] is there a way to simulate thousands
> or millions of compute nodes?
> >
> >I would say that supporting millions of compute nodes is not a current
> >priority for nova... We are actively working on improving support for
> >thousands of compute nodes, but that is via cells (so each nova deploy
> >except the top is still in the hundreds of nodes).
>
> <ramble on>
>
> Agreed, it wouldn't make much sense to simulate this on a single machine.
>
> That said, if one *was* to simulate this, there are the well known
> bottlenecks:
>
> 1. the API. How much can one node handle with given hardware specs? Which
> operations hit the DB the hardest?
> 2. the Scheduler. There's your API bottleneck and big load on the DB for
> Create operations.
> 3. the Conductor. Shouldn't be too bad, essentially just a proxy.
> 4. child-to-global-cell updates. Assuming a two-cell deployment.
> 5. the virt driver. YMMV.
> ... and that's excluding networking, volumes, etc.
>
> The virt driver should be load tested independently. So FakeDriver would
> be fine (with some delays added for common operations as Gareth suggests).
> Something like Bees-with-MachineGuns could be used to get a baseline metric
> for the API. Then it comes down to DB performance in the scheduler and
> conductor (for a single cell). Finally, inter-cell loads. Who blows out the
> queue first?
>
> All-in-all, I think you'd be better off load testing each piece
> independently on a fixed hardware platform and faking out all the
> incoming/outgoing services. Test the API with fake everything. Test the
> Scheduler with fake API calls and fake compute nodes. Test the conductor
> with fake compute nodes (not FakeDriver). Test the compute node directly.
>
> Probably all going to come down to the DB and I think there is some good
> performance data around that already?
>
> But I'm just spit-ballin' ... and I agree, not something I could see the
> Nova team taking on in the near term ;)
>
> -S
>
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Gareth

*Cloud Computing, OpenStack, Distributed Storage, Fitness, Basketball*
*OpenStack contributor, kun_huang at freenode*
*My promise: if you find any spelling or grammar mistakes in my email from
Mar 1 2013, notify me *
*and I'll donate $1 or ¥1 to an open organization you specify.*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141201/c5468bf5/attachment.html>


More information about the OpenStack-dev mailing list