[openstack-dev] [Heat] convergence rally test results (so far)

Angus Salkeld asalkeld at mirantis.com
Thu Sep 3 06:56:19 UTC 2015


On Thu, Sep 3, 2015 at 3:53 AM Zane Bitter <zbitter at redhat.com> wrote:

> On 02/09/15 04:55, Steven Hardy wrote:
> > On Wed, Sep 02, 2015 at 04:33:36PM +1200, Robert Collins wrote:
> >> On 2 September 2015 at 11:53, Angus Salkeld <asalkeld at mirantis.com>
> wrote:
> >>
> >>> 1. limit the number of resource actions in parallel (maybe base on the
> >>> number of cores)
> >>
> >> I'm having trouble mapping that back to 'and heat-engine is running on
> >> 3 separate servers'.
> >
> > I think Angus was responding to my test feedback, which was a different
> > setup, one 4-core laptop running heat-engine with 4 worker processes.
> >
> > In that environment, the level of additional concurrency becomes a
> problem
> > because all heat workers become so busy that creating a large stack
> > DoSes the Heat services, and in my case also the DB.
> >
> > If we had a configurable option, similar to num_engine_workers, which
> > enabled control of the number of resource actions in parallel, I probably
> > could have controlled that explosion in activity to a more managable
> series
> > of tasks, e.g I'd set num_resource_actions to (num_engine_workers*2) or
> > something.
>
> I think that's actually the opposite of what we need.
>
> The resource actions are just sent to the worker queue to get processed
> whenever. One day we will get to the point where we are overflowing the
> queue, but I guarantee that we are nowhere near that day. If we are
> DoSing ourselves, it can only be because we're pulling *everything* off
> the queue and starting it in separate greenthreads.
>

worker does not use a greenthread per job like service.py does.
This issue is if you have actions that are fast you can hit the db hard.

QueuePool limit of size 5 overflow 10 reached, connection timed out,
timeout 30

It seems like it's not very hard to hit this limit. It comes from simply
loading
the resource in the worker:
"/home/angus/work/heat/heat/engine/worker.py", line 276, in check_resource
"/home/angus/work/heat/heat/engine/worker.py", line 145, in _load_resource
"/home/angus/work/heat/heat/engine/resource.py", line 290, in load
resource_objects.Resource.get_obj(context, resource_id)



>
> In an ideal world, we might only ever pull one task off that queue at a
> time. Any time the task is sleeping, we would use for processing stuff
> off the engine queue (which needs a quick response, since it is serving
> the ReST API). The trouble is that you need a *huge* number of
> heat-engines to handle stuff in parallel. In the reductio-ad-absurdum
> case of a single engine only processing a single task at a time, we're
> back to creating resources serially. So we probably want a higher number
> than 1. (Phase 2 of convergence will make tasks much smaller, and may
> even get us down to the point where we can pull only a single task at a
> time.)
>
> However, the fewer engines you have, the more greenthreads we'll have to
> allow to get some semblance of parallelism. To the extent that more
> cores means more engines (which assumes all running on one box, but
> still), the number of cores is negatively correlated with the number of
> tasks that we want to allow.
>
> Note that all of the greenthreads run in a single CPU thread, so having
> more cores doesn't help us at all with processing more stuff in parallel.
>

Except, as I said above, we are not creating greenthreads in worker.

-A


>
> cheers,
> Zane.
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150903/f12197ce/attachment.html>


More information about the OpenStack-dev mailing list