[openstack-dev] [magnum][heat] 2 million requests / sec, 100s of nodes
zbitter at redhat.com
Mon Aug 8 16:17:04 UTC 2016
On 05/08/16 12:01, Hongbin Lu wrote:
> Add [heat] to the title to get more feedback.
> Best regards,
> *From:*Ricardo Rocha [mailto:rocha.porto at gmail.com]
> *Sent:* August-05-16 5:48 AM
> *To:* OpenStack Development Mailing List (not for usage questions)
> *Subject:* Re: [openstack-dev] [magnum] 2 million requests / sec, 100s
> of nodes
> Quick update is 1000 nodes and 7 million reqs/sec :) - and the number of
> requests should be higher but we had some internal issues. We have a
> submission for barcelona to provide a lot more details.
> But a couple questions came during the exercise:
> 1. Do we really need a volume in the VMs? On large clusters this is a
> burden, and local storage only should be enough?
> 2. We observe a significant delay (~10min, which is half the total time
> to deploy the cluster) on heat when it seems to be crunching the
> kube_minions nested stacks. Once it's done, it still adds new stacks
> gradually, so it doesn't look like it precomputed all the info in advance
> Anyone tried to scale Heat to stacks this size? We end up with a stack with:
> * 1000 nested stacks (depth 2)
> * 22000 resources
> * 47008 events
Wow, that's a big stack :) TripleO has certainly been pushing the
boundaries of how big a stack Heat can handle, but this sounds like
another step up even from there.
> And already changed most of the timeout/retrial values for rpc to get
> this working.
> This delay is already visible in clusters of 512 nodes, but 40% of the
> time in 1000 nodes seems like something we could improve. Any hints on
> Heat configuration optimizations for large stacks very welcome.
Y'all were right to set max_resources_per_stack to -1, because actually
checking the number of resources in a tree of stacks is sloooooow. (Not
as slow as it used to be when it was O(n^2), but still pretty slow.)
We're actively working on trying to make Heat more horizontally scalable
(even at the cost of some performance penalty) so that if you need to
handle this kind of scale then you'll be able to reach it by adding more
heat-engines. Another big step forward on this front is coming with
Newton, as (barring major bugs) the convergence_engine architecture will
be enabled by default.
RPC timeouts are caused by the synchronous work that Heat does before
returning a result to the caller. Most of this is validation of the data
provided by the user. We've talked about trying to reduce the amount of
validation done synchronously to a minimum (just enough to guarantee
that we can store and retrieve the data from the DB) and push the rest
into the asynchronous part of the stack operation alongside the actual
create/update. (FWIW, TripleO typically uses a 600s RPC timeout.)
The "QueuePool limit of size ... overflow ... reached" sounds like we're
pulling messages off the queue even when we don't have threads available
in the pool to pass them to. If you have a fix for this it would be much
appreciated. However, I don't think there's any guarantee that just
leaving messages on the queue can't lead to deadlocks. The problem with
very large trees of nested stacks is not so much that it's a lot of
stacks (Heat doesn't have _too_ much trouble with that) but that they
all have to be processed simultaneously. e.g. to validate the top level
stack you also need to validate all of the lower level stacks before
returning the result. If higher-level stacks consume all of the thread
pools then you'll get a deadlock as you'll be unable to validate any
lower-level stacks. At this point you'd have maxed out the capacity of
your Heat engines to process stacks simultaneously and you'd need to
scale out to more Heat engines. The solution is probably to try limit
the number of nested stack validations we send out concurrently.
Improving performance at scale is a priority area of focus for the Heat
team at the moment. That's been mostly driven by TripleO and Sahara, but
we'd be very keen to hear about the kind of loads that Magnum is putting
on Heat and working with folks across the community to figure out how to
improve things for those use cases.
More information about the OpenStack-dev