<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class="">
<br class="">
<div>
<blockquote type="cite" class="">
<div class="">On 08 Aug 2016, at 11:51, Ricardo Rocha <<a href="mailto:rocha.porto@gmail.com" class="">rocha.porto@gmail.com</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div class="">Hi.<br class="">
<br class="">
On Mon, Aug 8, 2016 at 1:52 AM, Clint Byrum <<a href="mailto:clint@fewbar.com" class="">clint@fewbar.com</a>> wrote:<br class="">
<blockquote type="cite" class="">Excerpts from Steve Baker's message of 2016-08-08 10:11:29 +1200:<br class="">
<blockquote type="cite" class="">On 05/08/16 21:48, Ricardo Rocha wrote:<br class="">
<blockquote type="cite" class="">Hi.<br class="">
<br class="">
Quick update is 1000 nodes and 7 million reqs/sec :) - and the number<br class="">
of requests should be higher but we had some internal issues. We have<br class="">
a submission for barcelona to provide a lot more details.<br class="">
<br class="">
But a couple questions came during the exercise:<br class="">
<br class="">
1. Do we really need a volume in the VMs? On large clusters this is a<br class="">
burden, and local storage only should be enough?<br class="">
<br class="">
2. We observe a significant delay (~10min, which is half the total<br class="">
time to deploy the cluster) on heat when it seems to be crunching the<br class="">
kube_minions nested stacks. Once it's done, it still adds new stacks<br class="">
gradually, so it doesn't look like it precomputed all the info in advance<br class="">
<br class="">
Anyone tried to scale Heat to stacks this size? We end up with a stack<br class="">
with:<br class="">
* 1000 nested stacks (depth 2)<br class="">
* 22000 resources<br class="">
* 47008 events<br class="">
<br class="">
And already changed most of the timeout/retrial values for rpc to get<br class="">
this working.<br class="">
<br class="">
This delay is already visible in clusters of 512 nodes, but 40% of the<br class="">
time in 1000 nodes seems like something we could improve. Any hints on<br class="">
Heat configuration optimizations for large stacks very welcome.<br class="">
<br class="">
</blockquote>
Yes, we recommend you set the following in /etc/heat/heat.conf [DEFAULT]:<br class="">
max_resources_per_stack = -1<br class="">
<br class="">
Enforcing this for large stacks has a very high overhead, we make this<br class="">
change in the TripleO undercloud too.<br class="">
<br class="">
</blockquote>
<br class="">
Wouldn't this necessitate having a private Heat just for Magnum? Not<br class="">
having a resource limit per stack would leave your Heat engines<br class="">
vulnerable to being DoS'd by malicious users, since one can create many<br class="">
many thousands of resources, and thus python objects, in just a couple<br class="">
of cleverly crafted templates (which is why I added the setting).<br class="">
<br class="">
This makes perfect sense in the undercloud of TripleO, which is a<br class="">
private, single tenant OpenStack. But, for Magnum.. now you're talking<br class="">
about the Heat that users have access to.<br class="">
</blockquote>
<br class="">
We have it already at -1 for these tests. As you say a malicious user<br class="">
could DoS, right now this is manageable in our environment. But maybe<br class="">
move it to a per tenant value, or some special policy? The stacks are<br class="">
created under a separate domain for magnum (for trustees), we could<br class="">
also use that for separation.<br class="">
<br class="">
</div>
</div>
</blockquote>
<div><br class="">
</div>
If there was a quota system within Heat for items like stacks and resources, this could be</div>
<div>controlled through that.</div>
<div><br class="">
</div>
<div>Looks like <a href="https://blueprints.launchpad.net/heat/+spec/add-quota-api-for-heat" class="">https://blueprints.launchpad.net/heat/+spec/add-quota-api-for-heat</a> did not make it into upstream though.</div>
<div><br class="">
</div>
<div>Tim</div>
<div><br class="">
<blockquote type="cite" class="">
<div class="">
<div class="">A separate heat instance sounds like an overkill.<br class="">
<br class="">
Cheers,<br class="">
Ricardo<br class="">
<br class="">
<blockquote type="cite" class=""><br class="">
__________________________________________________________________________<br class="">
OpenStack Development Mailing List (not for usage questions)<br class="">
Unsubscribe: <a href="mailto:OpenStack-dev-request@lists.openstack.org" class="">
OpenStack-dev-request@lists.openstack.org</a>?subject:unsubscribe<br class="">
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" class="">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br class="">
</blockquote>
<br class="">
__________________________________________________________________________<br class="">
OpenStack Development Mailing List (not for usage questions)<br class="">
Unsubscribe: <a href="mailto:OpenStack-dev-request@lists.openstack.org" class="">
OpenStack-dev-request@lists.openstack.org</a>?subject:unsubscribe<br class="">
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" class="">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br class="">
</div>
</div>
</blockquote>
</div>
<br class="">
</body>
</html>