<div dir="ltr">I've been thinking about this use case for a DHT-like design, I think I want to do what other people have alluded to here and try and intercept problematic requests like this one in some sort of "pre sending to ring-segment" stage. In this case the "pre-stage" could decide to send this off to a scheduler that has a more complete view of the world. Alternatively, don't make a single request for 50 instances, just send 50 requests for one? Is that a viable thing to do for this use case?<div>

<br></div><div style>-Mike</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Nov 19, 2013 at 7:03 PM, Joshua Harlow <span dir="ltr"><<a href="mailto:harlowja@yahoo-inc.com" target="_blank">harlowja@yahoo-inc.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">At yahoo at least 50+ simultaneous will be the common case (maybe we are<br>

special).<br>

<br>

Think of what happens on <a href="http://www.yahoo.com" target="_blank">www.yahoo.com</a> say on the olympics, <a href="http://news.yahoo.com" target="_blank">news.yahoo.com</a><br>

could need 50+ very very quickly (especially if say a gold medal is won by<br>

some famous person). So I wouldn't discount those being the common case<br>

(may not be common for some, but is common for others). In fact any<br>

website with spurious/spikey traffic will have the same desire; so it<br>

might be a target use-case for website like companies (or ones that can't<br>

upfront predict spikes).<br>

<br>

Overall though I think what u said about 'don't fill it up' is good<br>

general knowledge. Filling up stuff beyond a certain threshold is<br>

dangerous just in general (one should only push the limits so far before<br>

madness).<br>

<div class="HOEnZb"><div class="h5"><br>

On 11/19/13 4:08 PM, "Clint Byrum" <<a href="mailto:clint@fewbar.com">clint@fewbar.com</a>> wrote:<br>

<br>

>Excerpts from Chris Friesen's message of 2013-11-19 12:18:16 -0800:<br>

>> On 11/19/2013 01:51 PM, Clint Byrum wrote:<br>

>> > Excerpts from Chris Friesen's message of 2013-11-19 11:37:02 -0800:<br>

>> >> On 11/19/2013 12:35 PM, Clint Byrum wrote:<br>

>> >><br>

>> >>> Each scheduler process can own a different set of resources. If they<br>

>> >>> each grab instance requests in a round-robin fashion, then they will<br>

>> >>> fill their resources up in a relatively well balanced way until one<br>

>> >>> scheduler's resources are exhausted. At that time it should bow out<br>

>>of<br>

>> >>> taking new instances. If it can't fit a request in, it should kick<br>

>>the<br>

>> >>> request out for retry on another scheduler.<br>

>> >>><br>

>> >>> In this way, they only need to be in sync in that they need a way to<br>

>> >>> agree on who owns which resources. A distributed hash table that<br>

>>gets<br>

>> >>> refreshed whenever schedulers come and go would be fine for that.<br>

>> >><br>

>> >> That has some potential, but at high occupancy you could end up<br>

>>refusing<br>

>> >> to schedule something because no one scheduler has sufficient<br>

>>resources<br>

>> >> even if the cluster as a whole does.<br>

>> >><br>

>> ><br>

>> > I'm not sure what you mean here. What resource spans multiple compute<br>

>> > hosts?<br>

>><br>

>> Imagine the cluster is running close to full occupancy, each scheduler<br>

>> has room for 40 more instances.  Now I come along and issue a single<br>

>> request to boot 50 instances.  The cluster has room for that, but none<br>

>> of the schedulers do.<br>

>><br>

><br>

>You're assuming that all 50 come in at once. That is only one use case<br>

>and not at all the most common.<br>

><br>

>> >> This gets worse once you start factoring in things like heat and<br>

>> >> instance groups that will want to schedule whole sets of resources<br>

>> >> (instances, IP addresses, network links, cinder volumes, etc.) at<br>

>>once<br>

>> >> with constraints on where they can be placed relative to each other.<br>

>><br>

>> > Actually that is rather simple. Such requests have to be serialized<br>

>> > into a work-flow. So if you say "give me 2 instances in 2 different<br>

>> > locations" then you allocate 1 instance, and then another one with<br>

>> > 'not_in_location(1)' as a condition.<br>

>><br>

>> Actually, you don't want to serialize it, you want to hand the whole<br>

>>set<br>

>> of resource requests and constraints to the scheduler all at once.<br>

>><br>

>> If you do them one at a time, then early decisions made with<br>

>> less-than-complete knowledge can result in later scheduling requests<br>

>> failing due to being unable to meet constraints, even if there are<br>

>> actually sufficient resources in the cluster.<br>

>><br>

>> The "VM ensembles" document at<br>

>><br>

>><a href="https://docs.google.com/document/d/1bAMtkaIFn4ZSMqqsXjs_riXofuRvApa--qo4U" target="_blank">https://docs.google.com/document/d/1bAMtkaIFn4ZSMqqsXjs_riXofuRvApa--qo4U</a><br>

>>Twsmhw/edit?pli=1<br>

>> has a good example of how one-at-a-time scheduling can cause spurious<br>

>> failures.<br>

>><br>

>> And if you're handing the whole set of requests to a scheduler all at<br>

>> once, then you want the scheduler to have access to as many resources<br>

>>as<br>

>> possible so that it has the highest likelihood of being able to satisfy<br>

>> the request given the constraints.<br>

><br>

>This use case is real and valid, which is why I think there is room for<br>

>multiple approaches. For instance the situation you describe can also be<br>

>dealt with by just having the cloud stay under-utilized and accepting<br>

>that when you get over a certain percentage utilized spurious failures<br>

>will happen. We have a similar solution in the ext3 filesystem on Linux.<br>

>Don't fill it up, or suffer a huge performance penalty.<br>

><br>

>_______________________________________________<br>

>OpenStack-dev mailing list<br>

><a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>

><a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

<br>

<br>

_______________________________________________<br>

OpenStack-dev mailing list<br>

<a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

</div></div></blockquote></div><br></div>