[openstack-dev] [nova] [savanna] Host information for non admin users - from a holistic scheduler

Mike Spreitzer mspreitz at us.ibm.com
Sun Sep 15 05:14:10 UTC 2013


Alex, my understanding is that the motivation for rack-awareness in Hadoop 
is optimizing availability rather than networking.  The good news, for 
those of us who favor a holistic scheduler, is that it can take both sorts 
of things into account when/where desired.

Yes, the case of a public cloud is more difficult than the case of a 
private cloud.  My understanding of Amazon's attitude, for example, is 
that they do not want to give out any bits of information about placement 
--- even though there are known techniques to reverse-engineer it, Amazon 
does not want to help that along at all.  Giving out obscured information 
--- some bits but not all --- is still disfavored.  Let me give a little 
background on how my group deals with placement for availability, then 
discuss options for the public cloud.

Our holistic scheduler takes as input something we call a virtual resource 
topology (VRT), other people use words like pattern, template, 
application, and cluster for such a thing.  It is an assembly of virtual 
resources that one tenant wants to instantiate.  In a VRT the resources 
are arranged into a tree of groups, the VRT itself is the root.  We use 
the groups for concise statements of various sorts, which I will omit here 
for the sake of simplicity.  As far as direct location constraints are 
concern, there is just one primitive thing: it is a relationship between 
two virtual resources and it is parameterized by a sense (positive or 
negative) and a level in the physical hierarchy (e.g., physical machine 
(PM), chassis, rack).  Thus: a negative relationship between VM1 and VM2 
at the rack level means that VM1 and VM2 must go on different racks; a 
positive relationship between VM3 and VM4 at the PM level means those two 
VMs must be on the same host.  Additionally, each constraint can be hard 
or soft: a hard constraint must be satisfied while a soft constraint is a 
preference.

Consider the example of six interchangeable VMs (say VM1, VM2, ... VM6) 
that should be spread across at least two racks with no more than half the 
VMs on any one rack.  How to say that with a collection of location 
primitives?  What we do is establish three rack-level anti-co-location 
constraints: one between VM1 and VM2, one between VM3 and VM4, and one 
between VM5 and VM6.  That is not the most obvious representation.  You 
might have expected this: nine rack-level anti-co-location constraints, 
one for every pair in the outer product between {VM1, VM2, VM3} and {VM4, 
VM5, VM6}.  Now consider what happens if the physical system has three 
racks and room for only two additional VMs on each rack.  With the latter 
set of constraints, there is no acceptable placement.  With the sparser 
set that we use, there are allowed placements.  In short, an obvious set 
of constraints may rule out otherwise acceptable placement.

I see two ways to give guaranteed-accurate rack awareness to Hadoop: 
constrain the placement so tightly that you know enough to configure 
Hadoop before the placement decision is made, or extract placement 
information after the placement decision is made.  The public cloud 
setting rules out the latter, leaving only the former.  This can be done, 
at a cost of suffering pattern rejections that would not occur if you did 
not have to over-constrain the placement.

One more option is to give up on guaranteed accuracy: prescribe a 
placement with sufficient precision to inform Hadoop, and so inform 
Hadoop, but make that prescription a preference rather than a hard 
constraint.  If the actual placement does not fully meet all the 
preferences, Hadoop is not informed of the differences and so will suffer 
in "non-functional" ways but still get the job done (modulo all those 
non-functional considerations, like tolerating a rack crash).  When your 
preferences are not met, it is because the system is very loaded and your 
only choice is between operating in some degraded way or not at all --- 
you might as well take the degraded operation.

Regards,
Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20130915/7300f48c/attachment.html>


More information about the OpenStack-dev mailing list