Open Stack

Sat Apr 26 05:10:47 UTC 2014

Jay Pipes <jaypipes at gmail.com> wrote on 04/25/2014 06:28:38 PM:

> On Fri, 2014-04-25 at 22:00 +0000, Day, Phil wrote:
> > Hi Jay,
> > 
> > I'm going to disagree with you on this one, because:
> 
> No worries, Phil, I expected some dissention and I completely appreciate
> your feedback and perspective :)

I myself sit between the two camps on this one.  I share Jay's unhappiness 
with server groups, as they are today.  However, I see an evolutionary 
path forward from today's server groups to something that makes much more 
sense to me and my colleagues.  I do not see as clear a path forward from 
Jay's proposal, but am willing to think more about that.  I will start by 
outlining where I want to go, and then address the specific points that 
have been raised in this email thread so far.

I would like to see the OpenStack architecture have a place for what I 
have been calling holistic scheduling.  That is making a simultaneous 
scheduling decision about a whole collection of virtual resources of 
various types (Nova VMs, Cinder storage volumes, network bandwidth, ...), 
taking into account a rich composite policy statement.  This is not just a 
pipe dream, my group has been doing this for years.  What we are 
struggling with is finding an evolutionary path to a place where it can be 
done in an OpenStack context.  One part of the struggle is due to the fact 
that in our previous work the part that is analogous to Heat is not 
optional, while in OpenStack Heat is most definitely optional.  Some of 
the things I have written in the past have not clearly separated 
scheduling and Heat and left Heat optional, but please rest assured that I 
am making no proposal now to violate those things.  I see scheduling and 
orchestration as distinct functions; the potential for confusion arises 
because (a) holistic scheduling needs input that has some similarity to 
what you see in a Heat template today and (b) making the scheduling 
simultaneous requires moving it from its current place (downstream from 
orchestration) to an earlier place (upstream from orchestration).

The OpenStack community has historically used the word "scheduling" to 
refer to placement problems, always in the time-invariant 
now-and-forseeable future, and I am following that usage here.  Other 
communities consider "scheduling" to also include interesting variation 
over time, but I am not trying to bring that into this debate.  (Nor am I 
denying its interest and value, I am just trying to keep this discussion 
focused.)

The discussion in this email thread has recognized that scheduler hints 
are applied only at creation time today, but it has already been noted 
(e.g., in http://summit.openstack.org/cfp/details/99) that scheduling 
policy statements should be retained for the lifetime of the virtual 
resource.  That is true regardless of whether the policy statements come 
in through today's server groups, the alternate proposal from Jay Pipes, 
or some other alternative or evolution.

I agree with Jay that groups have no inherent connection to scheduling. My 
colleagues and I have found grouping to be a useful technique to make APIs 
and documents more concise, and we find a top-level group to be the 
natural scope for a simultaneous decision.  We have been working example 
problems with a non-trivial size and amount of structure; when you get 
beyond small simple examples you see the usefulness of grouping more 
clearly.  For a couple of examples, see a 3-tier web application in 
https://docs.google.com/drawings/d/1nridrUUwNaDrHQoGwSJ_KXYC7ik09wUuV3vXw1MyvlY 
and a deployment of an IBM product called "Connections" in 
https://docs.google.com/file/d/0BypF9OutGsW3ZUYwYkNjZGJFejQ (this latter 
example has been shorn of its networking policies, and is a literal 
abstract of something we did using software that could not cope with 
policies applied directly to virtual resources, so some of its groups are 
not well motivated --- but others *are*).  The groups are handy for making 
it possible to draw pictures without too many lines, and write documents 
that are readably concise.  But everything said with groups could be said 
without groups, if we allowed policy statements to be placed on virtual 
resources and on pairs of virtual resources --- it would just take a heck 
of a lot more policy statements.

If you want to make a simultaneous decision about several virtual 
resources, you need a description of all those virtual resources up-front. 
 So even in a totally Heat-free environment you find yourself wanting 
something that looks like a document or data structure describing multiple 
virtual resources --- and the policies that apply to them, and thus also 
the groups that allow for concise applications of policies; note also that 
the whole set of virtual resources involved is a group.

When you have an example of non-trivial size and structure, you generally 
do not want to make a change by a collection of atomic edits, each 
individually scheduled.  Rather you want to state the new set of virtual 
resources and policies that you want to move to, allowing a simultaneous 
decision about the new placement solution.

To get where I want to go can be done by evolutionary steps forward from 
today's server groups.  There are four fairly independent dimensions in 
which the evolution can proceed.  One is to go from today's sequential 
decision-making to simultaneous decision-making; I am drafting a blueprint 
on that now (
https://blueprints.launchpad.net/nova/+spec/simultaneous-server-group). 
Another is to expand beyond Nova.  Another is to allow nested groups. 
Another dimension is to expand and refine the catalog of policy types.  In 
older documents (in the particulars in 
https://wiki.openstack.org/wiki/Heat/PolicyExtension --- do not be 
distracted by the Heat context, the policy catalog is a scheduling issue 
--- and in the generalities in 
https://docs.google.com/document/d/17OIiBoIavih-1y4zzK0oXyI66529f-7JTCVj-BcXURA
) you see more kinds of policies and the idea that policies may have 
parameters.  For example, co-location (a more precise formulation of 
"affinity", it clearly says we are talking about placement rather than 
some other sort of affinity --- such as networking, which has its own 
policy types) takes a parameter indicating the level of the physical 
hierarchy at which the placement should be the same.  You also see the 
idea that a policy statement can be shaded as either a hard requirement or 
a soft preference.

In my own group's work, and in the joint proposal with folks from Cisco 
and VMware, we stipulated that the groups nest into a tree.  From my own 
group's perspective, that is merely a matter of conservatism --- we think 
it might make some implementations easier, and it has been an acceptable 
restriction for the examples we have worked.  I am not strongly wedded to 
that restriction.  The placement technology that we have in mind 
(transforming to a constrained optimization problem) does not require that 
restriction.  If we lifted that restriction, defining a group by an 
arbitrary predicate (tag match, or whatever) would be an acceptable way of 
expressing grouping.  We could even keep the restriction to a tree of 
groups while defining groups by predicates --- the restriction would be a 
restriction on the predicates.

If we were to start instead from the proposal of Jay Pipes, the same four 
dimensions of evolution apply.  To get from sequential to simultaneous 
decision-making requires an up-front statement of scope (i.e., member 
descriptions) and policy (for which grouping would help), rather than the 
at-creation-time statements. As mentioned above, the grouping could 
continue to be expressed by predicates rather than explicit statements of 
membership.  If we recognize that we are embarking on a general program of 
expanding and refining the policy language then particular command line 
syntax like "--not-near-tag $TAG" would probably change to something 
generic like "--constraint anti-co-location:level=rack --between 
rsc_or_grp_1 --and rsc_or_grp2" (and the API has a corresponding issue). 
The expansion beyond Nova has issues that seem to me to be pretty 
orthogonal to this debate over the evolutionary starting point.

> > i) This is a feature that was discussed in at least one if not two
> Design Summits and went through a long review period, it wasn't one 
> of those changes that merged in 24 hours before people could take a 
> good look at it.
> 
> Completely understood. That still doesn't mean we can't propose to get
> rid of it early instead of letting it sit around when an alternate
> implementation would be better for the user of OpenStack.

I also tend to favor looking ahead to validate that we are headed in a 
good direction.  That can conflict with the focus on quickly making 
incremental improvements --- if we allow ourselves to suffer analysis 
paralysis.  I hope a limited discussion can lead to some consensus on 
direction, not seriously preventing taking the first small steps soon. 
However, in the case of scheduling, we are already queued up behind the 
scheduler forklift and no-db-scheduler, so there is no danger of imminent 
progress on my evolutionary program.

> >   Whatever you feel about the implementation,  it is now in the 
> API and we should assume that people have started coding against it.

Yes, we should support backwards compatibility in general.  And in this 
particular case, there may be no immediate conflict.  If we decide we 
prefer Jay's way of expressing these concepts, we can retain support for 
the old way of expressing grouping and policy too (hopefully with a 
unified representation underneath).  That only leaves us with another 
general problem in interface evolution: when and how to delete old stuff 
that everybody should eventually stop using.

> ...
> >   I don't think it gives any credibility to Openstack as a 
> platform if we yank features back out just after they've landed.
> 
> Perhaps not, though I think we have less credibility if we don't
> recognize when a feature isn't implemented with users in mind and leave
> it in the code base to the detriment and confusion of users. We
> absolutely must, IMO, as a community, be able to say "this isn't right"
> and have a path for changing or removing something.
> 
> If that path is deprecation vs outright removal, so be it, I'd be cool
> with that. I'd just like to nip this anti-feature in the bud early so
> that it doesn't become the next "feature" like file-injection to persist
> in Nova well after its time has come and passed.

I am no mind reader, but I suspect the designers of server groups had 
users in mind.  But just having them in mind is not really adequate; a 
serious approach would be to involve actual users in evaluating a design 
proposal before it proceeds.  Remember the calls for OpenStack to be more 
user-driven?

> > ii) Sever Group - It's a way of defining a group of servers, and 
> the initial thing (only thing right now) you can define for such a 
> group is the affinity or anti-affinity for scheduling.
> 
> We already had ways of defining groups of servers. This new "feature"
> doesn't actually define a group of servers. It defines a policy, which
> is not particularly useful, as it's something that is better specified
> at the time of launching.

As I mentioned above, I want to do stuff (i.e., schedule) with groups 
before the members are actually created/udpated.

> >   Maybe in time we'll add other group properties or operations - 
> like "delete all the servers in a group" (I know some QA folks that 
> would love to have that feature).
> 
> We already have the ability to define a group of servers using key=value
> tags. Deleting all servers in a group is a three-line bash script that
> loops over the results of a nova list command and calls nova delete.
> Trust me, I've done group deletes in this way many times.
> 
> >   I don't see why it shouldn't be possible to have a server group 
> that doesn't have a scheduling policy associated to it.
> 
> I don't think the grouping of servers should have *anything* to do with
> scheduling :) That's the point of my proposal. Servers can and should be
> grouped using simple tags or key=value pair tags.
> 
> The grouping of servers together doesn't have anything of substance to
> do with scheduling policies.

Right, it just allows more concise statements of policies AND is a kind of 
scope where you can apply simultaneous decision-making.

> 
> >    I don't see any  Cognitive dissonance here - I think your just 
> assuming that the only reason for being able to group servers is for
> scheduling.
> 
> Again, I don't think scheduling and grouping of servers has anything to
> do with each other, thus my proposal to remove the relationship between
> groups of servers and scheduling policies, which is what the existing
> server group API and implementation does.
> 
> > iii) If the issue is that you can't add or remove servers from a 
> group, then why don't we add those operations to the API (you could 
> add a server to a group providing doing so  doesn't break any policy
> that might be associated with the group). 

In fact you see some of this already in the blueprint for server groups (
https://blueprints.launchpad.net/nova/+spec/instance-group-api-extension) 
--- look all the way through its whiteboard.

> We already have this ability today, thus my proposal to get rid of
> server groups.
> 
> >   Seems like a useful addition to me.
> 
> It's an addition that isn't needed, as we already have this today.
> 
> > iv) Since the user created the group, and chose a name for it that
> is presumably meaningful, then I don't understand why you think "--
> group XXX" isn't going to be meaningful to that same user ?
> 
> See point above about removing the unnecessary relationship between
> grouping of servers and scheduling policies.
> 
> > So I think there are a bunch of API operations missing, but I 
> don't see any advantage in throwing away what's now in place and 
> replacing it with a tag mechanism that basically says "everything 
> with this tag is in a sort of group".
> 
> We already have the tag group mechanism in place, that's kind of what
> I've been saying...

Regards,
Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140426/08cdc6f5/attachment.html>

Open Stack

[openstack-dev] [nova] Proposal: remove the server groups feature

OpenStack

Community

Documentation

Branding & Legal