[openstack-dev] [nova] Proposal: remove the server groups feature
Mike Spreitzer
mspreitz at us.ibm.com
Sat Apr 26 05:10:47 UTC 2014
Jay Pipes <jaypipes at gmail.com> wrote on 04/25/2014 06:28:38 PM:
> On Fri, 2014-04-25 at 22:00 +0000, Day, Phil wrote:
> > Hi Jay,
> >
> > I'm going to disagree with you on this one, because:
>
> No worries, Phil, I expected some dissention and I completely appreciate
> your feedback and perspective :)
I myself sit between the two camps on this one. I share Jay's unhappiness
with server groups, as they are today. However, I see an evolutionary
path forward from today's server groups to something that makes much more
sense to me and my colleagues. I do not see as clear a path forward from
Jay's proposal, but am willing to think more about that. I will start by
outlining where I want to go, and then address the specific points that
have been raised in this email thread so far.
I would like to see the OpenStack architecture have a place for what I
have been calling holistic scheduling. That is making a simultaneous
scheduling decision about a whole collection of virtual resources of
various types (Nova VMs, Cinder storage volumes, network bandwidth, ...),
taking into account a rich composite policy statement. This is not just a
pipe dream, my group has been doing this for years. What we are
struggling with is finding an evolutionary path to a place where it can be
done in an OpenStack context. One part of the struggle is due to the fact
that in our previous work the part that is analogous to Heat is not
optional, while in OpenStack Heat is most definitely optional. Some of
the things I have written in the past have not clearly separated
scheduling and Heat and left Heat optional, but please rest assured that I
am making no proposal now to violate those things. I see scheduling and
orchestration as distinct functions; the potential for confusion arises
because (a) holistic scheduling needs input that has some similarity to
what you see in a Heat template today and (b) making the scheduling
simultaneous requires moving it from its current place (downstream from
orchestration) to an earlier place (upstream from orchestration).
The OpenStack community has historically used the word "scheduling" to
refer to placement problems, always in the time-invariant
now-and-forseeable future, and I am following that usage here. Other
communities consider "scheduling" to also include interesting variation
over time, but I am not trying to bring that into this debate. (Nor am I
denying its interest and value, I am just trying to keep this discussion
focused.)
The discussion in this email thread has recognized that scheduler hints
are applied only at creation time today, but it has already been noted
(e.g., in http://summit.openstack.org/cfp/details/99) that scheduling
policy statements should be retained for the lifetime of the virtual
resource. That is true regardless of whether the policy statements come
in through today's server groups, the alternate proposal from Jay Pipes,
or some other alternative or evolution.
I agree with Jay that groups have no inherent connection to scheduling. My
colleagues and I have found grouping to be a useful technique to make APIs
and documents more concise, and we find a top-level group to be the
natural scope for a simultaneous decision. We have been working example
problems with a non-trivial size and amount of structure; when you get
beyond small simple examples you see the usefulness of grouping more
clearly. For a couple of examples, see a 3-tier web application in
https://docs.google.com/drawings/d/1nridrUUwNaDrHQoGwSJ_KXYC7ik09wUuV3vXw1MyvlY
and a deployment of an IBM product called "Connections" in
https://docs.google.com/file/d/0BypF9OutGsW3ZUYwYkNjZGJFejQ (this latter
example has been shorn of its networking policies, and is a literal
abstract of something we did using software that could not cope with
policies applied directly to virtual resources, so some of its groups are
not well motivated --- but others *are*). The groups are handy for making
it possible to draw pictures without too many lines, and write documents
that are readably concise. But everything said with groups could be said
without groups, if we allowed policy statements to be placed on virtual
resources and on pairs of virtual resources --- it would just take a heck
of a lot more policy statements.
If you want to make a simultaneous decision about several virtual
resources, you need a description of all those virtual resources up-front.
So even in a totally Heat-free environment you find yourself wanting
something that looks like a document or data structure describing multiple
virtual resources --- and the policies that apply to them, and thus also
the groups that allow for concise applications of policies; note also that
the whole set of virtual resources involved is a group.
When you have an example of non-trivial size and structure, you generally
do not want to make a change by a collection of atomic edits, each
individually scheduled. Rather you want to state the new set of virtual
resources and policies that you want to move to, allowing a simultaneous
decision about the new placement solution.
To get where I want to go can be done by evolutionary steps forward from
today's server groups. There are four fairly independent dimensions in
which the evolution can proceed. One is to go from today's sequential
decision-making to simultaneous decision-making; I am drafting a blueprint
on that now (
https://blueprints.launchpad.net/nova/+spec/simultaneous-server-group).
Another is to expand beyond Nova. Another is to allow nested groups.
Another dimension is to expand and refine the catalog of policy types. In
older documents (in the particulars in
https://wiki.openstack.org/wiki/Heat/PolicyExtension --- do not be
distracted by the Heat context, the policy catalog is a scheduling issue
--- and in the generalities in
https://docs.google.com/document/d/17OIiBoIavih-1y4zzK0oXyI66529f-7JTCVj-BcXURA
) you see more kinds of policies and the idea that policies may have
parameters. For example, co-location (a more precise formulation of
"affinity", it clearly says we are talking about placement rather than
some other sort of affinity --- such as networking, which has its own
policy types) takes a parameter indicating the level of the physical
hierarchy at which the placement should be the same. You also see the
idea that a policy statement can be shaded as either a hard requirement or
a soft preference.
In my own group's work, and in the joint proposal with folks from Cisco
and VMware, we stipulated that the groups nest into a tree. From my own
group's perspective, that is merely a matter of conservatism --- we think
it might make some implementations easier, and it has been an acceptable
restriction for the examples we have worked. I am not strongly wedded to
that restriction. The placement technology that we have in mind
(transforming to a constrained optimization problem) does not require that
restriction. If we lifted that restriction, defining a group by an
arbitrary predicate (tag match, or whatever) would be an acceptable way of
expressing grouping. We could even keep the restriction to a tree of
groups while defining groups by predicates --- the restriction would be a
restriction on the predicates.
If we were to start instead from the proposal of Jay Pipes, the same four
dimensions of evolution apply. To get from sequential to simultaneous
decision-making requires an up-front statement of scope (i.e., member
descriptions) and policy (for which grouping would help), rather than the
at-creation-time statements. As mentioned above, the grouping could
continue to be expressed by predicates rather than explicit statements of
membership. If we recognize that we are embarking on a general program of
expanding and refining the policy language then particular command line
syntax like "--not-near-tag $TAG" would probably change to something
generic like "--constraint anti-co-location:level=rack --between
rsc_or_grp_1 --and rsc_or_grp2" (and the API has a corresponding issue).
The expansion beyond Nova has issues that seem to me to be pretty
orthogonal to this debate over the evolutionary starting point.
> > i) This is a feature that was discussed in at least one if not two
> Design Summits and went through a long review period, it wasn't one
> of those changes that merged in 24 hours before people could take a
> good look at it.
>
> Completely understood. That still doesn't mean we can't propose to get
> rid of it early instead of letting it sit around when an alternate
> implementation would be better for the user of OpenStack.
I also tend to favor looking ahead to validate that we are headed in a
good direction. That can conflict with the focus on quickly making
incremental improvements --- if we allow ourselves to suffer analysis
paralysis. I hope a limited discussion can lead to some consensus on
direction, not seriously preventing taking the first small steps soon.
However, in the case of scheduling, we are already queued up behind the
scheduler forklift and no-db-scheduler, so there is no danger of imminent
progress on my evolutionary program.
> > Whatever you feel about the implementation, it is now in the
> API and we should assume that people have started coding against it.
Yes, we should support backwards compatibility in general. And in this
particular case, there may be no immediate conflict. If we decide we
prefer Jay's way of expressing these concepts, we can retain support for
the old way of expressing grouping and policy too (hopefully with a
unified representation underneath). That only leaves us with another
general problem in interface evolution: when and how to delete old stuff
that everybody should eventually stop using.
> ...
> > I don't think it gives any credibility to Openstack as a
> platform if we yank features back out just after they've landed.
>
> Perhaps not, though I think we have less credibility if we don't
> recognize when a feature isn't implemented with users in mind and leave
> it in the code base to the detriment and confusion of users. We
> absolutely must, IMO, as a community, be able to say "this isn't right"
> and have a path for changing or removing something.
>
> If that path is deprecation vs outright removal, so be it, I'd be cool
> with that. I'd just like to nip this anti-feature in the bud early so
> that it doesn't become the next "feature" like file-injection to persist
> in Nova well after its time has come and passed.
I am no mind reader, but I suspect the designers of server groups had
users in mind. But just having them in mind is not really adequate; a
serious approach would be to involve actual users in evaluating a design
proposal before it proceeds. Remember the calls for OpenStack to be more
user-driven?
> > ii) Sever Group - It's a way of defining a group of servers, and
> the initial thing (only thing right now) you can define for such a
> group is the affinity or anti-affinity for scheduling.
>
> We already had ways of defining groups of servers. This new "feature"
> doesn't actually define a group of servers. It defines a policy, which
> is not particularly useful, as it's something that is better specified
> at the time of launching.
As I mentioned above, I want to do stuff (i.e., schedule) with groups
before the members are actually created/udpated.
> > Maybe in time we'll add other group properties or operations -
> like "delete all the servers in a group" (I know some QA folks that
> would love to have that feature).
>
> We already have the ability to define a group of servers using key=value
> tags. Deleting all servers in a group is a three-line bash script that
> loops over the results of a nova list command and calls nova delete.
> Trust me, I've done group deletes in this way many times.
>
> > I don't see why it shouldn't be possible to have a server group
> that doesn't have a scheduling policy associated to it.
>
> I don't think the grouping of servers should have *anything* to do with
> scheduling :) That's the point of my proposal. Servers can and should be
> grouped using simple tags or key=value pair tags.
>
> The grouping of servers together doesn't have anything of substance to
> do with scheduling policies.
Right, it just allows more concise statements of policies AND is a kind of
scope where you can apply simultaneous decision-making.
>
> > I don't see any Cognitive dissonance here - I think your just
> assuming that the only reason for being able to group servers is for
> scheduling.
>
> Again, I don't think scheduling and grouping of servers has anything to
> do with each other, thus my proposal to remove the relationship between
> groups of servers and scheduling policies, which is what the existing
> server group API and implementation does.
>
> > iii) If the issue is that you can't add or remove servers from a
> group, then why don't we add those operations to the API (you could
> add a server to a group providing doing so doesn't break any policy
> that might be associated with the group).
In fact you see some of this already in the blueprint for server groups (
https://blueprints.launchpad.net/nova/+spec/instance-group-api-extension)
--- look all the way through its whiteboard.
> We already have this ability today, thus my proposal to get rid of
> server groups.
>
> > Seems like a useful addition to me.
>
> It's an addition that isn't needed, as we already have this today.
>
> > iv) Since the user created the group, and chose a name for it that
> is presumably meaningful, then I don't understand why you think "--
> group XXX" isn't going to be meaningful to that same user ?
>
> See point above about removing the unnecessary relationship between
> grouping of servers and scheduling policies.
>
> > So I think there are a bunch of API operations missing, but I
> don't see any advantage in throwing away what's now in place and
> replacing it with a tag mechanism that basically says "everything
> with this tag is in a sort of group".
>
> We already have the tag group mechanism in place, that's kind of what
> I've been saying...
Regards,
Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140426/08cdc6f5/attachment.html>
More information about the OpenStack-dev
mailing list