<tt><font size=2>Jay Pipes <jaypipes@gmail.com> wrote on 04/25/2014

06:28:38 PM:<br>

<br>

> On Fri, 2014-04-25 at 22:00 +0000, Day, Phil wrote:<br>

> > Hi Jay,<br>

> > <br>

> > I'm going to disagree with you on this one, because:<br>

> <br>

> No worries, Phil, I expected some dissention and I completely appreciate<br>

> your feedback and perspective :)</font></tt>

<br>

<br><tt><font size=2>I myself sit between the two camps on this one.  I

share Jay's unhappiness with server groups, as they are today.  However,

I see an evolutionary path forward from today's server groups to something

that makes much more sense to me and my colleagues.  I do not see

as clear a path forward from Jay's proposal, but am willing to think more

about that.  I will start by outlining where I want to go, and then

address the specific points that have been raised in this email thread

so far.</font></tt>

<br>

<br><tt><font size=2>I would like to see the OpenStack architecture have

a place for what I have been calling holistic scheduling.  That is

making a simultaneous scheduling decision about a whole collection of virtual

resources of various types (Nova VMs, Cinder storage volumes, network bandwidth,

...), taking into account a rich composite policy statement.  This

is not just a pipe dream, my group has been doing this for years.  What

we are struggling with is finding an evolutionary path to a place where

it can be done in an OpenStack context.  One part of the struggle

is due to the fact that in our previous work the part that is analogous

to Heat is not optional, while in OpenStack Heat is most definitely optional.

 Some of the things I have written in the past have not clearly separated

scheduling and Heat and left Heat optional, but please rest assured that

I am making no proposal now to violate those things.  I see scheduling

and orchestration as distinct functions; the potential for confusion arises

because (a) holistic scheduling needs input that has some similarity to

what you see in a Heat template today and (b) making the scheduling simultaneous

requires moving it from its current place (downstream from orchestration)

to an earlier place (upstream from orchestration).</font></tt>

<br>

<br><tt><font size=2>The OpenStack community has historically used the

word "scheduling" to refer to placement problems, always in the

time-invariant now-and-forseeable future, and I am following that usage

here.  Other communities consider "scheduling" to also include

interesting variation over time, but I am not trying to bring that into

this debate.  (Nor am I denying its interest and value, I am just

trying to keep this discussion focused.)</font></tt>

<br>

<br><tt><font size=2>The discussion in this email thread has recognized

that scheduler hints are applied only at creation time today, but it has

already been noted (e.g., in </font></tt><a href=http://summit.openstack.org/cfp/details/99><tt><font size=2>http://summit.openstack.org/cfp/details/99</font></tt></a><tt><font size=2>)

that scheduling policy statements should be retained for the lifetime of

the virtual resource.  That is true regardless of whether the policy

statements come in through today's server groups, the alternate proposal

from Jay Pipes, or some other alternative or evolution.</font></tt>

<br>

<br><tt><font size=2>I agree with Jay that groups have no inherent connection

to scheduling.  My colleagues and I have found grouping to be a useful

technique to make APIs and documents more concise, and we find a top-level

group to be the natural scope for a simultaneous decision.  We have

been working example problems with a non-trivial size and amount of structure;

when you get beyond small simple examples you see the usefulness of grouping

more clearly.  For a couple of examples, see a 3-tier web application

in </font></tt><a href=https://docs.google.com/drawings/d/1nridrUUwNaDrHQoGwSJ_KXYC7ik09wUuV3vXw1MyvlY><tt><font size=2>https://docs.google.com/drawings/d/1nridrUUwNaDrHQoGwSJ_KXYC7ik09wUuV3vXw1MyvlY</font></tt></a><tt><font size=2>

and a deployment of an IBM product called "Connections" in </font></tt><a href=https://docs.google.com/file/d/0BypF9OutGsW3ZUYwYkNjZGJFejQ><tt><font size=2>https://docs.google.com/file/d/0BypF9OutGsW3ZUYwYkNjZGJFejQ</font></tt></a><tt><font size=2>

(this latter example has been shorn of its networking policies, and is

a literal abstract of something we did using software that could not cope

with policies applied directly to virtual resources, so some of its groups

are not well motivated --- but others *are*).  The groups are handy

for making it possible to draw pictures without too many lines, and write

documents that are readably concise.  But everything said with groups

could be said without groups, if we allowed policy statements to be placed

on virtual resources and on pairs of virtual resources --- it would just

take a heck of a lot more policy statements.</font></tt>

<br>

<br><tt><font size=2>If you want to make a simultaneous decision about

several virtual resources, you need a description of all those virtual

resources up-front.  So even in a totally Heat-free environment you

find yourself wanting something that looks like a document or data structure

describing multiple virtual resources --- and the policies that apply to

them, and thus also the groups that allow for concise applications of policies;

note also that the whole set of virtual resources involved is a group.</font></tt>

<br>

<br><tt><font size=2>When you have an example of non-trivial size and structure,

you generally do not want to make a change by a collection of atomic edits,

each individually scheduled.  Rather you want to state the new set

of virtual resources and policies that you want to move to, allowing a

simultaneous decision about the new placement solution.</font></tt>

<br>

<br><tt><font size=2>To get where I want to go can be done by evolutionary

steps forward from today's server groups.  There are four fairly independent

dimensions in which the evolution can proceed.  One is to go from

today's sequential decision-making to simultaneous decision-making; I am

drafting a blueprint on that now (</font></tt><a href="https://blueprints.launchpad.net/nova/+spec/simultaneous-server-group"><tt><font size=2>https://blueprints.launchpad.net/nova/+spec/simultaneous-server-group</font></tt></a><tt><font size=2>).

 Another is to expand beyond Nova.  Another is to allow nested

groups.  Another dimension is to expand and refine the catalog of

policy types.  In older documents (in the particulars in </font></tt><a href=https://wiki.openstack.org/wiki/Heat/PolicyExtension><tt><font size=2>https://wiki.openstack.org/wiki/Heat/PolicyExtension</font></tt></a><tt><font size=2>

--- do not be distracted by the Heat context, the policy catalog is a scheduling

issue --- and in the generalities in </font></tt><a href="https://docs.google.com/document/d/17OIiBoIavih-1y4zzK0oXyI66529f-7JTCVj-BcXURA"><tt><font size=2>https://docs.google.com/document/d/17OIiBoIavih-1y4zzK0oXyI66529f-7JTCVj-BcXURA</font></tt></a><tt><font size=2>)

you see more kinds of policies and the idea that policies may have parameters.

 For example, co-location (a more precise formulation of "affinity",

it clearly says we are talking about placement rather than some other sort

of affinity --- such as networking, which has its own policy types) takes

a parameter indicating the level of the physical hierarchy at which the

placement should be the same.  You also see the idea that a policy

statement can be shaded as either a hard requirement or a soft preference.</font></tt>

<br>

<br><tt><font size=2>In my own group's work, and in the joint proposal

with folks from Cisco and VMware, we stipulated that the groups nest into

a tree.  From my own group's perspective, that is merely a matter

of conservatism --- we think it might make some implementations easier,

and it has been an acceptable restriction for the examples we have worked.

 I am not strongly wedded to that restriction.  The placement

technology that we have in mind (transforming to a constrained optimization

problem) does not require that restriction.  If we lifted that restriction,

defining a group by an arbitrary predicate (tag match, or whatever) would

be an acceptable way of expressing grouping.  We could even keep the

restriction to a tree of groups while defining groups by predicates ---

the restriction would be a restriction on the predicates.</font></tt>

<br>

<br><tt><font size=2>If we were to start instead from the proposal of Jay

Pipes, the same four dimensions of evolution apply.  To get from sequential

to simultaneous decision-making requires an up-front statement of scope

(i.e., member descriptions) and policy (for which grouping would help),

rather than the at-creation-time statements. As mentioned above, the grouping

could continue to be expressed by predicates rather than explicit statements

of membership.  If we recognize that we are embarking on a general

program of expanding and refining the policy language then particular command

line syntax like "--not-near-tag $TAG" would probably change

to something generic like "--constraint anti-co-location:level=rack

--between rsc_or_grp_1 --and rsc_or_grp2" (and the API has a corresponding

issue).  The expansion beyond Nova has issues that seem to me to be

pretty orthogonal to this debate over the evolutionary starting point.</font></tt>

<br><tt><font size=2><br>

> > i) This is a feature that was discussed in at least one if not

two<br>

> Design Summits and went through a long review period, it wasn't one

<br>

> of those changes that merged in 24 hours before people could take

a <br>

> good look at it.<br>

> <br>

> Completely understood. That still doesn't mean we can't propose to

get<br>

> rid of it early instead of letting it sit around when an alternate<br>

> implementation would be better for the user of OpenStack.</font></tt>

<br>

<br><tt><font size=2>I also tend to favor looking ahead to validate that

we are headed in a good direction.  That can conflict with the focus

on quickly making incremental improvements --- if we allow ourselves to

suffer analysis paralysis.  I hope a limited discussion can lead to

some consensus on direction, not seriously preventing taking the first

small steps soon.  However, in the case of scheduling, we are already

queued up behind the scheduler forklift and no-db-scheduler, so there is

no danger of imminent progress on my evolutionary program.</font></tt>

<br><tt><font size=2><br>

> >   Whatever you feel about the implementation,  it is

now in the <br>

> API and we should assume that people have started coding against it.</font></tt>

<br>

<br><tt><font size=2>Yes, we should support backwards compatibility in

general.  And in this particular case, there may be no immediate conflict.

 If we decide we prefer Jay's way of expressing these concepts, we

can retain support for the old way of expressing grouping and policy too

(hopefully with a unified representation underneath).  That only leaves

us with another general problem in interface evolution: when and how to

delete old stuff that everybody should eventually stop using.</font></tt>

<br><tt><font size=2><br>

> ...<br>

> >   I don't think it gives any credibility to Openstack as

a <br>

> platform if we yank features back out just after they've landed.<br>

> <br>

> Perhaps not, though I think we have less credibility if we don't<br>

> recognize when a feature isn't implemented with users in mind and

leave<br>

> it in the code base to the detriment and confusion of users. We<br>

> absolutely must, IMO, as a community, be able to say "this isn't

right"<br>

> and have a path for changing or removing something.<br>

> <br>

> If that path is deprecation vs outright removal, so be it, I'd be

cool<br>

> with that. I'd just like to nip this anti-feature in the bud early

so<br>

> that it doesn't become the next "feature" like file-injection

to persist<br>

> in Nova well after its time has come and passed.</font></tt>

<br>

<br><tt><font size=2>I am no mind reader, but I suspect the designers of

server groups had users in mind.  But just having them in mind is

not really adequate; a serious approach would be to involve actual users

in evaluating a design proposal before it proceeds.  Remember the

calls for OpenStack to be more user-driven?</font></tt>

<br><tt><font size=2><br>

> > ii) Sever Group - It's a way of defining a group of servers,

and <br>

> the initial thing (only thing right now) you can define for such a

<br>

> group is the affinity or anti-affinity for scheduling.<br>

> <br>

> We already had ways of defining groups of servers. This new "feature"<br>

> doesn't actually define a group of servers. It defines a policy, which<br>

> is not particularly useful, as it's something that is better specified<br>

> at the time of launching.</font></tt>

<br>

<br><tt><font size=2>As I mentioned above, I want to do stuff (i.e., schedule)

with groups before the members are actually created/udpated.<br>

<br>

> >   Maybe in time we'll add other group properties or operations

- <br>

> like "delete all the servers in a group" (I know some QA

folks that <br>

> would love to have that feature).<br>

> <br>

> We already have the ability to define a group of servers using key=value<br>

> tags. Deleting all servers in a group is a three-line bash script

that<br>

> loops over the results of a nova list command and calls nova delete.<br>

> Trust me, I've done group deletes in this way many times.<br>

> <br>

> >   I don't see why it shouldn't be possible to have a server

group <br>

> that doesn't have a scheduling policy associated to it.<br>

> <br>

> I don't think the grouping of servers should have *anything* to do

with<br>

> scheduling :) That's the point of my proposal. Servers can and should

be<br>

> grouped using simple tags or key=value pair tags.<br>

> <br>

> The grouping of servers together doesn't have anything of substance

to<br>

> do with scheduling policies.</font></tt>

<br>

<br><tt><font size=2>Right, it just allows more concise statements of policies

AND is a kind of scope where you can apply simultaneous decision-making.</font></tt>

<br><tt><font size=2><br>

> <br>

> >    I don't see any  Cognitive dissonance here

- I think your just <br>

> assuming that the only reason for being able to group servers is for<br>

> scheduling.<br>

> <br>

> Again, I don't think scheduling and grouping of servers has anything

to<br>

> do with each other, thus my proposal to remove the relationship between<br>

> groups of servers and scheduling policies, which is what the existing<br>

> server group API and implementation does.<br>

> <br>

> > iii) If the issue is that you can't add or remove servers from

a <br>

> group, then why don't we add those operations to the API (you could

<br>

> add a server to a group providing doing so  doesn't break any

policy<br>

> that might be associated with the group). </font></tt>

<br>

<br><tt><font size=2>In fact you see some of this already in the blueprint

for server groups (</font></tt><a href="https://blueprints.launchpad.net/nova/+spec/instance-group-api-extension"><tt><font size=2>https://blueprints.launchpad.net/nova/+spec/instance-group-api-extension</font></tt></a><tt><font size=2>)

--- look all the way through its whiteboard.<br>

<br>

> We already have this ability today, thus my proposal to get rid of<br>

> server groups.<br>

> <br>

> >   Seems like a useful addition to me.<br>

> <br>

> It's an addition that isn't needed, as we already have this today.<br>

> <br>

> > iv) Since the user created the group, and chose a name for it

that<br>

> is presumably meaningful, then I don't understand why you think "--<br>

> group XXX" isn't going to be meaningful to that same user ?<br>

> <br>

> See point above about removing the unnecessary relationship between<br>

> grouping of servers and scheduling policies.<br>

> <br>

> > So I think there are a bunch of API operations missing, but I

<br>

> don't see any advantage in throwing away what's now in place and  <br>

> replacing it with a tag mechanism that basically says "everything

<br>

> with this tag is in a sort of group".<br>

> <br>

> We already have the tag group mechanism in place, that's kind of what<br>

> I've been saying...<br>

</font></tt>

<br><tt><font size=2>Regards,</font></tt>

<br><tt><font size=2>Mike</font></tt>