<font size=2 face="sans-serif">I agree that such a thing is useful for

scheduling.  I see a bit of a tension here: for software engineering

reasons we want some independence, but we also want to avoid wasteful duplication.</font>

<br>

<br><font size=2 face="sans-serif">I think we are collectively backing

into the problem of metamodeling for datacenters, and establishing one

or more software thingies that will contain/communicate datacenter models.

 A collection of "nodes" annotated with "tags"

is a metamodel.  You could define a graph-based metamodel without

mandating any particular graph shape.  You could be more prescriptive

and mandate a tree shape as a good compromise between flexibility and making

something that is reasonably easy to process.  We can debate what

the metamodel should be, but that is different from debating whether there

is a metamodel.</font>

<br>

<br><font size=2 face="sans-serif">Regards,</font>

<br><font size=2 face="sans-serif">Mike</font>

<br>

<br>

<br>

<br><font size=1 color=#5f5f5f face="sans-serif">From:      

 </font><font size=1 face="sans-serif">Tomas Sedovic <tsedovic@redhat.com></font>

<br><font size=1 color=#5f5f5f face="sans-serif">To:      

 </font><font size=1 face="sans-serif">openstack-dev@lists.openstack.org,

</font>

<br><font size=1 color=#5f5f5f face="sans-serif">Date:      

 </font><font size=1 face="sans-serif">09/25/2013 10:37 AM</font>

<br><font size=1 color=#5f5f5f face="sans-serif">Subject:    

   </font><font size=1 face="sans-serif">Re: [openstack-dev]

[TripleO] Generalising racks :- modelling a        datacentre</font>

<br>

<hr noshade>

<br>

<br>

<br><tt><font size=2>On 09/25/2013 05:15 AM, Robert Collins wrote:<br>

> One of the major things Tuskar does is model a datacenter - which

is<br>

> very useful for error correlation, capacity planning and scheduling.<br>

><br>

> Long term I'd like this to be held somewhere where it is accessible<br>

> for schedulers and ceilometer etc. E.g. network topology + switch<br>

> information might be held by neutron where schedulers can rely on

it<br>

> being available, or possibly held by a unified topology db with<br>

> scheduler glued into that, but updated by neutron / nova / cinder.<br>

> Obviously this is a) non-trivial and b) not designed yet.<br>

><br>

> However, the design of Tuskar today needs to accomodate a few things:<br>

>   - multiple reference architectures for clouds (unless there

really is<br>

> one true design)<br>

>   - the fact that today we don't have such an integrated vertical

scheduler.<br>

><br>

> So the current Tuskar model has three constructs that tie together

to<br>

> model the DC:<br>

>   - nodes<br>

>   - resource classes (grouping different types of nodes into

service<br>

> offerings - e.g. nodes that offer swift, or those that offer nova).<br>

>   - 'racks'<br>

><br>

> AIUI the initial concept of Rack was to map to a physical rack, but<br>

> this rapidly got shifted to be 'Logical Rack' rather than physical<br>

> rack, but I think of Rack as really just a special case of a general<br>

> modelling problem..<br>

<br>

Yeah. Eventually, we settled on Logical Rack meaning a set of nodes on

<br>

the same L2 network (in a setup where you would group nodes into <br>

isolated L2 segments). Which kind of suggests we come up with a better

name.<br>

<br>

I agree there's a lot more useful stuff to model than just racks (or <br>

just L2 node groups).<br>

<br>

><br>

>>From a deployment perspective, if you have two disconnected<br>

> infrastructures, thats two AZ's, and two underclouds : so we know

that<br>

> any one undercloud is fully connected (possibly multiple subnets,

but<br>

> one infrastructure). When would we want to subdivide that?<br>

><br>

> One case is quick fault aggregation: if a physical rack loses power,<br>

> rather than having 16 NOC folk independently investigating the same

16<br>

> down hypervisors, one would prefer to identify that the power to the<br>

> rack has failed (for non-HA powered racks); likewise if a single<br>

> switch fails (for non-HA network topologies) you want to identify

that<br>

> that switch is down rather than investigating all the cascaded errors<br>

> independently.<br>

><br>

> A second case is scheduling: you may want to put nova instances on

the<br>

> same switch as the cinder service delivering their block devices,

when<br>

> possible, or split VM's serving HA tasks apart. (We currently do this<br>

> with host aggregates, but being able to do it directly would be much<br>

> nicer).<br>

><br>

> Lastly, if doing physical operations like power maintenance or moving<br>

> racks around in a datacentre, being able to identify machines in the<br>

> same rack can be super useful for planning, downtime announcements,

orhttps://plus.google.com/hangouts/_/04919b4400b8c4c5ba706b752610cd433d9acbe1<br>

> host evacuation, and being able to find a specific machine in a DC

is<br>

> also important (e.g. what shelf in the rack, what cartridge in a<br>

> chassis).<br>

<br>

I agree. However, we should take care not to commit ourselves to <br>

building a DCIM just yet.<br>

<br>

><br>

> Back to 'Logical Rack' - you can see then that having a single<br>

> construct to group machines together doesn't really support these

use<br>

> cases in a systematic fasion:- Physical rack modelling supports only

a<br>

> subset of the location/performance/failure use cases, and Logical

rack<br>

> doesn't support them at all: we're missing all the rich data we need<br>

> to aggregate faults rapidly : power, network, air conditioning - and<br>

> these things cover both single machine/groups of machines/racks/rows<br>

> of racks scale (consider a networked PDU with 10 hosts on it - thats

a<br>

> fraction of a rack).<br>

><br>

> So, what I'm suggesting is that we model the failure and performance<br>

> domains directly, and include location (which is the incremental data<br>

> racks add once failure and performance domains are modelled) too.

We<br>

> can separately noodle on exactly what failure domain and performance<br>

> domain modelling looks like - e.g. the scheduler focus group would

be<br>

> a good place to have that discussion.<br>

<br>

Yeah I think it's pretty clear that the current Tuskar concept where <br>

Racks are the first-class objects isn't going to fly. We should switch

<br>

our focus on the individual nodes and their grouping and metadata.<br>

<br>

I'd like to start with something small and simple that we can improve <br>

upon, though. How about just going with freeform tags and key/value <br>

metadata for the nodes?<br>

<br>

We can define some well-known tags and keys to begin with (rack, <br>

l2-network, power, switch, etc.), it would be easy to iterate and once

<br>

we settle on the things we need, we can solidify them more.<br>

<br>

In the meantime, we have the API flexible enough to handle whatever <br>

architectures we end up supporting and the UI can provide the <br>

appropriate views into the data.<br>

<br>

And this would allow people to add their own criteria that we didn't <br>

consider.<br>

<br>

><br>

> E.g. for any node I should be able to ask:<br>

> - what failure domains is this in? [e.g. power-45, switch-23, ac-15,<br>

> az-3, region-1]<br>

> - what locality-of-reference features does this have? [e.g. switch-23,<br>

> az-3, region-1]<br>

> - where is it [e.g. DC 2, pod 4, enclosure 2, row 5, rack 3, RU 30,<br>

> cartridge 40].<br>

><br>

> And then we should be able to slice and dice the DC easily by these

aspects:<br>

> - location: what machines are in DC 2, or DC2 pod 4<br>

> - performance: what machines are all in region-1, or az-3, or switch-23.<br>

> - failure: what failure domains do machines X and Y have in common?<br>

> - failure: if we power off switch-23, what machines will be impacted?<br>

><br>

> So, what do you think?<br>

><br>

> -Rob<br>

><br>

<br>

<br>

_______________________________________________<br>

OpenStack-dev mailing list<br>

OpenStack-dev@lists.openstack.org<br>

</font></tt><a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev"><tt><font size=2>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</font></tt></a><tt><font size=2><br>

<br>

</font></tt>

<br>