[openstack-dev] [TripleO] Summit session wrapup

Ladislav Smola lsmola at redhat.com
Thu Nov 28 08:54:47 UTC 2013


Hello,

just few notes from me:

https://etherpad.openstack.org/p/tripleo-feature-map sounds like a great 
idea, we should go through them one by one maybe on meeting.
We should agree on what is doable for I, without violating the Openstack 
way in some very ugly way. So do we want to be Openstack on Openstack
or Almost Openstack on Openstack? Or what is the goal here?

So let's take a simple example, flat network 2 racks (32 nodes), 2 
controllers nodes, 2 neutron nodes, 14 nova compute, 14 storage

I. Manual way using Heat and Scheduler could be assigning every group of 
nodes to special flavor by hard. Then nova scheduler will take care of it.
1. How hard it will be to implement 'Assigning a specific nodes to 
Flavor' ? (probably adding a condition for MAC address?)
     Or do you have some other idea how to do this in an almost clean 
way? Without reimplementing nova scheduler. (though this is probably 
messing with scheduler)
2. How this will be implementable in UI? Just assigning nodes to flavors 
and uploading a Heat template?

II. Having homogeneous hardware, all will be one flavor and then nova 
scheduler will decide, where to put what. When you give heat e.g. I want 
to spawn 2 controller images.
1. How hard is to set the policies, like we want to spread those nodes 
over all racks?
2. How this will be implementable in UI? It is basically building a 
complex Heat template, right? So just uploading a Heat template?

III. Having more flavors
1. We will be able to set in Heat something like, I want Nova compute 
node on compute_flavor(amazon c1,c3) with high priority or on 
all_purpose_flavor(amazon m1)  with normal_priority. How hard is that?
2. How this will be implementable in UI? Just uploading a Heat template?

IV. Tripleo way
========

1. From the OOO name I infer, we want to use openstack, that means using 
Heat, Nova scheduler etc.
     From my point of view having Heat template for deploying e.g. 
Wordpress installation seems the same to me like having a Heat template
     to deploy Openstack, it's just much more complex. Is this a valid 
assumption? If you think it's not, explain why please.


"Radical idea : we could ask (e.g. on -operators) for a few potential 
users who'd be willing to let us interview them."
Yes please!!!

Talking to jcoufal, being able to edit Heat template in UI, being able 
to assign baremetals to flavors(later connected to template catalog). It 
could be all we need. Also later visualize
what will happen when you actually stack create the template, so we 
don't go blindly would be very needed.

Kind regards,
Ladislav


On 11/28/2013 06:41 AM, Robert Collins wrote:
> Hey, I realise I've done a sort of point-bypoint thing below - sorry.
> Let me say that I'm glad you're focused on what will help users, and
> their needs - I am too. Hopefully we can figure out why we have
> different opinions about what things are key, and/or how we can get
> data to better understand our potential users.
>
>
> On 28 November 2013 02:39, Jaromir Coufal <jcoufal at redhat.com> wrote:
>
>> Important point here is, that we agree on starting with very basics - grow
>> then. Which is great.
>>
>> The whole deployment workflow (not just UI) is all about user experience
>> which is built on top of TripleO's approach. Here I see two important
>> factors:
>> - There are users who are having some needs and expectations.
> Certainly. Do we have Personas for those people? (And have we done any
> validation of them?)
>
>> - There is underlying concept of TripleO, which we are using for
>> implementing features which are satisfying those needs.
> mmm, so the technical aspect of TripleO is about setting up a virtuous
> circle: where improvements in deploying cluster software via OpenStack
> makes deploying OpenStack better, and those of us working on deploying
> OpenStack will make deploying cluster software via OpenStack better in
> general, as part of solving 'deploying OpenStack' in a nice way.
>
>> We are circling around and trying to approach the problem from wrong end -
>> which is implementation point of view (how to avoid own scheduling).
>>
>> Let's try get out of the box and start with thinking about our audience
>> first - what they expect, what they need. Then we go back, put our
>> implementation thinking hat on and find out how we are going to re-use
>> OpenStack components to achieve our goals. In the end we have detailed plan.
> Certainly, +1.
>
>> === Users ===
>>
>> I would like to start with our targeted audience first - without milestones,
>> without implementation details.
>>
>> I think here is the main point where I disagree and which leads to different
>> approaches. I don't think, that user of TripleO cares only about deploying
>> infrastructure without any knowledge where the things go. This is overcloud
>> user's approach - 'I want VM and I don't care where it runs'. Those are
>> self-service users / cloud users. I know we are OpenStack on OpenStack, but
>> we shouldn't go that far that we expect same behavior from undercloud users.
>> I can tell you various examples of why the operator will care about where
>> the image goes and what runs on specific node.
> This may be where we disagree indeed :). Wearing my sysadmin hat ( a
> little dusty, but never really goes away :P) - I can tell you I spent
> a lot of time worrying about what went on what machine. But it was
> never actually what I was paid to do.
>
> What I was paid to do was to deliver infrastructure and services to
> the business. Everything that we could automate, that we could
> describe with policy and still get robust, reliable results - we did.
> It's how one runs many hundred machines with an ops team of 2.
>
> Planning around failure domains for example, is tedious work; it's
> needed at a purchasing level - you need to decide if you're buying
> three datacentres or one datacentre with internal redundancy, but once
> thats decided the actual mechanics of ensure that each HA service is
> spread across the (three datacentres) or (three separate zones in the
> one DC) is not interesting. So - I'm sure that many sysadmins do
> manually assign work to machines to ensure a good result from
> performance or HA concerns, but thats out of necessity, not desire.
>
>> One quick example:
>> I have three racks of homogenous hardware and I want to design it the way so
>> that I have one control node in each, 3 storage nodes and the rest compute.
>> With that smart deployment, I'll never know what my rack contains in the
>> end. But if I have control over stuff, I can say that this node is
>> controller, those three are storage and those are compute - I am happy from
>> the very beginning.
> Why does that layout make you happy? What is it about that setup where
> things will work better for you? Note that in the absence of a
> sophisticated scheduler you'll have some volumes with redundancy of 3
> end up all in one rack: you won't get rack-can-fail safety on the
> delivered cloud workloads (I mention this as one attempt to understand
> why knowing there is a control node / 3 storage /rest compute in each
> rack makes you happy).
>
>> Our targeted audience are sysadmins, operators. They hate 'magics'. They
>> want to have control over things which they are doing. If we put in front of
>> them workflow, where they click one button and they get cloud installed,
>> they will get horrified.
> I don't think this is a good characterisation of the sysadmin /
> operator mindset. They - like anyone don't like surprises, and they
> often care intensely about delivering services well, with high
> performance and high availability. Tools that help them do that are
> appreciated, tools that are flaky - which a lot of
> abstract-all-the-details tools seem to be - get a bad rap in sysadmin
> circles.
>
>> That's why I am very sure and convinced that we need to have ability for
>> user to have control over stuff. What node is having what role. We can be
>> smart, suggest and advice. But not hiding this functionality from user.
>> Otherwise, I am afraid that we can fail.
> I think having that degree of control is failure. Our CloudOS team has
> considerable experience now in deploying clouds using a high-touch
> system like you describe - and they are utterly convinced that it
> doesn't scale. Even at 20 nodes it is super tedious, and beyond that
> it's ridiculous.
>
>> Furthermore, if we put lots of restrictions (like homogenous hardware) in
>> front of users from the very beginning, we are discouraging people from
>> using TripleO-UI. We are young project and trying to hit as broad audience
>> as possible. If we do flexible enough approach to get large audience
>> interested, solve their problems, we will get more feedback, we will get
>> early adopters, we will get more contributors, etc.
> Flexibilty comes with a cost. Right now we have a large audience
> interested in what we have, but we're delivering two separate things:
> we have a functional sysadminny interface with command line scripts
> and heat templates - , and we have a GUI where we can offer a better
> interface which the tuskar folk are building up. I agree that
> homogeneous hardware isn't a viable long term constraint. But if we
> insist on fixing that issue first, we sacrifice our ability to learn
> about the usefulness of a simple, straight forward interface. We'll be
> doing a bunch of work - regardless of implementation - to deal with
> heterogeneity, when we could be bringing Swift and Cinder up to
> production readiness - which IMO will get many more folk onboard for
> adoption.
>
>> First, let's help cloud operator, who is having some nodes and wants to
>> deploy OpenStack on them. He wants to have control which node is controller,
>> which node is compute or storage. Then we can get smarter and guide.
> Folk that want to manually install openstack on a couple of machines
> can already do so : we don't change the game for them by replacing a
> manual system with a manual system. My vision is that we should
> deliver something significantly better!
>
>> === Milestones ===
>>
>> Based on different user behavior I am talking about, I suggest different
>> milestones:
> ...
>
> So, I have a suggestion. Lets create a set of all the things we want
> in the product eventually.
>
> https://etherpad.openstack.org/p/tripleo-feature-map
>
>  From there we can assess for each thing several things:
> cost - estimated cost of 'ok'(*) implementation - 0: expensive-
> multiple cycles, 9: cheap
> benefit(us) - estimated benefit to design learning by having a
> functional implementation - 0: learn nothing, 9: learn lots
> benefit(users) - e.g. estimated increase in # of users for which
> TripleO will satisfy their needs (as part of a holistic install) - 0:
> minimal increase, 9: huge increase
>
>  From there we can draw a cube: things that are cheap, we learn a lot,
> and users benefit a lot are no brainers :) Things that are expensive,
> we don't learn a lot and users don't benefit much are clearly things
> we don't want to do right now:
>
> cost  b-us   b-users   do-when ?
> 0        0        0            never?
> 9        9        9            right now
> 5        5        5            sometime in the middle
> but more interesting are combinations like:
> 0        9        9            start now as a background task?
> 9        2        2            Do if we have nothing better
> 9        0        9            right now
> 9        9        0            also right now
>
> So I dunno if this is a good idea - it's just an attempt to visualise
> the tradeoffs in a way that we can be clear what we're saying is good
> about a specific feature [think of it as a variation on planning
> poker].
>
> (*): I mean an implementation we could live with for a while, vs
> whatever the ideal might be.
>
>> === Implementation ===
>>
>> Above mentioned approach shouldn't lead to reimplementing scheduler. We can
>> still use nova-scheduler, but we can take advantage of extra params (like
>> unique identifier), so that we specify more concretely what goes where.
> That is reimplementing the scheduler. In this case it's forcing
> sysadmins to be the scheduler, which is a waste of their time.
>
>> More details should follow here - how to achieve above mentioned goals, like
>> what should go through heat, what should go through nova, ironic, etc.
>>
>> But first, let's agree on approach and goals.
> Totally agree!
>
> -Rob
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131128/83f7f194/attachment.html>


More information about the OpenStack-dev mailing list