<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <div class="moz-cite-prefix">Hello,<br>

      <br>

      just few notes from me:<br>

      <br>

      <meta http-equiv="content-type" content="text/html;

        charset=ISO-8859-1">

      <a href="https://etherpad.openstack.org/p/tripleo-feature-map">https://etherpad.openstack.org/p/tripleo-feature-map</a>

      sounds like a great idea, we should go through them one by one

      maybe on meeting.<br>

      We should agree on what is doable for I, without violating the

      Openstack way in some very ugly way. So do we want to be Openstack

      on Openstack<br>

      or Almost Openstack on Openstack? Or what is the goal here?<br>

      <br>

      So let's take a simple example, flat network 2 racks (32 nodes), 2

      controllers nodes, 2 neutron nodes, 14 nova compute, 14 storage<br>

      <br>

      I. Manual way using Heat and Scheduler could be assigning every

      group of nodes to special flavor by hard. Then nova scheduler will

      take care of it.<br>

      1. How hard it will be to implement 'Assigning a specific nodes to

      Flavor' ? (probably adding a condition for MAC address?)<br>

          Or do you have some other idea how to do this in an almost

      clean way? Without reimplementing nova scheduler. (though this is

      probably messing with scheduler)<br>

      2. How this will be implementable in UI? Just assigning nodes to

      flavors and uploading a Heat template?<br>

      <br>

      II. Having homogeneous hardware, all will be one flavor and then

      nova scheduler will decide, where to put what. When you give heat

      e.g. I want to spawn 2 controller images.<br>

      1. How hard is to set the policies, like we want to spread those

      nodes over all racks?<br>

      2. How this will be implementable in UI? It is basically building

      a complex Heat template, right? So just uploading a Heat template?<br>

      <br>

      III. Having more flavors<br>

      1. We will be able to set in Heat something like, I want Nova

      compute node on compute_flavor(amazon c1,c3) with high priority or

      on all_purpose_flavor(amazon m1)  with normal_priority. How hard

      is that?<br>

      2. How this will be implementable in UI? Just uploading a Heat

      template?<br>

      <br>

      IV. Tripleo way<br>

      ========<br>

      <br>

      1. From the OOO name I infer, we want to use openstack, that means

      using Heat, Nova scheduler etc. <br>

          From my point of view having Heat template for deploying e.g.

      Wordpress installation seems the same to me like having a Heat

      template<br>

          to deploy Openstack, it's just much more complex. Is this a

      valid assumption? If you think it's not, explain why please.<br>

      <br>

      <br>

      "Radical idea : we could ask (e.g. on -operators) for a few

      potential

      users who'd be willing to let us interview them."<br>

      Yes please!!!<br>

      <br>

      Talking to jcoufal, being able to edit Heat template in UI, being

      able to assign baremetals to flavors(later connected to template

      catalog). It could be all we need. Also later visualize<br>

      what will happen when you actually stack create the template, so

      we don't go blindly would be very needed.<br>

      <br>

      Kind regards,<br>

      Ladislav<br>

      <br>

      <br>

      On 11/28/2013 06:41 AM, Robert Collins wrote:<br>

    </div>

    <blockquote

cite="mid:CAJ3HoZ0rcjyO92-rD+Hsm_VViiRNQZRFwvdvOvpQ34ZaH1jQLA@mail.gmail.com"

      type="cite">

      <pre wrap="">Hey, I realise I've done a sort of point-bypoint thing below - sorry.

Let me say that I'm glad you're focused on what will help users, and

their needs - I am too. Hopefully we can figure out why we have

different opinions about what things are key, and/or how we can get

data to better understand our potential users.

On 28 November 2013 02:39, Jaromir Coufal <a class="moz-txt-link-rfc2396E" href="mailto:jcoufal@redhat.com"><jcoufal@redhat.com></a> wrote:

</pre>

      <blockquote type="cite">

        <pre wrap="">Important point here is, that we agree on starting with very basics - grow

then. Which is great.

The whole deployment workflow (not just UI) is all about user experience

which is built on top of TripleO's approach. Here I see two important

factors:

- There are users who are having some needs and expectations.

</pre>

      </blockquote>

      <pre wrap="">

Certainly. Do we have Personas for those people? (And have we done any

validation of them?)

</pre>

      <blockquote type="cite">

        <pre wrap="">- There is underlying concept of TripleO, which we are using for

implementing features which are satisfying those needs.

</pre>

      </blockquote>

      <pre wrap="">

mmm, so the technical aspect of TripleO is about setting up a virtuous

circle: where improvements in deploying cluster software via OpenStack

makes deploying OpenStack better, and those of us working on deploying

OpenStack will make deploying cluster software via OpenStack better in

general, as part of solving 'deploying OpenStack' in a nice way.

</pre>

      <blockquote type="cite">

        <pre wrap="">We are circling around and trying to approach the problem from wrong end -

which is implementation point of view (how to avoid own scheduling).

Let's try get out of the box and start with thinking about our audience

first - what they expect, what they need. Then we go back, put our

implementation thinking hat on and find out how we are going to re-use

OpenStack components to achieve our goals. In the end we have detailed plan.

</pre>

      </blockquote>

      <pre wrap="">

Certainly, +1.

</pre>

      <blockquote type="cite">

        <pre wrap="">=== Users ===

I would like to start with our targeted audience first - without milestones,

without implementation details.

I think here is the main point where I disagree and which leads to different

approaches. I don't think, that user of TripleO cares only about deploying

infrastructure without any knowledge where the things go. This is overcloud

user's approach - 'I want VM and I don't care where it runs'. Those are

self-service users / cloud users. I know we are OpenStack on OpenStack, but

we shouldn't go that far that we expect same behavior from undercloud users.

I can tell you various examples of why the operator will care about where

the image goes and what runs on specific node.

</pre>

      </blockquote>

      <pre wrap="">

This may be where we disagree indeed :). Wearing my sysadmin hat ( a

little dusty, but never really goes away :P) - I can tell you I spent

a lot of time worrying about what went on what machine. But it was

never actually what I was paid to do.

What I was paid to do was to deliver infrastructure and services to

the business. Everything that we could automate, that we could

describe with policy and still get robust, reliable results - we did.

It's how one runs many hundred machines with an ops team of 2.

Planning around failure domains for example, is tedious work; it's

needed at a purchasing level - you need to decide if you're buying

three datacentres or one datacentre with internal redundancy, but once

thats decided the actual mechanics of ensure that each HA service is

spread across the (three datacentres) or (three separate zones in the

one DC) is not interesting. So - I'm sure that many sysadmins do

manually assign work to machines to ensure a good result from

performance or HA concerns, but thats out of necessity, not desire.

</pre>

      <blockquote type="cite">

        <pre wrap="">One quick example:

I have three racks of homogenous hardware and I want to design it the way so

that I have one control node in each, 3 storage nodes and the rest compute.

With that smart deployment, I'll never know what my rack contains in the

end. But if I have control over stuff, I can say that this node is

controller, those three are storage and those are compute - I am happy from

the very beginning.

</pre>

      </blockquote>

      <pre wrap="">

Why does that layout make you happy? What is it about that setup where

things will work better for you? Note that in the absence of a

sophisticated scheduler you'll have some volumes with redundancy of 3

end up all in one rack: you won't get rack-can-fail safety on the

delivered cloud workloads (I mention this as one attempt to understand

why knowing there is a control node / 3 storage /rest compute in each

rack makes you happy).

</pre>

      <blockquote type="cite">

        <pre wrap="">Our targeted audience are sysadmins, operators. They hate 'magics'. They

want to have control over things which they are doing. If we put in front of

them workflow, where they click one button and they get cloud installed,

they will get horrified.

</pre>

      </blockquote>

      <pre wrap="">

I don't think this is a good characterisation of the sysadmin /

operator mindset. They - like anyone don't like surprises, and they

often care intensely about delivering services well, with high

performance and high availability. Tools that help them do that are

appreciated, tools that are flaky - which a lot of

abstract-all-the-details tools seem to be - get a bad rap in sysadmin

circles.

</pre>

      <blockquote type="cite">

        <pre wrap="">That's why I am very sure and convinced that we need to have ability for

user to have control over stuff. What node is having what role. We can be

smart, suggest and advice. But not hiding this functionality from user.

Otherwise, I am afraid that we can fail.

</pre>

      </blockquote>

      <pre wrap="">

I think having that degree of control is failure. Our CloudOS team has

considerable experience now in deploying clouds using a high-touch

system like you describe - and they are utterly convinced that it

doesn't scale. Even at 20 nodes it is super tedious, and beyond that

it's ridiculous.

</pre>

      <blockquote type="cite">

        <pre wrap="">Furthermore, if we put lots of restrictions (like homogenous hardware) in

front of users from the very beginning, we are discouraging people from

using TripleO-UI. We are young project and trying to hit as broad audience

as possible. If we do flexible enough approach to get large audience

interested, solve their problems, we will get more feedback, we will get

early adopters, we will get more contributors, etc.

</pre>

      </blockquote>

      <pre wrap="">

Flexibilty comes with a cost. Right now we have a large audience

interested in what we have, but we're delivering two separate things:

we have a functional sysadminny interface with command line scripts

and heat templates - , and we have a GUI where we can offer a better

interface which the tuskar folk are building up. I agree that

homogeneous hardware isn't a viable long term constraint. But if we

insist on fixing that issue first, we sacrifice our ability to learn

about the usefulness of a simple, straight forward interface. We'll be

doing a bunch of work - regardless of implementation - to deal with

heterogeneity, when we could be bringing Swift and Cinder up to

production readiness - which IMO will get many more folk onboard for

adoption.

</pre>

      <blockquote type="cite">

        <pre wrap="">First, let's help cloud operator, who is having some nodes and wants to

deploy OpenStack on them. He wants to have control which node is controller,

which node is compute or storage. Then we can get smarter and guide.

</pre>

      </blockquote>

      <pre wrap="">

Folk that want to manually install openstack on a couple of machines

can already do so : we don't change the game for them by replacing a

manual system with a manual system. My vision is that we should

deliver something significantly better!

</pre>

      <blockquote type="cite">

        <pre wrap="">=== Milestones ===

Based on different user behavior I am talking about, I suggest different

milestones:

</pre>

      </blockquote>

      <pre wrap="">...

So, I have a suggestion. Lets create a set of all the things we want

in the product eventually.

<a class="moz-txt-link-freetext" href="https://etherpad.openstack.org/p/tripleo-feature-map">https://etherpad.openstack.org/p/tripleo-feature-map</a>

>From there we can assess for each thing several things:

cost - estimated cost of 'ok'(*) implementation - 0: expensive-

multiple cycles, 9: cheap

benefit(us) - estimated benefit to design learning by having a

functional implementation - 0: learn nothing, 9: learn lots

benefit(users) - e.g. estimated increase in # of users for which

TripleO will satisfy their needs (as part of a holistic install) - 0:

minimal increase, 9: huge increase

>From there we can draw a cube: things that are cheap, we learn a lot,

and users benefit a lot are no brainers :) Things that are expensive,

we don't learn a lot and users don't benefit much are clearly things

we don't want to do right now:

cost  b-us   b-users   do-when ?

0        0        0            never?

9        9        9            right now

5        5        5            sometime in the middle

but more interesting are combinations like:

0        9        9            start now as a background task?

9        2        2            Do if we have nothing better

9        0        9            right now

9        9        0            also right now

So I dunno if this is a good idea - it's just an attempt to visualise

the tradeoffs in a way that we can be clear what we're saying is good

about a specific feature [think of it as a variation on planning

poker].

(*): I mean an implementation we could live with for a while, vs

whatever the ideal might be.

</pre>

      <blockquote type="cite">

        <pre wrap="">

=== Implementation ===

Above mentioned approach shouldn't lead to reimplementing scheduler. We can

still use nova-scheduler, but we can take advantage of extra params (like

unique identifier), so that we specify more concretely what goes where.

</pre>

      </blockquote>

      <pre wrap="">

That is reimplementing the scheduler. In this case it's forcing

sysadmins to be the scheduler, which is a waste of their time.

</pre>

      <blockquote type="cite">

        <pre wrap="">More details should follow here - how to achieve above mentioned goals, like

what should go through heat, what should go through nova, ironic, etc.

But first, let's agree on approach and goals.

</pre>

      </blockquote>

      <pre wrap="">

Totally agree!

-Rob

</pre>

    </blockquote>

    <br>

  </body>

</html>