<html>

  <head>

    <meta content="text/html; charset=UTF-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#333333">

    <br>

    <div class="moz-cite-prefix">On 2013/28/11 06:41, Robert Collins

      wrote:<br>

    </div>

    <blockquote

cite="mid:CAJ3HoZ0rcjyO92-rD+Hsm_VViiRNQZRFwvdvOvpQ34ZaH1jQLA@mail.gmail.com"

      type="cite">

      <pre wrap="">

Certainly. Do we have Personas for those people? (And have we done any

validation of them?)</pre>

    </blockquote>

    We have shorter paragraph to each. But not verified by any survey,

    so we don't have very solid basis in this area right now and I

    believe we all are trying to assume at the moment.<br>

    <br>

    <blockquote

cite="mid:CAJ3HoZ0rcjyO92-rD+Hsm_VViiRNQZRFwvdvOvpQ34ZaH1jQLA@mail.gmail.com"

      type="cite">

      <pre wrap="">This may be where we disagree indeed :). Wearing my sysadmin hat ( a

little dusty, but never really goes away :P) - I can tell you I spent

a lot of time worrying about what went on what machine. But it was

never actually what I was paid to do.

What I was paid to do was to deliver infrastructure and services to

the business. Everything that we could automate, that we could

describe with policy and still get robust, reliable results - we did.

It's how one runs many hundred machines with an ops team of 2.

Planning around failure domains for example, is tedious work; it's

needed at a purchasing level - you need to decide if you're buying

three datacentres or one datacentre with internal redundancy, but once

thats decided the actual mechanics of ensure that each HA service is

spread across the (three datacentres) or (three separate zones in the

one DC) is not interesting. So - I'm sure that many sysadmins do

manually assign work to machines to ensure a good result from

performance or HA concerns, but thats out of necessity, not desire.</pre>

    </blockquote>

    Well, I think there is one small misunderstanding. I've never said

    that manual way should be primary workflow for us. I agree that we

    should lean toward as much automation and smartness as possible. But

    in the same time, I am adding that we need manual fallback for user

    to change that smart decision.<br>

    <br>

    Primary way would be to let TripleO decide, where the stuff go. I

    think we agree here.<br>

    <br>

    But I, as sysadmin, want to see the distribution of stuff before I

    deploy. And if there is some failure in the automation logic, I need

    to have possibility to change that. Not from scratch, but do the

    change in suggested distribution. There always should be way to do

    that manually. Let's imagine that TripleO will by some mistake or

    intentionally distribute nodes across my datacenter wrong (wrong for

    me, not necessarily for somebody else). What would I do? Would I let

    TripleO to deploy it anyway? No. I will not use TripleO. But If

    there is something what I need to change and I have a way to do

    that, I will keep with TripleO, because it allows me to satisfy all

    I need.<br>

    <br>

    We can be smart, but we can't be the smartest and see all reasons of

    all users.<br>

    <br>

    <blockquote

cite="mid:CAJ3HoZ0rcjyO92-rD+Hsm_VViiRNQZRFwvdvOvpQ34ZaH1jQLA@mail.gmail.com"

      type="cite">

      <pre wrap="">Why does that layout make you happy? What is it about that setup where

things will work better for you? Note that in the absence of a

sophisticated scheduler you'll have some volumes with redundancy of 3

end up all in one rack: you won't get rack-can-fail safety on the

delivered cloud workloads (I mention this as one attempt to understand

why knowing there is a control node / 3 storage /rest compute in each

rack makes you happy).</pre>

    </blockquote>

    It doesn't have to make me happy, but somebody else might have

    strong reasoning for that (or any other setup which we didn't

    cover). We don't have to know it, but why can't we allow him to do

    this?<br>

    <br>

    One more time, I want to stress this out - I am not fighting for

    absence of sophisticated scheduler, I am fighting for allowing user

    to control the stuff if he wants/needs to.<br>

    <br>

    <blockquote

cite="mid:CAJ3HoZ0rcjyO92-rD+Hsm_VViiRNQZRFwvdvOvpQ34ZaH1jQLA@mail.gmail.com"

      type="cite">

      <pre wrap="">I think having that degree of control is failure. Our CloudOS team has

considerable experience now in deploying clouds using a high-touch

system like you describe - and they are utterly convinced that it

doesn't scale. Even at 20 nodes it is super tedious, and beyond that

it's ridiculous.</pre>

    </blockquote>

    Right. And are they convinced that automated tool will do the best

    job for them? Are they trusting them so strongly, so that they would

    deploy their whole datacenter without checking the correct

    distribution? Would they say - OK I said I want 50 compute, 10 block

    storage, 3 control. As long as it will work, I don't care, be smart,

    do it for me.<br>

    <br>

    It all depends on the GUI design. If we design it well enough, so

    that we allow user to do quick bulk actions, even manual

    distribution can be easy. Even for 100 nodes... or more.<br>

    (But I don't suggest we do that all manual.)<br>

    <br>

    <blockquote

cite="mid:CAJ3HoZ0rcjyO92-rD+Hsm_VViiRNQZRFwvdvOvpQ34ZaH1jQLA@mail.gmail.com"

      type="cite">

      <pre wrap="">Flexibilty comes with a cost. Right now we have a large audience

interested in what we have, but we're delivering two separate things:

we have a functional sysadminny interface with command line scripts

and heat templates - , and we have a GUI where we can offer a better

interface which the tuskar folk are building up. I agree that

homogeneous hardware isn't a viable long term constraint. But if we

insist on fixing that issue first, we sacrifice our ability to learn

about the usefulness of a simple, straight forward interface. We'll be

doing a bunch of work - regardless of implementation - to deal with

heterogeneity, when we could be bringing Swift and Cinder up to

production readiness - which IMO will get many more folk onboard for

adoption.</pre>

    </blockquote>

    I agree that there always will be some cost. I just think that we

    can reduce it.<br>

    <br>

    <blockquote

cite="mid:CAJ3HoZ0rcjyO92-rD+Hsm_VViiRNQZRFwvdvOvpQ34ZaH1jQLA@mail.gmail.com"

      type="cite">

      <pre wrap="">Folk that want to manually install openstack on a couple of machines

can already do so : we don't change the game for them by replacing a

manual system with a manual system. My vision is that we should

deliver something significantly better!</pre>

    </blockquote>

    We should! And we can. But I think we shouldn't deliver something,

    what will discourage people from using TripleO. Especially at the

    beginning - see user, we are doing first steps here, the

    distribution is not perfect and what you wanted, but you can do the

    change you need. You don't have to go away and come back in 6 months

    when we try to be smarter and address your case.<br>

    <br>

    <blockquote

cite="mid:CAJ3HoZ0rcjyO92-rD+Hsm_VViiRNQZRFwvdvOvpQ34ZaH1jQLA@mail.gmail.com"

      type="cite">

      <blockquote type="cite">

        <pre wrap="">=== Milestones ===

</pre>

      </blockquote>

      <pre wrap="">...

So, I have a suggestion. Lets create a set of all the things we want

in the product eventually.

<a class="moz-txt-link-freetext" href="https://etherpad.openstack.org/p/tripleo-feature-map">https://etherpad.openstack.org/p/tripleo-feature-map</a>

>From there we can assess for each thing several things:

cost - estimated cost of 'ok'(*) implementation - 0: expensive-

multiple cycles, 9: cheap

benefit(us) - estimated benefit to design learning by having a

functional implementation - 0: learn nothing, 9: learn lots

benefit(users) - e.g. estimated increase in # of users for which

TripleO will satisfy their needs (as part of a holistic install) - 0:

minimal increase, 9: huge increase

>From there we can draw a cube: things that are cheap, we learn a lot,

and users benefit a lot are no brainers :) Things that are expensive,

we don't learn a lot and users don't benefit much are clearly things

we don't want to do right now:

cost  b-us   b-users   do-when ?

0        0        0            never?

9        9        9            right now

5        5        5            sometime in the middle

but more interesting are combinations like:

0        9        9            start now as a background task?

9        2        2            Do if we have nothing better

9        0        9            right now

9        9        0            also right now

So I dunno if this is a good idea - it's just an attempt to visualise

the tradeoffs in a way that we can be clear what we're saying is good

about a specific feature [think of it as a variation on planning

poker].

(*): I mean an implementation we could live with for a while, vs

whatever the ideal might be.</pre>

    </blockquote>

    I think it might help.<br>

    <br>

    The thing is, that I believe we are going the same direction with

    same goals (with just some nuances).<br>

    <br>

    For me it is important to have manual fallback in Icehouse release.

    If it will be too difficult to implement, we can deliver it in v1

    instead of v0 (I'll survive :)). Personally I don't think it should

    be that difficult, but I am not the best person to do the best

    evaluation here. But I will strongly fight for this to be in

    Icehouse release. It shouldn't be primary way, but I believe it

    needs to exist.<br>

    <br>

    Basically what I am saying - be smart, do some 'dry-run' of

    scheduler to see what will be the distribution and if I am happy,

    confirm. If I am not happy, allow me to change it.<br>

    <br>

    -- Jarda<br>

  </body>

</html>