[openstack-dev] [TripleO][Tuskar] Icehouse Requirements

Keith Basil kbasil at redhat.com
Thu Dec 12 17:13:05 UTC 2013


On Dec 11, 2013, at 3:42 PM, Robert Collins wrote:

> On 12 December 2013 01:17, Jaromir Coufal <jcoufal at redhat.com> wrote:
>> On 2013/10/12 23:09, Robert Collins wrote:
> 
>>>> The 'easiest' way is to support bigger companies with huge deployments,
>>>> tailored infrastructure, everything connected properly.
>>>> 
>>>> But there are tons of companies/users who are running on old
>>>> heterogeneous
>>>> hardware. Very likely even more than the number of companies having
>>>> already
>>>> mentioned large deployments. And giving them only the way of 'setting up
>>>> rules' in order to get the service on the node - this type of user is not
>>>> gonna use our deployment system.
>>> 
>>> 
>>> Thats speculation. We don't know if they will or will not because we
>>> haven't given them a working system to test.
>> 
>> Some part of that is speculation, some part of that is feedback from people
>> who are doing deployments (of course its just very limited audience).
>> Anyway, it is not just pure theory.
> 
> Sure. Let be me more precise. There is a hypothesis that lack of
> direct control will be a significant adoption blocker for a primary
> group of users.
> 
> I think it's safe to say that some users in the group 'sysadmins
> having to deploy an OpenStack cloud' will find it a bridge too far and
> not use a system without direct control. Call this group A.
> 
> I think it's also safe to say that some users will not care in the
> slightest, because their deployment is too small for them to be
> particularly worried (e.g. about occasional downtime (but they would
> worry a lot about data loss)). Call this group B.
> 
> I suspect we don't need to consider group C - folk who won't use a
> system if it *has* manual control, but thats only a suspicion. It may
> be that the side effect of adding direct control is to reduce
> usability below the threshold some folk need...
> 
> To assess 'significant adoption blocker' we basically need to find the
> % of users who will care sufficiently that they don't use TripleO.
> 
> How can we do that? We can do questionnaires, and get such folk to
> come talk with use, but that suffers from selection bias - group B can
> use the system with or without direct manual control, so have little
> motivation to argue vigorously in any particular direction. Group A
> however have to argue because they won't use the system at all without
> that feature, and they may want to use the system for other reasons,
> so that because a crucial aspect for them.
> 
> A much better way IMO is to test it - to get a bunch of volunteers and
> see who responds positively to a demo *without* direct manual control.
> 
> To do that we need a demoable thing, which might just be mockups that
> show a set of workflows (and include things like Jay's
> shiny-new-hardware use case in the demo).
> 
> I rather suspect we're building that anyway as part of doing UX work,
> so maybe what we do is put a tweet or blog post up asking for
> sysadmins who a) have not yet deployed openstack, b) want to, and c)
> are willing to spend 20-30 minutes with us, walk them through a demo
> showing no manual control, and record what questions they ask, and
> whether they would like to have that product to us, and if not, then
> (a) what use cases they can't address with the mockups and (b) what
> other reasons they have for not using it.
> 
> This is a bunch of work though!
> 
> So, do we need to do that work?
> 
> *If* we can layer manual control on later, then we could defer this
> testing until we are at the point where we can say 'the nova scheduled
> version is ready, now lets decide if we add the manual control'.
> 
> OTOH, if we *cannot* layer manual control on later - if it has
> tentacles through too much of the code base, then we need to decide
> earlier, because it will be significantly harder to add later and that
> may be too late of a ship date for vendors shipping on top of TripleO.
> 
> So with that as a prelude, my technical sense is that we can layer
> manual scheduling on later: we provide an advanced screen, show the
> list of N instances we're going to ask for and allow each instance to
> be directly customised with a node id selected from either the current
> node it's running on or an available node. It's significant work both
> UI and plumbing, but it's not going to be made harder by the other
> work we're doing AFAICT.
> 
> -> My proposal is that we shelve this discussion until we have the
> nova/heat scheduled version in 'and now we polish' mode, and then pick
> it back up and assess user needs.
> 
> An alternative argument is to say that group A is a majority of the
> userbase and that doing an automatic version is entirely unnecessary.
> Thats also possible, but I'm extremely skeptical, given the huge cost
> of staff time, and the complete lack of interest my sysadmin friends
> (and my former sysadmin self) have in doing automatable things by
> hand.
> 
>>> Lets break the concern into two halves:
>>> A) Users who could have their needs met, but won't use TripleO because
>>> meeting their needs in this way is too hard/complex/painful.
>>> 
>>> B) Users who have a need we cannot meet with the current approach.
>>> 
>>> For category B users, their needs might be specific HA things - like
>>> the oft discussed failure domains angle, where we need to split up HA
>>> clusters across power bars, aircon, switches etc. Clearly long term we
>>> want to support them, and the undercloud Nova scheduler is entirely
>>> capable of being informed about this, and we can evolve to a holistic
>>> statement over time. Lets get a concrete list of the cases we can
>>> think of today that won't be well supported initially, and we can
>>> figure out where to do the work to support them properly.
>> 
>> My question is - can't we help them now? To enable users to use our app even
>> when we don't have enough smartness to help them 'auto' way?
> 
> I understand the question: but I can't answer it until we have *an*
> example that is both real and not deliverable today. At the moment the
> only one we know of is HA, and thats certainly an important feature on
> the nova scheduled side, so doing manual control to deliver a future
> automatic feature doesn't make a lot of sense to me. Crawl, walk, run.

Maybe this is a valid use case?

Cloud operator has several core service nodes of differing configuration
types. 

[node1]  <-- balanced mix of disk/cpu/ram for general core services
[node2]  <-- lots of disks for Ceilometer data storage
[node3]  <-- low-end "appliance like" box for a specialized/custom core service
	     (SIEM box for example)

All nodes[1,2,3] are in the same deployment grouping ("core services)".  As such,
this is a heterogenous deployment grouping.  Heterogeneity in this case defined by 
differing roles and hardware configurations.  

This is a real use case.

How do we handle this?


	-k




More information about the OpenStack-dev mailing list