Open Stack

Mon Jul 10 16:59:21 UTC 2017

On Mon, Jul 10, 2017 at 9:19 AM, Steven Hardy <shardy at redhat.com> wrote:
> On Fri, Jul 7, 2017 at 6:50 PM, James Slagle <james.slagle at gmail.com> wrote:
> Yeah so my idea with (4), and subsequent patches such as[1] is to
> gradually move the deploy steps performed to configure services (on
> baremetal and in containers) to a single ansible playbook.
>
> There's currently still heat orchestration around the host preparation
> (although this is performed via ansible) and iteration over each step
> (where we re-apply the same deploy-steps playbook with an incrementing
> step variable, but this could be replaced by e.g an ansible or mistral
> loop), but my idea was to enable end-to-end configuration of nodes via
> ansible-playbook, without the need for any special tooks (e.g we
> refactor t-h-t enough that we don't need any special tools, and we
> make deploy-steps-playbook.yaml the only method of deployment (for
> baremetal and container cases)
>
> [1] https://review.openstack.org/#/c/462211/
>
>> All of this work has merit as we investigate longer term plans, and
>> it's all at different stages with some being for dev/CI (0), some
>> being used already in production (1 and 2), some just at the
>> experimental stage (3 and 4), and some does not exist other than an
>> idea (5).
>
> I'd like to get the remaining work for (4) done so it's a supportable
> option for minor updates, but there's still a bit more t-h-t
> refactoring required to enable it I think, but I think we're already
> pretty close to being able to run end-to-end ansible for most of the
> PostDeploy steps without any special tooling.

Thanks for this context, I think it helps clarify where we could be
going with these patches. I'll take a closer look at what you've done
so far.

I think I will missing the point of whether the playbooks would still
run localhost mode on each node, or if the idea would be that we could
eventually work towards a central ansible "runner" (such as the
undercloud) that could execute all the playbooks.

It sounds the latter is possibly just an iterative step beyond the
former, so I think like where this approach is going.

>> My intent with this mail is to start a discussion around what we've
>> learned from these approaches and start discussing a consolidated plan
>> around Ansible. And I'm not saying that whatever we come up with
>> should only use Ansible a certain way. Just that we ought to look at
>> how users/operators interact with Ansible and TripleO today and try
>> and come up with the best solution(s) going forward.
>>
>> I think that (1) has been pretty successful, and my idea with (5)
>> would use a similar approach once the playbooks were generated.
>> Further, my idea with (5) would give us a fully backwards compatible
>> solution with our existing template interfaces from
>> tripleo-heat-templates. Longer term (or even in parallel for some
>> time), the generated playbooks could stop being generated (and just
>> exist in git), and we could consider moving away from Heat more
>> permanently
>
> Yeah I think working towards aligning more TripleO configuration with
> the approach taken by tripleo-validations is fine, and we can e.g add
> more heat generated data about the nodes to the dynamic ansible
> inventory:
>
> https://github.com/openstack/tripleo-validations/blob/master/tripleo_validations/inventory.py
>
> We've been gradually adding data there, which I hope will enable a
> cleaner "split stack", where the nodes are deployed via heat, then
> ansible can do the configuration based on data exposed via stack
> outputs (which again is a pattern that I think has been proven to work
> quite well for tripleo-validations, and is also something I've been
> using locally for dev testing quite successfully).
>
>> I recognize that saying "moving away from Heat" may be quite
>> controversial. While it's not 100% the same discussion as what we are
>> doing with Ansible, I think it is a big part of the discussion and if
>> we want to continue with Heat as the primary orchestration tool in
>> TripleO.
>
> Yeah, I think the first step is to focus on a clean "split stack"
> model where the nodes/networks etc are still deployed via heat, then
> ansible handles the configuration of the nodes.
>
> In the long term I could see benefits in a "tripleo lite" model,
> where, say, we only used mistral+Ironic+ansible, but IMO we're not at
> the point yet where that's achievable, primarily because there's
> coupling between the heat parameter interfaces and multiple
> integrations we can't break (e.g users with environment files,
> tripleo-ui, vendor integrations, etc).
>
> It's a good discussion to kick off regardless though, so personally
> I'd like to focus on these as the first "baby steps":
>
> 1. How to perform end-to-end configuration via ansible (outside of
> heat, but probably still using data and possibly playbooks generated
> by heat)
>
> 2. How to deploy nodes directly via Ironic, with a mistral workflow
> (e.g no Nova and potentially no Neutron?), I started that in
> https://review.openstack.org/#/c/313048/ but could use some help
> completing it.

Recently, I took a closer look at the os_ironic and os_ironic_node
modules that exist in Ansible today:

http://docs.ansible.com/ansible/os_ironic_module.html
http://docs.ansible.com/ansible/os_ironic_node_module.html

These are quite fully functional today and are also used by bifrost.

I'd rather see us just use these native Ansible modules and trigger
them with Mistral->Ansible instead of reimplementing the logic in
native Mistral workbooks/flows.

>
>> I've been hearing a lot of feedback from various operators about how
>> difficult the baremetal deployment is with Heat. While feedback about
>> Ironic is generally positive, a lot of the negative feedback is around
>> the Heat->Nova->Ironic interaction. And, if we also move more towards
>> Ansible for the service deployment, I wonder if there is still a long
>> term place for Heat at all.
>
> So while there are plenty of valid complaints, one observation is Heat
> always gets blamed because it's the operator visible interface, but
> quite often the problems are e.g Nova or some other non-heat issue,
> for example "No valid host found" is often perceived a heat problem by
> new users when in reality it's not.

Yes, I agree completely with that point.

As you mention, there is a class of feedback that is directly related
to Nova->Ironic interaction. If we moved away from Heat for baremetal,
I don't really see any reason why we would still use Nova either, as
it's just a hindrance at that point.

There is another class of feedback somewhat specific to Heat though.
Things like:
- fault tolerance
- interface complexity (HostnameMap, FixedIP's)

To directly address some of that feedback, we'd need to work on Heat
directly and make it work more like the tools that operators are
asking for (Ansible, etc). I think there's differing opinions on
whether Heat should work certain ways or not, and if it's worth
investing in those types of changes.

> That said, there are valid complaints around the SoftwareDeployment
> approach and operator familiarity vs some more traditional tool such
> as ansible.
>
>> Personally, I'm pretty apprehensive about the approach taken in (3). I
>> feel that it is a lot of complexity that could be done simpler if we
>> took a step back and thought more about a longer term approach. I
>> recognize that it's mostly an experiment/POC at this stage, and I'm
>> not trying to directly knock down the approach. It's just that when I
>> start to see more patches (Kubernetes installation) using the same
>> approach, I figure it's worth discussing more broadly vs trying to
>> have a discussion by -1'ing patch reviews, etc.
>
> I agree, I think the approach in (3) is a stopgap until we can define
> a cleaner approach with less layers.
>
> IMO the first step towards that is likely to be a "split stack" which
> outputs heat data, then deployment configuration is performed via
> mistral->ansible just like we already do in (1).
>
>> I'm interested in all feedback of course. And I plan to take a shot at
>> working on the prototype I mentioned in (5) if anyone would like to
>> collaborate around that.
>
> I'm very happy to collaborate, and this is quite closely related to
> the investigations I've been doing around enabling minor updates for
> containers.
>
> Lets sync up about it, but as I mentioned above I'm not yet fully sold
> on a new translation tool, vs just more t-h-t refactoring to enable
> output of data directly consumable via ansible-playbook (which can
> then be run via operators, or heat, or mistral, or whatever).

Indeed, I think it sounds like our goal is shared and mostly the same,
and we're just investigating different ways to get there. Perhaps it
could be worth starting an etherpad with some of these thoughts and
goals of where we'd like to get to. I can do that and link it from the
PTG etherpad for this proposed session.

-- 
-- James Slagle
--

Open Stack

[openstack-dev] [TripleO] Forming our plans around Ansible

OpenStack

Community

Documentation

Branding & Legal