Open Stack

Wed Jul 12 02:18:38 UTC 2017

On Wed, Jul 12, 2017 at 11:47 AM, James Slagle <james.slagle at gmail.com>
wrote:

> On Tue, Jul 11, 2017 at 6:53 PM, Steve Baker <sbaker at redhat.com> wrote:
> >
> >
> > On Tue, Jul 11, 2017 at 6:51 AM, James Slagle <james.slagle at gmail.com>
> > wrote:
> >>
> >> On Mon, Jul 10, 2017 at 11:37 AM, Lars Kellogg-Stedman <lars at redhat.com
> >
> >> wrote:
> >> > On Fri, Jul 7, 2017 at 1:50 PM, James Slagle <james.slagle at gmail.com>
> >> > wrote:
> >> >>
> >> >> There are also some ideas forming around pulling the Ansible
> playbooks
> >> >>
> >> >> and vars out of Heat so that they can be rerun (or run initially)
> >> >> independently from the Heat SoftwareDeployment delivery mechanism:
> >> >
> >> >
> >> > I think the closer we can come to "the operator runs ansible-playbook
> to
> >> > configure the overcloud" the better, but not because I think Ansible
> is
> >> > inherently a great tool: rather, I think the many layers of
> indirection
> >> > in
> >> > our existing model make error reporting and diagnosis much more
> >> > complicated
> >> > that it needs to be.  Combined with Puppet's "fail as late as
> possible"
> >> > model, this means that (a) operators waste time waiting for a
> deployment
> >> > that is ultimately going to fail but hasn't yet, and (b) when it does
> >> > fail,
> >> > they need relatively intimate knowledge of our deployment tools to
> >> > backtrack
> >> > through logs and find the root cause of the failure.
> >> >
> >> > If we can offer a deployment mode that reduces the number of layers
> >> > between
> >> > the operator and the actions being performed on the hosts I think we
> >> > would
> >> > win on both fronts: faster failures and reporting errors as close as
> >> > possible to the actual problem will result in less frustration across
> >> > the
> >> > board.
> >> >
> >> > I do like Steve's suggestion of a split model where Heat is
> responsible
> >> > for
> >> > instantiating OpenStack resources while Ansible is used to perform
> host
> >> > configuration tasks.  Despite all the work done on Ansible's OpenStack
> >> > modules, they feel inflexible and frustrating to work with when
> compared
> >> > to
> >> > Heat's state-aware, dependency ordered deployments.  A solution that
> >> > allows
> >> > Heat to output configuration that can subsequently be consumed by
> >> > Ansible --
> >> > either running manually or perhaps via Mistral for
> >> > API-driven-deployments --
> >> > seems like an excellent goal.  Using Heat as a "front-end" to the
> >> > process
> >> > means that we get to keep the parameter validation and documentation
> >> > that is
> >> > missing in Ansible, while still following the Unix philosophy of
> giving
> >> > you
> >> > enough rope to hang yourself if you really want it.
> >>
> >> This is excellent input, thanks for providing it.
> >>
> >> I think it lends itself towards suggesting that we may like to persue
> >> (again) adding native Ironic resources to Heat. If those were written
> >> in a way that also addressed some of the feedback about TripleO and
> >> the baremetal deployment side, then we could continue to get the
> >> advantages from Heat that you mention.
> >>
> >> My personal opinion to date is that Ansible's os_ironic* modules are
> >> superior in some ways to the Heat->Nova->Ironic model. However, just a
> >> Heat->Ironic model may work in a way that has the advantages of both.
> >
> >
> > I too would dearly like to get nova out of the picture. Our placement
> needs
> > mean the scheduler is something we need to work around, and it discards
> > basically all context for the operator when ironic can't deploy for some
> > reason.
> >
> > Whether we use a mistral workflow[1], a heat resource, or ansible
> os_ironic,
> > there will still need to be some python logic to build the config drive
> ISO
> > that injects the ssh keys and os-collect-config bootstrap.
> >
> > Unfortunately ironic iPXE boot from iSCSI[2] doesn't support config-drive
> > (still?) so the only option to inject ssh keys is the nova ec2-metadata
> > service (or equivalent). I suspect if we can't make every ironic
> deployment
> > method support config-drive then we're stuck with nova.
> >
> > I don't have a strong preference for a heat resource vs mistral vs
> ansible
> > os_ironic, but given there is some python logic required anyway, I would
> > lean towards a heat resource. If the resource is general enough we could
> > propose it to heat upstream, otherwise we could carry it in
> tripleo-common.
> >
> > Alternatively, we can implement a config-drive builder in tripleo-common
> and
> > invoke that from mistral or ansible.
>
> Ironic's cli node-set-provision-state command has a --config-drive
> option where you just point it a directory and it will automatically
> bundle that dir into the config drive ISO format.
>
> Ansible's os_ironic_node[1] also supports that via the config_drive
> parameter. Combining that with a couple of template tasks to create
> meta_data.json and user_data files makes for a very easy to user
> interface.
>
>
> [1] http://docs.ansible.com/ansible/os_ironic_node_module.html
>

Oh, that makes it easier. That just leaves the issue of 4 of the 5
scenarios in [2] not supporting config drive. The options I see here are:
a. nova forever
b. not support any boot from volume scenarios in TripleO that don't work
with config-drive
c. write our own small metadata service (its basically serving machine
specific static http content, so can maybe be done with some apache fu)

If b. is acceptable then maybe I can un-abandon [3]?

[2]
http://specs.openstack.org/openstack/ironic-specs/specs/approved/boot-from-volume-reference-drivers.html
[3] https://review.openstack.org/#/c/400407/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170712/7f17fe64/attachment.html>

Open Stack

[openstack-dev] [TripleO] Forming our plans around Ansible

OpenStack

Community

Documentation

Branding & Legal