[openstack-dev] [tripleo][heat][ironic] Heat Ironic resources and "ready state" orchestration

Jim Rollenhagen jim at jimrollenhagen.com
Mon Sep 15 18:04:38 UTC 2014


On Mon, Sep 15, 2014 at 12:44:24PM +0100, Steven Hardy wrote:
> All,
> 
> Starting this thread as a follow-up to a strongly negative reaction by the
> Ironic PTL to my patches[1] adding initial Heat->Ironic integration, and
> subsequent very detailed justification and discussion of why they may be
> useful in this spec[2].
> 
> Back in Atlanta, I had some discussions with folks interesting in making
> "ready state"[3] preparation of bare-metal resources possible when
> deploying bare-metal nodes via TripleO/Heat/Ironic.
> 
> The initial assumption is that there is some discovery step (either
> automatic or static generation of a manifest of nodes), that can be input
> to either Ironic or Heat.

We've discussed this a *lot* within Ironic, and have decided that
auto-discovery (with registration) is out of scope for Ironic. In my
opinion, this is straightforward enough for operators to write small
scripts to take a CSV/JSON/whatever file and register the nodes in that
file with Ironic. This is what we've done at Rackspace, and it's really
not that annoying; the hard part is dealing with incorrect data from
the (vendor|DC team|whatever).

That said, I like the thought of Ironic having a bulk-registration
feature with some sort of specified format (I imagine this would just be
a simple JSON list of node objects).

We are likely doing a session on discovery in general in Paris. It seems
like the main topic will be about how to interface with external
inventory management systems to coordinate node discovery. Maybe Heat is
a valid tool to integrate with here, maybe not.

> Following discovery, but before an undercloud deploying OpenStack onto the
> nodes, there are a few steps which may be desired, to get the hardware into
> a state where it's ready and fully optimized for the subsequent deployment:

These pieces are mostly being done downstream, and (IMO) in scope for
Ironic in the Kilo cycle. More below.

> - Updating and aligning firmware to meet requirements of qualification or
>   site policy

Rackspace does this today as part of what we call "decommissioning".
There are patches up for review for both ironic-python-agent (IPA) [1] and
Ironic [2] itself. We have support for 1) flashing a BIOS on a node, and
2) Writing a set of BIOS settings to a node (these are embedded in the agent
image as a set, not through an Ironic API). These are both implemented as
a hardware manager plugin, and so can easily be vendor-specific.

I expect this to land upstream in the Kilo release.

> - Optimization of BIOS configuration to match workloads the node is
>   expected to run

The Ironic team has also discussed this, mostly at the last mid-cycle
meetup. We'll likely have a session on "capabilities", which we think
might be the best way to handle this case. Essentially, a node can be
tagged with arbitrary capabilities, e.g. "hypervisor", which Nova
(flavors?) could use for scheduling, and Ironic drivers could use to do
per-provisioning work, like setting BIOS settings. This may even tie in
with the next point.

Looks like Jay just ninja'd me a bit on this point. :)

> - Management of machine-local storage, e.g configuring local RAID for
>   optimal resilience or performance.

I don't see why Ironic couldn't do something with this in Kilo. It's
dangerously close to the "inventory management" line, however I think
it's reasonable for a user to specify that his or her root partition
should be on a RAID or a specific disk out of many in the node.

> Interfaces to Ironic are landing (or have landed)[4][5][6] which make many
> of these steps possible, but there's no easy way to either encapsulate the
> (currently mostly vendor specific) data associated with each step, or to
> coordinate sequencing of the steps.

It's important to remember that just because a blueprint/spec exists,
does not mean it will be approved. :) I don't expect the "DRAC
discovery" blueprint to go through, and the "DRAC RAID" blueprint is
questionable, with regards to scope.

> What is required is some tool to take a text definition of the required
> configuration, turn it into a correctly sequenced series of API calls to
> Ironic, expose any data associated with those API calls, and declare
> success or failure on completion.  This is what Heat does.

This is a fair point, however none of these use cases have code landed
in mainline Ironic, and certainly don't have APIs exposed, with the
exception of node registration. Is it useful to start writing plumbing to
talk to APIs that don't exist?

All that said, I don't think it's unreasonable for Heat to talk directly
to Ironic, but only if there's a valid use case that Ironic can't (or
won't) provide a solution for.

// jim

[1] https://review.openstack.org/104379
[2] https://review.openstack.org/#/q/status:open+project:openstack/ironic+branch:master+topic:decom-nodes,n,z

> So the idea is to create some basic (contrib, disabled by default) Ironic
> heat resources, then explore the idea of orchestrating ready-state
> configuration via Heat.
> 
> Given that Devananda and I have been banging heads over this for some time
> now, I'd like to get broader feedback of the idea, my interpretation of
> "ready state" applied to the tripleo undercloud, and any alternative
> implementation ideas.
> 
> Thanks!
> 
> Steve
> 
> [1] https://review.openstack.org/#/c/104222/
> [2] https://review.openstack.org/#/c/120778/
> [3] http://robhirschfeld.com/2014/04/25/ready-state-infrastructure/
> [4] https://blueprints.launchpad.net/ironic/+spec/drac-management-driver
> [5] https://blueprints.launchpad.net/ironic/+spec/drac-raid-mgmt
> [6] https://blueprints.launchpad.net/ironic/+spec/drac-hw-discovery
> 


> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list