[openstack-dev] [tripleo] Nodes management in our shiny new TripleO API

Dmitry Tantsur dtantsur at redhat.com
Fri May 20 13:59:39 UTC 2016

On 05/20/2016 03:42 PM, John Trowbridge wrote:
> On 05/19/2016 09:31 AM, Dmitry Tantsur wrote:
>> Hi all!
>> We started some discussions on https://review.openstack.org/#/c/300200/
>> about the future of node management (registering, configuring and
>> introspecting) in the new API, but I think it's more fair (and
>> convenient) to move it here. The goal is to fix several long-standing
>> design flaws that affect the logic behind tripleoclient. So fasten your
>> seatbelts, here it goes.
>> If you already understand why we need to change this logic, just scroll
>> down to "what do you propose?" section.
>> "introspection bulk start" is evil
>> ----------------------------------
>> As many of you obviously know, TripleO used the following command for
>> introspection:
>>  openstack baremetal introspection bulk start
>> As not everyone knows though, this command does not come from
>> ironic-inspector project, it's part of TripleO itself. And the ironic
>> team has some big problems with it.
>> The way it works is
>> 1. Take all nodes in "available" state and move them to "manageable" state
>> 2. Execute introspection for all nodes in "manageable" state
>> 3. Move all nodes with successful introspection to "available" state.
>> Step 3 is pretty controversial, step 1 is just horrible. This not how
>> the ironic-inspector team designed introspection to work (hence it
>> refuses to run on nodes in "available" state), and that's now how the
>> ironic team expects the ironic state machine to be handled. To explain
>> it I'll provide a brief information on the ironic state machine.
>> ironic node lifecycle
>> ---------------------
>> With recent versions of the bare metal API (starting with 1.11), nodes
>> begin their life in a state called "enroll". Nodes in this state are not
>> available for deployment, nor for most of other actions. Ironic does not
>> touch such nodes in any way.
>> To make nodes alive an operator uses "manage" provisioning action to
>> move nodes to "manageable" state. During this transition the power and
>> management credentials (IPMI, SSH, etc) are validated to ensure that
>> nodes in "manageable" state are, well, manageable. This state is still
>> not available for deployment. With nodes in this state an operator can
>> execute various pre-deployment actions, such as introspection, RAID
>> configuration, etc. So to sum it up, nodes in "manageable" state are
>> being configured before exposing them into the cloud.
>> The last step before the deployment it to make nodes "available" using
>> the "provide" provisioning action. Such nodes are exposed to nova, and
>> can be deployed to at any moment. No long-running configuration actions
>> should be run in this state. The "manage" action can be used to bring
>> nodes back to "manageable" state for configuration (e.g. reintrospection).
>> so what's the problem?
>> ----------------------
>> The problem is that TripleO essentially bypasses this logic by keeping
>> all nodes "available" and walking them through provisioning steps
>> automatically. Just a couple of examples of what gets broken:
>> (1) Imagine I have 10 nodes in my overcloud, 10 nodes ready for
>> deployment (including potential autoscaling) and I want to enroll 10
>> more nodes.
>> Both introspection and ready-state operations nowadays will touch both
>> 10 new nodes AND 10 nodes which are ready for deployment, potentially
>> making the latter not ready for deployment any more (and definitely
>> moving them out of pool for some time).
>> Particularly, any manual configuration made by an operator before making
>> nodes "available" may get destroyed.
>> (2) TripleO has to disable automated cleaning. Automated cleaning is a
>> set of steps (currently only wiping the hard drive) that happens in
>> ironic 1) before nodes are available, 2) after an instance is deleted.
>> As TripleO CLI constantly moves nodes back-and-forth from and to
>> "available" state, cleaning kicks in every time. Unless it's disabled.
>> Disabling cleaning might sound a sufficient work around, until you need
>> it. And you actually do. Here is a real life example of how to get
>> yourself broken by not having cleaning:
>> a. Deploy an overcloud instance
>> b. Delete it
>> c. Deploy an overcloud instance on a different hard drive
>> d. Boom.
>> As we didn't pass cleaning, there is still a config drive on the disk
>> used in the first deployment. With 2 config drives present cloud-init
>> will pick a random one, breaking the deployment.
>> To top it all, TripleO users tend to not use root device hints, so
>> switching root disks may happen randomly between deployments. Have fun
>> debugging.
>> what do you propose?
>> --------------------
>> I would like the new TripleO mistral workflows to start following the
>> ironic state machine closer. Imagine the following workflows:
>> 1. register: take JSON, create nodes in "manageable" state. I do believe
>> we can automate the enroll->manageable transition, as it serves the
>> purpose of validation (and discovery, but lets put it aside).
>> 2. provide: take a list of nodes or all "managable" nodes and move them
>> to "available". By using this workflow an operator will make a
>> *conscious* decision to add some nodes to the cloud.
>> 3. introspect: take a list of "managable" (!!!) nodes or all
>> "manageable" nodes and move them through introspection. This is an
>> optional step between "register" and "provide".
>> 4. set_node_state: a helper workflow to move nodes between states. The
>> "provide" workflow is essentially set_node_state with verb=provide, but
>> is separate due to its high importance in the node lifecycle.
>> 5. configure: given a couple of parameters (deploy image, local boot
>> flag, etc), update given or all "manageable" nodes with them.
>> Essentially the only addition here is the "provide" action which I hope
>> you already realize should be an explicit step.
>> what about tripleoclient
>> ------------------------
>> Of course we want to keep backward compatibility. The existing commands
>>  openstack baremetal import
>>  openstack baremetal configure boot
>>  openstack baremetal introspection bulk start
>> will use some combinations of workflows above and will be deprecated.
>> The new commands (also avoiding hijacking into the bare metal
>> namespaces) will be provided strictly matching the workflows (especially
>> in terms of the state machine):
>>  openstack overcloud node import
>>  openstack overcloud node configure
>>  openstack overcloud node introspect
>>  openstack overcloud node provide
>> (I have a good reasoning behind each of these names, but if I put it
>> here this mail will be way too long).
>> Now to save a user some typing:
>> 1. the configure command will be optional, as the import command will
>> set the defaults
>> 2. the introspect command will get --provide flag
>> 3. the import command will get --introspect and --provide flags
>> So the simplest flow for people will be:
>>  openstack overcloud node import --provide instackenv.json
>> this command will use 2 workflows and will result in a bunch of
>> "available" nodes, essentially making it a synonym of the "baremetal
>> import" command.
>> With introspection it becomes:
>>  openstack overcloud node import --introspect --provide instackenv.json
>> this command will use 3 workflows and will result in "available" and
>> introspected nodes.
> Thanks for the very detailed write-up. I even learned some new reasons
> to not like the introspection bulk command.
> I like the proposed breakdown of steps. In the workflow section you talk
> about single (arbitrary list too?) nodes and all nodes of a certain
> state, but in the client section it seems all commands will act on the
> latter? Is that the intention, or would we also want the client actions
> to support single (or arbitrary lists) of nodes? A bit of an
> implementation detail, but just wanted to check what you were thinking
> in that regard.

I was thinking about

  openstack overcloud node introspect


  openstack overcloud node introspect --nodes uuid1 uuid2

(or something like that). Forgot to put to the email.

>> Thanks for reading such a long email (ping me on IRC if you actually
>> read it through just for statistics). I hope it makes sense for you.
>> Dmitry.
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

More information about the OpenStack-dev mailing list