[openstack-dev] [tripleo] Nodes management in our shiny new TripleO API
Dmitry Tantsur
dtantsur at redhat.com
Thu May 19 13:31:36 UTC 2016
Hi all!
We started some discussions on https://review.openstack.org/#/c/300200/
about the future of node management (registering, configuring and
introspecting) in the new API, but I think it's more fair (and
convenient) to move it here. The goal is to fix several long-standing
design flaws that affect the logic behind tripleoclient. So fasten your
seatbelts, here it goes.
If you already understand why we need to change this logic, just scroll
down to "what do you propose?" section.
"introspection bulk start" is evil
----------------------------------
As many of you obviously know, TripleO used the following command for
introspection:
openstack baremetal introspection bulk start
As not everyone knows though, this command does not come from
ironic-inspector project, it's part of TripleO itself. And the ironic
team has some big problems with it.
The way it works is
1. Take all nodes in "available" state and move them to "manageable" state
2. Execute introspection for all nodes in "manageable" state
3. Move all nodes with successful introspection to "available" state.
Step 3 is pretty controversial, step 1 is just horrible. This not how
the ironic-inspector team designed introspection to work (hence it
refuses to run on nodes in "available" state), and that's now how the
ironic team expects the ironic state machine to be handled. To explain
it I'll provide a brief information on the ironic state machine.
ironic node lifecycle
---------------------
With recent versions of the bare metal API (starting with 1.11), nodes
begin their life in a state called "enroll". Nodes in this state are not
available for deployment, nor for most of other actions. Ironic does not
touch such nodes in any way.
To make nodes alive an operator uses "manage" provisioning action to
move nodes to "manageable" state. During this transition the power and
management credentials (IPMI, SSH, etc) are validated to ensure that
nodes in "manageable" state are, well, manageable. This state is still
not available for deployment. With nodes in this state an operator can
execute various pre-deployment actions, such as introspection, RAID
configuration, etc. So to sum it up, nodes in "manageable" state are
being configured before exposing them into the cloud.
The last step before the deployment it to make nodes "available" using
the "provide" provisioning action. Such nodes are exposed to nova, and
can be deployed to at any moment. No long-running configuration actions
should be run in this state. The "manage" action can be used to bring
nodes back to "manageable" state for configuration (e.g. reintrospection).
so what's the problem?
----------------------
The problem is that TripleO essentially bypasses this logic by keeping
all nodes "available" and walking them through provisioning steps
automatically. Just a couple of examples of what gets broken:
(1) Imagine I have 10 nodes in my overcloud, 10 nodes ready for
deployment (including potential autoscaling) and I want to enroll 10
more nodes.
Both introspection and ready-state operations nowadays will touch both
10 new nodes AND 10 nodes which are ready for deployment, potentially
making the latter not ready for deployment any more (and definitely
moving them out of pool for some time).
Particularly, any manual configuration made by an operator before making
nodes "available" may get destroyed.
(2) TripleO has to disable automated cleaning. Automated cleaning is a
set of steps (currently only wiping the hard drive) that happens in
ironic 1) before nodes are available, 2) after an instance is deleted.
As TripleO CLI constantly moves nodes back-and-forth from and to
"available" state, cleaning kicks in every time. Unless it's disabled.
Disabling cleaning might sound a sufficient work around, until you need
it. And you actually do. Here is a real life example of how to get
yourself broken by not having cleaning:
a. Deploy an overcloud instance
b. Delete it
c. Deploy an overcloud instance on a different hard drive
d. Boom.
As we didn't pass cleaning, there is still a config drive on the disk
used in the first deployment. With 2 config drives present cloud-init
will pick a random one, breaking the deployment.
To top it all, TripleO users tend to not use root device hints, so
switching root disks may happen randomly between deployments. Have fun
debugging.
what do you propose?
--------------------
I would like the new TripleO mistral workflows to start following the
ironic state machine closer. Imagine the following workflows:
1. register: take JSON, create nodes in "manageable" state. I do believe
we can automate the enroll->manageable transition, as it serves the
purpose of validation (and discovery, but lets put it aside).
2. provide: take a list of nodes or all "managable" nodes and move them
to "available". By using this workflow an operator will make a
*conscious* decision to add some nodes to the cloud.
3. introspect: take a list of "managable" (!!!) nodes or all
"manageable" nodes and move them through introspection. This is an
optional step between "register" and "provide".
4. set_node_state: a helper workflow to move nodes between states. The
"provide" workflow is essentially set_node_state with verb=provide, but
is separate due to its high importance in the node lifecycle.
5. configure: given a couple of parameters (deploy image, local boot
flag, etc), update given or all "manageable" nodes with them.
Essentially the only addition here is the "provide" action which I hope
you already realize should be an explicit step.
what about tripleoclient
------------------------
Of course we want to keep backward compatibility. The existing commands
openstack baremetal import
openstack baremetal configure boot
openstack baremetal introspection bulk start
will use some combinations of workflows above and will be deprecated.
The new commands (also avoiding hijacking into the bare metal
namespaces) will be provided strictly matching the workflows (especially
in terms of the state machine):
openstack overcloud node import
openstack overcloud node configure
openstack overcloud node introspect
openstack overcloud node provide
(I have a good reasoning behind each of these names, but if I put it
here this mail will be way too long).
Now to save a user some typing:
1. the configure command will be optional, as the import command will
set the defaults
2. the introspect command will get --provide flag
3. the import command will get --introspect and --provide flags
So the simplest flow for people will be:
openstack overcloud node import --provide instackenv.json
this command will use 2 workflows and will result in a bunch of
"available" nodes, essentially making it a synonym of the "baremetal
import" command.
With introspection it becomes:
openstack overcloud node import --introspect --provide instackenv.json
this command will use 3 workflows and will result in "available" and
introspected nodes.
Thanks for reading such a long email (ping me on IRC if you actually
read it through just for statistics). I hope it makes sense for you.
Dmitry.
More information about the OpenStack-dev
mailing list