[openstack-dev] [tripleo] Nodes management in our shiny new TripleO API

Dmitry Tantsur dtantsur at redhat.com
Mon May 23 09:50:25 UTC 2016


On 05/21/2016 08:35 PM, Dan Prince wrote:
> On Fri, 2016-05-20 at 14:06 +0200, Dmitry Tantsur wrote:
>> On 05/20/2016 01:44 PM, Dan Prince wrote:
>>>
>>> On Thu, 2016-05-19 at 15:31 +0200, Dmitry Tantsur wrote:
>>>>
>>>> Hi all!
>>>>
>>>> We started some discussions on https://review.openstack.org/#/c/3
>>>> 0020
>>>> 0/
>>>> about the future of node management (registering, configuring and
>>>> introspecting) in the new API, but I think it's more fair (and
>>>> convenient) to move it here. The goal is to fix several long-
>>>> standing
>>>> design flaws that affect the logic behind tripleoclient. So
>>>> fasten
>>>> your
>>>> seatbelts, here it goes.
>>>>
>>>> If you already understand why we need to change this logic, just
>>>> scroll
>>>> down to "what do you propose?" section.
>>>>
>>>> "introspection bulk start" is evil
>>>> ----------------------------------
>>>>
>>>> As many of you obviously know, TripleO used the following command
>>>> for
>>>> introspection:
>>>>
>>>>   openstack baremetal introspection bulk start
>>>>
>>>> As not everyone knows though, this command does not come from
>>>> ironic-inspector project, it's part of TripleO itself. And the
>>>> ironic
>>>> team has some big problems with it.
>>>>
>>>> The way it works is
>>>>
>>>> 1. Take all nodes in "available" state and move them to
>>>> "manageable"
>>>> state
>>>> 2. Execute introspection for all nodes in "manageable" state
>>>> 3. Move all nodes with successful introspection to "available"
>>>> state.
>>>>
>>>> Step 3 is pretty controversial, step 1 is just horrible. This not
>>>> how
>>>> the ironic-inspector team designed introspection to work (hence
>>>> it
>>>> refuses to run on nodes in "available" state), and that's now how
>>>> the
>>>> ironic team expects the ironic state machine to be handled. To
>>>> explain
>>>> it I'll provide a brief information on the ironic state machine.
>>>>
>>>> ironic node lifecycle
>>>> ---------------------
>>>>
>>>> With recent versions of the bare metal API (starting with 1.11),
>>>> nodes
>>>> begin their life in a state called "enroll". Nodes in this state
>>>> are
>>>> not
>>>> available for deployment, nor for most of other actions. Ironic
>>>> does
>>>> not
>>>> touch such nodes in any way.
>>>>
>>>> To make nodes alive an operator uses "manage" provisioning action
>>>> to
>>>> move nodes to "manageable" state. During this transition the
>>>> power
>>>> and
>>>> management credentials (IPMI, SSH, etc) are validated to ensure
>>>> that
>>>> nodes in "manageable" state are, well, manageable. This state is
>>>> still
>>>> not available for deployment. With nodes in this state an
>>>> operator
>>>> can
>>>> execute various pre-deployment actions, such as introspection,
>>>> RAID
>>>> configuration, etc. So to sum it up, nodes in "manageable" state
>>>> are
>>>> being configured before exposing them into the cloud.
>>>>
>>>> The last step before the deployment it to make nodes "available"
>>>> using
>>>> the "provide" provisioning action. Such nodes are exposed to
>>>> nova,
>>>> and
>>>> can be deployed to at any moment. No long-running configuration
>>>> actions
>>>> should be run in this state. The "manage" action can be used to
>>>> bring
>>>> nodes back to "manageable" state for configuration (e.g.
>>>> reintrospection).
>>>>
>>>> so what's the problem?
>>>> ----------------------
>>>>
>>>> The problem is that TripleO essentially bypasses this logic by
>>>> keeping
>>>> all nodes "available" and walking them through provisioning steps
>>>> automatically. Just a couple of examples of what gets broken:
>>>>
>>>> (1) Imagine I have 10 nodes in my overcloud, 10 nodes ready for
>>>> deployment (including potential autoscaling) and I want to enroll
>>>> 10
>>>> more nodes.
>>>>
>>>> Both introspection and ready-state operations nowadays will touch
>>>> both
>>>> 10 new nodes AND 10 nodes which are ready for deployment,
>>>> potentially
>>>> making the latter not ready for deployment any more (and
>>>> definitely
>>>> moving them out of pool for some time).
>>>>
>>>> Particularly, any manual configuration made by an operator before
>>>> making
>>>> nodes "available" may get destroyed.
>>>>
>>>> (2) TripleO has to disable automated cleaning. Automated cleaning
>>>> is
>>>> a
>>>> set of steps (currently only wiping the hard drive) that happens
>>>> in
>>>> ironic 1) before nodes are available, 2) after an instance is
>>>> deleted.
>>>> As TripleO CLI constantly moves nodes back-and-forth from and to
>>>> "available" state, cleaning kicks in every time. Unless it's
>>>> disabled.
>>>>
>>>> Disabling cleaning might sound a sufficient work around, until
>>>> you
>>>> need
>>>> it. And you actually do. Here is a real life example of how to
>>>> get
>>>> yourself broken by not having cleaning:
>>>>
>>>> a. Deploy an overcloud instance
>>>> b. Delete it
>>>> c. Deploy an overcloud instance on a different hard drive
>>>> d. Boom.
>>> This sounds like an Ironic bug to me. Cleaning (wiping a disk) and
>>> removing state that would break subsequent installations on a
>>> different
>>> drive are different things. In TripleO I think the reason we
>>> disable
>>> cleaning is largely because of the extra time it takes and the fact
>>> that our baremetal cloud isn't multi-tenant (currently at least).
>> We fix this "bug" by introducing cleaning. This is the process to
>> guarantee each deployment starts with a clean environment. It's hard
>> to
>> known which remained data can cause which problem (e.g. what about a
>> remaining UEFI partition? any remainings of Ceph? I don't know).
>>
>>>
>>>
>>>>
>>>>
>>>> As we didn't pass cleaning, there is still a config drive on the
>>>> disk
>>>> used in the first deployment. With 2 config drives present cloud-
>>>> init
>>>> will pick a random one, breaking the deployment.
>>> TripleO isn't using config drive is it? Until Nova supports config
>>> drives via Ironic I think we are blocked on using it.
>> TripleO does use config drives (btw I'm telling you a real bug here,
>> not
>> something I made up). Nova does support Ironic config drives, it
>> does
>> not support (and does not want to) injecting random data from an
>> Ironic
>> node there (we wanted to pass data from introspection to the node).
>>
>>>
>>>
>>>>
>>>>
>>>> To top it all, TripleO users tend to not use root device hints,
>>>> so
>>>> switching root disks may happen randomly between deployments.
>>>> Have
>>>> fun
>>>> debugging.
>>>>
>>>> what do you propose?
>>>> --------------------
>>>>
>>>> I would like the new TripleO mistral workflows to start following
>>>> the
>>>> ironic state machine closer. Imagine the following workflows:
>>>>
>>>> 1. register: take JSON, create nodes in "manageable" state. I do
>>>> believe
>>>> we can automate the enroll->manageable transition, as it serves
>>>> the
>>>> purpose of validation (and discovery, but lets put it aside).
>>>>
>>>> 2. provide: take a list of nodes or all "managable" nodes and
>>>> move
>>>> them
>>>> to "available". By using this workflow an operator will make a
>>>> *conscious* decision to add some nodes to the cloud.
>>>>
>>>> 3. introspect: take a list of "managable" (!!!) nodes or all
>>>> "manageable" nodes and move them through introspection. This is
>>>> an
>>>> optional step between "register" and "provide".
>>>>
>>>> 4. set_node_state: a helper workflow to move nodes between
>>>> states.
>>>> The
>>>> "provide" workflow is essentially set_node_state with
>>>> verb=provide,
>>>> but
>>>> is separate due to its high importance in the node lifecycle.
>>>>
>>>> 5. configure: given a couple of parameters (deploy image, local
>>>> boot
>>>> flag, etc), update given or all "manageable" nodes with them.
>>> I like how you've split things up into the above workflows.
>>> Furthermore, I think we'll actually be able to accomplish most, if
>>> not
>>> all of it by using pure Mistral workflows (very little custom
>>> actions
>>> involved).
>>>
>>> One refinement I might suggestion is that for the workflows that
>>> take a
>>> list of uuid's *or* search for a type of nodes that we might split
>>> them
>>> into two workflows. One which calls the other.
>> Good idea.
>
> Ryan Brady and I spent some time yesterday implementing the suggested
> workflows (all except for the 'config' one I think which could come
> later).

Fantastic, thank you! Lets continue in gerrit now.

>
> How does this one look:
>
> https://review.openstack.org/#/c/300200/4/workbooks/baremetal.yaml
>
> We've got some python-tripleoclient patches coming soon too which use
> this updated workflow to do the node registration bits.
>
> Dan
>
>>
>>>
>>>
>>> For example a 'provide_managed_nodes' workflow would call into the
>>> 'provide' workflow which takes a list of uuids? I think this gives
>>> us
>>> the same features we need and exposes the required input parameters
>>> more cleanly to the end user.
>>>
>>> So long as we can do the above and still make the existing python-
>>> tripleoclient calls backwards compatible I think we should be in
>>> good
>>> shape.
>> Awesome!
>>
>>>
>>>
>>> Dan
>>>
>>>
>>>>
>>>>
>>>> Essentially the only addition here is the "provide" action which
>>>> I
>>>> hope
>>>> you already realize should be an explicit step.
>>>>
>>>> what about tripleoclient
>>>> ------------------------
>>>>
>>>> Of course we want to keep backward compatibility. The existing
>>>> commands
>>>>
>>>>   openstack baremetal import
>>>>   openstack baremetal configure boot
>>>>   openstack baremetal introspection bulk start
>>>>
>>>> will use some combinations of workflows above and will be
>>>> deprecated.
>>>>
>>>> The new commands (also avoiding hijacking into the bare metal
>>>> namespaces) will be provided strictly matching the workflows
>>>> (especially
>>>> in terms of the state machine):
>>>>
>>>>   openstack overcloud node import
>>>>   openstack overcloud node configure
>>>>   openstack overcloud node introspect
>>>>   openstack overcloud node provide
>>>>
>>>> (I have a good reasoning behind each of these names, but if I put
>>>> it
>>>> here this mail will be way too long).
>>>>
>>>> Now to save a user some typing:
>>>> 1. the configure command will be optional, as the import command
>>>> will
>>>> set the defaults
>>>> 2. the introspect command will get --provide flag
>>>> 3. the import command will get --introspect and --provide flags
>>>>
>>>> So the simplest flow for people will be:
>>>>
>>>>   openstack overcloud node import --provide instackenv.json
>>>>
>>>> this command will use 2 workflows and will result in a bunch of
>>>> "available" nodes, essentially making it a synonym of the
>>>> "baremetal
>>>> import" command.
>>>>
>>>> With introspection it becomes:
>>>>
>>>>   openstack overcloud node import --introspect --provide
>>>> instackenv.json
>>>>
>>>> this command will use 3 workflows and will result in "available"
>>>> and
>>>> introspected nodes.
>>>>
>>>>
>>>> Thanks for reading such a long email (ping me on IRC if you
>>>> actually
>>>> read it through just for statistics). I hope it makes sense for
>>>> you.
>>>>
>>>> Dmitry.
>>>>
>>>> _________________________________________________________________
>>>> ____
>>>> _____
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:un
>>>> subs
>>>> cribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> ___________________________________________________________________
>>> _______
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsu
>>> bscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>> _____________________________________________________________________
>> _____
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubs
>> cribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>




More information about the OpenStack-dev mailing list