[openstack-dev] [ironic] using ironic as a replacement for existing datacenter baremetal provisioning

Lucas Alvares Gomes lucasagomes at gmail.com
Thu Jun 9 17:17:56 UTC 2016


Hi,

Thanks for writing it down Jim.

> So, I've been thinking about this quite a bit. We've also talked about
> doing a v2 API (as evil as that may be) in Ironic here and there. We've
> had lots of lessons learned from the v1 API, mostly that our API is
> absolutely terrible for humans. I'd love to fix that (whether that
> requires a v2 API or not is unclear, so don't focus on that).
>
> I've noticed that people keep talking about the Nova driver API
> not being public/stable/whatever in this thread - let's ignore that and
> think bigger.
>
> So, there's two large use cases for ironic that we support today:
>
> * Ironic as a backend to nova. Operators still need to interact with the
>   Ironic API for management, troubleshooting, and fixing issues that
>   computers do not handle today.
>
> * Ironic standalone - by this I mean ironic without nova. The primary
>   deployment method here is using Bifrost, and I also call it the
>   "better than cobbler" case. I'm not sure if people are using this
>   without bifrost, or with other non-nova services, today. Users in this
>   model, as I understand things, do not interact with the Ironic API
>   directly (except maybe for troubleshooting).
>
> There's other use cases I would like to support:
>
> * Ironic standalone, without Bifrost. I would love for a deployer to be
>   able to stand up Ironic as an end-user facing API, probably with
>   Keystone, maybe with Neutron/Glance/Swift if needed. This would
>   require a ton of discussion and work (e.g. ironic has no concept of
>   tenants/projects today, we might want a scheduler, a concept of an
>   instance, etc) and would be a very long road. The ideal solution to
>   this is to break out the Compute API and scheduler to be separate from
>   Nova, but that's an even longer road, so let's pretend I didn't say
>   that and not devolve this thread into that conversation (yet).
>
> * Ironic as a backend to other things. Josh pointed out kubernetes
>   somewhere, I'd love to be an official backend there. Heat today goes
>   through Nova to get an ironic instance, it seems reasonable to have
>   heat talk directly to ironic. Things like that. The amount of work
>   here might depend on the application using ironic (e.g. I think k8s
>   has it's own scheduler, heat does not, right?).
>
> So all that said, I think there is one big step we can take in the
> short-term that works for all of these use cases: make our API better.
> Make it simpler. Take a bunch of the logic in the Nova driver, and put
> it in our API instead. spawn() becomes /v1/nodes/foo/deploy or
> something, etc (I won't let us bikeshed those specifics in this thread).
> Just doing that allows us to remove a bunch of code from a number of
> places (nova, bifrost, shade, tempest(?)) and make those simpler. It
> allows direct API users to more easily deploy things, making one API
> call instead of a bunch (we could even create Neutron ports and such for
> them). It allows k8s and friends to write less code. Oh, let's also stop
> directly exposing state machine transitions as API actions, that's
> crazy, kthx.
>
> I think this is what Josh is trying to get at, except maybe with a
> separate API service in between, which doesn't sound very desirable to
> me.
>
> Thoughts on this?
>
> Additionally, in the somewhat-short term, I'd like us to try to
> enumerate the major use cases we're trying to solve, and make those use
> cases ridiculously simple to deploy. Ironic is quickly becoming a
> tangled mess of configuration options and tweaking surrounding services
> (nova, neutron) to deploy it. Once it's figured out, it works very well.
> However, it's incredibly difficult to figure out how to get there.
>
> Ultimately, I'd like someone that wants to deploy ironic in a common use
> case, with off-the-shelf hardware, to be able to get a POC up and
> running in a matter of hours, not days or weeks.
>
> Who's in? :)
>

I agree in general with the idea but I think it needs a tad more
context. We need to remember that Ironic (ex-Nova Baremetal) was
created to fill a gap in OpenStack that was missing for TripleO
project to get off the ground. That was the problem being solved and
these aspects are reflected in the ReST API: Being admin-only, not
"human-friendly" (standalone came later), etc...

> * Ironic as a backend to other things. Josh pointed out kubernetes
>   somewhere, I'd love to be an official backend there. Heat today goes
>   through Nova to get an ironic instance, it seems reasonable to have
>   heat talk directly to ironic. Things like that. The amount of work
>   here might depend on the application using ironic (e.g. I think k8s
>   has it's own scheduler, heat does not, right?).

There was an attempt to do that before in heat, but they were refused
at the time because it didn't fit the context above [0]. That wasn't
the goal/scope of the project.

Now we have v1 is (almost) 3 years and during this time Ironic evolved
a _lot_, it does covers way more use cases than we ever imagined and,
the ReST API specially is having a hard time to cope with it.

I don't think that v2 is evil, in fact, looking at how many changes
proposals/ideas we have in flight that not necessarily "fits" very
well in the current API (off the top of my head): Driver composition,
portgroups, trunk port representation, the "thing" to  management a
group of nodes (bulk operation), composing nodes on-demand,
claiming/reserving resources...

And now moving some of the logic from the nova ironic virt driver to
the API. The main problem I have with this being done in the v1 is
that the API will have two (or more) different ways of doing the
same/similar thing, e.g, POST v1/nodes/foo/deploy Vs. PUT {'target':
'active'} v1/nodes/foo/states/provision.

I really think we should get all these ideas, the things we learned in
(almost) 3 years of the v1 existence, put it all in the table and
start solving those problems _by design_. I even believe that by
designing a v2 that address such problems from the beggining will be
much faster to land than trying to get it on v1. Take driver
composition as an example, it's being around for 1.5 years and the
spec is not approved yet, in fact we had 3 design sessions in 3
different summits about it already.

Also, I would like to apologize for focusing on v2 in my reply even
tho you explicitly said to "not focus on that". But, I think that it's
the way forward.

[0] https://review.openstack.org/#/q/topic:bp/ironic-resource,n,z

Cheers,
Lucas



More information about the OpenStack-dev mailing list