[openstack-dev] [Openstack-operators] [ironic] [nova] [tripleo] Deprecation of Nova's integration with Ironic Capabilities and ComputeCapabilitiesFilter
Julia Kreger
juliaashleykreger at gmail.com
Tue Oct 2 00:31:34 UTC 2018
On Mon, Oct 1, 2018 at 3:37 PM Jay Pipes <jaypipes at gmail.com> wrote:
> On 10/01/2018 06:04 PM, Julia Kreger wrote:
> > On Mon, Oct 1, 2018 at 2:41 PM Eric Fried <openstack at fried.cc> wrote:
> >
> >
> > > So say the user requests a node that supports UEFI because their
> > image
> > > needs UEFI. Which workflow would you want here?
> > >
> > > 1) The operator (or ironic?) has already configured the node to
> > boot in
> > > UEFI mode. Only pre-configured nodes advertise the "supports
> > UEFI" trait.
> > >
> > > 2) Any node that supports UEFI mode advertises the trait. Ironic
> > ensures
> > > that UEFI mode is enabled before provisioning the machine.
> > >
> > > I imagine doing #2 by passing the traits which were specifically
> > > requested by the user, from Nova to Ironic, so that Ironic can do
> the
> > > right thing for the user.
> > >
> > > Your proposal suggests that the user request the "supports UEFI"
> > trait,
> > > and *also* pass some glance UUID which the user understands will
> make
> > > sure the node actually boots in UEFI mode. Something like:
> > >
> > > openstack server create --flavor METAL_12CPU_128G --trait
> > SUPPORTS_UEFI
> > > --config-data $TURN_ON_UEFI_UUID
> > >
> > > Note that I pass --trait because I hope that will one day be
> > supported
> > > and we can slow down the flavor explosion.
> >
> > IMO --trait would be making things worse (but see below). I think
> UEFI
> > with Jay's model would be more like:
> >
> > openstack server create --flavor METAL_12CPU_128G --config-data
> $UEFI
> >
> > where the UEFI profile would be pretty trivial, consisting of
> > placement.traits.required = ["BOOT_MODE_UEFI"] and object.boot_mode =
> > "uefi".
> >
> > I agree that this seems kind of heavy, and that it would be nice to
> be
> > able to say "boot mode is UEFI" just once. OTOH I get Jay's point
> that
> > we need to separate the placement decision from the instance
> > configuration.
> >
> > That said, what if it was:
> >
> > openstack config-profile create --name BOOT_MODE_UEFI --json -
> > {
> > "type": "boot_mode_scheme",
> > "version": 123,
> > "object": {
> > "boot_mode": "uefi"
> > },
> > "placement": {
> > "traits": {
> > "required": [
> > "BOOT_MODE_UEFI"
> > ]
> > }
> > }
> > }
> > ^D
> >
> > And now you could in fact say
> >
> > openstack server create --flavor foo --config-profile
> BOOT_MODE_UEFI
> >
> > using the profile name, which happens to be the same as the trait
> name
> > because you made it so. Does that satisfy the yen for saying it
> once? (I
> > mean, despite the fact that you first had to say it three times to
> get
> > it set up.)
> >
> > ========
> >
> > I do want to zoom out a bit and point out that we're talking about
> > implementing a new framework of substantial size and impact when the
> > original proposal - using the trait for both - would just work out of
> > the box today with no changes in either API. Is it really worth it?
> >
> >
> > +1000. Reading both of these threads, it feels like we're basically
> > trying to make something perfect. I think that is a fine goal, except it
> > is unrealistic because the enemy of good is perfection.
> >
> > ========
> >
> > By the way, with Jim's --trait suggestion, this:
> >
> > > ...dozens of flavors that look like this:
> > > - 12CPU_128G_RAID10_DRIVE_LAYOUT_X
> > > - 12CPU_128G_RAID5_DRIVE_LAYOUT_X
> > > - 12CPU_128G_RAID01_DRIVE_LAYOUT_X
> > > - 12CPU_128G_RAID10_DRIVE_LAYOUT_Y
> > > - 12CPU_128G_RAID5_DRIVE_LAYOUT_Y
> > > - 12CPU_128G_RAID01_DRIVE_LAYOUT_Y
> >
> > ...could actually become:
> >
> > openstack server create --flavor 12CPU_128G --trait $WHICH_RAID
> > --trait
> > $WHICH_LAYOUT
> >
> > No flavor explosion.
> >
> >
> > ++ I believe this was where this discussion kind of ended up in..
> ?Dublin?
> >
> > The desire and discussion that led us into complex configuration
> > templates and profiles being submitted were for highly complex scenarios
> > where users wanted to assert detailed raid configurations to disk.
> > Naturally, there are many issues there. The ability to provide such
> > detail would be awesome for those 10% of operators that need such
> > functionality. Of course, if that is the only path forward, then we
> > delay the 90% from getting the minimum viable feature they need.
> >
> >
> > (Maybe if we called it something other than --trait, like maybe
> > --config-option, it would let us pretend we're not really
> overloading a
> > trait to do config - it's just a coincidence that the config option
> has
> > the same name as the trait it causes to be required.)
> >
> >
> > I feel like it might be confusing, but totally +1 to matching required
> > trait name being a thing. That way scheduling is completely decoupled
> > and if everything was correct then the request should already be
> > scheduled properly.
>
> I guess I'll just drop the idea of doing this properly then. It's true
> that the placement traits concept can be hacked up and the virt driver
> can just pass a list of trait strings to the Ironic API and that's the
> most expedient way to get what the 90% of people apparently want. It's
> also true that it will add a bunch of unmaintainable tribal knowledge
> into the interface between Nova and Ironic, but that has been the case
> for multiple years.
>
A few things that need to be unpacked in this statement. But first, please
don't stop. Your bringing a different perspective, and we need to find a
common ground. My HUGE issue right now is just how frustrated people at
this moment when this feels like the third time we've looked at pivoting on
this. Does that mean we should stop? Absolutely not!
In terms of hacked up. This is the way the virt driver behaves today[1]. In
a sense, that contract has already been established. We absolutely have to
be aware of what the scheduling was order to turn off UEFI. Additionally,
there is logic[2] to prevent deployment in case something goes wrong and a
trait is on the requested instance that does not match what is being
offered as a trait for the baremetal node. So it seems reasonable to even
continue to do regardless.
I don't agree it is unmaintainable, But we do fall down on documenting the
interaction for a whole slew of reasons. If we want to discuss that, we
should get lots of tea.
:) I'm totally open to improving that non-existent nova/ironic interaction
documentation, and even open to improving the interaction moving forward
but I would only ask that we take one deliberate step at a time.
[1]
http://git.openstack.org/cgit/openstack/nova/tree/nova/virt/ironic/patcher.py?h=stable/rocky#n118
[2]
http://git.openstack.org/cgit/openstack/ironic/tree/ironic/conductor/utils.py?h=stable/rocky#n935
>
> The flavor explosion problem will continue to get worse for those of us
> who deal with its pain (Oath in particular feels this) because the
> interface between nova flavors and Ironic instance capabilities will
> continue to be super-tightly-coupled.
>
So what would alleviate some of that explosion pain? If something like
"--required-trait" or even "--trait" is a possibility, which is where I
thought things were going to go in nova based on past discussions... that
seems like it would help it tremendously. Granted, I'm not looking at that
list of flavors, but there must surely be some commonality that we can
begin to use to identify the commonalities. If it is purely raid, then lets
build a mechanism to help drive that forward first since we already have a
field in our API for it. Of course, with all of this back and forth, I'm
getting the feeling "--required-trait HW_CPU_X86_VMX" or "--trait
HW_CPU_X86_VMX" will just not happen because of the impasse that exists.
>
>
> For the record, I would have been happier if someone had proposed
> separating the instance configuration data in the flavor extra-specs
> from the notion of required placement constraints (i.e. traits). You
> could call the extra_spec "deploy_template_id" if you wanted and that
> extra spec value could have been passed to Ironic during node
> provisioning instead of the list of placement constraints (traits).
>
But wouldn't this completely change all nova user's interaction with nova?
>
> So, you'd have a list of actual placement traits for an instance that
> looked like this:
>
> required=BOOT_MODE_UEFI,STORAGE_HARDWARE_RAID
>
> and you'd have a flavor extra spec called "deploy_template_id" with a
> value of the deploy template configuration data you wanted to
> communicate to Ironic. The Ironic virt driver could then just look for
> the "deploy_template_id" extra spec and pass the value of that to the
> Ironic API instead of passing a list of traits.
>
Is the source desire instead of allowing for defaults or the desired state
of the hardware to be expressed, a desire for an override facility? If so
that would kind of make sense and I suspect would help reduce the flavor
explosion. I can't imagine telling someone "You need to create this
template file, put it in glance, in order to set 323MB of RAM." Without
override facilities, I'm not sure how the actual flavor explosion would be
scaled back (I guess we need data to actually understand the causes). For
what it is worth, this seems reasonable path to start with as an override
facility. I guess the only thing that we would need to do is maybe consider
it a default node object field in ironic and prevent overwrites in the case
of rebuild events. (Of course, someone will want to rebuild those machines.
:\)
>
> That would have at least satisfied my desire to separate configuration
> data from placement constraints.
>
> Anyway, I'm done trying to please my own desires for a clean solution to
> this.
>
> Best,
> -jay
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20181001/9b86795c/attachment.html>
More information about the OpenStack-dev
mailing list