<div dir="ltr"><div dir="ltr"><div dir="ltr"><br><br><div class="gmail_quote"><div dir="ltr">On Mon, Oct 1, 2018 at 3:37 PM Jay Pipes <<a href="mailto:jaypipes@gmail.com">jaypipes@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 10/01/2018 06:04 PM, Julia Kreger wrote:<br>

> On Mon, Oct 1, 2018 at 2:41 PM Eric Fried <openstack@fried.cc> wrote:<br>

> <br>

> <br>

>      > So say the user requests a node that supports UEFI because their<br>

>     image<br>

>      > needs UEFI. Which workflow would you want here?<br>

>      ><br>

>      > 1) The operator (or ironic?) has already configured the node to<br>

>     boot in<br>

>      > UEFI mode. Only pre-configured nodes advertise the "supports<br>

>     UEFI" trait.<br>

>      ><br>

>      > 2) Any node that supports UEFI mode advertises the trait. Ironic<br>

>     ensures<br>

>      > that UEFI mode is enabled before provisioning the machine.<br>

>      ><br>

>      > I imagine doing #2 by passing the traits which were specifically<br>

>      > requested by the user, from Nova to Ironic, so that Ironic can do the<br>

>      > right thing for the user.<br>

>      ><br>

>      > Your proposal suggests that the user request the "supports UEFI"<br>

>     trait,<br>

>      > and *also* pass some glance UUID which the user understands will make<br>

>      > sure the node actually boots in UEFI mode. Something like:<br>

>      ><br>

>      > openstack server create --flavor METAL_12CPU_128G --trait<br>

>     SUPPORTS_UEFI<br>

>      > --config-data $TURN_ON_UEFI_UUID<br>

>      ><br>

>      > Note that I pass --trait because I hope that will one day be<br>

>     supported<br>

>      > and we can slow down the flavor explosion.<br>

> <br>

>     IMO --trait would be making things worse (but see below). I think UEFI<br>

>     with Jay's model would be more like:<br>

> <br>

>        openstack server create --flavor METAL_12CPU_128G --config-data $UEFI<br>

> <br>

>     where the UEFI profile would be pretty trivial, consisting of<br>

>     placement.traits.required = ["BOOT_MODE_UEFI"] and object.boot_mode =<br>

>     "uefi".<br>

> <br>

>     I agree that this seems kind of heavy, and that it would be nice to be<br>

>     able to say "boot mode is UEFI" just once. OTOH I get Jay's point that<br>

>     we need to separate the placement decision from the instance<br>

>     configuration.<br>

> <br>

>     That said, what if it was:<br>

> <br>

>       openstack config-profile create --name BOOT_MODE_UEFI --json -<br>

>       {<br>

>        "type": "boot_mode_scheme",<br>

>        "version": 123,<br>

>        "object": {<br>

>            "boot_mode": "uefi"<br>

>        },<br>

>        "placement": {<br>

>         "traits": {<br>

>          "required": [<br>

>           "BOOT_MODE_UEFI"<br>

>          ]<br>

>         }<br>

>        }<br>

>       }<br>

>       ^D<br>

> <br>

>     And now you could in fact say<br>

> <br>

>       openstack server create --flavor foo --config-profile BOOT_MODE_UEFI<br>

> <br>

>     using the profile name, which happens to be the same as the trait name<br>

>     because you made it so. Does that satisfy the yen for saying it once? (I<br>

>     mean, despite the fact that you first had to say it three times to get<br>

>     it set up.)<br>

> <br>

>     ========<br>

> <br>

>     I do want to zoom out a bit and point out that we're talking about<br>

>     implementing a new framework of substantial size and impact when the<br>

>     original proposal - using the trait for both - would just work out of<br>

>     the box today with no changes in either API. Is it really worth it?<br>

> <br>

> <br>

> +1000. Reading both of these threads, it feels like we're basically <br>

> trying to make something perfect. I think that is a fine goal, except it <br>

> is unrealistic because the enemy of good is perfection.<br>

> <br>

>     ========<br>

> <br>

>     By the way, with Jim's --trait suggestion, this:<br>

> <br>

>      > ...dozens of flavors that look like this:<br>

>      > - 12CPU_128G_RAID10_DRIVE_LAYOUT_X<br>

>      > - 12CPU_128G_RAID5_DRIVE_LAYOUT_X<br>

>      > - 12CPU_128G_RAID01_DRIVE_LAYOUT_X<br>

>      > - 12CPU_128G_RAID10_DRIVE_LAYOUT_Y<br>

>      > - 12CPU_128G_RAID5_DRIVE_LAYOUT_Y<br>

>      > - 12CPU_128G_RAID01_DRIVE_LAYOUT_Y<br>

> <br>

>     ...could actually become:<br>

> <br>

>       openstack server create --flavor 12CPU_128G --trait $WHICH_RAID<br>

>     --trait<br>

>     $WHICH_LAYOUT<br>

> <br>

>     No flavor explosion.<br>

> <br>

> <br>

> ++ I believe this was where this discussion kind of ended up in.. ?Dublin?<br>

> <br>

> The desire and discussion that led us into complex configuration <br>

> templates and profiles being submitted were for highly complex scenarios <br>

> where users wanted to assert detailed raid configurations to disk. <br>

> Naturally, there are many issues there. The ability to provide such <br>

> detail would be awesome for those 10% of operators that need such <br>

> functionality. Of course, if that is the only path forward, then we <br>

> delay the 90% from getting the minimum viable feature they need.<br>

> <br>

> <br>

>     (Maybe if we called it something other than --trait, like maybe<br>

>     --config-option, it would let us pretend we're not really overloading a<br>

>     trait to do config - it's just a coincidence that the config option has<br>

>     the same name as the trait it causes to be required.)<br>

> <br>

> <br>

> I feel like it might be confusing, but totally +1 to matching required <br>

> trait name being a thing. That way scheduling is completely decoupled <br>

> and if everything was correct then the request should already be <br>

> scheduled properly.<br>

<br>

I guess I'll just drop the idea of doing this properly then. It's true <br>

that the placement traits concept can be hacked up and the virt driver <br>

can just pass a list of trait strings to the Ironic API and that's the <br>

most expedient way to get what the 90% of people apparently want. It's <br>

also true that it will add a bunch of unmaintainable tribal knowledge <br>

into the interface between Nova and Ironic, but that has been the case <br>

for multiple years.<br></blockquote><div><br></div><div>A few things that need to be unpacked in this statement. But first, please don't stop. Your bringing a different perspective, and we need to find a common ground. My HUGE issue right now is just how frustrated people at this moment when this feels like the third time we've looked at pivoting on this. Does that mean we should stop? Absolutely not!</div><div><br></div><div>In terms of hacked up. This is the way the virt driver behaves today[1]. In a sense, that contract has already been established. We absolutely have to be aware of what the scheduling was order to turn off UEFI. Additionally, there is logic[2] to prevent deployment in case something goes wrong and a trait is on the requested instance that does not match what is being offered as a trait for the baremetal node. So it seems reasonable to even continue to do regardless.</div><div><br></div><div>I don't agree it is unmaintainable, But we do fall down on documenting the interaction for a whole slew of reasons. If we want to discuss that, we should get lots of tea.</div><div> :) I'm totally open to improving that non-existent nova/ironic interaction documentation, and even open to improving the interaction moving forward but I would only ask that we take one deliberate step at a time.<br></div><div><br></div><div>[1] <a href="http://git.openstack.org/cgit/openstack/nova/tree/nova/virt/ironic/patcher.py?h=stable/rocky#n118">http://git.openstack.org/cgit/openstack/nova/tree/nova/virt/ironic/patcher.py?h=stable/rocky#n118</a></div><div>[2] <a href="http://git.openstack.org/cgit/openstack/ironic/tree/ironic/conductor/utils.py?h=stable/rocky#n935">http://git.openstack.org/cgit/openstack/ironic/tree/ironic/conductor/utils.py?h=stable/rocky#n935</a><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

The flavor explosion problem will continue to get worse for those of us <br>

who deal with its pain (Oath in particular feels this) because the <br>

interface between nova flavors and Ironic instance capabilities will <br>

continue to be super-tightly-coupled.<br></blockquote><div><br></div><div>So what would alleviate some of that explosion pain? If something like "--required-trait" or even "--trait" is a possibility, which is where I thought things were going to go in nova based on past discussions... that seems like it would help it tremendously. Granted, I'm not looking at that list of flavors, but there must surely be some commonality that we can begin to use to identify the commonalities. If it is purely raid, then lets build a mechanism to help drive that forward first since we already have a field in our API for it. Of course, with all of this back and forth, I'm getting the feeling "--required-trait HW_CPU_X86_VMX" or "--trait HW_CPU_X86_VMX" will just not happen because of the impasse that exists.<br></div> <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

For the record, I would have been happier if someone had proposed <br>

separating the instance configuration data in the flavor extra-specs <br>

from the notion of required placement constraints (i.e. traits). You <br>

could call the extra_spec "deploy_template_id" if you wanted and that <br>

extra spec value could have been passed to Ironic during node <br>

provisioning instead of the list of placement constraints (traits).<br></blockquote><div><br></div><div>But wouldn't this completely change all nova user's interaction with nova?<br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

So, you'd have a list of actual placement traits for an instance that <br>

looked like this:<br>

<br>

required=BOOT_MODE_UEFI,STORAGE_HARDWARE_RAID<br>

<br>

and you'd have a flavor extra spec called "deploy_template_id" with a <br>

value of the deploy template configuration data you wanted to <br>

communicate to Ironic. The Ironic virt driver could then just look for <br>

the "deploy_template_id" extra spec and pass the value of that to the <br>

Ironic API instead of passing a list of traits.<br></blockquote></div><div class="gmail_quote"><br></div><div class="gmail_quote">Is the source desire instead of allowing for defaults or the desired state of the hardware to be expressed, a desire for an override facility? If so that would kind of make sense and I suspect would help reduce the flavor explosion. I can't imagine telling someone "You need to create this template file, put it in glance, in order to set 323MB of RAM." Without override facilities, I'm not sure how the actual flavor explosion would be scaled back (I guess we need data to actually understand the causes). For what it is worth, this seems reasonable path to start with as an override facility. I guess the only thing that we would need to do is maybe consider it a default node object field in ironic and prevent overwrites in the case of rebuild events. (Of course, someone will want to rebuild those machines. :\)<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

That would have at least satisfied my desire to separate configuration <br>

data from placement constraints.<br>

<br>

Anyway, I'm done trying to please my own desires for a clean solution to <br>

this.<br>

<br>

Best,<br>

-jay<br>

<br>

__________________________________________________________________________<br>

OpenStack Development Mailing List (not for usage questions)<br>

Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

</blockquote></div></div></div></div>