<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Oct 23, 2017 at 2:54 PM, Eric Fried <span dir="ltr"><<a href="mailto:openstack@fried.cc" target="_blank">openstack@fried.cc</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I agree with Sean. In general terms:<br>
<br>
* A resource provider should be marked with a trait if that feature<br>
* Can be turned on or off (whether it's currently on or not); or<br>
* Is always on and can't ever be turned off.<br></blockquote><div><br></div><div>No, traits are not boolean. If a resource provider stops providing a capability, then the existing related trait should just be removed, that's it.</div><div>If you see a trait, that's just means that the related capability for the Resource Provider is supported, that's it too.</div><div><br></div><div>MHO.</div><div><br></div><div>-Sylvain</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
* A consumer wanting that feature present (doesn't matter whether it's<br>
on or off) should specify it as a required *trait*.<br>
* A consumer wanting that feature present and turned on should<br>
* Specify it as a required trait; AND<br>
* Indicate that it be turned on via some other mechanism (e.g. a<br>
separate extra_spec).<br>
<br>
I believe this satisfies Dmitry's (Ironic's) needs, but also Jay's drive<br>
for placement purity.<br>
<br>
Please invite me to the hangout or whatever.<br>
<br>
Thanks,<br>
Eric<br>
<br>
On 10/23/2017 07:22 AM, Mooney, Sean K wrote:<br>
> <br>
><br>
> <br>
><br>
> *From:*Jay Pipes [mailto:<a href="mailto:jaypipes@gmail.com">jaypipes@gmail.com</a>]<br>
> *Sent:* Monday, October 23, 2017 12:20 PM<br>
> *To:* OpenStack Development Mailing List <<a href="mailto:openstack-dev@lists.openstack.org">openstack-dev@lists.<wbr>openstack.org</a>><br>
> *Subject:* Re: [openstack-dev] [ironic] ironic and traits<br>
<span class="">><br>
> <br>
><br>
> Writing from my phone... May I ask that before you proceed with any plan<br>
> that uses traits for state information that we have a hangout or<br>
> videoconference to discuss this? Unfortunately today and tomorrow I'm<br>
> not able to do a hangout but I can do one on Wednesday any time of the day.<br>
><br>
> <br>
><br>
</span>> */[Mooney, Sean K] on the uefi boot topic I did bring up at the ptg that<br>
> we wanted to standardizes tratis for “verified boot” /*<br>
><br>
> */that included a trait for uefi secure boot enabled and to indicated a<br>
> hardware root of trust, e.g. intel boot guard or similar/*<br>
><br>
> */we distinctly wanted to be able to tag nova compute hosts with those<br>
> new traits so we could require that vms that request/*<br>
><br>
> */a host with uefi secure boot enabled and a hardware root of trust are<br>
> scheduled only to those nodes. /*<br>
><br>
> */ /*<br>
><br>
> */There are many other examples that effect both vms and bare metal such<br>
> as, ecc/interleaved memory, cluster on die, /*<br>
><br>
> */l3 cache code and data prioritization, vt-d/vt-c, HPET, Hyper<br>
<span class="">> threading, power states … all of these feature may be present on the<br>
</span>> platform/*<br>
><br>
> */but I also need to know if they are turned on. Ruling out state in<br>
<span class="">> traits means all of this logic will eventually get pushed to scheduler<br>
</span>> filters/*<br>
><br>
> */which will be suboptimal long term as more state is tracked. Software<br>
> defined infrastructure may be the future but hardware defined software/*<br>
><br>
> */is sadly the present…/*<br>
><br>
> */ /*<br>
><br>
> */I do however think there should be a sperateion between asking for a<br>
> host that provides x with a trait and asking for x to be configure via/*<br>
><br>
> */A trait. The trait secure_boot_enabled should never result in the<br>
> feature being enabled It should just find a host with it on. If you want/*<br>
><br>
> */To request it to be turned on you would request a host with<br>
<span class="">> secure_boot_capable as a trait and have a flavor extra spec or image<br>
</span>> property to request/*<br>
><br>
> */Ironic to enabled it. these are two very different request and should<br>
> not be treated the same. /*<br>
<span class="">><br>
> <br>
><br>
> <br>
><br>
> Lemme know!<br>
><br>
> -jay<br>
><br>
> <br>
><br>
> On Oct 23, 2017 5:01 AM, "Dmitry Tantsur" <<a href="mailto:dtantsur@redhat.com">dtantsur@redhat.com</a><br>
</span><span class="">> <mailto:<a href="mailto:dtantsur@redhat.com">dtantsur@redhat.com</a>>> wrote:<br>
><br>
> Hi Jay!<br>
><br>
> I appreciate your comments, but I think you're approaching the<br>
> problem from purely VM point of view. Things simply don't work the<br>
> same way in bare metal, at least not if we want to provide the same<br>
> user experience.<br>
><br>
> <br>
><br>
> On Sun, Oct 22, 2017 at 2:25 PM, Jay Pipes <<a href="mailto:jaypipes@gmail.com">jaypipes@gmail.com</a><br>
</span><div><div class="h5">> <mailto:<a href="mailto:jaypipes@gmail.com">jaypipes@gmail.com</a>>> wrote:<br>
><br>
> Sorry for delay, took a week off before starting a new job.<br>
> Comments inline.<br>
><br>
> On 10/16/2017 12:24 PM, Dmitry Tantsur wrote:<br>
><br>
> Hi all,<br>
><br>
> I promised John to dump my thoughts on traits to the ML, so<br>
> here we go :)<br>
><br>
> I see two roles of traits (or kinds of traits) for bare metal:<br>
> 1. traits that say what the node can do already (e.g. "the<br>
> node is<br>
> doing UEFI boot")<br>
> 2. traits that say what the node can be *configured* to do<br>
> (e.g. "the node can<br>
> boot in UEFI mode")<br>
><br>
><br>
> There's only one role for traits. #2 above. #1 is state<br>
> information. Traits are not for state information. Traits are<br>
> only for communicating capabilities of a resource provider<br>
> (baremetal node).<br>
><br>
> <br>
><br>
> These are not different, that's what I'm talking about here. No<br>
> users care about the difference between "this node was put in UEFI<br>
> mode by an operator in advance", "this node was put in UEFI mode by<br>
> an ironic driver on demand" and "this node is always in UEFI mode,<br>
> because it's AARCH64 and it does not have BIOS". These situation<br>
> produce the same result (the node is booted in UEFI mode), and thus<br>
> it's up to ironic to hide this difference.<br>
><br>
> <br>
><br>
> My suggestion with traits is one way to do it, I'm not sure what you<br>
> suggest though.<br>
><br>
> <br>
><br>
><br>
> For example, let's say we add the following to the os-traits<br>
> library [1]<br>
><br>
> * STORAGE_RAID_0<br>
> * STORAGE_RAID_1<br>
> * STORAGE_RAID_5<br>
> * STORAGE_RAID_6<br>
> * STORAGE_RAID_10<br>
><br>
> The Ironic administrator would add all RAID-related traits to<br>
> the baremetal nodes that had the *capability* of supporting that<br>
> particular RAID setup [2]<br>
><br>
> When provisioned, the baremetal node would either have RAID<br>
> configured in a certain level or not configured at all.<br>
><br>
><br>
> A very important note: the Placement API and Nova scheduler (or<br>
> future Ironic scheduler) doesn't care about this. At all. I know<br>
> it sounds like I'm being callous, but I'm not. Placement and<br>
> scheduling doesn't care about the state of things. It only cares<br>
> about the capabilities of target destinations. That's it.<br>
><br>
> <br>
><br>
> Yes, because VMs always start with a clean state, and hypervisor is<br>
> there to ensure that. We don't have this luxury in ironic :) E.g.<br>
> our SNMP driver is not even aware of boot modes (or RAID, or BIOS<br>
> configuration), which does not mean that a node using it cannot be<br>
> in UEFI mode (have a RAID or BIOS pre-configured, etc, etc).<br>
><br>
> <br>
><br>
> <br>
><br>
> This seems confusing, but it's actually very useful. Say, I<br>
> have a flavor that<br>
> requests UEFI boot via a trait. It will match both the nodes<br>
> that are already in<br>
> UEFI mode, as well as nodes that can be put in UEFI mode.<br>
><br>
><br>
> No :) It will only match nodes that have the UEFI capability.<br>
> The set of providers that have the ability to be booted via UEFI<br>
> is *always* a superset of the set of providers that *have been<br>
> booted via UEFI*. Placement and scheduling decisions only care<br>
> about that superset -- the providers with a particular capability.<br>
><br>
> <br>
><br>
> Well, no, it will. Again, you're purely basing on the VM idea, where<br>
> a VM is always *put* in UEFI mode, no matter how the hypervisor<br>
> looks like. It is simply not the case for us. You have to care what<br>
> state the node is, because many drivers cannot change this state.<br>
><br>
> <br>
><br>
> <br>
><br>
> This idea goes further with deploy templates (new concept<br>
> we've been thinking<br>
> about). A flavor can request something like CUSTOM_RAID_5,<br>
> and it will match the<br>
> nodes that already have RAID 5, or, more interestingly, the<br>
> nodes on which we<br>
> can build RAID 5 before deployment. The UEFI example above<br>
> can be treated in a<br>
> similar way.<br>
><br>
> This ends up with two sources of knowledge about traits in<br>
> ironic:<br>
> 1. Operators setting something they know about hardware<br>
> ("this node is in UEFI<br>
> mode"),<br>
> 2. Ironic drivers reporting something they<br>
> 2.1. know about hardware ("this node is in UEFI mode" -<br>
> again)<br>
> 2.2. can do about hardware ("I can put this node in UEFI<br>
> mode")<br>
><br>
><br>
> You're correct that both pieces of information are important.<br>
> However, only the "can do about hardware" part is relevant to<br>
> Placement and Nova.<br>
><br>
> For case #1 we are planning on a new CRUD API to set/unset<br>
> traits for a node.<br>
><br>
><br>
> I would *strongly* advise against this. Traits are not for state<br>
> information.<br>
><br>
> Instead, consider having a DB (or JSON) schema that lists state<br>
> information in fields that are explicitly for that state<br>
> information.<br>
><br>
> For example, a schema that looks like this:<br>
><br>
> {<br>
> "boot": {<br>
> "mode": <one of 'bios' or 'uefi'>,<br>
> "params": <dict><br>
> },<br>
> "disk": {<br>
> "raid": {<br>
> "level": <int>,<br>
> "controller": <one of 'sw' or 'hw'>,<br>
> "driver": <string>,<br>
> "params": <dict><br>
> }, ...<br>
> },<br>
> "network": {<br>
> ...<br>
> }<br>
> }<br>
><br>
> etc, etc.<br>
><br>
> Don't use trait strings to represent state information.<br>
><br>
> <br>
><br>
> I don't see an alternative proposal that will satisfy what we have<br>
> to solve.<br>
><br>
> <br>
><br>
><br>
> Best,<br>
> -jay<br>
><br>
> Case #2 is more interesting. We have two options, I think:<br>
><br>
> a) Operators still set traits on nodes, drivers are simply<br>
> validating them. E.g.<br>
> an operators sets CUSTOM_RAID_5, and the node's RAID<br>
> interface checks if it is<br>
> possible to do. The downside is obvious - with a lot of<br>
> deploy templates<br>
> available it can be a lot of manual work.<br>
><br>
> b) Drivers report the traits, and they get somehow added to<br>
> the traits provided<br>
> by an operator. Technically, there are sub-cases again:<br>
> b.1) The new traits API returns a union of<br>
> operator-provided and<br>
> driver-provided traits<br>
> b.2) The new traits API returns only operator-provided<br>
> traits; driver-provided<br>
> traits are returned e.g. via a new field<br>
> (node.driver_traits). Then nova will<br>
> have to merge the lists itself.<br>
><br>
> My personal favorite is the last option: I'd like a clear<br>
> distinction between<br>
> different "sources" of traits, but I'd also like to reduce<br>
> manual work for<br>
> operators.<br>
><br>
> A valid counter-argument is: what if an operator wants to<br>
> override a<br>
> driver-provided trait? E.g. a node can do RAID 5, but I<br>
> don't want this<br>
> particular node to do it for any reason. I'm not sure if<br>
> it's a valid case, and<br>
> what to do about it.<br>
><br>
> Let me know what you think.<br>
><br>
> Dmitry<br>
><br>
><br>
> [1] <a href="http://git.openstack.org/cgit/openstack/os-traits/tree/" rel="noreferrer" target="_blank">http://git.openstack.org/cgit/<wbr>openstack/os-traits/tree/</a><br>
> [2] Based on how many attached disks the node had, the presence<br>
> and abilities of a hardware RAID controller, etc<br>
><br>
><br>
><br>
> ______________________________<wbr>______________________________<wbr>______________<br>
> OpenStack Development Mailing List (not for usage questions)<br>
> Unsubscribe:<br>
> <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.<wbr>openstack.org?subject:<wbr>unsubscribe</a><br>
</div></div>> <<a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">http://OpenStack-dev-request@<wbr>lists.openstack.org?subject:<wbr>unsubscribe</a>><br>
<span class="">> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-dev</a><br>
><br>
> <br>
><br>
><br>
> ______________________________<wbr>______________________________<wbr>______________<br>
> OpenStack Development Mailing List (not for usage questions)<br>
> Unsubscribe:<br>
> <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.<wbr>openstack.org?subject:<wbr>unsubscribe</a><br>
</span>> <<a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">http://OpenStack-dev-request@<wbr>lists.openstack.org?subject:<wbr>unsubscribe</a>><br>
<div class="HOEnZb"><div class="h5">> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-dev</a><br>
><br>
><br>
><br>
> ______________________________<wbr>______________________________<wbr>______________<br>
> OpenStack Development Mailing List (not for usage questions)<br>
> Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.<wbr>openstack.org?subject:<wbr>unsubscribe</a><br>
> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-dev</a><br>
><br>
<br>
______________________________<wbr>______________________________<wbr>______________<br>
OpenStack Development Mailing List (not for usage questions)<br>
Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.<wbr>openstack.org?subject:<wbr>unsubscribe</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-dev</a><br>
</div></div></blockquote></div><br></div></div>