[openstack-dev] [ironic] ironic and traits
Dmitry Tantsur
dtantsur at redhat.com
Mon Oct 23 15:05:25 UTC 2017
Your suggestions requires transparent passing of extra_specs to ironic,
which is something the nova team has objections for quite some time.
On Mon, Oct 23, 2017 at 4:09 PM, Eric Fried <openstack at fried.cc> wrote:
> We discussed this a little bit further in IRC [1]. We're all in
> agreement, but it's worth being precise on a couple of points:
>
> * We're distinguishing between a "feature" and the "trait" that
> represents it in placement. For the sake of this discussion, a
> "feature" can (maybe) be switched on or off, but a "trait" can either be
> present or absent on a RP.
> * It matters *who* can turn a feature on/off.
> * If it can be done by virt at spawn time, then it makes sense to have
> the trait on the RP, and you can switch the feature on/off via a
> separate extra_spec.
> * But if it's e.g. an admin action, and spawn has no control, then the
> trait needs to be *added* whenever the feature is *on*, and *removed*
> whenever the feature is *off*.
>
> [1]
> http://eavesdrop.openstack.org/irclogs/%23openstack-nova/
> %23openstack-nova.2017-10-23.log.html#t2017-10-23T13:12:13
>
> On 10/23/2017 08:15 AM, Sylvain Bauza wrote:
> >
> >
> > On Mon, Oct 23, 2017 at 2:54 PM, Eric Fried <openstack at fried.cc
> > <mailto:openstack at fried.cc>> wrote:
> >
> > I agree with Sean. In general terms:
> >
> > * A resource provider should be marked with a trait if that feature
> > * Can be turned on or off (whether it's currently on or not); or
> > * Is always on and can't ever be turned off.
> >
> >
> > No, traits are not boolean. If a resource provider stops providing a
> > capability, then the existing related trait should just be removed,
> > that's it.
> > If you see a trait, that's just means that the related capability for
> > the Resource Provider is supported, that's it too.
> >
> > MHO.
> >
> > -Sylvain
> >
> >
> >
> > * A consumer wanting that feature present (doesn't matter whether
> it's
> > on or off) should specify it as a required *trait*.
> > * A consumer wanting that feature present and turned on should
> > * Specify it as a required trait; AND
> > * Indicate that it be turned on via some other mechanism (e.g. a
> > separate extra_spec).
> >
> > I believe this satisfies Dmitry's (Ironic's) needs, but also Jay's
> drive
> > for placement purity.
> >
> > Please invite me to the hangout or whatever.
> >
> > Thanks,
> > Eric
> >
> > On 10/23/2017 07:22 AM, Mooney, Sean K wrote:
> > >
> > >
> > >
> > >
> > > *From:*Jay Pipes [mailto:jaypipes at gmail.com
> > <mailto:jaypipes at gmail.com>]
> > > *Sent:* Monday, October 23, 2017 12:20 PM
> > > *To:* OpenStack Development Mailing List
> > <openstack-dev at lists.openstack.org
> > <mailto:openstack-dev at lists.openstack.org>>
> > > *Subject:* Re: [openstack-dev] [ironic] ironic and traits
> > >
> > >
> > >
> > > Writing from my phone... May I ask that before you proceed with
> any plan
> > > that uses traits for state information that we have a hangout or
> > > videoconference to discuss this? Unfortunately today and tomorrow
> I'm
> > > not able to do a hangout but I can do one on Wednesday any time of
> the day.
> > >
> > >
> > >
> > > */[Mooney, Sean K] on the uefi boot topic I did bring up at the
> > ptg that
> > > we wanted to standardizes tratis for “verified boot” /*
> > >
> > > */that included a trait for uefi secure boot enabled and to
> > indicated a
> > > hardware root of trust, e.g. intel boot guard or similar/*
> > >
> > > */we distinctly wanted to be able to tag nova compute hosts with
> those
> > > new traits so we could require that vms that request/*
> > >
> > > */a host with uefi secure boot enabled and a hardware root of
> > trust are
> > > scheduled only to those nodes. /*
> > >
> > > */ /*
> > >
> > > */There are many other examples that effect both vms and bare
> > metal such
> > > as, ecc/interleaved memory, cluster on die, /*
> > >
> > > */l3 cache code and data prioritization, vt-d/vt-c, HPET, Hyper
> > > threading, power states … all of these feature may be present on
> the
> > > platform/*
> > >
> > > */but I also need to know if they are turned on. Ruling out state
> in
> > > traits means all of this logic will eventually get pushed to
> scheduler
> > > filters/*
> > >
> > > */which will be suboptimal long term as more state is tracked.
> > Software
> > > defined infrastructure may be the future but hardware defined
> > software/*
> > >
> > > */is sadly the present…/*
> > >
> > > */ /*
> > >
> > > */I do however think there should be a sperateion between asking
> for a
> > > host that provides x with a trait and asking for x to be
> > configure via/*
> > >
> > > */A trait. The trait secure_boot_enabled should never result in the
> > > feature being enabled It should just find a host with it on. If
> > you want/*
> > >
> > > */To request it to be turned on you would request a host with
> > > secure_boot_capable as a trait and have a flavor extra spec or
> image
> > > property to request/*
> > >
> > > */Ironic to enabled it. these are two very different request and
> > should
> > > not be treated the same. /*
> > >
> > >
> > >
> > >
> > >
> > > Lemme know!
> > >
> > > -jay
> > >
> > >
> > >
> > > On Oct 23, 2017 5:01 AM, "Dmitry Tantsur" <dtantsur at redhat.com
> <mailto:dtantsur at redhat.com>
> > > <mailto:dtantsur at redhat.com <mailto:dtantsur at redhat.com>>> wrote:
> > >
> > > Hi Jay!
> > >
> > > I appreciate your comments, but I think you're approaching the
> > > problem from purely VM point of view. Things simply don't work
> the
> > > same way in bare metal, at least not if we want to provide the
> same
> > > user experience.
> > >
> > >
> > >
> > > On Sun, Oct 22, 2017 at 2:25 PM, Jay Pipes <jaypipes at gmail.com
> <mailto:jaypipes at gmail.com>
> > > <mailto:jaypipes at gmail.com <mailto:jaypipes at gmail.com>>>
> wrote:
> > >
> > > Sorry for delay, took a week off before starting a new job.
> > > Comments inline.
> > >
> > > On 10/16/2017 12:24 PM, Dmitry Tantsur wrote:
> > >
> > > Hi all,
> > >
> > > I promised John to dump my thoughts on traits to the
> > ML, so
> > > here we go :)
> > >
> > > I see two roles of traits (or kinds of traits) for
> > bare metal:
> > > 1. traits that say what the node can do already (e.g.
> "the
> > > node is
> > > doing UEFI boot")
> > > 2. traits that say what the node can be *configured*
> to do
> > > (e.g. "the node can
> > > boot in UEFI mode")
> > >
> > >
> > > There's only one role for traits. #2 above. #1 is state
> > > information. Traits are not for state information. Traits
> are
> > > only for communicating capabilities of a resource provider
> > > (baremetal node).
> > >
> > >
> > >
> > > These are not different, that's what I'm talking about here. No
> > > users care about the difference between "this node was put in
> UEFI
> > > mode by an operator in advance", "this node was put in UEFI
> > mode by
> > > an ironic driver on demand" and "this node is always in UEFI
> mode,
> > > because it's AARCH64 and it does not have BIOS". These
> situation
> > > produce the same result (the node is booted in UEFI mode), and
> > thus
> > > it's up to ironic to hide this difference.
> > >
> > >
> > >
> > > My suggestion with traits is one way to do it, I'm not sure
> > what you
> > > suggest though.
> > >
> > >
> > >
> > >
> > > For example, let's say we add the following to the
> os-traits
> > > library [1]
> > >
> > > * STORAGE_RAID_0
> > > * STORAGE_RAID_1
> > > * STORAGE_RAID_5
> > > * STORAGE_RAID_6
> > > * STORAGE_RAID_10
> > >
> > > The Ironic administrator would add all RAID-related traits
> to
> > > the baremetal nodes that had the *capability* of
> > supporting that
> > > particular RAID setup [2]
> > >
> > > When provisioned, the baremetal node would either have RAID
> > > configured in a certain level or not configured at all.
> > >
> > >
> > > A very important note: the Placement API and Nova
> > scheduler (or
> > > future Ironic scheduler) doesn't care about this. At all.
> > I know
> > > it sounds like I'm being callous, but I'm not. Placement
> and
> > > scheduling doesn't care about the state of things. It only
> > cares
> > > about the capabilities of target destinations. That's it.
> > >
> > >
> > >
> > > Yes, because VMs always start with a clean state, and
> > hypervisor is
> > > there to ensure that. We don't have this luxury in ironic :)
> E.g.
> > > our SNMP driver is not even aware of boot modes (or RAID, or
> BIOS
> > > configuration), which does not mean that a node using it
> cannot be
> > > in UEFI mode (have a RAID or BIOS pre-configured, etc, etc).
> > >
> > >
> > >
> > >
> > >
> > > This seems confusing, but it's actually very useful.
> > Say, I
> > > have a flavor that
> > > requests UEFI boot via a trait. It will match both the
> > nodes
> > > that are already in
> > > UEFI mode, as well as nodes that can be put in UEFI
> mode.
> > >
> > >
> > > No :) It will only match nodes that have the UEFI
> capability.
> > > The set of providers that have the ability to be booted
> > via UEFI
> > > is *always* a superset of the set of providers that *have
> been
> > > booted via UEFI*. Placement and scheduling decisions only
> care
> > > about that superset -- the providers with a particular
> > capability.
> > >
> > >
> > >
> > > Well, no, it will. Again, you're purely basing on the VM idea,
> > where
> > > a VM is always *put* in UEFI mode, no matter how the hypervisor
> > > looks like. It is simply not the case for us. You have to care
> > what
> > > state the node is, because many drivers cannot change this
> state.
> > >
> > >
> > >
> > >
> > >
> > > This idea goes further with deploy templates (new
> concept
> > > we've been thinking
> > > about). A flavor can request something like
> CUSTOM_RAID_5,
> > > and it will match the
> > > nodes that already have RAID 5, or, more
> > interestingly, the
> > > nodes on which we
> > > can build RAID 5 before deployment. The UEFI example
> above
> > > can be treated in a
> > > similar way.
> > >
> > > This ends up with two sources of knowledge about
> traits in
> > > ironic:
> > > 1. Operators setting something they know about hardware
> > > ("this node is in UEFI
> > > mode"),
> > > 2. Ironic drivers reporting something they
> > > 2.1. know about hardware ("this node is in UEFI
> mode" -
> > > again)
> > > 2.2. can do about hardware ("I can put this node in
> > UEFI
> > > mode")
> > >
> > >
> > > You're correct that both pieces of information are
> important.
> > > However, only the "can do about hardware" part is relevant
> to
> > > Placement and Nova.
> > >
> > > For case #1 we are planning on a new CRUD API to
> set/unset
> > > traits for a node.
> > >
> > >
> > > I would *strongly* advise against this. Traits are not for
> > state
> > > information.
> > >
> > > Instead, consider having a DB (or JSON) schema that lists
> > state
> > > information in fields that are explicitly for that state
> > > information.
> > >
> > > For example, a schema that looks like this:
> > >
> > > {
> > > "boot": {
> > > "mode": <one of 'bios' or 'uefi'>,
> > > "params": <dict>
> > > },
> > > "disk": {
> > > "raid": {
> > > "level": <int>,
> > > "controller": <one of 'sw' or 'hw'>,
> > > "driver": <string>,
> > > "params": <dict>
> > > }, ...
> > > },
> > > "network": {
> > > ...
> > > }
> > > }
> > >
> > > etc, etc.
> > >
> > > Don't use trait strings to represent state information.
> > >
> > >
> > >
> > > I don't see an alternative proposal that will satisfy what we
> have
> > > to solve.
> > >
> > >
> > >
> > >
> > > Best,
> > > -jay
> > >
> > > Case #2 is more interesting. We have two options, I
> think:
> > >
> > > a) Operators still set traits on nodes, drivers are
> simply
> > > validating them. E.g.
> > > an operators sets CUSTOM_RAID_5, and the node's RAID
> > > interface checks if it is
> > > possible to do. The downside is obvious - with a lot of
> > > deploy templates
> > > available it can be a lot of manual work.
> > >
> > > b) Drivers report the traits, and they get somehow
> > added to
> > > the traits provided
> > > by an operator. Technically, there are sub-cases again:
> > > b.1) The new traits API returns a union of
> > > operator-provided and
> > > driver-provided traits
> > > b.2) The new traits API returns only
> operator-provided
> > > traits; driver-provided
> > > traits are returned e.g. via a new field
> > > (node.driver_traits). Then nova will
> > > have to merge the lists itself.
> > >
> > > My personal favorite is the last option: I'd like a
> clear
> > > distinction between
> > > different "sources" of traits, but I'd also like to
> reduce
> > > manual work for
> > > operators.
> > >
> > > A valid counter-argument is: what if an operator wants
> to
> > > override a
> > > driver-provided trait? E.g. a node can do RAID 5, but I
> > > don't want this
> > > particular node to do it for any reason. I'm not sure
> if
> > > it's a valid case, and
> > > what to do about it.
> > >
> > > Let me know what you think.
> > >
> > > Dmitry
> > >
> > >
> > > [1]
> > http://git.openstack.org/cgit/openstack/os-traits/tree/
> > <http://git.openstack.org/cgit/openstack/os-traits/tree/>
> > > [2] Based on how many attached disks the node had, the
> > presence
> > > and abilities of a hardware RAID controller, etc
> > >
> > >
> > >
> > >
> > ___________________________________________________________
> _______________
> > > OpenStack Development Mailing List (not for usage
> questions)
> > > Unsubscribe:
> > >
> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > <http://OpenStack-dev-request@lists.openstack.org?subject:
> unsubscribe>
> > >
> > <http://OpenStack-dev-request@lists.openstack.org?
> subject:unsubscribe
> > <http://OpenStack-dev-request@lists.openstack.org?subject:
> unsubscribe>>
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/
> openstack-dev
> > <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
> > >
> > >
> > >
> > >
> > > ___________________________________________________________
> _______________
> > > OpenStack Development Mailing List (not for usage questions)
> > > Unsubscribe:
> > > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > <http://OpenStack-dev-request@lists.openstack.org?subject:
> unsubscribe>
> > >
> > <http://OpenStack-dev-request@lists.openstack.org?
> subject:unsubscribe
> > <http://OpenStack-dev-request@lists.openstack.org?subject:
> unsubscribe>>
> > >
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
> > >
> > >
> > >
> > >
> > ____________________________________________________________
> ______________
> > > OpenStack Development Mailing List (not for usage questions)
> > > Unsubscribe:
> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > <http://OpenStack-dev-request@lists.openstack.org?subject:
> unsubscribe>
> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
> > >
> >
> > ____________________________________________________________
> ______________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > <http://OpenStack-dev-request@lists.openstack.org?subject:
> unsubscribe>
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
> >
> >
> >
> >
> > ____________________________________________________________
> ______________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:
> unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20171023/bf996bba/attachment-0001.html>
More information about the OpenStack-dev
mailing list