[openstack-dev] [placement] The "intended purpose" of traits

Chris Dent cdent+os at anticdent.org
Fri Sep 28 17:19:31 UTC 2018


On Fri, 28 Sep 2018, Jay Pipes wrote:

> On 09/28/2018 09:25 AM, Eric Fried wrote:
>> It's time somebody said this.

Yes, a useful topic, I think.

>> Every time we turn a corner or look under a rug, we find another use
>> case for provider traits in placement. But every time we have to have
>> the argument about whether that use case satisfies the original
>> "intended purpose" of traits.
>> 
>> That's only reason I've ever been able to glean: that it (whatever "it"
>> is) wasn't what the architects had in mind when they came up with the
>> idea of traits.
>
> Don't pussyfoot around things. It's me you're talking about, Eric. You could 
> just ask me instead of passive-aggressively posting to the list like this.

It's not just you. Ed and I have also expressed some fairly strong
statement about how traits are "supposed" to be used and I would
guess that from Eric's perspective all three of us (amongst others)
have some form of architectural influence. Since it takes a village
and all that.

> They aren't arbitrary. They are there for a reason: a trait is a boolean 
> capability. It describes something that either a provider is capable of 
> supporting or it isn't.

This is somewhat (maybe even only slightly) different from what I
think the definition of a trait is, and that nuance may be relevant.

I describe a trait as a "quality that a resource provider has" (the
car is blue). This contrasts with a resource class which is a
"quantity that a resource provider has" (the car has 4 doors).

Our implementation is pretty much exactly that ^. We allow
clients to ask "give me things that have qualities x, y, z, not
qualities a, b, c, and quanities of G of 5 and H of 7".

Add in aggregates and we have exactly what you say:

> * Does the provider have *capacity* for the requested resources?
> * Does the provider have the required (or forbidden) *capabilities*?
> * Does the provider belong to some group?

The nuance of difference is that your description of *capabilities*
seems more narrow than my description of *qualities* (aka
characteristics). You've got something fairly specific in mind, as a
way of constraining the profusion of noise that has happened with
how various kinds of information about resources of all sorts is
managed in OpenStack, as you describe in your message.

I do not think it should be placement's job to control that noise.
It should be placement's job to provide a very strict contract about
what you can do with a trait:

* create it, if necessary
* assign it to one or more resource providers
* ask for providers that either have it
* ... or do not have it

That's all. Placement _code_ should _never_ be aware of the value of
a trait (except for the magical MISC_SHARES...). It should never
become possible to regex on traits or do comparisons
(required=<CUSTOM_TEMP_85). Just "yes" or "no" to presence of quality.

> If we want to add further constraints to the placement allocation candidates 
> request that ask things like:
>
> * Does the provider have version 1.22.61821 of BIOS firmware from Marvell 
> installed on it?

That's a quality of the provider in a moment.

> * Does the provider support an FPGA that has had an OVS program flashed to it 
> in the last 20 days?

If you squint, so is this.

> * Does the provider belong to physical network "corpnet" and also support 
> creation of virtual NICs of type either "DIRECT" or "NORMAL"?

And these.

But at least some of them are dynamic rather than some kind of
platonic ideal associated with the resource provider.

I don't think placement should be concerned about temporal aspects
of traits. If we can't write a web service that can handle setting
lots of traits every second of every day, we should go home. If
clients of placement want to set weird traits, more power to them.

However, if clients of placement (such as nova) which are being the
orchestrator of resource providers manipulated by multiple systems
(neutron, cinder, ironic, cyborg, etc) wish to set some constraints
on how and what traits can do and mean, then that is up to them.

nova-scheduler is the thing that is doing `GET
/allocation_candidates` for those multiple system. It presumably
should have some say in what traits it is willing to express and
use.

But the placement service doesn't and shouldn't care.

> Then we should add a data model that allow providers to be decorated with 
> key/value (or more complex than key/value) information where we can query for 
> those kinds of constraints without needing to encode all sorts of non-binary 
> bits of information into a capability string.

Let's never do this, please. The three capabilities (ha!) of
placement that you listed above ("Does the...") are very powerful as
is and have a conceptual integrity that's really quite awesome. I
think keeping it contained and constrained in very "simple" concepts
like that was stroke of genius you (Jay) made and I'd hope we can
keep it clean like that.

If we weren't a multiple-service oriented system, and instead had
some kind of k8s-like etcd-like
keeper-of-all-the-info-about-everything, then sure, having what we
currently model as resource providers be a giant blob of metadata
(with quantities, qualitiies, and key-values) that is an authority
for the entire system might make some kind of sense.

But we don't. If we wanted to migrate to having something like that,
using placement as the trojan horse for such a change, either with
intent or by accident, would be unfortunate.

> Propose such a thing and I'll gladly support it. But I won't support 
> bastardizing the simple concept of a boolean capability just because we don't 
> want to change the API or database schema.

For me, it is not a matter of not wanting to change the API or the
database schema. It's about not wanting to expand the concepts, and
thus the purpose, of the system. It's about wanting to keep focus
and functionality narrow so we can have a target which is "maturity"
and know when we're there.

My summary: Traits are symbols that are 255 characters long that are
associated with a resource provider. It's possible to query for
resource providers that have or do not have a specific trait. This
has the effect of making the meaning of a trait a descriptor of the
resource provider. What the descriptor signifies is up to the thing
creating and using the resource provider, not placement. We need to
harden that contract and stick to it. Placement is like a common
carrier, it doesn't care what's in the box.

/me cues brad pitt

-- 
Chris Dent                       ٩◔̯◔۶           https://anticdent.org/
freenode: cdent                                         tw: @anticdent


More information about the OpenStack-dev mailing list