[openstack-dev] [nova][placement] update_provider_tree design updates

Eric Fried openstack at fried.cc
Thu Mar 15 21:08:52 UTC 2018

Excellent and astute questions, both of which came up in the discussion,
but I neglected to mention.  (I had to miss *something*, right?)

See inline.

On 03/15/2018 02:29 PM, Chris Dent wrote:
> On Thu, 15 Mar 2018, Eric Fried wrote:
>> One of the takeaways from the Queens retrospective [1] was that we
>> should be summarizing discussions that happen in person/hangout/IRC/etc.
>> to the appropriate mailing list for the benefit of those who weren't
>> present (or paying attention :P ).  This is such a summary.
> Thank you _very_ much for doing this. I've got two questions within.
>> ...which we discussed earlier this week in IRC [4][5].  We concluded:
>> - Compute is the source of truth for any and all traits it could ever
>> assign, which will be a subset of what's in os-traits, plus whatever
>> CUSTOM_ traits it stakes a claim to.  If an outside agent sets a trait
>> that's in that list, compute can legitimately remove it.  If an outside
>> agent removes a trait that's in that list, compute can reassert it.
> Where does that list come from? Or more directly how does Compute
> stake the claim for "mine"?

One piece of the list should come from the traits associated with the
compute driver capabilities [2].  Likewise anything else in the future
that's within compute but outside of virt.  In other words, we're
declaring that it doesn't make sense for an operator to e.g. set the
"has_imagecache" trait on a compute if the compute doesn't do that
itself.  The message being that you can't turn on a capability by
setting a trait.

Beyond that, each virt driver is going to be responsible for figuring
out its own list.  Thinking this through with my PowerVM hat on, it
won't actually be as hard as it initially sounded - though it will
require more careful accounting.  Essentially, the driver is going to
ask the platform questions and get responses in its own language; then
map those responses to trait names.  So we'll be writing blocks like:

 if sys_caps.can_modify_io:
     provider_tree.add_trait(nodename, "CUSTOM_LIVE_RESIZE_CAPABLE")
     provider_tree.remove_trait(nodename, "CUSTOM_LIVE_RESIZE_CAPABLE")

And, for some subset of the "owned" traits, we should be able to
maintain a dict such that this works:

 for feature in trait_map.values():
     if feature in sys_features:
         provider_tree.add_trait(nodename, trait_map[feature])
         provider_tree.remove_trait(nodename, trait_map[feature])

BUT what about *dynamic* features?  If I have code like (don't kill me):

 vendor_id_trait = 'CUSTOM_DEV_VENDORID_' + slugify(io_device.vendor_id)
 provider_tree.add_trait(io_dev_rp, vendor_id_trait)

...then there's no way I can know ahead of time what all those might be.
 (In particular, if I want to support new devices without updating my
code.)  I.e. I *can't* write the corresponding
provider_tree.remove_trait(...) condition.  Maybe that never becomes a
real problem because we'll never need to remove a dynamic trait.  Or
maybe we can tolerate "leakage".  Or maybe we do something
clever-but-ugly with namespacing (if
trait.startswith('CUSTOM_DEV_VENDORID_')...).  We're consciously kicking
this can down the road.

And note that this "dynamic" problem is likely to be a much larger
portion (possibly all) of the domain when we're talking about aggregates.

Then there's ironic, which is currently set up to get its traits blindly
from Inspector.  So Inspector not only needs to maintain the "owned
traits" list (with all the same difficulties as above), but it must also
either a) communicate that list to ironic virt so the latter can manage
the add/remove logic; or b) own the add/remove logic and communicate the
individual traits with a +/- on them so virt knows whether to add or
remove them.

> How does an outside agent know what Compute has claimed? Presumably
> they want to know that so they can avoid wastefully doing something
> that's going to get clobbered?

Yup [11].  It was deemed that we don't need an API/CLI to discover those
lists (assuming that would even be possible).  The reasoning was
- We'll document that there are traits "owned" by nova and attempts to
set/unset them will be frustrated.  You can't find out which ones they
are except when a manually-set/-unset trait magically dis-/re-appears.
- It probably won't be an issue because outside agents will be setting
traits based on some specific thing they want to do, and the
documentation for that thing will specify traits that are known not to
interfere with those in nova's wheelhouse.

> [2] https://review.openstack.org/#/c/538498/

More information about the OpenStack-dev mailing list