On Wed, 15 May 2019, Eric Fried wrote:
(NB: I'm explicitly rendering "no opinion" on several items below so you know I didn't miss/ignore them.)
I'm responding in this thread so that it's clear I'm not ignoring it. I don't have a strong opinion. I agree that availability of a trait in os-traits is not the same as nova reporting that trait when creating resource providers representing compute nodes. However, having something in os-traits that nobody is going to use is not without cost: Once something is in os-traits it must stay there forever. So if there's no pressing use case for these additions, maybe we just wait. Bit more within...
However, I'll state again for the record that vendor-specific "positive" traits (indicating "has mitigation", "not vulnerable", etc.) are nigh worthless for the Nova scheduling use case of "land me on a non-vulnerable host" because, until you can say required=in:HW_CPU_X86_INTEL_FIX,HW_CPU_X86_AMD_FIX, you would have to pick your CPU vendor ahead of time.
There's a spec for this, but it is currently on hold as there is neither immediate use cases demanding to be satisfied, nor anyone to do the work. https://review.opendev.org/649992
(Disclaimer: I'm a card-carrying "trait libertarian": freedom to do what makes sense with traits, as long as you're not hurting anyone and it's not costing the taxpayers.)
From a placement-the-service standpoint, it cares naught. It doesn't know what traits mean and cannot distinguish between official and custom traits when filtering candidates. It's important that
I guess that makes me a "trait anarcho communitarian". People should have the freedom to do what they like with traits and they aren't hurting anybody, but blessing a trait as official (by putting it in os-traits) is a strong signifier and has system-wide impacts that should be debated in ad-hoc committees endlessly until a consensus emerges which avoids anyone facepalming or rage quitting. placement be able to work easily with thousands or hundreds of thousands of traits. We very definitely do not wanting to making authorization decisions based on the value of a trait and the status of the requestor. As said elsewhere by several folk: It's how the other services use them that matters. I'm agnostic on nova reporting all the cpu flags/features/capabilities as traits. If it is going to do that, then having _those_ traits as members of os-traits is the right thing to do. I'm less agnostic on users ever needing or wanting to be aware of specific cpu features in order to get a satisfactory workload placement. I want to be able to request high performance without knowing the required underlying features. Flavors + traits (which I don't have to understand) gets us that, so ... cool.
If we want to make scheduling decisions based on vulnerabilities, it needs to be under the exclusive control of the admin.
Others have said this (at least Dan): This seems like something where something other than nova ought to handle it. A host which shouldn't be scheduled to should be disabled (as a service). -=-=- This thread and several other conversations about traits and resource classes have made it pretty clear that the knowledge and experience required to make good decisions about what names should be in os-traits and os-resource-classes (and the form the names should take) is not exactly overlapping with what's required to be a core on the placement service. How do people feel about the idea of forming a core group for those two repos that includes placement cores but has additions from nova (Dan, Kashyap and Sean would make good candidates) and other projects that consume them? Having that group wouldn't remove the need for these extended conversations but would help make sure the right people were aware of changes and participating. -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent