(NB: I'm explicitly rendering "no opinion" on several items below so you know I didn't miss/ignore them.)
The other day I casually noticed that the above file is missing some important CPU flags
I think this is noteworthy. These traits are being proposed because you casually noticed they were missing, not because someone asked for them. We can invent use cases, but without demand we may just be spinning our wheels.
So, theoretically there is scope for "exploiting" (but non-trivial) it is trivial all you would have to do is
I'm not a security guy, but I'm pretty sure it doesn't matter whether it's trivial; if it's possible at all, that's bad. That being the case, you don't even have to be able to target a vulnerable host for it to be a security problem. If my cloud is set up so that Joe Hacker is able to land his instance on a vulnerable host even by randomly trying, I done effed up already.
There's no consensus here. Some think that we should _not_ allow those CPU flags as traits which can 'allow' you to target vulnerable hosts.
for what its worth im in this camp and have said so in other places where we have been disucssing it.
Yep, noted.
My position is that it's not harmful to add them to os-traits; it's whether/how they're used in nova that needs some thought.
Does the Security Team has any strong opinions?
Still hoping someone speaks up in this capacity...
If there is consensus on dropping those CPU-flags-as-traits that let you target vulnerable hosts, drop them. And add only those CPU flags as traits that provide either 'features' (what's the definition?) or those that reduce performance degradation.
my vote is for only adding tratis for cpu featrue.
Noted; I'd like to hear other opinions. (And note that the word "feature" can get fuzzy in this context, I'll assume we're using it somewhat loosely to include things that help with reducing perf degradation, etc.)
I abstain. Once again, presence in os-traits is harmless; use by nova is subject to further discussion. But we also don't have any demand (that I'm aware of). However, I'll state again for the record that vendor-specific "positive" traits (indicating "has mitigation", "not vulnerable", etc.) are nigh worthless for the Nova scheduling use case of "land me on a non-vulnerable host" because, until you can say required=in:HW_CPU_X86_INTEL_FIX,HW_CPU_X86_AMD_FIX, you would have to pick your CPU vendor ahead of time.
PCID is a CPU feautre that was designed as a performce optiomistation
I'm staying well away from the what-is-a-feature discussion, mainly out of ignorance.
Some think this is not "Nova's business", because: "just like how you don't want to stop based on CPU fan speed or temperature or firmware patch levels ...".
IMO this (cpu flags/features/attributes, and even possibly firmware patch levels, though probably not fan speed or temperature) is a perfectly suitable use of traits. Not all traits have to feed into Nova scheduling decisions; they could also be used by e.g. external orchestrators. os-traits needs to have that more global not-just-Nova perspective. (Disclaimer: I'm a card-carrying "trait libertarian": freedom to do what makes sense with traits, as long as you're not hurting anyone and it's not costing the taxpayers.)
Okay, "stopping" / "refusing to launch" is too strict and unresonable; scratch that.
I agree with this, for all the reasons stated.
we can potentially make Nova check the 'sysfs' directory for vulnerabilities.
IMO this is still a good idea, but rather than warning / refusing to boot, we could expose a roll-up trait, subject to the strawman design below. To summarize my position on the os-traits side of things: - We can merge the feature-ish traits (assuming folks can agree on which ones those are). - We can merge the vulnerability traits as long as they come with nice comments explaining the potential security pitfalls around using them. - Or for all I care we can merge nothing, since we don't actually seem to have a demand for it. ========================== I'm going to dive into Nova-land now. The below would need a blueprint and a spec. And an owner. And it would be nice if it also had demand. If we want to make scheduling decisions based on vulnerabilities, it needs to be under the exclusive control of the admin. As mentioned above, exposing the traits and allowing untrusted/untrustworthy users to target vulnerable hosts is only marginally worse than having those vulnerable hosts available to said untrusted users at all. So if we are going to have virt drivers expose a VULNERABLE trait in any form, it should come with: 1) a config option in the spirit of: [scheduler] allow_scheduling_to_vulnerable_hosts = $bool (default: False) which, when False, causes the scheduler to add trait:VULNERABLE=forbidden to *all* GET /a_c requests. But we should generalize this to: (a) Maintain a hardcoded list of traits that represent vulnerabilities or other undesirables (b) Have the conf option be [scheduler]evil_trait_whitelist (c) Add [trait:$X=forbidden for $X in {(b) - (a)}] 2) a hard check to disallow trait:$X=required from *anywhere* (flavor, image, etc.) regardless of the conf option. Either reject the boot request or explicitly strip that out. For completeness, note that these traits need to be "negative" (i.e. "has vulnerability") so that we can forbid them in a list in the GET /a_c request. Because required=!INTEL_VULNERABLE,!AMD_VULNERABLE will correctly avoid vulnerable hosts from either vendor, but required=INTEL_FIXED,AMD_FIXED won't land anywhere, and we don't have required=in:INTEL_FIXED,AMD_FIXED yet. efried .