On Wed, May 15, 2019 at 11:49:03AM +0100, Sean Mooney wrote:
On Wed, 2019-05-15 at 11:24 +0200, Kashyap Chamarthy wrote:
[...]
Contention / unsolved question ------------------------------
Whether we should expose CPU flags (e.g. "SSBD", or "STIBP") that provide mitigation from CPU flaws as traits or not? It is a "policy" decision, and the 'traits' are "forever" (well, you can soft-deprecate them with a comment) once they're added, hence all the belaboring.
There's no consensus here. Some think that we should _not_ allow those CPU flags as traits which can 'allow' you to target vulnerable hosts.
for what its worth im in this camp and have said so in other places where we have been disucssing it.
Yep, noted.
Some think it is okay to add these as granular CPU traits. (Have a gander at the discussion on this[2] change.)
Does the Security Team has any strong opinions?
[...]
Next steps ----------
If there is consensus on dropping those CPU-flags-as-traits that let you target vulnerable hosts, drop them. And add only those CPU flags as traits that provide either 'features' (what's the definition?) or those that reduce performance degradation.
my vote is for only adding tratis for cpu featrue.
Noted; I'd like to hear other opinions. (And note that the word "feature" can get fuzzy in this context, I'll assume we're using it somewhat loosely to include things that help with reducing perf degradation, etc.)
PCID is a CPU feautre that was designed as a performce optiomistation
... except that "feature" was a 'no-op' and it wasn't even _used_, until Linux 4.1.4 enabled it (in November 2017) for Meltdown mitigation. So the presence of PCID in the hardware didn't matter one whit all these decades. (Source: http://archive.is/ma8Iw.)
and several generation later also was found to be useful in reducing the performace impacts of the sepcter mitigation
Nit: Not Spectre, but Meltdown. [...]
Some think this is not "Nova's business", because: "just like how you don't want to stop based on CPU fan speed or temperature or firmware patch levels ...".
i think it applies perfectly.
It's a matter of scope. To be clear — I'm not "insisting" that it be done in Nova. Just thinking out loud. [...]
form a product perspective vendors shoudl ensure that they provide tooling and software updated that are secure by default
"Product perspective" is irrelevant here. Of course, it's obvious that vendors "should" provide the relevant tooling and sofware updates.
But that argument doesn't quite apply, as CPU fan/speed are very different, and are not seen by the guest. If you take security seriously, it _is_ be fair game, IMHO, to make Nova warn (then stop) launching instances on Compute hosts with vulnerable
Correcting myself: Okay, "stopping" / "refusing to launch" is too strict and unresonable; scratch that. (Because, as discussed before, there _are_ valid cases to be made that certain admins/operators intentionally will run on vulnerable hypervisors — e.g. because their CPUs are too old to receive microcode updates. Or may deliberately tolerate this risk, as they know their risk policy. Or they're running staging envs, or any number of other reasons.)
hypervisors.
the same aregument could be aplied to qemu or libvirt.
No, that argument does not apply to QEMU or libvirt. Why? QEMU and libvirt are low-level primitives. They explicitly state that they don't, and will not, make such "policy" decisions. But Nova, as a management tool, _does_ make some policy decisions (e.g. how we generate a libvirt guest XML based on certain criteria, and others). And in this case, Nova _can_ take a stance that "orchestration tools" should do that — that's perfectly acceptable. [...] -- /kashyap