Some operators have taken an approach of attestation and system measurement as a means to try and combat these sorts of vectors, however, if the TPM can't read the firmware to "measure" checksum out of the inband firmware channel, i.e. access the flash directly, not what malicious byte code could reply to, then it is a little difficult to trust that mechanism. The positive is that this mainly means things like drives are the items at risk at this point. Not exactly comforting as the first firmware POC I can think of that spoofs on checking the firmware was against a SATA disk.
We thought about that too - potential firmware corruption of NVMe drives, or the configuration of drives that support NVMe namespaces, and undoing this upon reprovisioning of the server. Lots of things to think about. I'm not 100% sure how the firmware signature checks work, but it seems that this would be done within the firmware itself, and not with a separate management processor inside the device. So, then we have to deal with the potential firmware flash of an older firmware version that did not have digital signature checks, which would open a channel to install anything the attacker wanted on that device.
I know some operators have brought up trying to drive their vendors into means of having an out of band mechanism to be able to check and assert these things, where in the meantime they are performing in-band flashing on upon each cleaning in hope to scrub malicious firmware in hopes of squashing any malicious user's actions. This is an approach a number of operators have publicly stated they've taken, however it requires creating your own custom hardware manager to align with the hardware you have and the firmware versions you want/expect.
Exactly - so quite an effort, and labor intensive.
I think this is a good topic for the baremetal SIG to try and discuss and push forward, because as Jay said, there is no silver bullet, and most of these patterns are basically highly customized sorts of patterns and interactions based upon your environment, your hardware, and the attack vectors you're concerned about.
I think the answer is to keep the hardware as simple as possible - meaning no internal drives or other cards that could be modified. It would actually be nice if machines had a "loadable BIOS firmware" from external media, where everytime the machine booted, the BIOS firmware would load from a trusted source (a locally attached drive - directly to the BIOS chip) - and maybe the same for BMC firmware. BIOS firmware already loads a shadow copy of the BIOS into memory already - why not just load it from external media instead somehow. Somewhat like UEFI firmware provides for BIOS configuration data. This strategy leaves the hardware in a "bare" state with no software, so resetting the device would always return to a clean state. I'll have to look for the baremetal SIG and participate. Thanks for pointing it out! Eric