On Wed, Jan 20, 2021 at 9:56 AM Dmitry Tantsur <dtantsur@redhat.com> wrote:
Hi all,
Now that we've gained some experience with using Redfish virtual media I'd like to reopen the discussion about $subj. For the context, the idrac-redfish-virtual-media boot interface appeared because Dell machines need an additional action [1] to boot from virtual media. The initial position on hardware interfaces was that anything requiring OEM actions must go into a vendor hardware interface. I would like to propose relaxing this (likely unwritten) rule.
You see, this distinction causes a lot of confusion. Ironic supports Redfish, ironic supports iDRAC, iDRAC supports Redfish, ironic supports virtual media, Redfish supports virtual media, iDRAC supports virtual media. BUT! You cannot use redfish-virtual-media with iDRAC. Just today I had to explain the cause of it to a few people. It required diving into how exactly Redfish works and how exactly ironic uses it, which is something we want to protect our users from.
Wow! Now I’m confused, too. AFAIU, the people you had to help decided to use the redfish driver, instead of the idrac driver. It is puzzling that they decided to do that considering the ironic driver composition reform [1] of a couple years ago. Recall that reform allows “having one vendor driver with options configurable per node instead of many drivers for every vendor” and had the following goals. “- Make vendors in charge of defining a set of supported interface implementations in priority order. - Allow vendors to guarantee that unsupported interface implementations will not be used with hardware types they define. This is done by having a hardware type list all interfaces it supports.” The idrac driver is Dell Technologies’ vendor driver for systems with an iDRAC. It offers a one-stop shop for using ironic to manage its systems. Users can select among the hardware interfaces it supports. Each interface uses a single management protocol -- Redfish, WS-Man, and soon IPMI [2] -- to communicate with the BMC. While it supports the idrac-redfish-virtual-media boot interface, it does not support redfish-virtual-media. One cannot configure a node with the idrac driver to use redfish-virtual-media.
We already have a precedent [2] of adding vendor-specific handling to a generic driver.
That change was introduced about a month ago in the community’s vendor-independent ipmi driver. That was very understandable, since IPMI is a very mature management protocol and was introduced over 22 years ago. I cannot remember what I was doing back then :) As one would expect, the ipmi driver has experienced very little change over the past two-plus years. I count roughly two (2) substantive changes over that period. By contrast, the Redfish protocol is just over five (5) years old. Its vendor-independent driver, redfish, has been fertile ground for adding new, advanced features, such as BIOS settings configuration, firmware update, and RAID configuration, and fixing bugs. It fosters lots of change, too many for me to count.
I have proposed a patch [3] to block using redfish- virtual-media for Dell hardware, but I grew to dislike this approach. It does not have precedents in the ironic code base and it won't scale well if we have to handle vendor differences for vendors that don't have ironic drivers.
Dell understands and is on board with ironic’s desire that vendors support the full functionality offered by the vendor-independent redfish driver. If the iDRAC is broken with regards to redfish-virtual-media, then we have a vested interest in fixing it. While that is worked, an alternative approach could be for our community to strengthen its promotion of the goals of the driver composition reform. That would leverage ironic’s long-standing ability to ensure people only use hardware interfaces which the vendor and its driver support.
Based on all this I suggest relaxing the rule to the following: if a feature supported by a generic hardware interface requires additional actions or has a minor deviation from the standard, allow handling it in the generic hardware interface. Meaning, redfish-virtual-media starts handling the Dell case by checking the System manufacturer (via the recently added detect_vendor call) and loading the OEM code if it matches "Dell". After this idrac-redfish-virtual-media will stay empty (for future enhancements and to make the patch backportable).
That would cause the vendor-independent redfish driver to become dependent on sushy-oem-idrac, which is not under ironic governance. It is worth pointing out the sushy-oem-idrac library is necessary to get virtual media to work with Dell systems. It was first created for that purpose. It is not a workaround like those in sushy, which accommodate common, minor standards interpretation and implementation differences across vendors by sprinkling a bit of code here and there within the library, unbeknownst to ironic proper. We at Dell Technologies are concerned that the proposed rule change would result in a greater code review load on the ironic community. Since vendor-specific code would be in the generic hardware interface, much more care, eyes, and integration testing against physical hardware would be needed to ensure it does not break others. And our community is already concerned about its limited available review bandwidth [3]. Generally speaking, the vendor third-party CIs do not cover all drivers. Rather, each vendor only tests its own driver, and, in some cases, sushy. Therefore, changes to the vendor-independent redfish driver may introduce regressions in what has been working with various hardware and not be detected by automated testing before being merged. Can we afford this additional review load, prospective slowing down of innovation with Redfish, and likely undetected regressions? Would that be best for our users when we could fix the problem in other ways, such as the one suggested above? Also consider that feedback to the DMTF to drive vendor consistency is critical, but the DMTF needs feedback on what is broken in order to push others to address a problem. Remember the one-time boot debacle when three vendors broke at the same time? Once folks went screaming to the DMTF about the issue, it quickly explained it to member companies, clarified the standard, and created a test case for that condition. Changing the driver model to accommodate everyone's variations will reduce that communication back to the DMTF, meaning the standard stalls and interoperability does not gain traction.
Thoughts?
TL;DR, we strongly recommend ironic not make this rule change. Clearly communicating users should use the vendor driver should simplify their experience and eliminate the confusion. The code as-is is factored well as a result of the 21st century approach the community has taken to date. Vendors can implement the driver OEM changes they need to accommodate their unique hardware and BMC requirements, with reduced concern about the risk of breaking other drivers or ironic itself. Ironic’s driver composition reform, sushy, and sushy’s OEM extension mechanism support that modern approach. Our goal is to continue to improve the iDRAC Redfish service’s compliance with the standard and eliminate the kind of OEM code Dmitry identified. Beware of unintended consequences, including - reduced quality, - slowed feature and bug fix velocity, - stalled DMTF Redfish standard, - lost Redfish interoperability traction, and - increased code review load.
Dmitry
[1] https://opendev.org/openstack/ironic/src/commit/6ea73bdfbb53486cf9 905d21024d16cbf5829b2c/ironic/drivers/modules/drac/boot.py#L130 [2] https://review.opendev.org/c/openstack/ironic/+/757198/ [3] https://review.opendev.org/c/openstack/ironic/+/771619
-- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill
I welcome your feedback. Rick [1] https://opendev.org/openstack/ironic-specs/src/branch/master/specs/approved/... [2] https://storyboard.openstack.org/#!/story/2008528 [3] https://etherpad.opendev.org/p/ironic-wallaby-midcycle