[openstack-dev] [ironic] Summary of ironic sessions from Sydney

Michael Still mikal at stillhq.com
Thu Nov 23 01:45:05 UTC 2017

Thanks for this summary. I'd say the cinder-booted IPA is definitely of
interest to the operators I've met. Building new IPAs, especially when
trying to iterate on what drivers are needed is a pain so being able to
iterate faster would be very useful. That said, I guess this implies
booting more than one machine off a volume at once?


On Wed, Nov 15, 2017 at 3:18 AM, Julia Kreger <juliaashleykreger at gmail.com>

> Greetings ironic folk!
> Like many other teams, we had very few ironic contributors make it to
> Sydney. As such, I wanted to go ahead and write up a summary that
> covers takeaways, questions, and obvious action items for the
> community that were raised by operators and users present during the
> sessions, so that we can use this as feedback to help guide our next
> steps and feature planning.
> Much of this is from my memory combined with notes on the various
> etherpads. I would like to explicitly thank NobodyCam for reading
> through this in advance to see if I was missing anything at a high
> level since he was present in the vast majority of these sessions, and
> dtantsur for sanity checking the content and asking for some
> elaboration in some cases.
> -Julia
> Ironic Project Update
> =====================
> Questions largely arose around use of boot from volume, including some
> scenarios we anticipated that would arise, as well as new scenarios
> that we had not considered.
> Boot nodes booting from the same volume
> ---------------------------------------
> From a technical standpoint, when BFV is used with iPXE chain loading,
> the chain loader reads the boot loader and related data from the
> cinder (or, realistically, any iSCSI volume). This means that a
> skilled operator is able to craft a specific volume that may just turn
> around and unpack a ramdisk and operate the machine solely from RAM,
> or that utilize an NFS root.
> This sort of technical configuration would not be something an average
> user would make use of, but there are actual use cases that some large
> scale deployment operators would make use of and that would provide
> them value.
> Additionally, this topic and the desire for this capability also come
> up during the “Building a bare metal cloud is hard” talk Q&A.
> Action Item: Check the data model to see if we prohibit, and consider
> removing the prohibition against using the same volume across nodes,
> if any.
> Cinder-less BFV support
> -----------------------
> Some operators are curious about booting Ironic managed nodes without
> cinder in a BFV context. This is something we anticipated and built
> the API and CLI interfaces to support this. Realistically, we just
> need to offer the ability for the data to be read and utilized.
> Action Item: Review code and ensure that we have a some sort of no-op
> driver or method that allows cinder-less node booting. For existing
> drivers, it would be the shipment of the information to the BMC or the
> write-out of iPXE templates as necessary.
> Boot IPA from a cinder volume
> -----------------------------
> With larger IPA images, specifically in cases where the image contains
> a substantial amount of utilized or tooling to perform cleaning,
> providing a mechanism to point the deployment Ramdisk to a cinder
> volume would allow more efficient IO access.
> Action Item: Discuss further - Specifically how we could support as we
> would need to better understand how some of the operators might use
> such functionality.
> Dedicated Storage Fabric support
> --------------------------------
> A question of dedicated storage fabric/networking support arose. For
> users of FibreChannel, they generally have a dedicated storage fabric
> by the very nature of separate infrasturcture. However, with ethernet
> networking where iSCSI software initiators are used, or even possibly
> converged network adapters, things get a little more complex.
> Presently, with the iPXE boot from volume support, we boot using the
> same interface details for the neutron VIF that the node is attached
> with.
> Moving forward, with BFV, the concept was to support the use of
> explicitly defined interfaces as storage interfaces, which could be
> denoted as "volume connectors" in ironic by type defined as "mac". In
> theory, we begin to get functionality along these lines once
> https://review.openstack.org/#/c/468353/ lands, as the user could
> define two networks, and the storage network should then fall to the
> explicit volume connector interface(s). The operator would just need
> to ensure that the settings being used on that storage network are
> such that the node can boot and reach the iSCSI endpoint, and that a
> default route is not provided.
> The question then may be, does Ironic do this quietly for the user
> requesting the VM or not, and how do we document the use such that
> operators can conceptualize it. How do we make this work at a larger
> scale? How could this fit or not fit into multi-site deployments?
> In order to determine if there is more to do, we need to have more
> discussions with operators.
> Action items:
> * Determine overall needs for operators, since this is implementation
> architecture centric.
> * Plan forward path form there, if it makes sense.
> Note: This may require more information to be stored or leveraged in
> terms of structural or location based data.
> Migration questions from classic drivers to Hardware types
> ----------------------------------------------------------
> One explicit question from the operator community was if we intended
> to perform a migration from the classic driver to hardware types. In a
> sense, there are two issues here. The first being a perception of the
> work and the second is there a good way to cleanly identify, and
> transform classic drivers during upgrade.
> Action item:
> * For whatever reason the ironic community felt it was un-necessary to
> facilitate a migration for users from drivers to resource classes,
> even though we have direct analogs. The ironic community should
> re-evaluate and consider implementing migration logic to ease user
> migration.
> * In order to proceed, Ironic does need to understand if operators
> would be okay if the upgrade process failed, that is if the
> pre-upgrade checks detected that the configuration was incompatible
> for a migration to be successful. This could allow an operator to
> correct their configuration file and re-execute the upgrade attempt.
> Ironic use Feedback Session
> ===========================
> https://etherpad.openstack.org/p/SYD-forum-ironic-feedback
> The feedback session felt particularly productive because developers
> were far out numbered by operators.
> Current Troubles/Not Working for Operators
> ------------------------------------------
> * Current RAID deployment process where we apply raid configuration
> generally during the cleaning step, prior to deploy.
> ** One of the proposed solutions was the marriage of traits, deploy
> templates, and the application of deployment templates upon
> deployment.
> ** The concern is that this will lead to an explosion of flavors, and
> some operators environments are already extremely flavor-full. “I
> presently run `nova flavor-list`, and go get a coffee”
> ** The mitigating factor will be the ability to allow at-boot time
> definition by the user initiating the deployment, that additional
> traits could be proposed on the command line. This was mentioned by
> Sam Betts, and one of the nova cores present indicated that it was
> part of their plan.
> * UEFI iPXE boot - Specifically some operators are encountering
> issues, with some vendors hardware, that “should” be compatible,
> however is not actually working except in specific scenarios.
> ** This is not an ironic bug.
> ** In the specific case that an operator reported, they were forced
> into use of a vendor driver and specific settings, which seemed like
> something they would have preferred to avoid.
> ** The community members, as long with the users and operators present
> agreed that a good solution would be to propose documentation updates
> to our repository that detail when drivers _do not_ work, or when
> there are weird compatibility issues that are not quite visible.
> ** It may be worth considering some sort of matrix to raise visibility
> of drivers compatibility/interoperability moving forward. The Ironic
> team would not push back if an operator wishes to being updating our
> Admin documentation with such information.
> Action Items:
> * The community should encourage operators to submit documentation
> changes when they become aware of such issues.
> * The community should also encourage vendor driver maintainers to
> explicitly document their known-good/tested scenarios, as some
> hardware with-in the same family can vary.
> What Operators are indicating that they need
> --------------------------------------------
> Firmware Updates
> ~~~~~~~~~~~~~~~~
> Our present firmware update model is dependent upon a hardware manager
> driving the process during cleaning, which presently requires the
> hardware manager to be built inside the ramdisk image. This is
> problematic as it requires operators to craft and build hardware
> managers that fit their needs, and then ensure those are running on
> the specific hosts to upgrade their firmware.
> While this may seem easy and reasonable for a small deployment, there
> is an operations disconnect in many organizations between who blesses
> new firmware versions, and who controls the hardware. In some cases, a
> team may be in charge of certifying and testing new firmware, while
> another team entirely operates the cloud. These process and
> operational constraints also prevent hardware managers from being
> shared in the open, because they could potentially reveal security
> state of a deployment. Simply put, operators need something easier,
> especially when they may receive twenty different chassis in a single
> year.
> While we discussed this as a group, we did seem to begin to reach an
> understanding of what would be useful.
> Several operators made it clear that they feel that Ironic is in a
> position to help drive standardization across vendors.
> What operators are looking for:
> * A framework or scaffolding to facilitate centrally managed firmware
> updates where the current state information is published upward, and
> the system replies with the firmware to be applied.
> ** Depending on the deployment, an operator may choose to assert
> firmware upon every cleaning, but they need to be able to identify,
> the hardware, current firmware, and necessary versions by some sort of
> policy.
> ** Any version policy may vary across the infrastructure, based on
> either resource class, or hardware ownership concepts.
> ** This may, in itself just be a hardware manager that calls out to an
> external service, and executes based upon what the service tells it to
> do.
> * Ironic to work with vendors to encourage some sort of standardized
> packaging and/or installation process such that the firmware updating
> code can be as generic as possible.
> One other note worth mentioning, some operators spoke of stress
> testing their hardware during cleaning processes. It seems like a
> logical thing to do, however this would be best something for a few
> operators to explicitly propose what they wish to test, and how they
> do it presently so we as a community can gain a better understanding.
> Action Items:
> * Poll hardware vendors during the next weekly meeting and attempt to
> build an understanding and consensus.
> * With feedback, we will then have to take the next step is trying to
> determine how to fit such a service into Ironic along with what
> ironic's expectations are for drivers regarding firmware upgrades.
> TPM Key Assertion
> ~~~~~~~~~~~~~~~~~
> Some operators utilize their TPMs for secure key storage, and need a
> mechanism to update/assert a new key to overwrite existing key data.
> The key data in the TPM is used by the running system, and we have no
> operational need to store or utilize the data. Presently some
> operators perform this manually, and replacing keys on systems running
> elsewhere in the world is presently a human intensive process.
> The consensus in the room was that this might be a good out of band
> management interface feature which could be in the management
> interfaces for the vendor drivers. We presently minimally use the
> management interface.
> From a security standpoint, this is also something we shouldn’t store
> locally, but only be a clean pass-through conduit for the data, which
> makes explicitly out-of-band usage even more appealing with vendor
> drivers.
> Action Item: Poll hardware vendors during the next weekly meeting or
> two in order to begin discussion of viability/capability to support.
> This could be passthru functionality in the driver, but if two drivers
> can eventually support such functionality, we should standardize this
> upfront.
> Reversing the communications flow
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> One of the often discussed items in security conscious environments is
> the need to be able to have the conductor initiate communication to
> the agent. While this model will not work for all deployments, we've
> had a consensus in the past that this is something we should consider
> doing as an optional setting.
> In other words, IPA could have a mode of operation where it no longer
> heartbeats to the API, where the conductor would lookup the address
> from Neutron, and proceed to poll it until the node came online. The
> conductor would then poll that address on a regular basis, much like
> heart-beating works today. We should keep in mind, that this polling
> operation will have an increased impact on conductor load.
> Several operators present in the session expressed interest, with
> others indicated this would be a breaking change for their
> environment's security model, and as such any movement in this
> direction must be optional.
> Action Item: Someone writes a specification and poll the larger
> operators that we know in the community for thoughts, in order to see
> if it meets their needs.
> Documentation on known issues and fixes or incompatibilities
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Operators would like to see more information on known driver issues or
> incompatibilities explicitly stated in the documentation. Additionally
> operators would like a single location for "how to fix this issue"
> that is not a bug tracking system.
> There seemed to be consensus that these were good things to document,
> and like the like this, and it seems like the community does not
> disagree. That being said, the operators are the ones who will have a
> tendency for us to be more aware of such issues.
> The best way for the operator community to help the developers, is to
> propose documentation patches to the ironic repository to raise
> awareness and visibility with-in our own documentation. We must keep
> in mind that we must curate this information as well, since some of
> these things are not necessarily “bugs”, much like the UEFI boot
> issues noted earlier.
> Automatic BIOS Configuration
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> tl;dr: We are working on it.
> Use of System UUID in addition to MAC addresses
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Some operators would like to see changes in order to allow booting
> based upon a recorded system UUID. In these cases, the operator may be
> using the "noop" network interface driver, or another custom driver
> that does not involve neutron.
> To the reasons to support UUID based booting are extensive:
> * iPXE can attempt the system UUID first, so support for utilizing the
> UUID could remove several possible transactions.
> * The first ethernet device to iPXE may not be the one known to
> ironic, and may still obtain an IP address based upon the environment
> configuration/operation. This will largely be the case in an
> environment where DHCP is not managed by neutron. Presently, operators
> have to wait for the unknown interfaces to fail completely, and
> eventually reach a known network interface.
> ** Possibly worse, the order may not be consistent depending on the
> hardware boot order / switch configuration / cabling.
> ** Operators indicated that swapped cabling with Link Aggregates is a
> semi-common problem they encounter.
> * MAC addresses of nodes may just not be known, and evolving to
> support hard coded UUIDs does provide us some greater flexibility in
> the terms of being able to boot a node where we do not know nor
> control the IP addressing assigned.
> In addition to UUIDs, an operator expressed interest in having the
> same boot behavior, however with the IP address allocated, as opposed
> to the UUID. This also deserves consideration as it may be useful for
> some operators.
> Action Items:
> * We should determine a method of storage of the UUID if discovered or
> already known. Some operators may already know the address.
> ** Suggestion from an operator: Maybe just allow setting of uuid when
> a node is created, like we do with ports, so that a operator or
> inspector could set the node uuid to be the same as the systems uuid,
> thus eliminating the need for another field.
> ** Ironic contributor: Alternatively, we should just add a boolean
> that writes it out and offers it as an initial step, and then falls
> back to the MAC address based attempt.
> * Update template generation to support writing a symlink with the
> UUID and / or MAC addresses.
> * Explore possibility of doing the same with IP addresses.
> Diskless boot
> ~~~~~~~~~~~~~
> This is a repeat theme that has arisen before, and in many cases could
> be solved via the BFV iPXE functionality, however other operators have
> expressed need in the past for more generic boot options in order to
> boot their nodes. There has been some prior specifications on making
> generic PXE interfaces available for things such as network switches.
> As such, we should likely re-evaluate those specifications.
> Action Item: Ironic should re-evaluate current position, and review
> related specifications.
> Physical Location/Hardware Ownership
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> This one didn't quite make the notes, but a couple attendees seem to
> remember it and it is worth mentioning.
> Presently, there is no ability to have a geographically diverse ironic
> installation where a single pane of glass is provided by the API. To
> add further complexity, ironic may be in a situation where it is
> managing hardware that the operator might not explicitly own, that
> needs to be in a delineated pool. We presently have no way to
> represent or control scheduling in these cases.
> This quickly leads to tenant awareness, as in an operator may have a
> tenant that owns hardware hardware in the datacenter. Naturally, this
> can get complex quite quickly, but it seems logical for many users as
> they have either trusted users that may wish to manually deploy
> hardware, and in many of those cases, it is desired that hardware be
> used by no other tenant. This concept may also be extended to a
> concept of "authorized users" who have temporary implied rights to
> interact with ironic based upon permissions and current ownership.
> To keep this short, the impacts of this are _massive_ as they are
> intertwined fundamental changes to how ironic represents data at the
> API level as well as to how ironic executes requests as the end goal
> would be to provide the ability to provide the granularity to say
> "These two conductors are for x environment" or the granularity of
> "your only allowed to see x nodes". As a result of all of this, it
> would be a huge API change. The current concept of which, is just
> build upon the existing 1.0 API as a 2.0 API.
> Action Item: TheJulia and Nobodycam volunteer as tributes... to start a
> spec.
> Ironic On-boarding Session
> ==========================
> The Sydney summit was the first attempt by the ironic team to execute
> an on-boarding session for the community. As such, the intent was to
> take a free form approach as an attempt to answer questions that
> anyone had for community members, which would provide feedback into
> what new contributors might be interested in moving forward.
> By in large, the questions that were asked boiled down to the
> following questions:
> Where do I find the code?
> This was largely a question of what repository contains what pieces.
> How do I setup a test environment?
> This was very much a question of getting started, which led into the
> next logical question.
> How do I test without real physical servers?
> The answer became Devstack or Bifrost, depending on your use case and
> desire to perform full-stack development or lightweight work with or
> along side Ironic.
> Can I test using remote VMs?
> Overall the answer was yes, but that you needed to handle networking
> yourself to bridge it through and have some mechanism to control
> power. Ironic-staging-drivers was brought up as a repository that
> might have useful drivers in these cases. Ironic should to look at
> improving some of our docs to highlight the possibilities?
> What alternatives to devstack are there?
> Bifrost was raised as an example. We failed to mention kolla as an option.
> :(
> How do we see community priorities?
> This was very easy for us, but for a new contributor coming into the
> community, it is not as clear. Ironic should consider improving
> documentation to make it very clear where to find this information.
> Action Items:
> * Some of Ironic's documentation for new contributors may need
> revision to provide some of these contextual details upfront, or we
> might need to consider a Q&A portion of the documentation.
> * The ironic community should ensure that the above questions are
> largely answered in whatever material is presented as part of the next
> on-boarding session at the Vancouver summit.
> Mogan/Nova/Ironic Session
> =========================
> https://etherpad.openstack.org/p/SYD-forum-baremetal-ironic-mogan-nova
> The purpose of this session was to help compare, contrast, and provide
> background to the community as to the purpose behind the Mogan
> project, which was to create a baremetal centric user-facing compute
> API allowing non-administrator users to provision and directly control
> baremetal. The baremetal in mogan's context could be baremetal that
> already exists, or baremetal that is created from some sort of
> composible infrastructure.
> The Mogan PTL started with an overview to provide context to the
> community, and then the community shifted to asking questions to gain
> context, including polling operators for interest and concerns.
> Primarily, operator concern was over creating divergence and user
> confusion in the community.
> Once we had some operator input, we attempted to identify differences
> and shortcomings in Ironic and Nova that primarily drove the effort.
> What we were able to identify from a work-in-progress comparison
> document largely indicated that was additional insight into aggregates
> which was partly due to affinity/anti-affinity needs. Additional
> functionality exists in Mogan to list servers available for management
> and then directly act upon them, although the extent of what
> additional actions can be taken upon a baremetal node had not been
> identified.
> As the discussions went on, the Ironic team members that were present
> were able to express our concerns over communication. It largely
> seemed to be a surprise that some of our hardware teams were working
> in the direction of composible hardware, and that the use model mogan
> sought could fit into our scope and workflow for composible hardware.
> Largely, for composible hardware, we would need some way to represent
> a node that a user wishes to perform an action upon. In some cases
> now, that is performed with placeholder records representing possible
> capacity.
> Naturally for ironic, making it user facing would be a very large
> change to Ironic's API, however these are changes, based on other
> sessions, that Ironic may wish to explore given stated operator needs.
> The discussion for both Ironic and Nova was more of a “How do we best
> navigate” instead of “If we should navigate” question, which in it's
> self is positive.
> Some of these items included improving the view of available physical
> baremetal. Regional/Availability zoning, tenant utilization of the
> API, and possibly hardware ownership concepts. Many of these items, as
> touched in the feedback session, are intertwined.
> Overall, the session was good in that we were able to gain consensus
> that the core issues which spurred the creation of Mogan are
> addressable by the present Ironic and Nova contributors. Complete
> gap/feature comparison remains as an outstanding item, which may still
> influence the discussion going forward.
> Baremetal Scheduling session
> ============================
> https://etherpad.openstack.org/p/SYD-forum-baremetal-scheduling
> We were originally hoping to cancel this session and redirect everyone
> into the nova placement status update, but we soon found out that
> there were some lingering questions as well as concerns of operators
> that needed to be discussed.
> We started out in discussion and came to the realization that there
> could very well be a trait explosion in the attempt to support
> affinity and anti-affinity efforts. While for baremetal it could be a
> side-effect, it does not line up with the nova model. Conceptually, we
> could quickly end up with trait lists that could look something like:
>     CUSTOM_Charlotte_DC3
>     CUSTOM_Charlotte_DC3_ROW2_CAB4
>     NET1GB
>     NET2GB
>     NET10GB
>     NET10GB_DUAL
> At some point, someone remarked “It seems like there is just no
> solution that is going to work for everyone by default.” The remark
> was not just resource class determination, but trait identification,
> but also encompassed scheduling affinity and anti-affinity, which
> repeatedly came up in discussions over the week.
> This quickly raised an operator desire for the ironic community to
> solve for what would fit 80% of use cases, and then iterate moving
> forward. The example brought up in discussion was to give operators an
> explicit configuration parameter that they could use to assert
> resource_class, or possibly even static trait lists until they can
> populate the node with what should be there for their deployment, or
> for that individual hardware installation in their environment.
> While, the ironic community solution is "write introspection rules",
> it seems operators just want something simpler that is a standing
> default, like an explicit default in the configuration file.
> Some operators pointed out that with their processes, they would
> largely know or be able to reconcile the differences in their
> environment and make those in ironic as-needed.
> Eventually, the discussion shifted to affinity/anti-affinity which
> could partially make use of tags, although that as previously
> detailed, would quickly result in a tag explosion depending on how an
> operator implements and chooses to manage their environment.
> Action items:
> * Ironic needs to discuss as a group and what the impact of this
> discussion means. Many of themes beyond providing configurable
> defaults to meet the 80% of users, have repeatedly come up and really
> drive towards some of the things detailed as part of the feedback
> session.
> * For "resource_class" defaults, Dmitry was kind enough to create
> https://bugs.launchpad.net/ironic/+bug/1732190 as this does seem like
> a quick and easy thing for us to address.
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20171123/a735b217/attachment-0001.html>

More information about the OpenStack-dev mailing list