[openstack-dev] [Ironic] Proposal to add a new repository

John Trowbridge trown at redhat.com
Mon Jun 22 12:37:00 UTC 2015


This is a proposal to add a new repository governed by the ironic
inspector subteam. The current repository is named ahc-tools[1], however
there is no attachment to this name. "ironic-inspector-extra" would seem
to fit if this is moved under the Ironic umbrella.

What is AHC?
------------
* AHC as a term comes from the enovance edeploy installation method[2].
* The general concept is that we want to have a very granular picture of
the physical hardware being used in a deployment in order to be able to
match specific hardware to specific roles, as well as the ability to
find poor performing outliers before we attempt to deploy.
* For example: As a cloud operator, I want to make sure all logical
disks have random read IOPs within 15% variance of each other.
* The huge benefit of this tooling over current inspection is the number
of facts collected (~1000 depending on the hardware), all of which can
be used for matching.
* Another example: As an end user, I would like to request a bare metal
machine with a specific model GPU.

What is ahc-tools?
------------------
* We first tried to place all of this logic into a plugin in
inspector[3] (discoverd at the time). [4]
* This worked fine for just collecting some of the simple facts, however
we now had a coupling between booting a ramdisk, and matching against
the collected data.
* ahc-tools started as a way to uncouple these two steps[5].
* We also added a wrapper around the enovance report tooling[6], as it
already had the ability to generate reports based on the collected data,
but was designed to read in the data from the filesystem.
* The report tool has two functions.
* First, it can group the systems by category (NICs, Firmware,
Processors, etc.).
* Second, it can use statistical analysis to find performance outliers.

Why is ahc-tools useful to Ironic?
----------------------------------
* If we run benchmarks on hardware whenever it is turned back in by a
tenant, we can easily put nodes into maintenance if the hardware is
performing below some set threshold. This would allow us to have better
certainty that the end user is getting what we promised them.
* The advanced matching could also prove very useful. For VMs, I think
the pets vs cattle analogy holds up very well, however many use cases
for having cloud based bare metal involve access to specific hardware
capabilities. I think advanced matching could help bridge this gap.

Why not just put this code directly into inspector?
---------------------------------------------------
* Clearly this code is 100% dependent on inspector. However, inspector
is quite stable, and works great without any of this extra tooling.
* ahc-tools is very immature, and will need many breaking changes to get
to the same stability level of inspector.

Why aren't you following the downstream->stackforge->openstack path?
--------------------------------------------------------------------
* This was the initial plan[7], however we were told that under the new
"big tent", that the openstack namespace is no longer meant to signify
maturity of a project.
* Instead, we were told we should propose the project directly to
Ironic, or make a new separate project.

What is the plan to make ahc-tools better?
------------------------------------------
* The first major overhaul we would like to do is to put the reporting
and matching functionality behind a REST API.
* Reporting in particular will require significant work, as the current
wrapper script wraps code that was never designed to be a library (Its
output is just a series of print statements). One option is to improve
the library[8] to be more library like, and the other is to reimplement
the logic itself. Personally, while reimplementing the library is a
large amount of work, I think it is probably worth the effort.
* We would also like to add an API endpoint to coordinate distributed
checks. For instance, if we want to confirm that there is physical
network connectivity between a set of nodes, or if we would like to
confirm the bandwidth of those connections.
* The distributed checks and REST API will hopefully be completed in the
Liberty timeframe.
* Overhaul of the reporting will likely be an M target, unless there is
interest from new contributors in working on this feature.
* We are planning a talk for Tokyo on inspector that will also include
details about this project.

Thank you very much for your consideration.

Respectfully,
John Trowbridge

[1] https://github.com/rdo-management/ahc-tools
[2] https://github.com/enovance/edeploy/blob/master/docs/AHC.rst
[3]
https://github.com/openstack/ironic-inspector/commit/22a0e24efbef149377ea1e020f2d81968c10b58c
[4] We can have out-of-tree plugins for the inspector, so some of this
code might become a plugin again, but within the new repository tree.
[5]
https://github.com/openstack/ironic-inspector/commit/eaad7e09b99ab498e080e6e0ab71e69d00275422
[6]
https://github.com/rdo-management/ahc-tools/blob/master/ahc_tools/report.py
[7] https://review.openstack.org/#/c/193392/
[8] https://github.com/enovance/hardware



More information about the OpenStack-dev mailing list