[Openstack-operators] [scientific] Ironic Summit recap - ops experiences

Stig Telfer stig.openstack at telfer.org
Thu May 12 04:22:21 UTC 2016

Hi All - 

Jim Rollenhagen from the Ironic project has just posted a great summit report of Ironic team activities on the openstack-devs mailing list[1], which included this item which will be of interest to the Scientific WG members who are looking to work on bare metal activities this cycle:

> # Making ops less worse
> [Etherpad](https://etherpad.openstack.org/p/ironic-newton-summit-ops)
> We discussed some common failure cases that operators see, and how we
> can solve them in code.
> We discussed flaky BMCs, which end with the node in maintenance mode,
> and if Ironic can get them out of that mode automagically. We identified
> the need to distinguish between maintenance set by ironic and set by
> operators, and do things like attempt to connect to the BMC on a power
> state request, and turn off maintenance mode if successful. JayF is
> going to write a spec for this differentiation.
> Folks also expressed the desire to be able to reset the BMC via APIs. We
> have a BMC reset function in the vendor interface for the ipmitool
> driver; dtantsur volunteered to write a spec to promote that method to
> an official ManagementInterface method.
> We also talked for a while about stuck states. This has been mostly
> solved in code, but is still a problem for some deployers. We decided
> that we should not have a "reset-state" API like nova does, but rather a
> command line tool to handle this. lintan has volunteered to write a
> proposal for this; I have also posted some [straw man
> code](https://review.openstack.org/#/c/311273/) that someone is welcome
> to take over or use.

The operator issues already identified cover some things we’ve hit at Cambridge, please do scan through and contribute if there is anything they have not covered.

Best wishes,

[1] http://lists.openstack.org/pipermail/openstack-dev/2016-May/094658.html 

