[openstack-dev] [Ironic] Fuel agent proposal

Fox, Kevin M Kevin.Fox at pnnl.gov
Tue Dec 9 17:41:45 UTC 2014

We've been interested in Ironic as a replacement for Cobbler for some of our systems and have been kicking the tires a bit recently.

While initially I thought this thread was probably another "Fuel not playing well with the community" kind of thing, I'm not thinking that any more. Its deeper then that.

Cloud provisioning is great. I really REALLY like it. But one of the things that makes it great is the nice, pretty, cute, uniform, standard "hardware" the vm gives the user. Ideally, the physical hardware would behave the same. But, 
“No Battle Plan Survives Contact With the Enemy”.  The sad reality is, most hardware is different from each other. Different drivers, different firmware, different different different.

One way the cloud enables this isolation is by forcing the cloud admin's to install things and deal with the grungy hardware to make the interface nice and clean for the user. For example, if you want greater mean time between failures of nova compute nodes, you probably use a raid 1. Sure, its kind of a pet kind of thing todo, but its up to the cloud admin to decide what's "better", buying more hardware, or paying for more admin/user time. Extra hard drives are dirt cheep...

So, in reality Ironic is playing in a space somewhere between "I want to use cloud tools to deploy hardware, yay!" and "ewww.., physical hardware's nasty. you have to know all these extra things and do all these extra things that you don't have to do with a vm"... I believe Ironic's going to need to be able to deal with this messiness in as clean a way as possible.  But that's my opinion. If the team feels its not a valid use case, then we'll just have to use something else for our needs. I really really want to be able to use heat to deploy whole physical distributed systems though.

Today, we're using software raid over two disks to deploy our nova compute. Why? We have some very old disks we recovered for one of our clouds and they fail often. nova-compute is pet enough to benefit somewhat from being able to swap out a disk without much effort. If we were to use Ironic to provision the compute nodes, we need to support a way to do the same.

We're looking into ways of building an image that has a software raid presetup, and expand it on boot. This requires each image to be customized for this case though. I can see Fuel not wanting to provide two different sets of images, "hardware raid" and "software raid", that have the same contents in them, with just different partitioning layouts... If we want users to not have to care about partition layout, this is also not ideal...

Assuming Ironic can be convinced that these features really would be needed, perhaps the solution is a middle ground between the pxe driver and the agent?

Associate partition information at the flavor level. The admin can decide the best partitioning layout for a given hardware... The user doesn't have to care any more. Two flavors for the same hardware could be "4 9's" or "5 9's" or something that way.
Modify the agent to support a pxe style image in addition to full layout, and have the agent partition/setup raid and lay down the image into it.
Modify the agent to support running grub2 at the end of deployment.

Or at least make the agent plugable to support adding these options.

This does seem a bit backwards from the way the agent has been going. the pxe driver was kind of linux specific. the agent is not... So maybe that does imply a 3rd driver may be beneficial... But it would be nice to have one driver, the agent, in the end that supports everything.

Anyway, some things to think over.

From: Jim Rollenhagen [jim at jimrollenhagen.com]
Sent: Tuesday, December 09, 2014 7:00 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Ironic] Fuel agent proposal

On Tue, Dec 09, 2014 at 04:01:07PM +0400, Vladimir Kozhukalov wrote:
> Just a short explanation of Fuel use case.
> Fuel use case is not a cloud. Fuel is a deployment tool. We install OS on
> bare metal servers and on VMs
> and then configure this OS using Puppet. We have been using Cobbler as our
> OS provisioning tool since the beginning of Fuel.
> However, Cobbler assumes using native OS installers (Anaconda and
> Debian-installer). For some reasons we decided to
> switch to image based approach for installing OS.
> One of Fuel features is the ability to provide advanced partitioning
> schemes (including software RAIDs, LVM).
> Native installers are quite difficult to customize in the field of
> partitioning
> (that was one of the reasons to switch to image based approach). Moreover,
> we'd like to implement even more
> flexible user experience. We'd like to allow user to choose which hard
> drives to use for root FS, for
> allocating DB. We'd like user to be able to put root FS over LV or MD
> device (including stripe, mirror, multipath).
> We'd like user to be able to choose which hard drives are bootable (if
> any), which options to use for mounting file systems.
> Many many various cases are possible. If you ask why we'd like to support
> all those cases, the answer is simple:
> because our users want us to support all those cases.
> Obviously, many of those cases can not be implemented as image internals,
> some cases can not be also implemented on
> configuration stage (placing root fs on lvm device).
> As far as those use cases were rejected to be implemented in term of IPA,
> we implemented so called Fuel Agent.

This is *precisely* why I disagree with adding this driver.

Nearly every feature that is listed here has been talked about before,
within the Ironic community. Software RAID, LVM, user choosing the
partition layout. These were reected from IPA because they do not fit in
*Ironic*, not because they don't fit in IPA.

If the Fuel team can convince enough people that Ironic should be
managing pets, then I'm almost okay with adding this driver (though I
still think adding those features to IPA is the right thing to do).

// jim

> Important Fuel Agent features are:
> * It does not have REST API
> * it has executable entry point[s]
> * It uses local json file as it's input
> * It is planned to implement ability to download input data via HTTP (kind
> of metadata service)
> * It is designed to be agnostic to input data format, not only Fuel format
> (data drivers)
> * It is designed to be agnostic to image format (tar images, file system
> images, disk images, currently fs images)
> * It is designed to be agnostic to image compression algorithm (currently
> gzip)
> * It is designed to be agnostic to image downloading protocol (currently
> local file and HTTP link)
> So, it is clear that being motivated by Fuel, Fuel Agent is quite
> independent and generic. And we are open for
> new use cases.
> According Fuel itself, our nearest plan is to get rid of Cobbler because
> in the case of image based approach it is huge overhead. The question is
> which tool we can use instead of Cobbler. We need power management,
> we need TFTP management, we need DHCP management. That is
> exactly what Ironic is able to do. Frankly, we can implement power/TFTP/DHCP
> management tool independently, but as Devananda said, we're all working on
> the same problems,
> so let's do it together.  Power/TFTP/DHCP management is where we are
> working on the same problems,
> but IPA and Fuel Agent are about different use cases. This case is not just
> Fuel, any mature
> deployment case require advanced partition/fs management. However, for me
> it is OK, if it is easily possible
> to use Ironic with external drivers (not merged to Ironic and not tested on
> Ironic CI).
> AFAIU, this spec https://review.openstack.org/#/c/138115/ does not assume
> changing Ironic API and core.
> Jim asked about how Fuel Agent will know about advanced disk partitioning
> scheme if API is not
> changed. The answer is simple: Ironic is supposed to send a link to
> metadata service (http or local file)
> where Fuel Agent can download input json data.
> As Roman said, we try to be pragmatic and suggest something which does not
> break anything. All changes
> are supposed to be encapsulated into a driver. No API and core changes. We
> have resources to support, test
> and improve this driver. This spec is just a zero step. Further steps are
> supposed to improve driver
> so as to make it closer to Ironic abstractions.
> For Ironic that means widening use cases and user community. But, as I
> already said,
> we are OK if Ironic does not need this feature.
> Vladimir Kozhukalov
> On Tue, Dec 9, 2014 at 1:09 PM, Roman Prykhodchenko <
> rprikhodchenko at mirantis.com> wrote:
> > It is true that IPA and FuelAgent share a lot of functionality in common.
> > However there is a major difference between them which is that they are
> > intended to be used to solve a different problem.
> >
> > IPA is a solution for provision-use-destroy-use_by_different_user use-case
> > and is really great for using it for providing BM nodes for other OS
> > services or in services like Rackspace OnMetal. FuelAgent itself serves for
> > provision-use-use-…-use use-case like Fuel or TripleO have.
> >
> > Those two use-cases require concentration on different details in first
> > place. For instance for IPA proper decommissioning is more important than
> > advanced disk management, but for FuelAgent priorities are opposite because
> > of obvious reasons.
> >
> > Putting all functionality to a single driver and a single agent may cause
> > conflicts in priorities and make a lot of mess inside both the driver and
> > the agent. Actually previously changes to IPA were blocked right because of
> > this conflict of priorities. Therefore replacing FuelAgent by IPA in where
> > FuelAgent is used currently does not seem like a good option because come
> > people (and I’m not talking about Mirantis) might loose required features
> > because of different priorities.
> >
> > Having two separate drivers along with two separate agents for those
> > different use-cases will allow to have two independent teams that are
> > concentrated on what’s really important for a specific use-case. I don’t
> > see any problem in overlapping functionality if it’s used differently.
> >
> >
> > P. S.
> > I realise that people may be also confused by the fact that FuelAgent is
> > actually called like that and is used only in Fuel atm. Our point is to
> > make it a simple, powerful and what’s more important a generic tool for
> > provisioning. It is not bound to Fuel or Mirantis and if it will cause
> > confusion in the future we will even be happy to give it a different and
> > less confusing name.
> >
> > P. P. S.
> > Some of the points of this integration do not look generic enough or nice
> > enough. We look pragmatic on the stuff and are trying to implement what’s
> > possible to implement as the first step. For sure this is going to have a
> > lot more steps to make it better and more generic.
> >
> >
> > On 09 Dec 2014, at 01:46, Jim Rollenhagen <jim at jimrollenhagen.com> wrote:
> >
> >
> >
> > On December 8, 2014 2:23:58 PM PST, Devananda van der Veen <
> > devananda.vdv at gmail.com> wrote:
> >
> > I'd like to raise this topic for a wider discussion outside of the
> > hallway
> > track and code reviews, where it has thus far mostly remained.
> >
> > In previous discussions, my understanding has been that the Fuel team
> > sought to use Ironic to manage "pets" rather than "cattle" - and doing
> > so
> > required extending the API and the project's functionality in ways that
> > no
> > one else on the core team agreed with. Perhaps that understanding was
> > wrong
> > (or perhaps not), but in any case, there is now a proposal to add a
> > FuelAgent driver to Ironic. The proposal claims this would meet that
> > teams'
> > needs without requiring changes to the core of Ironic.
> >
> > https://review.openstack.org/#/c/138115/
> >
> >
> > I think it's clear from the review that I share the opinions expressed in
> > this email.
> >
> > That said (and hopefully without derailing the thread too much), I'm
> > curious how this driver could do software RAID or LVM without modifying
> > Ironic's API or data model. How would the agent know how these should be
> > built? How would an operator or user tell Ironic what the
> > disk/partition/volume layout would look like?
> >
> > And before it's said - no, I don't think vendor passthru API calls are an
> > appropriate answer here.
> >
> > // jim
> >
> >
> > The Problem Description section calls out four things, which have all
> > been
> > discussed previously (some are here [0]). I would like to address each
> > one,
> > invite discussion on whether or not these are, in fact, problems facing
> > Ironic (not whether they are problems for someone, somewhere), and then
> > ask
> > why these necessitate a new driver be added to the project.
> >
> >
> > They are, for reference:
> >
> > 1. limited partition support
> >
> > 2. no software RAID support
> >
> > 3. no LVM support
> >
> > 4. no support for hardware that lacks a BMC
> >
> > #1.
> >
> > When deploying a partition image (eg, QCOW format), Ironic's PXE deploy
> > driver performs only the minimal partitioning necessary to fulfill its
> > mission as an OpenStack service: respect the user's request for root,
> > swap,
> > and ephemeral partition sizes. When deploying a whole-disk image,
> > Ironic
> > does not perform any partitioning -- such is left up to the operator
> > who
> > created the disk image.
> >
> > Support for arbitrarily complex partition layouts is not required by,
> > nor
> > does it facilitate, the goal of provisioning physical servers via a
> > common
> > cloud API. Additionally, as with #3 below, nothing prevents a user from
> > creating more partitions in unallocated disk space once they have
> > access to
> > their instance. Therefor, I don't see how Ironic's minimal support for
> > partitioning is a problem for the project.
> >
> > #2.
> >
> > There is no support for defining a RAID in Ironic today, at all,
> > whether
> > software or hardware. Several proposals were floated last cycle; one is
> > under review right now for DRAC support [1], and there are multiple
> > call
> > outs for RAID building in the state machine mega-spec [2]. Any such
> > support
> > for hardware RAID will necessarily be abstract enough to support
> > multiple
> > hardware vendor's driver implementations and both in-band creation (via
> > IPA) and out-of-band creation (via vendor tools).
> >
> > Given the above, it may become possible to add software RAID support to
> > IPA
> > in the future, under the same abstraction. This would closely tie the
> > deploy agent to the images it deploys (the latter image's kernel would
> > be
> > dependent upon a software RAID built by the former), but this would
> > necessarily be true for the proposed FuelAgent as well.
> >
> > I don't see this as a compelling reason to add a new driver to the
> > project.
> > Instead, we should (plan to) add support for software RAID to the
> > deploy
> > agent which is already part of the project.
> >
> > #3.
> >
> > LVM volumes can easily be added by a user (after provisioning) within
> > unallocated disk space for non-root partitions. I have not yet seen a
> > compelling argument for doing this within the provisioning phase.
> >
> > #4.
> >
> > There are already in-tree drivers [3] [4] [5] which do not require a
> > BMC.
> > One of these uses SSH to connect and run pre-determined commands. Like
> > the
> > spec proposal, which states at line 122, "Control via SSH access
> > feature
> > intended only for experiments in non-production environment," the
> > current
> > SSHPowerDriver is only meant for testing environments. We could
> > probably
> > extend this driver to do what the FuelAgent spec proposes, as far as
> > remote
> > power control for cheap always-on hardware in testing environments with
> > a
> > pre-shared key.
> >
> > (And if anyone wonders about a use case for Ironic without external
> > power
> > control ... I can only think of one situation where I would rationally
> > ever
> > want to have a control-plane agent running inside a user-instance: I am
> > both the operator and the only user of the cloud.)
> >
> >
> > ----------------
> >
> > In summary, as far as I can tell, all of the problem statements upon
> > which
> > the FuelAgent proposal are based are solvable through incremental
> > changes
> > in existing drivers, or out of scope for the project entirely. As
> > another
> > software-based deploy agent, FuelAgent would duplicate the majority of
> > the
> > functionality which ironic-python-agent has today.
> >
> > Ironic's driver ecosystem benefits from a diversity of
> > hardware-enablement
> > drivers. Today, we have two divergent software deployment drivers which
> > approach image deployment differently: "agent" drivers use a local
> > agent to
> > prepare a system and download the image; "pxe" drivers use a remote
> > agent
> > and copy the image over iSCSI. I don't understand how a second driver
> > which
> > duplicates the functionality we already have, and shares the same goals
> > as
> > the drivers we already have, is beneficial to the project.
> >
> > Doing the same thing twice just increases the burden on the team; we're
> > all
> > working on the same problems, so let's do it together.
> >
> > -Devananda
> >
> >
> > [0]
> > https://blueprints.launchpad.net/ironic/+spec/ironic-python-agent-partition
> >
> > [1] https://review.openstack.org/#/c/107981/
> >
> > [2]
> >
> > https://review.openstack.org/#/c/133828/11/specs/kilo/new-ironic-state-machine.rst
> >
> >
> > [3]
> >
> > http://git.openstack.org/cgit/openstack/ironic/tree/ironic/drivers/modules/snmp.py
> >
> > [4]
> >
> > http://git.openstack.org/cgit/openstack/ironic/tree/ironic/drivers/modules/iboot.py
> >
> > [5]
> >
> > http://git.openstack.org/cgit/openstack/ironic/tree/ironic/drivers/modules/ssh.py
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >
> >
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >

> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org

More information about the OpenStack-dev mailing list