[openstack-dev] [Ironic] Fuel agent proposal
Vladimir Kozhukalov
vkozhukalov at mirantis.com
Tue Dec 9 14:40:06 UTC 2014
Vladimir Kozhukalov
On Tue, Dec 9, 2014 at 3:51 PM, Dmitry Tantsur <dtantsur at redhat.com> wrote:
> Hi folks,
>
> Thank you for additional explanation, it does clarify things a bit. I'd
> like to note, however, that you talk a lot about how _different_ Fuel Agent
> is from what Ironic does now. I'd like actually to know how well it's going
> to fit into what Ironic does (in additional to your specific use cases).
> Hence my comments inline:
>
> On 12/09/2014 01:01 PM, Vladimir Kozhukalov wrote:
>
>> Just a short explanation of Fuel use case.
>>
>> Fuel use case is not a cloud. Fuel is a deployment tool. We install OS
>> on bare metal servers and on VMs
>> and then configure this OS using Puppet. We have been using Cobbler as
>> our OS provisioning tool since the beginning of Fuel.
>> However, Cobbler assumes using native OS installers (Anaconda and
>> Debian-installer). For some reasons we decided to
>> switch to image based approach for installing OS.
>>
>> One of Fuel features is the ability to provide advanced partitioning
>> schemes (including software RAIDs, LVM).
>> Native installers are quite difficult to customize in the field of
>> partitioning
>> (that was one of the reasons to switch to image based approach).
>> Moreover, we'd like to implement even more
>> flexible user experience. We'd like to allow user to choose which hard
>> drives to use for root FS, for
>> allocating DB. We'd like user to be able to put root FS over LV or MD
>> device (including stripe, mirror, multipath).
>> We'd like user to be able to choose which hard drives are bootable (if
>> any), which options to use for mounting file systems.
>> Many many various cases are possible. If you ask why we'd like to
>> support all those cases, the answer is simple:
>> because our users want us to support all those cases.
>> Obviously, many of those cases can not be implemented as image
>> internals, some cases can not be also implemented on
>> configuration stage (placing root fs on lvm device).
>>
>> As far as those use cases were rejected to be implemented in term of
>> IPA, we implemented so called Fuel Agent.
>> Important Fuel Agent features are:
>>
>> * It does not have REST API
>>
> I would not call it a feature :-P
>
> Speaking seriously, if you agent is a long-running thing and it gets it's
> configuration from e.g. JSON file, how can Ironic notify it of any changes?
>
> Fuel Agent is not long-running service. Currently there is no need to have
REST API. If we deal with kind of keep alive stuff of inventory/discovery
then we probably add API. Frankly, IPA REST API is not REST at all. However
that is not a reason to not to call it a feature and through it away. It is
a reason to work on it and improve. That is how I try to look at things
(pragmatically).
Fuel Agent has executable entry point[s] like /usr/bin/provision. You can
run this entry point with options (oslo.config) and point out where to find
input json data. It is supposed Ironic will use ssh (currently in Fuel we
use mcollective) connection and run this waiting for exit code. If exit
code is equal to 0, provisioning is done. Extremely simple.
> * it has executable entry point[s]
>> * It uses local json file as it's input
>> * It is planned to implement ability to download input data via HTTP
>> (kind of metadata service)
>> * It is designed to be agnostic to input data format, not only Fuel
>> format (data drivers)
>> * It is designed to be agnostic to image format (tar images, file system
>> images, disk images, currently fs images)
>> * It is designed to be agnostic to image compression algorithm
>> (currently gzip)
>> * It is designed to be agnostic to image downloading protocol (currently
>> local file and HTTP link)
>>
> Does it support Glance? I understand it's HTTP, but it requires
> authentication.
>
>
>> So, it is clear that being motivated by Fuel, Fuel Agent is quite
>> independent and generic. And we are open for
>> new use cases.
>>
> My favorite use case is hardware introspection (aka getting data required
> for scheduling from a node automatically). Any ideas on this? (It's not a
> priority for this discussion, just curious).
That is exactly what we do in Fuel. Currently we use so called 'Default'
pxelinux config and all nodes being powered on are supposed to boot with so
called 'Bootstrap' ramdisk where Ohai based agent (not Fuel Agent) runs
periodically and sends hardware report to Fuel master node.
User then is able to look at CPU, hard drive and network info and choose
which nodes to use for controllers, which for computes, etc. That is what
nova scheduler is supposed to do (look at hardware info and choose a
suitable node).
Talking about future, we are planning to re-implement inventory/discovery
stuff in terms of Fuel Agent (currently, this stuff is implemented as Ohai
based independent script). Estimation for that is March 2015.
>
>
>
>> According Fuel itself, our nearest plan is to get rid of Cobbler because
>> in the case of image based approach it is huge overhead. The question is
>> which tool we can use instead of Cobbler. We need power management,
>> we need TFTP management, we need DHCP management. That is
>> exactly what Ironic is able to do. Frankly, we can implement
>> power/TFTP/DHCP
>> management tool independently, but as Devananda said, we're all working
>> on the same problems,
>> so let's do it together. Power/TFTP/DHCP management is where we are
>> working on the same problems,
>> but IPA and Fuel Agent are about different use cases. This case is not
>> just Fuel, any mature
>> deployment case require advanced partition/fs management.
>>
> Taking into consideration that you're doing a generic OS installation
> tool... yeah, it starts to make some sense. For cloud advanced partition is
> definitely a "pet" case.
Generic image based OS installation tool.
>
> However, for
>
>> me it is OK, if it is easily possible
>> to use Ironic with external drivers (not merged to Ironic and not tested
>> on Ironic CI).
>>
>> AFAIU, this spec https://review.openstack.org/#/c/138115/ does not
>> assume changing Ironic API and core.
>> Jim asked about how Fuel Agent will know about advanced disk
>> partitioning scheme if API is not
>> changed. The answer is simple: Ironic is supposed to send a link to
>> metadata service (http or local file)
>> where Fuel Agent can download input json data.
>>
> That's not about not changing Ironic. Changing Ironic is ok for reasonable
> use cases - we do a huge change right now to accommodate zapping, hardware
> introspection and RAID configuration.
>
> Minimal changes because we don't want to break anything. It is clear how
difficult to convince people to do even minimal changes. Again it is just a
pragmatic approach. We want to do things iteratively so as not to break
Ironic as well as Fuel. We just can not change all at once.
> I actually have problems with this particular statement. It does not sound
> like Fuel Agent will integrate enough with Ironic. This JSON file: who is
> going to generate it? In the most popular use case we're driven by Nova.
> Will Nova generate this file?
>
> If the answer is "generate it manually for every node", it's too much a
> "pet" case for me personally.
>
> That is how this provision data look like right now
https://github.com/stackforge/fuel-web/blob/master/fuel_agent_ci/samples/provision.json
Do you still think it is written manually? Currently Fuel Agent works as a
part of Fuel ecosystem. We have a service which serializes provision data
for us into json. Fuel Agent is agnostic to data format (data drivers). If
someone wants to use another format, they are welcome to implement a
driver.
We assume next step will be to put provision data (disk partition scheme,
maybe other data) into driver_info and make Fuel Agent driver able to
serialize those data (special format) and implement a corresponding data
driver in Fuel Agent for this format. Again very simple. Maybe it is time
to think of having Ironic metadata service (just maybe).
Another point is that currently Fuel stores hardware info in its own
database but when it is possible to get those data from Ironic (when
inventory stuff is implemented) we will be glad to use Ironic API for that.
That is what I mean when I say 'to make Fuel stuff closer to Ironic
abstractions'
>
>> As Roman said, we try to be pragmatic and suggest something which does
>> not break anything. All changes
>> are supposed to be encapsulated into a driver. No API and core changes.
>> We have resources to support, test
>> and improve this driver. This spec is just a zero step. Further steps
>> are supposed to improve driver
>> so as to make it closer to Ironic abstractions.
>>
> Honestly I think you should at least write a roadmap for it - see my
> comments above.
>
Honestly, I think writing roadmap right now is not very rational as far as
I am not even sure people are interested in widening Ironic use cases. Some
of the comments were not even constructive like "I don't understand what
your use case is, please use IPA".
>
> About testing and support: are you providing a 3rdparty CI for it? It
> would be a big plus as to me: we already have troubles with drivers broken
> accidentally.
We are flexible here but I'm not ready to answer this question right now.
We will try to fit Ironic requirements wherever it is possible.
>
>
>> For Ironic that means widening use cases and user community. But, as I
>> already said,
>> we are OK if Ironic does not need this feature.
>>
> I don't think we should through away your hardware provision use case, but
> I personally would like to see how well Fuel Agent is going to play with
> how Ironic and Nova operate.
>
Nova is not our case. Fuel is totally about deployment. There is some in
common
As I already explained, currently we need power/tftp/dhcp management Ironic
capabilities. Again, it is not a problem to implement this stuff
independently like it happened with Fuel Agent (because this use case was
rejected several months ago). Our suggestion is not about "let's compete
with IPA" it is totally about "let's work on the same problems together".
>
>> Vladimir Kozhukalov
>>
>> On Tue, Dec 9, 2014 at 1:09 PM, Roman Prykhodchenko
>> <rprikhodchenko at mirantis.com <mailto:rprikhodchenko at mirantis.com>> wrote:
>>
>> It is true that IPA and FuelAgent share a lot of functionality in
>> common. However there is a major difference between them which is
>> that they are intended to be used to solve a different problem.
>>
>> IPA is a solution for provision-use-destroy-use_by_different_user
>> use-case and is really great for using it for providing BM nodes for
>> other OS services or in services like Rackspace OnMetal. FuelAgent
>> itself serves for provision-use-use-…-use use-case like Fuel or
>> TripleO have.
>>
>> Those two use-cases require concentration on different details in
>> first place. For instance for IPA proper decommissioning is more
>> important than advanced disk management, but for FuelAgent
>> priorities are opposite because of obvious reasons.
>>
>> Putting all functionality to a single driver and a single agent may
>> cause conflicts in priorities and make a lot of mess inside both the
>> driver and the agent. Actually previously changes to IPA were
>> blocked right because of this conflict of priorities. Therefore
>> replacing FuelAgent by IPA in where FuelAgent is used currently does
>> not seem like a good option because come people (and I’m not talking
>> about Mirantis) might loose required features because of different
>> priorities.
>>
>> Having two separate drivers along with two separate agents for those
>> different use-cases will allow to have two independent teams that
>> are concentrated on what’s really important for a specific use-case.
>> I don’t see any problem in overlapping functionality if it’s used
>> differently.
>>
>>
>> P. S.
>> I realise that people may be also confused by the fact that
>> FuelAgent is actually called like that and is used only in Fuel atm.
>> Our point is to make it a simple, powerful and what’s more important
>> a generic tool for provisioning. It is not bound to Fuel or Mirantis
>> and if it will cause confusion in the future we will even be happy
>> to give it a different and less confusing name.
>>
>> P. P. S.
>> Some of the points of this integration do not look generic enough or
>> nice enough. We look pragmatic on the stuff and are trying to
>> implement what’s possible to implement as the first step. For sure
>> this is going to have a lot more steps to make it better and more
>> generic.
>>
>>
>> On 09 Dec 2014, at 01:46, Jim Rollenhagen <jim at jimrollenhagen.com
>>> <mailto:jim at jimrollenhagen.com>> wrote:
>>>
>>>
>>>
>>> On December 8, 2014 2:23:58 PM PST, Devananda van der Veen
>>> <devananda.vdv at gmail.com <mailto:devananda.vdv at gmail.com>> wrote:
>>>
>>>> I'd like to raise this topic for a wider discussion outside of the
>>>> hallway
>>>> track and code reviews, where it has thus far mostly remained.
>>>>
>>>> In previous discussions, my understanding has been that the Fuel
>>>> team
>>>> sought to use Ironic to manage "pets" rather than "cattle" - and
>>>> doing
>>>> so
>>>> required extending the API and the project's functionality in
>>>> ways that
>>>> no
>>>> one else on the core team agreed with. Perhaps that understanding
>>>> was
>>>> wrong
>>>> (or perhaps not), but in any case, there is now a proposal to add a
>>>> FuelAgent driver to Ironic. The proposal claims this would meet that
>>>> teams'
>>>> needs without requiring changes to the core of Ironic.
>>>>
>>>> https://review.openstack.org/#/c/138115/
>>>>
>>>
>>> I think it's clear from the review that I share the opinions
>>> expressed in this email.
>>>
>>> That said (and hopefully without derailing the thread too much),
>>> I'm curious how this driver could do software RAID or LVM without
>>> modifying Ironic's API or data model. How would the agent know how
>>> these should be built? How would an operator or user tell Ironic
>>> what the disk/partition/volume layout would look like?
>>>
>>> And before it's said - no, I don't think vendor passthru API calls
>>> are an appropriate answer here.
>>>
>>> // jim
>>>
>>>
>>>> The Problem Description section calls out four things, which have
>>>> all
>>>> been
>>>> discussed previously (some are here [0]). I would like to address
>>>> each
>>>> one,
>>>> invite discussion on whether or not these are, in fact, problems
>>>> facing
>>>> Ironic (not whether they are problems for someone, somewhere),
>>>> and then
>>>> ask
>>>> why these necessitate a new driver be added to the project.
>>>>
>>>>
>>>> They are, for reference:
>>>>
>>>> 1. limited partition support
>>>>
>>>> 2. no software RAID support
>>>>
>>>> 3. no LVM support
>>>>
>>>> 4. no support for hardware that lacks a BMC
>>>>
>>>> #1.
>>>>
>>>> When deploying a partition image (eg, QCOW format), Ironic's PXE
>>>> deploy
>>>> driver performs only the minimal partitioning necessary to
>>>> fulfill its
>>>> mission as an OpenStack service: respect the user's request for
>>>> root,
>>>> swap,
>>>> and ephemeral partition sizes. When deploying a whole-disk image,
>>>> Ironic
>>>> does not perform any partitioning -- such is left up to the operator
>>>> who
>>>> created the disk image.
>>>>
>>>> Support for arbitrarily complex partition layouts is not required
>>>> by,
>>>> nor
>>>> does it facilitate, the goal of provisioning physical servers via a
>>>> common
>>>> cloud API. Additionally, as with #3 below, nothing prevents a
>>>> user from
>>>> creating more partitions in unallocated disk space once they have
>>>> access to
>>>> their instance. Therefor, I don't see how Ironic's minimal
>>>> support for
>>>> partitioning is a problem for the project.
>>>>
>>>> #2.
>>>>
>>>> There is no support for defining a RAID in Ironic today, at all,
>>>> whether
>>>> software or hardware. Several proposals were floated last cycle;
>>>> one is
>>>> under review right now for DRAC support [1], and there are multiple
>>>> call
>>>> outs for RAID building in the state machine mega-spec [2]. Any such
>>>> support
>>>> for hardware RAID will necessarily be abstract enough to support
>>>> multiple
>>>> hardware vendor's driver implementations and both in-band
>>>> creation (via
>>>> IPA) and out-of-band creation (via vendor tools).
>>>>
>>>> Given the above, it may become possible to add software RAID
>>>> support to
>>>> IPA
>>>> in the future, under the same abstraction. This would closely tie
>>>> the
>>>> deploy agent to the images it deploys (the latter image's kernel
>>>> would
>>>> be
>>>> dependent upon a software RAID built by the former), but this would
>>>> necessarily be true for the proposed FuelAgent as well.
>>>>
>>>> I don't see this as a compelling reason to add a new driver to the
>>>> project.
>>>> Instead, we should (plan to) add support for software RAID to the
>>>> deploy
>>>> agent which is already part of the project.
>>>>
>>>> #3.
>>>>
>>>> LVM volumes can easily be added by a user (after provisioning)
>>>> within
>>>> unallocated disk space for non-root partitions. I have not yet seen
>>>> a
>>>> compelling argument for doing this within the provisioning phase.
>>>>
>>>> #4.
>>>>
>>>> There are already in-tree drivers [3] [4] [5] which do not require a
>>>> BMC.
>>>> One of these uses SSH to connect and run pre-determined commands.
>>>> Like
>>>> the
>>>> spec proposal, which states at line 122, "Control via SSH access
>>>> feature
>>>> intended only for experiments in non-production environment," the
>>>> current
>>>> SSHPowerDriver is only meant for testing environments. We could
>>>> probably
>>>> extend this driver to do what the FuelAgent spec proposes, as far as
>>>> remote
>>>> power control for cheap always-on hardware in testing
>>>> environments with
>>>> a
>>>> pre-shared key.
>>>>
>>>> (And if anyone wonders about a use case for Ironic without external
>>>> power
>>>> control ... I can only think of one situation where I would
>>>> rationally
>>>> ever
>>>> want to have a control-plane agent running inside a
>>>> user-instance: I am
>>>> both the operator and the only user of the cloud.)
>>>>
>>>>
>>>> ----------------
>>>>
>>>> In summary, as far as I can tell, all of the problem statements upon
>>>> which
>>>> the FuelAgent proposal are based are solvable through incremental
>>>> changes
>>>> in existing drivers, or out of scope for the project entirely. As
>>>> another
>>>> software-based deploy agent, FuelAgent would duplicate the
>>>> majority of
>>>> the
>>>> functionality which ironic-python-agent has today.
>>>>
>>>> Ironic's driver ecosystem benefits from a diversity of
>>>> hardware-enablement
>>>> drivers. Today, we have two divergent software deployment drivers
>>>> which
>>>> approach image deployment differently: "agent" drivers use a local
>>>> agent to
>>>> prepare a system and download the image; "pxe" drivers use a remote
>>>> agent
>>>> and copy the image over iSCSI. I don't understand how a second
>>>> driver
>>>> which
>>>> duplicates the functionality we already have, and shares the same
>>>> goals
>>>> as
>>>> the drivers we already have, is beneficial to the project.
>>>>
>>>> Doing the same thing twice just increases the burden on the team;
>>>> we're
>>>> all
>>>> working on the same problems, so let's do it together.
>>>>
>>>> -Devananda
>>>>
>>>>
>>>> [0]
>>>> https://blueprints.launchpad.net/ironic/+spec/ironic-
>>>> python-agent-partition
>>>>
>>>> [1] https://review.openstack.org/#/c/107981/
>>>>
>>>> [2]
>>>> https://review.openstack.org/#/c/133828/11/specs/kilo/new-
>>>> ironic-state-machine.rst
>>>>
>>>>
>>>> [3]
>>>> http://git.openstack.org/cgit/openstack/ironic/tree/ironic/
>>>> drivers/modules/snmp.py
>>>>
>>>> [4]
>>>> http://git.openstack.org/cgit/openstack/ironic/tree/ironic/
>>>> drivers/modules/iboot.py
>>>>
>>>> [5]
>>>> http://git.openstack.org/cgit/openstack/ironic/tree/ironic/
>>>> drivers/modules/ssh.py
>>>>
>>>>
>>>> ------------------------------------------------------------
>>>> ------------
>>>>
>>>> _______________________________________________
>>>> OpenStack-dev mailing list
>>>> OpenStack-dev at lists.openstack.org
>>>> <mailto:OpenStack-dev at lists.openstack.org>
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> <mailto:OpenStack-dev at lists.openstack.org>
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> <mailto:OpenStack-dev at lists.openstack.org>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141209/4cd0d948/attachment.html>
More information about the OpenStack-dev
mailing list