Open Stack

Thu Jan 16 15:23:34 UTC 2014

On 15.1.2014 14:07, James Slagle wrote:
> I'll start by laying out how I see editing or updating nodes working
> in TripleO without Tuskar:
>
> To do my initial deployment:
> 1.  I build a set of images for my deployment for different roles. The
> images are different based on their role, and only contain the needed
> software components to accomplish the role they intend to be deployed.
> 2.  I load the images into glance
> 3.  I create the Heat template for my deployment, likely from
> fragments that are already avaiable. Set quantities, indicate which
> images (via image uuid) are for which resources in heat.
> 4.  heat stack-create with my template to do the deployment
>
> To update my deployment:
> 1.  If I need to edit a role (or create a new one), I create a new image.
> 2.  I load the new image(s) into glance
> 3.  I edit my Heat template, update any quantities, update any image uuids, etc.
> 4.  heat stack-update my deployment
>
> In both cases above, I see the role of Tuskar being around steps 3 and 4.

+1. Although it's worth noting that if we want zero downtime updates, 
we'll probably need ability to migrate content off the machines being 
updated - that would be a pre-3 step. (And for that we need spare 
capacity equal to the number of nodes being updated, so we'll probably 
want to do updating in chunks in the future, not the whole overcloud at 
once).

>
> I may be misinterpreting, but let me say that I don't think Tuskar
> should be building images. There's been a fair amount of discussion
> around a Nova native image building service [1][2]. I'm actually not
> sure what the status/concensus on that is, but maybe longer term,
> Tuskar might call an API to kick off an image build.

Yeah I don't think image building should be driven through Tuskar API 
(and probably not even Tuskar UI?). Tuskar should just fetch images from 
Glance imho. However, we should be aware that image building *is* our 
concern, as it's an important prerequisite for deployment. We should 
provide at least directions how to easily build images for use with 
Tuskar, not leave users in doubt.

<snip>

>> "We will have to store image metadata in tuskar probably, that would map to
>> glance, once the image is generated. I would say we need to store the list
>> of the elements and probably the commit hashes (because elements can
>> change). Also it should be versioned, as the images in glance will be also
>> versioned.
>
> I'm not sure why this image metadata would be in Tuskar. I definitely
> like the idea of knowing the versions/commit hashes of the software
> components in your images, but that should probably be in Glance.

+1

>
>> We can't probably store it in the Glance, cause we will first store the
>> metadata, then generate image. Right?
>
> I'm not sure I follow this point. But, mainly, I don't think Tuskar
> should be automatically generating images.

+1

>
>> Then we could see whether image was created from the metadata and whether
>> that image was used in the heat-template. With versions we could also see
>> what has changed.
>
> We'll be able to tell what image was used in the heat template, and
> thus the deployment,  based on it's UUID.
>
> I love the idea of seeing differences between images, especially
> installed software versions, but I'm not sure that belongs in Tuskar.
> That sort of utility functionality seems like it could apply to any
> image you might want to launch in OpenStack, not just to do a
> deployment.  So, I think it makes sense to have that as Glance
> metadata or in Glance somehow. For instance, if I wanted to launch an
> image that had a specific version of apache, it'd be nice to be able
> to see that when I'm choosing an image to launch.

Yes. We might want to show the data to the user, but i don't see a need 
to run this through Tuskar API. Tuskar UI could query Glance directly 
and display the metadata to the user. (When using CLI, one could use 
Glance CLI directly. We're not adding any special logic on top.)

>
>> But there was also idea that there will be some generic image, containing
>> all services, we would just configure which services to start. In that case
>> we would need to version also this.
>
> -1 to this.  I think we should stick with specialized images per role.
> I replied on the wireframes thread, but I don't see how
> enabling/disabling services in a prebuilt image should work. Plus, I
> don't really think it fits with the TripleO model of having an image
> created based on it's specific "role" (I hate to use that term and
> muddy the water....i mean in the generic sense here).
>
>
>> = New Comments =
>>
>> My comments on this train of thought:
>>
>> - I'm afraid of the idea of applying changes immediately for the same
>> reasons I'm worried about a few other things. Very little of what we do will
>> actually finish executing immediately and will instead be long running
>> operations. If I edit a few roles in a row, we're looking at a lot of
>> outstanding operations executing against other OpenStack pieces (namely
>> Heat).
>>
>> The idea of immediately also suffers from a sort of "Oh shit, that's not
>> what I meant" when hitting save. There's no way for the user to review what
>> the larger picture is before deciding to make it so.
>
> +1

Yeah we probably can't immediately update everything. Apart from "that's 
not what i meant", this would probably not work when attempting zero 
downtime update.

On the other hand, i'd say we aim at keeping all machines in sync as 
much as possible. So it would be nice to have machines somehow displayed 
as "needs update", and user could then say "ok, update these machines".

>> We need some sort of task tracking that prevents overlapping operations from
>> executing at the same time. Tuskar needs to know what's happening instead of
>> simply having the UI fire off into other OpenStack components when the user
>> presses a button.
>>
>> To rehash an earlier argument, this is why I advocate for having the
>> business logic in the API itself instead of at the UI. Even if it's just a
>> queue to make sure they don't execute concurrently (that's not enough IMO,
>> but for example), the server is where that sort of orchestration should take
>> place and be able to understand the differences between the configured state
>> in Tuskar and the actual deployed state.

Just to clarify: I think the prevailing opinion on the list was that 
there should be no significant business logic in the UI. It should 
either live in a library (e.g. as separated part of tuskarclient), or in 
the API. For details there's a long openstack-dev thread [5].

Even though i'm still not 100% convinced which path is right, i see that 
having logic in API will let us do long running tasks and track their 
progress (e.g. chunked updates i mentioned earlier, if we want to do 
them in the future). However, we should be careful not mirror data in 
our DB that belongs elsewhere.

>>
>> I'm off topic a bit though. Rather than talk about how we pull it off, I'd
>> like to come to an agreement on what the actual policy should be. My
>> concerns focus around the time to create the image and get it into Glance
>> where it's available to actually be deployed. When do we bite that time off
>> and how do we let the user know it is or isn't ready yet?
>
> I think this becomes simpler if you're not worried about building
> images. Even so, some task tracking will likely be needed. TaskFlow[3]
> and Mistral[4] may be relevant.
>
>
>> - Editing a node is going to run us into versioning complications. So far,
>> all we've entertained are ways to map a node back to the resource category
>> it was created under. If the configuration of that category changes, we have
>> no way of indicating that the node is out of sync.
>>
>> We could store versioned resource categories in the Tuskar DB and have the
>> version information also find its way to the nodes (note: the idea is to use
>> the metadata field on a Heat resource to store the res-cat information, so
>> including version is possible). I'm less concerned with eventual reaping of
>> old versions here since it's just DB data, though we still hit the question
>> of when to delete old images.
>
> Is resource category the same as role?  Sorry :), I probably need to
> go back and re-read the terminology thread. If so, I think versioning
> them in the Tuskar db makes sense. That way you know what's been
> deployed and what hasn't, as well as any differences.

I wondering if we can fly without Resource Category versioning and 
linking the versions to nodes, but maybe can't. If i update a Resource 
Category config, but image stays the same, i don't have means of 
learning which nodes run the updated config, do i? So maybe we'll need 
to keep track of this ourselves. However, i'd like to keep such node 
metadata in Ironic, not in Tuskar.

This is a wider-reaching issue - we should try to avoid creating Node 
and Image resources in Tuskar if possible - i think these should belong 
to Ironic and Glance, respectively.

>
>> - For the comment on a generic image with service configuration, the first
>> thing that came to mind was the thread on creating images from packages [1].
>> It's not the exact same problem, but see Clint Byrum's comments in there
>> about drift. My gut feeling is that having specific images for each res-cat
>> will be easier to manage than trying to edit what services are running on a
>> node.
>
> +1.

+1. Image per role will mean greater flexibility. I imagine that in 
advanced deployments users might want to deploy resource categories 
which do not run OpenStack services at all (e.g. RDO Foreman has such 
parts that help with high availability / load balancing setups).

>
> [1] http://lists.openstack.org/pipermail/openstack-dev/2013-August/013122.html
> [2] https://wiki.openstack.org/wiki/NovaImageBuilding
> [3] https://wiki.openstack.org/wiki/TaskFlow
> [4] https://wiki.openstack.org/wiki/Mistral
>

[5] 
http://lists.openstack.org/pipermail/openstack-dev/2013-December/021919.html

Open Stack

[openstack-dev] [TripleO][Tuskar] Editing Nodes

OpenStack

Community

Documentation

Branding & Legal