[openstack-dev] [TripleO] containerized undercloud in Queens
Dan Prince
dprince at redhat.com
Tue Oct 3 17:12:57 UTC 2017
On Mon, 2017-10-02 at 15:20 -0600, Alex Schultz wrote:
> Hey Dan,
>
> Thanks for sending out a note about this. I have a few questions
> inline.
>
> On Mon, Oct 2, 2017 at 6:02 AM, Dan Prince <dprince at redhat.com>
> wrote:
> > One of the things the TripleO containers team is planning on
> > tackling
> > in Queens is fully containerizing the undercloud. At the PTG we
> > created
> > an etherpad [1] that contains a list of features that need to be
> > implemented to fully replace instack-undercloud.
> >
>
> I know we talked about this at the PTG and I was skeptical that this
> will land in Queens. With the exception of the Container's team
> wanting this, I'm not sure there is an actual end user who is looking
> for the feature so I want to make sure we're not just doing more work
> because we as developers think it's a good idea.
I've heard from several operators that they were actually surprised we
implemented containers in the Overcloud first. Validating a new
deployment framework on a single node Undercloud (for operators) before
overtaking their entire cloud deployment has a lot of merit to it IMO.
When you share the same deployment architecture across the
overcloud/undercloud it puts us in a better position to decide where to
expose new features to operators first (when creating the undercloud or
overcloud for example).
Also, if you read my email again I've explicitly listed the
"Containers" benefit last. While I think moving the undercloud to
containers is a great benefit all by itself this is more of a
"framework alignment" in TripleO and gets us out of maintaining huge
amounts of technical debt. Re-using the same framework for the
undercloud and overcloud has a lot of merit. It effectively streamlines
the development process for service developers, and 3rd parties wishing
to integrate some of their components on a single node. Why be forced
to create a multi-node dev environment if you don't have to (aren't
using HA for example).
Lets be honest. While instack-undercloud helped solve the old "seed" VM
issue it was outdated the day it landed upstream. The entire premise of
the tool is that it uses old style "elements" to create the undercloud
and we moved away from those as the primary means driving the creation
of the Overcloud years ago at this point. The new 'undercloud_deploy'
installer gets us back to our roots by once again sharing the same
architecture to create the over and underclouds. A demo from long ago
expands on this idea a bit: https://www.youtube.com/watch?v=y1qMDLAf26
Q&t=5s
In short, we aren't just doing more work because developers think it is
a good idea. This has potential to be one of the most useful
architectural changes in TripleO that we've made in years. Could
significantly decrease our CI reasources if we use it to replace the
existing scenarios jobs which take multiple VMs per job. Is a building
block we could use for other features like and HA undercloud. And yes,
it does also have a huge impact on developer velocity in that many of
us already prefer to use the tool as a means of streamlining our
dev/test cycles to minutes instead of hours. Why spend hours running
quickstart Ansible scripts when in many cases you can just doit.sh. htt
ps://github.com/dprince/undercloud_containers/blob/master/doit.sh
Lastly, this isn't just a containers team thing. We've been using the
undercloud_deploy architecture across many teams to help develop for
almost an entire cycle now. Huge benefits. I would go as far as saying
that undercloud_deploy was *the* biggest feature in Pike that enabled
us to bang out a majority of the docker/service templates in tripleo-
heat-templates.
> Given that etherpad
> appears to contain a pretty big list of features, are we going to be
> able to land all of them by M2? Would it be beneficial to craft a
> basic spec related to this to ensure we are not missing additional
> things?
I'm not sure there is a lot of value in creating a spec at this point.
We've already got an approved blueprint for the feature in Pike here: h
ttps://blueprints.launchpad.net/tripleo/+spec/containerized-undercloud
I think we might get more velocity out of grooming the etherpad and
perhaps dividing this work among the appropriate teams.
>
> > Benefits of this work:
> >
> > -Alignment: aligning the undercloud and overcloud installers gets
> > rid
> > of dual maintenance of services.
> >
>
> I like reusing existing stuff. +1
>
> > -Composability: tripleo-heat-templates and our new Ansible
> > architecture around it are composable. This means any set of
> > services
> > can be used to build up your own undercloud. In other words the
> > framework here isn't just useful for "underclouds". It is really
> > the
> > ability to deploy Tripleo on a single node with no external
> > dependencies. Single node TripleO installer. The containers team
> > has
> > already been leveraging existing (experimental) undercloud_deploy
> > installer to develop services for Pike.
> >
>
> Is this something that is actually being asked for or is this just an
> added bonus because it allows developers to reduce what is actually
> being deployed for testing?
There is an implied ask for this feature when a new developer starts to
use TripleO. Right now resource bar is quite high for TripleO. You have
to have a multi-node development environment at the very least (one
undercloud node, and one overcloud node). The ideas we are talking
about here short circuits this in many cases... where if you aren't
testing HA services or Ironic you could simple use undercloud_deploy to
test tripleo-heat-template changes on a single VM. Less resources, and
much less time spent learning and waiting.
>
> > -Development: The containerized undercloud is a great development
> > tool. It utilizes the same framework as the full overcloud
> > deployment
> > but takes about 20 minutes to deploy. This means faster
> > iterations,
> > less waiting, and more testing. Having this be a first class
> > citizen
> > in the ecosystem will ensure this platform is functioning for
> > developers to use all the time.
> >
>
> Seems to go with the previous question about the re-usability for
> people who are not developers. Has everyone (including non-container
> folks) tried this out and attest that it's a better workflow for
> them?
> Are there use cases that are made worse by switching?
I would let other chime in but the feedback I've gotten has mostly been
that it improves the dev/test cycle greatly.
>
> > -CI resources: better use of CI resources. At the PTG we received
> > feedback from the OpenStack infrastructure team that our upstream
> > CI
> > resource usage is quite high at times (even as high as 50% of the
> > total). Because of the shared framework and single node
> > capabilities we
> > can re-architecture much of our upstream CI matrix around single
> > node.
> > We no longer require multinode jobs to be able to test many of the
> > services in tripleo-heat-templates... we can just use a single
> > cloud VM
> > instead. We'll still want multinode undercloud -> overcloud jobs
> > for
> > testing things like HA and baremetal provisioning. But we can cover
> > a
> > large set of the services (in particular many of the new scenario
> > jobs
> > we added in Pike) with single node CI test runs in much less time.
> >
>
> I like this idea but would like to see more details around this.
> Since this is a new feature we need to make sure that we are properly
> covering the containerized undercloud with CI as well. I think we
> need 3 jobs to properly cover this feature before marking it done. I
> added them to the etherpad but I think we need to ensure the
> following
> 3 jobs are defined and voting by M2 to consider actually switching
> from the current instack-undercloud installation to the containerized
> version.
>
> 1) undercloud-containers - a containerized install, should be voting
> by m1
> 2) undercloud-containers-update - minor updates run on containerized
> underclouds, should be voting by m2
> 3) undercloud-containers-upgrade - major upgrade from
> non-containerized to containerized undercloud, should be voting by
> m2.
>
> If we have these jobs, is there anything we can drop or mark as
> covered that is currently being covered by an overcloud job?
>
> > -Containers: There are no plans to containerize the existing
> > instack-
> > undercloud work. By moving our undercloud installer to a tripleo-
> > heat-
> > templates and Ansible architecture we can leverage containers.
> > Interestingly, the same installer also supports baremetal (package)
> > installation as well at this point. Like to overcloud however I
> > think
> > making containers our undercloud default would better align the
> > TripleO
> > tooling.
> >
> > We are actively working through a few issues with the deployment
> > framework Ansible effort to fully integrate that into the
> > undercloud
> > installer. We are also reaching out to other teams like the UI and
> > Security folks to coordinate the efforts around those components.
> > If
> > there are any questions about the effort or you'd like to be
> > involved
> > in the implementation let us know. Stay tuned for more specific
> > updates
> > as we organize to get as much of this in M1 and M2 as possible.
> >
>
> I would like to see weekly updates on this effort during the IRC
> meeting. As previously mentioned around squad status, I'll be asking
> for them during the meeting so it would be nice to get an update this
> on a weekly basis so we can make sure that we'll be OK to cut over.
>
> Also what does the cut over plan look like? This is something that
> might be beneficial to have in a spec. IMHO, I'm ok to continue
> pushing the container effort using the openstack undercloud deploy
> method for now. Once we have voting CI jobs and the feature list has
> been covered then we can evaluate if we've made the M2 time frame to
> switching openstack undercloud deploy to be the new undercloud
> install. I want to make sure we don't introduce regressions and are
> doing thing in a user friendly fashion since the undercloud is the
> first intro an end user gets to tripleo. It would be a good idea to
> review what the new install process looks like and make sure it "just
> works" given that the current process[0] (with all it's flaws) is
> fairly trivial to perform.
>
> Thanks,
> -Alex
>
> [0] https://docs.openstack.org/tripleo-docs/latest/install/installati
> on/installation.html#installing-the-undercloud
>
> > On behalf of the containers team,
> >
> > Dan
> >
> > [1] https://etherpad.openstack.org/p/tripleo-queens-undercloud-cont
> > aine
> > rs
> >
> > ___________________________________________________________________
> > _______
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsu
> > bscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> _____________________________________________________________________
> _____
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubs
> cribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list