Open Stack

Thu Sep 3 16:29:18 UTC 2015

On Wed, Sep 2, 2015, at 02:54 PM, Adrian Otto wrote:
> Infra Team,
> 
> OpenStack Magnum seeks to expand the scope of the functional tests we
> run. Our tests involve the creation of magnum bays, which are composed of
> nova instances that we create through heat. We will need to create and
> delete a dozen or so virtual machines (nested within the devstack gate
> test environment) through the course of the functional tests. We are
> concerned because creating this many nova instances will take time, and
> we don’t want to hog the test resources for long periods, and we want our
> tests to run in a timely fashion.
> 
> Rackspace has offered to allocate a pool of several OnMetal servers for
> use by Magnum, above any beyond the resources that are in the main
> resource pool. Ideally, Magnum would have the ability to tag our tests in
> a way so they are run on this resource pool. This would speed up
> execution by eliminating the nested virtualization needed to create the
> nova instances that make up the bays. The benefit to other OpenStack
> projects is that it would free up the resources we are currently
> consuming, which could reduce contention, and allow other jobs to run
> sooner.
> 
> My questions:
> 
> 1) Is this sort of a setup possible and practical?
My understanding of how OnMetal works is that you talk to Rackspace's
Nova api and instruct it to boot special flavors using specific images.
This is mostly how Nodepool works so for the most part it should just
work.

The one place I expect trouble is Nodepool expects to build an image
that it boots either by booting nodes in the cloud and making a snapshot
or by doing a local DIB build that is uploaded to glance. Does OnMetal
support performing snapshots? Or will we have to figure out what is
special in the OnMetal images and ensure that special sauce is baked
into DIB images for OnMetal?
> 
> 2) How long would it (probably) take to get this set up, assuming
> adequate participation on our side?
It is really hard to give a number without knowing what the image
building situation is. If our only choice is DIB images this could be a
very long time (we have spent roughly the last year trying to get image
uploads to Rackspace working, we are close to having it work reliably
now, but it has been a long long road).
> 
> 3) Who would we want to work with to get this done properly?
You will mostly like end up working with openstack-infra/nodepool,
openstack-infra/system-config, and openstack-infra/project-config cores
as they are groups responsible for getting images built and booted for
testing.

I do have a few concerns with this in general. Have we pushed changes to
expand the scope of the testing to see how bad running those test on the
existing slave machines really is? I worry that we are jumping to
OnMetal as a solution before we have described the problem. Can we
concretely describe the issues with real information by pushing the
required changes to Magnum and seeing how they break? Have we shown that
using OnMetal addresses the problems that arise in the previous tests?

In the past we have set up special clouds to test specific projects (the
tripleo clouds) and when they work its fine, but when they break it can
be a large cost on the infra team. We were constantly enabling and
disabling clouds and images and all the things for a while as the teams
responsible debugged the problems. Pretty sure the VM images are stale
there as well. In general these special cases tend to need a lot of love
so the project(s) involved should be prepared to jump in as necessary.

Finally we have had requests for baremetal servers in the past from
other projects. Would you expect these resources to be dedicated to
Magnum only or would the greater OpenStack community have access to them
within reason? I am not sure that we would require these to be shared
resources (still thinking about how that would work), but this could
create unexpected tension in the project if we don't consider it
upfront.

Hope this helps,
Clark

Open Stack

[OpenStack-Infra] Magnum Gate nodepool

OpenStack

Community

Documentation

Branding & Legal