[OpenStack-Infra] Plan for testing nova baremetal and TripleO

Robert Collins robertc at robertcollins.net
Mon Sep 23 09:48:04 UTC 2013


Hey, so after multiple repeated attempts, getting virtualised
functional baremetal testing up inside the public rackspace/HP clouds
isn't going to fly with any real fidelity.
LXC doesn't help TripleO but we thought it would help nova-bm, however
iscsid within the container is a fail, which makes deploys break, and
TripleO needs to test actual boot of nodes, so we'd hit the ultra-slow
story that is qemu in qemu in kvm *anyway*.

We have concluded that the requirements for doing it 'in a cloud' are roughly:
 - complete control over a l2 network : controllable anti-spoofing [so
we can do virtual IPs for HA], run our own DHCP and routers
 - VM's set to PXE boot
 - VM's with no images (start with blank disks)

And possibly more:  we can work on this as a long term play, but we
need something at a faster rate than seems likely if we depend on
having all our public cloud providers adopt the above features *after
they are written*, and *expose them to users*.

So at the TripleO sprint we put together this:
https://etherpad.openstack.org/tripleo-test-cluster

We have a rack dedicated to community testing which we'd like to use
for this; it's in an HP datacentre and anyone on Monty's team at HP
can open tickets for the DC ops folk to repair/fixup hardware and
change routing etc. If we can get reliable enough that we can be a
gate, we can - and will - go seek additional clusters to be setup
identically as test clouds.

The etherpad has the architecture, but some additional prose won't hurt :).

The plan is to run up a TripleO undercloud and give openstack-infra
API keys on that (so you can fix things yourself if desired). Within
the undercloud we'll deploy a small scale (5 hypervisors or so) kvm
overcloud (infra also get keys on that) to host:
 - jenkins slaves
 - a test environment broker

On the rest of the undercloud machines we deploy a bunch of
 - test environment host instances

A 'test environment host' is just a physical machine with a bunch of
dedicated libvirt VM's + openvswitch bridges, setup per the devtest
use case we have in tripleo-incubator: ideal small scale testbeds for
TripleO. Being in the same rack with the Jenkins slaves avoids
bandwidth issues copying disk images around : we don't want to do that
over the internet without a good reason.

The broker will hand out test environments to jenkins slave, and will
be gearman based to detect dead slaves automatically.

I realise that having a SPOF in the infrastructure is undesirable, but
I hope we can get this up and running and scale up to having 2 such
testbeds once the tech is in place to run them efficiently : asking
e.g. Rackspace for a rack to test baremetal on will be a lot easier
once we can point them at a running example, IMO.

-Rob

-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud



More information about the OpenStack-Infra mailing list