[openstack-dev] [TripleO] test environment requirements
Clint Byrum
clint at fewbar.com
Mon Mar 24 00:57:22 UTC 2014
Excerpts from Ben Nemec's message of 2014-03-21 09:38:00 -0700:
> On 2014-03-21 10:57, Derek Higgins wrote:
> > On 14/03/14 20:16, Ben Nemec wrote:
> >> On 2014-03-13 11:12, James Slagle wrote:
> >>> On Thu, Mar 13, 2014 at 2:51 AM, Robert Collins
> >>> <robertc at robertcollins.net> wrote:
> >>>> So we already have pretty high requirements - its basically a 16G
> >>>> workstation as minimum.
> >>>>
> >>>> Specifically to test the full story:
> >>>> - a seed VM
> >>>> - an undercloud VM (bm deploy infra)
> >>>> - 1 overcloud control VM
> >>>> - 2 overcloud hypervisor VMs
> >>>> ====
> >>>> 5 VMs with 2+G RAM each.
> >>>>
> >>>> To test the overcloud alone against the seed we save 1 VM, to skip the
> >>>> overcloud we save 3.
> >>>>
> >>>> However, as HA matures we're about to add 4 more VMs: we need a HA
> >>>> control plane for both the under and overclouds:
> >>>> - a seed VM
> >>>> - 3 undercloud VMs (HA bm deploy infra)
> >>>> - 3 overcloud control VMs (HA)
> >>>> - 2 overcloud hypervisor VMs
> >>>> ====
> >>>> 9 VMs with 2+G RAM each == 18GB
> >>>>
> >>>> What should we do about this?
> >>>>
> >>>> A few thoughts to kick start discussion:
> >>>> - use Ironic to test across multiple machines (involves tunnelling
> >>>> brbm across machines, fairly easy)
> >>>> - shrink the VM sizes (causes thrashing)
> >>>> - tell folk to toughen up and get bigger machines (ahahahahaha, no)
> >>>> - make the default configuration inline the hypervisors on the
> >>>> overcloud with the control plane:
> >>>> - a seed VM
> >>>> - 3 undercloud VMs (HA bm deploy infra)
> >>>> - 3 overcloud all-in-one VMs (HA)
> >>>> ====
> >>>> 7 VMs with 2+G RAM each == 14GB
> >>>>
> >>>>
> >>>> I think its important that we exercise features like HA and live
> >>>> migration regularly by developers, so I'm quite keen to have a fairly
> >>>> solid systematic answer that will let us catch things like bad
> >>>> firewall rules on the control node preventing network tunnelling
> >>>> etc... e.g. we benefit the more things are split out like scale
> >>>> deployments are. OTOH testing the micro-cloud that folk may start with
> >>>> is also a really good idea....
> >>>
> >>>
> >>> The idea I was thinking was to make a testenv host available to
> >>> tripleo atc's. Or, perhaps make it a bit more locked down and only
> >>> available to a new group of tripleo folk, existing somewhere between
> >>> the privileges of tripleo atc's and tripleo-cd-admins. We could
> >>> document how you use the cloud (Red Hat's or HP's) rack to start up a
> >>> instance to run devtest on one of the compute hosts, request and lock
> >>> yourself a testenv environment on one of the testenv hosts, etc.
> >>> Basically, how our CI works. Although I think we'd want different
> >>> testenv hosts for development vs what runs the CI, and would need to
> >>> make sure everything was locked down appropriately security-wise.
> >>>
> >>> Some other ideas:
> >>>
> >>> - Allow an option to get rid of the seed VM, or make it so that you
> >>> can shut it down after the Undercloud is up. This only really gets rid
> >>> of 1 VM though, so it doesn't buy you much nor solve any long term
> >>> problem.
> >>>
> >>> - Make it easier to see how you'd use virsh against any libvirt host
> >>> you might have lying around. We already have the setting exposed, but
> >>> make it a bit more public and call it out more in the docs. I've
> >>> actually never tried it myself, but have been meaning to.
> >>>
> >>> - I'm really reaching now, and this may be entirely unrealistic :),
> >>> but....somehow use the fake baremetal driver and expose a mechanism to
> >>> let the developer specify the already setup undercloud/overcloud
> >>> environment ahead of time.
> >>> For example:
> >>> * Build your undercloud images with the vm element since you won't be
> >>> PXE booting it
> >>> * Upload your images to a public cloud, and boot instances for them.
> >>> * Use this new mechanism when you run devtest (presumably running from
> >>> another instance in the same cloud) to say "I'm using the fake
> >>> baremetal driver, and here are the IP's of the undercloud instances".
> >>> * Repeat steps for the overcloud (e.g., configure undercloud to use
> >>> fake baremetal driver, etc).
> >>> * Maybe it's not the fake baremetal driver, and instead a new driver
> >>> that is a noop for the pxe stuff, and the power_on implementation
> >>> powers on the cloud instances.
> >>> * Obviously if your aim is to test the pxe and disk deploy process
> >>> itself, this wouldn't work for you.
> >>> * Presumably said public cloud is OpenStack, so we've also achieved
> >>> another layer of "On OpenStack".
> >>
> >> I actually spent quite a while looking into something like this last
> >> option when I first started on TripleO, because I had only one big
> >> server locally and it was running my OpenStack installation. I was
> >> hoping to use it for my TripleO instances, and even went so far as to
> >> add support for OpenStack to the virtual power driver in baremetal. I
> >> was never completely successful, but I did work through a number of
> >> problems:
> >>
> >> 1. Neutron didn't like allowing the DHCP/PXE traffic to let my seed
> >> serve to the undercloud. I was able to get around this by using flat
> >> networking with a local bridge on the OpenStack system, but I'm not sure
> >> if that's going to be possible on most public cloud providers. There
> >> may very well be a less invasive way to configure Neutron to allow that,
> >> but I don't know how to do it.
> >>
> >> 2. Last time I checked, Nova doesn't support PXE booting instances so I
> >> had to use iPXE images to do the booting. This doesn't work since we
> >> PXE boot every time an instance reboots and the iPXE image gets
> >> overwritten by the image deploy, so the instance doesn't boot properly
> >> after deployment. This is where I stopped my investigation because I
> >> didn't want to start hacking up my OpenStack installation to get around
> >> the problem, but if we decided to go in this direction I don't think it
> >> would be terribly difficult to get support for this into Nova. I know
> >> there have been proposed patches for it before, but I don't think there
> >> was ever much push behind them because it isn't something most people
> >> need to do.
> >>
> >> So if we can work through a couple of problems then in theory it should
> >> be possible to use OpenStack instances for TripleO development, which
> >> would let us do the cloudy thing and have someone else worry about the
> >> hardware. The idea certainly has some appeal to me.
> >
> > I'm tempted to say If we could pull this off it would be great but I'm
> > worried it would differ too much from our target deployment method. We
> > would be spreading ourselves too thin trying to support this for
> > developers along with our traditional deployment method. Also if using
> > it is adopted by too many people the only thing exercising our target
> > deployment method would be CI. But I'm interested in what other people
> > think.
>
> To clarify my suggestion, I want to take James's idea a step further and
> do the full PXE deploy to Nova instances. So the only difference from
> our workflow now would be that instead of configuring your VM's with
> virsh, you would do it with Nova and Neutron.
>
Seems like this would be the most scalable long term plan. Would it be
as easy as enabling PXE bios for instances and allowing users to set
the appropriate DHCP options for the tftp server on these ports?
More information about the OpenStack-dev
mailing list