[OpenStack-Infra] Launch node and the new bridge server

Clark Boylan cboylan at sapwetik.org
Mon Aug 27 23:21:34 UTC 2018


Hello everyone,

I've been trying to help mnaser bootstrap a new cloud region in nodepool. Part of this requires launching a new mirror node in that cloud region and I have discovered some interesting things related to our migration to launch node and how they affect our ability to boot new instances. I'm going to try and summarize all that I have learned as you may find yourself needing to launch nodes too. Also there is plenty to do to help make this better if interested :)

The first thing I ran into was that /etc/openstack/clouds.yaml is not managed by ansible on bridge.openstack.org. This meant the new region info was not automatically added to that file. I made a copy of the clouds.yaml file in my homedir, updated it with new region, tested that it worked, then updated the global file with these changes manually. Pabelanger has volunteered to work on getting this managed by ansible and https://review.openstack.org/#/c/593029/ seemed to be where Monty had started.

While testing the clouds.yaml changes I ran into the lack of openstackclient installation which had me learn that virtualenv is weird on Bionic with python3.  `python3 /usr/lib/python3/dist-packages/virtualenv.py --python=python3 venv` is what I ended up using to create a virtualenv to install openstackclient into.

With that done I ran launch-node.py and ran into a couple issues. First we need volume size specification support for boot from volume hosts (added in https://review.openstack.org/554313). I also updated docs to be clear we run under python3 in https://review.openstack.org/#/c/596859/1 and fixed a bug with unlinking ansible caches at https://review.openstack.org/#/c/596873/1. Reviews much appreciated.

This got the instance booting happily but then ssh wasn't working which was the result of not running cloud launcher on bridge.o.o yet (and its not running on old puppet master either). I went ahead and manually triggered cloud launcher against both new accounts in this cloud region to update ssh root keys and security groups. We will need to get this running automatically on bridge.o.o.

This resulted in launch-node.py exiting successfully but then the puppet cron failed to run puppet on the host to turn it into a new mirror node because there is no python2 installed on Xenial by default. https://review.openstack.org/#/c/596913/ is a work around for this (explicitly install python2 on xenial for now, this will help transition us from trusty to xenial too). But also resulted in some other attempts to address this at https://review.openstack.org/#/c/596905/, https://review.openstack.org/#/c/596911/1, and https://review.openstack.org/#/c/596894/. Once again reviews welcome.

The last thing to note out of this is that Trusty comes with python3.4 and centos7 doesn't have python3 and we can't run ansible under python3 on these platforms. This means launch-node.py cannot currently boot trusty or centos nodes. If we end up needing to boot a trusty or centos node we will have to force it to use python2 on the remote instead.

TL;DR We need to:
* Manage clouds.yaml with ansible
* Fix a couple bugs in launch-node.py
* Run cloud launcher on bridge.openstack.org
* Figure out our python2 to python3 transition plan for Ansible

Hopefully this helps you out if you end up needing to launch a new node while we transition to bridge.openstack.org.

Clark



More information about the OpenStack-Infra mailing list