[openstack-dev] [tripleo] glance backend: replace swift by file in CI
emilien at redhat.com
Mon Jun 27 19:30:58 UTC 2016
Today we're re-investigating a CI failure that we had multiple times :
Swift memory usage grows until it is OOM-killed.
The perimeter of this thread is about our CI and not production environments.
Indeed, our CI is running limited resources while production
environments should not hit this problem.
After some investigation on #ŧripleo, we found out this scenario was
happening almost every time since recently:
* undercloud is deployed, glance and swift are running. Glance is
configured with Swift backend to store images.
* tripleo CI upload overcloud image into Glance, image is successfully uploaded.
* when overcloud starts deploying, some nodes randomly fail to deploy
because the undercloud OOM-kills swift-proxy-server that is still
sending the ovecloud image requested by Glance API. Swift fails,
Glance fails, overcloud deployment fails with a "No valid hosts
It's likely due to performances issues in our CI, and there is nothing
we can do but adding more resources or reducing the number of
environments, something we won't do at this time, because our recent
improvements in our CI (more ram, SSD, etc).
As a first iteration, I propose  that we stop using Swift as a
backend for Glance. Indeed, our undercloud is currently single-node, I
see zero value of using Swift to store the overcloud image.
If there is a value, then we can add the option to whether or not
using it (and set it to False in our CI to use file backend, which
won't lead to OOM).
Note: on the overcloud: we currently support file, swift and rbd
backends, that you can easily select during your deployment.
More information about the OpenStack-dev