[openstack-dev] "bad" default values in conf files
Clint Byrum
clint at fewbar.com
Thu Feb 13 15:01:05 UTC 2014
Excerpts from David Kranz's message of 2014-02-13 06:38:52 -0800:
> I was recently bitten by a case where some defaults in keystone.conf
> were not appropriate for real deployment, and our puppet modules were
> not providing better values
> https://bugzilla.redhat.com/show_bug.cgi?id=1064061.
Just taking a look at that issue, Keystone's PKI and revocation are
causing all kinds of issues with performance that are being tackled with
a bit of a redesign. I doubt we can find a cache timeout setting that
will work generically for everyone, but if we make detecting revocation
scale, we won't have to.
The default probably is too low, but raising it too high will cause
concern with those who want revoked tokens to take effect immediately
and are willing to scale the backend to get that result.
> Since there are
> hundreds (thousands?) of options across all the services. I am wondering
> whether there are other similar issues lurking and if we have done what
> we can to flush them out.
>
> Defaults in conf files seem to be one of the following:
>
> - Generic, appropriate for most situations
> - Appropriate for devstack
> - Appropriate for small, distro-based deployment
> - Approprate for large deployment
>
> Upstream, I don't think there is a shared view of how defaults should be
> chosen.
>
I don't know that we have been clear enough about this, but nobody has
ever challenged the assertion we've been making for a while in TripleO
which is that OpenStack _must_ have production defaults. We don't make
OpenStack for devstack.
In TripleO, we consider it a bug when we can't run with a default value
that isn't directly related to whatever makes that cloud unique. So
the virt driver: meh, that's a choice, but leaving file injection on is
really not appropriate for 99% of users in production. Also you'll see
quite a few commits from me in the keystone SQL token driver trying to
speed it up because the old default token backend was KVS (in-memory),
which was fast, but REALLY not useful in production. We found these
things by running defaults and noticing in a long running cloud where
the performance problems are, and we intend to keep doing that.
So perhaps we should encode this assertion in
https://wiki.openstack.org/wiki/ReviewChecklist
> Keeping bad defaults can have a huge impact on performance and when a
> system falls over but the problems may not be visible until some time
> after a system gets into real use. Have the folks creating our puppet
> modules and install recommendations taken a close look at all the
> options and determined
> that the defaults are appropriate for deploying RHEL OSP in the
> configurations we are recommending?
>
TripleO is the official "deployment" program. We are taking the approach
described above. We're standing up several smallish (<50 nodes) clouds
with the intention of testing the defaults on real hardware in the gate
of OpenStack eventually.
More information about the OpenStack-dev
mailing list