[openstack-dev] Ops-Dev communication (was: /var/lib/nova/instances fs filled up corrupting my Linux instances)

Michael Still mikal at stillhq.com
Fri Mar 22 03:57:27 UTC 2013


On Fri, Mar 22, 2013 at 1:33 PM, Lorin Hochstein
<lorin at nimbisservices.com> wrote:
> Davanum:
>
> I don't think the problem is necessarily the existing mechanism for
> reporting bugs or feature requests (I do think that ops aren't reporting
> usability issues as bugs, even though they should, but put that aside for a
> moment).
>
> My worry is about the disconnect between how developers believe operators
> use OpenStack, and how operators actually use OpenStack, and the problems
> caused by that disconnect. What initially prompted this was the assumption
> that devs could introduce a new feature that was disabled by default, and
> then after a couple of releases they could enable it by default, the
> assumption being that operators would have tested this experimental feature
> in the initial releases. But operators don't test against features like
> that, so this was an incorrect assumption: introducing a new feature that is
> disabled by default doesn't necessarily lead to operators testing it.

I must admit I found this frustrating. I'm not sure how else we can
responsibly add features. I guess we could just enable them and use
the RC process to shake out bugs, but I'm not convinced that many
operators are deploying RCs in test environments either.

> Another example is how operators write scripts that do things like poke
> directly at the database in order to work around missing features in the
> tools. Here is a case that they should be reporting usability issues. But
> they don't. And so these scripts use the equivalent of an undocumented,
> internal interface that could break in a future release. I worry that the
> ops are not communicating back to the devs when they have to poke at
> internals to workaround problems. And, honestly, I don't have a good
> suggestion here (unless we could "embed" some OpenStack devs into
> environments with production deployments and have them watch what happens,
> which would be great, but probably not a viable solution).

Yes, this worries me a lot. First off, because it means the actual
product isn't meeting their needs, but additionally because its
fragile. Some of the base image cleanup scripts I've seen are flat out
dangerous, and I had to help a bunch of people with the new fixed ip
quotas who had just edited the database to add quotas but gotten it
subtly wrong.

I don't have good answers here. I had hoped this mailing list would
offer a solution by better communication between the two groups, but
I'm not sure that's supported by the last six months of traffic here.

One of the things I'm going to try and do a better job of in Havana is
blogging major features as I add them. Maybe people will come out of
the woodwork with comments if they see a post.

I'm very open to other suggestions.

Michael



More information about the OpenStack-dev mailing list