[openstack-dev] [grenade] upgrades vs rootwrap

Clint Byrum clint at fewbar.com
Wed Jul 6 16:44:14 UTC 2016


Excerpts from Matthew Treinish's message of 2016-07-06 11:55:53 -0400:
> On Wed, Jul 06, 2016 at 10:34:49AM -0500, Matt Riedemann wrote:
> > On 6/27/2016 6:24 AM, Sean Dague wrote:
> > > On 06/26/2016 10:02 PM, Angus Lees wrote:
> > > > On Fri, 24 Jun 2016 at 20:48 Sean Dague <sean at dague.net
> > > > <mailto:sean at dague.net>> wrote:
> > > > 
> > > >     On 06/24/2016 05:12 AM, Thierry Carrez wrote:
> > > >     > I'm adding Possibility (0): change Grenade so that rootwrap
> > > >     filters from
> > > >     > N+1 are put in place before you upgrade.
> > > > 
> > > >     If you do that as general course what you are saying is that every
> > > >     installer and install process includes overwriting all of rootwrap
> > > >     before every upgrade. Keep in mind we do upstream upgrade as offline,
> > > >     which means that we've fully shut down the cloud. This would remove the
> > > >     testing requirement that rootwrap configs were even compatible between N
> > > >     and N+1. And you think this is theoretical, you should see the patches
> > > >     I've gotten over the years to grenade because people didn't see an issue
> > > >     with that at all. :)
> > > > 
> > > >     I do get that people don't like the constraints we've self imposed, but
> > > >     we've done that for very good reasons. The #1 complaint from operators,
> > > >     for ever, has been the pain and danger of upgrading. That's why we are
> > > >     still trademarking new Juno clouds. When you upgrade Apache, you don't
> > > >     have to change your config files.
> > > > 
> > > > 
> > > > In case it got lost, I'm 100% on board with making upgrades safe and
> > > > straightforward, and I understand that grenade is merely a tool to help
> > > > us test ourselves against our process and not an enemy to be worked
> > > > around.  I'm an ops guy proud and true and hate you all for making
> > > > openstack hard to upgrade in the first place :P
> > > > 
> > > > Rootwrap configs need to be updated in line with new rootwrap-using code
> > > > - that's just the way the rootwrap security mechanism works, since the
> > > > security "trust" flows from the root-installed rootwrap config files.
> > > > 
> > > > I would like to clarify what our self-imposed upgrade rules are so that
> > > > I can design code within those constraints, and no-one is answering my
> > > > question so I'm just getting more confused as this thread progresses...
> > > > 
> > > > ***
> > > > What are we trying to impose on ourselves for upgrades for the present
> > > > and near future (ie: while rootwrap is still a thing)?
> > > > ***
> > > > 
> > > > A. Sean says above that we do "offline" upgrades, by which I _think_ he
> > > > means a host-by-host (or even global?) "turn everything (on the same
> > > > host/container) off, upgrade all files on disk for that host/container,
> > > > turn it all back on again".  If this is the model, then we can trivially
> > > > update rootwrap files during the "upgrade" step, and I don't see any
> > > > reason why we need to discuss anything further - except how we implement
> > > > this in grenade.
> > > > 
> > > > B. We need to support a mix of old + new code running on the same
> > > > host/container, running against the same config files (presumably
> > > > because we're updating service-by-service, or want to minimise the
> > > > service-unavailability during upgrades to literally just a process
> > > > restart).  So we need to think about how and when we stage config vs
> > > > code updates, and make sure that any overlap is appropriately allowed
> > > > for (expand-contract, etc).
> > > > 
> > > > C. We would like to just never upgrade rootwrap (or other config) files
> > > > ever again (implying a freeze in as_root command lines, effective ~a
> > > > year ago).  Any config update is an exception dealt with through
> > > > case-by-case process and release notes.
> > > > 
> > > > 
> > > > I feel like the grenade check currently implements (B) with a 6 month
> > > > lead time on config changes, but the "theory of upgrade" doc and our
> > > > verbal policy might actually be (C) (see this thread, eg), and Sean
> > > > above introduced the phrase "offline" which threw me completely into
> > > > thinking maybe we're aiming for (A).  You can see why I'm looking for
> > > > clarification  ;)
> > > 
> > > Ok, there is theory of what we are striving for, and there is what is
> > > viable to test consistently.
> > > 
> > > The thing we are shooting for is making the code Continuously
> > > Deployable. Which means the upgrade process should be "pip install -U
> > > $foo && $foo-manage db-sync" on the API surfaces and "pip install -U
> > > $foo; service restart" on everything else.
> > > 
> > > Logic we can put into the python install process is common logic shared
> > > by all deployment tools, and we can encode it in there. So all
> > > installers just get it.
> > > 
> > > The challenge is there is no facility for config file management in
> > > python native packaging. Which means that software which *depends* on
> > > config files for new or even working features now moves from the camp of
> > > CDable to manual upgrade needed. What you need to do is in releasenotes,
> > > not in code that's shipped with your software. Release notes are not
> > > scriptable.
> > > 
> > > So, we've said, doing that has to be the exception and not the rule.
> > > It's also the same reasoning behind our deprecation phase for all config
> > > options. Things move from working (in N), to working with warnings (in
> > > N+1), to not working (in N+2). Which allows people to CD across this
> > > boundary, and do config file fixing in their Config Management tools
> > > *post* upgrade.
> > 
> > rootwrap filters aren't config options, but I get the feeling we're
> > shoe-horning grenade to treat them as such.
> > 
> > I get why grenade tests how it does so we give a window for configuration
> > option deprecation. That's great and useful.
> > 
> > What I'm struggling with, and assuming others on this thread are, is the
> > difference with rootwrap filters, which are going to be required to be in
> > place for the code that relies on them to work.
> > 
> > That's not the same for config options, i.e. my nova.conf from mitaka
> > doesn't need new options from newton for my newton code to work, because if
> > the option isn't in nova.conf explicitly, my newton code gets the defaults
> > from oslo.config because it's in the code.
> > 
> > That doesn't work for rootwrap filters. So it really seems that putting the
> > newton rootwrap filters in place before running the newton code makes the
> > most sense, at least to me.
> > 
> > The problem I could see us running into is if in newton we dropped some no
> > longer used code which also was the last/only thing using a given rootwrap
> > filter, and we dropped that too. But maybe something that wasn't upgraded
> > (so mitaka code) on that same host is still relying on that rootwrap filter.
> > Maybe that's not possible though since the only down-level thing in nova
> > that we support is computes, and those would be separate nodes. If it was
> > single-node, deploying the controller code would also update the rootwrap
> > filters I'd think (if we went that route).
> > 
> > Am I missing something else here?
> 
> Well, for better or worse rootwrap filters are put in /etc and treated like a
> config file. What you're essentially saying is that it shouldn't be config and
> just be in code. I completely agree with that being what we want eventually, but
> it's not how we advertise it today. Privsep sounds like it's our way of making
> this migration. But, it doesn't change the status quo where it's this hybrid
> config/code thing today, like policy was in nova before:
> 
> http://specs.openstack.org/openstack/nova-specs/specs/newton/approved/policy-in-code.html
> 
> (which has come up before as another tension point in the past during upgrades)
> I don't think we should break what we're currently enforcing today because we
> don't like the model we've built. We need to handle the migration to the new
> better thing gracefully so we don't break people who are relying on our current
> guarantees, regardless of how bad they are.
> 

What if we just made rootwrap fall back to a path in the python module
that they're pertaining to? So if you're running nova-rootwrap you look
in os.path.dirname(nova.__file__) for a rootwrap.d.

There are plenty of python packages that ship data in their modules. I
don't see why we couldn't do it. And this would also eliminate the
need for shipping config files in pip packages, which I know has been
something that went around and around on rootwrap, among other issues.

I'm all for a migration to privsep, but I think that takes time.
Meanwhile, we are going to slam into this wall over and over for _years_.



More information about the OpenStack-dev mailing list