[openstack-dev] [grenade] upgrades vs rootwrap

Angus Lees gus at inodes.org
Tue Jun 28 05:46:39 UTC 2016


Ok, thanks for the in-depth explanation.

My take away is that we need to file any rootwrap updates as exceptions for
now (so releasenotes and grenade scripts).

 - Gus

On Mon, 27 Jun 2016 at 21:25 Sean Dague <sean at dague.net> wrote:

> On 06/26/2016 10:02 PM, Angus Lees wrote:
> > On Fri, 24 Jun 2016 at 20:48 Sean Dague <sean at dague.net
> > <mailto:sean at dague.net>> wrote:
> >
> >     On 06/24/2016 05:12 AM, Thierry Carrez wrote:
> >     > I'm adding Possibility (0): change Grenade so that rootwrap
> >     filters from
> >     > N+1 are put in place before you upgrade.
> >
> >     If you do that as general course what you are saying is that every
> >     installer and install process includes overwriting all of rootwrap
> >     before every upgrade. Keep in mind we do upstream upgrade as offline,
> >     which means that we've fully shut down the cloud. This would remove
> the
> >     testing requirement that rootwrap configs were even compatible
> between N
> >     and N+1. And you think this is theoretical, you should see the
> patches
> >     I've gotten over the years to grenade because people didn't see an
> issue
> >     with that at all. :)
> >
> >     I do get that people don't like the constraints we've self imposed,
> but
> >     we've done that for very good reasons. The #1 complaint from
> operators,
> >     for ever, has been the pain and danger of upgrading. That's why we
> are
> >     still trademarking new Juno clouds. When you upgrade Apache, you
> don't
> >     have to change your config files.
> >
> >
> > In case it got lost, I'm 100% on board with making upgrades safe and
> > straightforward, and I understand that grenade is merely a tool to help
> > us test ourselves against our process and not an enemy to be worked
> > around.  I'm an ops guy proud and true and hate you all for making
> > openstack hard to upgrade in the first place :P
> >
> > Rootwrap configs need to be updated in line with new rootwrap-using code
> > - that's just the way the rootwrap security mechanism works, since the
> > security "trust" flows from the root-installed rootwrap config files.
> >
> > I would like to clarify what our self-imposed upgrade rules are so that
> > I can design code within those constraints, and no-one is answering my
> > question so I'm just getting more confused as this thread progresses...
> >
> > ***
> > What are we trying to impose on ourselves for upgrades for the present
> > and near future (ie: while rootwrap is still a thing)?
> > ***
> >
> > A. Sean says above that we do "offline" upgrades, by which I _think_ he
> > means a host-by-host (or even global?) "turn everything (on the same
> > host/container) off, upgrade all files on disk for that host/container,
> > turn it all back on again".  If this is the model, then we can trivially
> > update rootwrap files during the "upgrade" step, and I don't see any
> > reason why we need to discuss anything further - except how we implement
> > this in grenade.
> >
> > B. We need to support a mix of old + new code running on the same
> > host/container, running against the same config files (presumably
> > because we're updating service-by-service, or want to minimise the
> > service-unavailability during upgrades to literally just a process
> > restart).  So we need to think about how and when we stage config vs
> > code updates, and make sure that any overlap is appropriately allowed
> > for (expand-contract, etc).
> >
> > C. We would like to just never upgrade rootwrap (or other config) files
> > ever again (implying a freeze in as_root command lines, effective ~a
> > year ago).  Any config update is an exception dealt with through
> > case-by-case process and release notes.
> >
> >
> > I feel like the grenade check currently implements (B) with a 6 month
> > lead time on config changes, but the "theory of upgrade" doc and our
> > verbal policy might actually be (C) (see this thread, eg), and Sean
> > above introduced the phrase "offline" which threw me completely into
> > thinking maybe we're aiming for (A).  You can see why I'm looking for
> > clarification  ;)
>
> Ok, there is theory of what we are striving for, and there is what is
> viable to test consistently.
>
> The thing we are shooting for is making the code Continuously
> Deployable. Which means the upgrade process should be "pip install -U
> $foo && $foo-manage db-sync" on the API surfaces and "pip install -U
> $foo; service restart" on everything else.
>
> Logic we can put into the python install process is common logic shared
> by all deployment tools, and we can encode it in there. So all
> installers just get it.
>
> The challenge is there is no facility for config file management in
> python native packaging. Which means that software which *depends* on
> config files for new or even working features now moves from the camp of
> CDable to manual upgrade needed. What you need to do is in releasenotes,
> not in code that's shipped with your software. Release notes are not
> scriptable.
>
> So, we've said, doing that has to be the exception and not the rule.
> It's also the same reasoning behind our deprecation phase for all config
> options. Things move from working (in N), to working with warnings (in
> N+1), to not working (in N+2). Which allows people to CD across this
> boundary, and do config file fixing in their Config Management tools
> *post* upgrade.
>
> Our testing, like all testing, is a trade off for what we could do
> consistently, and feel confident of the results. That's grenade. We need
> to operate on an all in one node, because that's what we have. We're
> using system level installs, because > 50% of our user base does. This
> does mean all of everything is getting upgraded all at once in the
> normal pip install -U flow, because the moment you start replacing
> system level libraries, bets are kind of off for services that are still
> running.
>
> But, if we exploit every weakness of the testing to figure out exactly
> the minimum we need to make the testing pass, we stop trying to do the
> thing we set out. Painless upgrades.
>
> The theory that rootwrap rules have to be inspected manually and
> adjusted by every deployer during upgrade seems... odd. It's like if you
> tried to upgrade firefox, and it wouldn't start until you adjusted your
> profile manually.
>
> So we are not aiming for A, we're actually aiming much higher. But
> testing, consistently, that much higher bar is a thing we can't easily
> do. So the structure of the testing for our offline upgrades, with the
> policy rules about what we should not change, is our check and balance
> for getting to properly seemless fully online upgrades.
>
>         -Sean
>
> --
> Sean Dague
> http://dague.net
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> --
> Message  protected by MailGuard: e-mail anti-virus, anti-spam and content
> filtering.http://www.mailguard.com.au/mg
> Click here to report this message as spam:
> https://console.mailguard.com.au/ras/1OJ137Hmex/7hJ0sxibjR6Z5nVC229GOK/0.22
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160628/9c4999d7/attachment.html>


More information about the OpenStack-dev mailing list