[goals][upgrade-checkers] Retrospective

Mohammed Naser mnaser at vexxhost.com
Sat Apr 27 10:53:13 UTC 2019

On Fri, Apr 26, 2019 at 5:03 AM Mark Goddard <mark at stackhpc.com> wrote:
> On Thu, 25 Apr 2019 at 23:50, Matt Riedemann <mriedemos at gmail.com> wrote:
>> On 4/24/2019 8:21 AM, Mark Goddard wrote:
>> > I put together a patch for kolla-ansible with support for upgrade checks
>> > for some projects: https://review.opendev.org/644528. It's on the
>> > backburner at the moment but I plan to return to it during the Train
>> > cycle. Perhaps you could clarify a few things about expected usage.
>> Cool. I'd probably try to pick one service (nova?) to start with before
>> trying to bite off all of these in a single change (that review is kind
>> of daunting).
>> Also, as part of the community wide goal I wrote up reference docs in
>> the nova tree [1] which might answer your questions with links for more
>> details.
>> >
>> > 1. Should the tool be run using the new code? I would assume so.
>> Depends on what you mean by "new code". When nova introduced this in
>> Ocata it was meant to be run in a venv or container after upgrading the
>> newton schema and data migrations to ocata, but before restarting the
>> services with the ocata code and that's how grenade uses it. But the
>> checks should also be idempotent and can be run as a
>> post-install/upgrade verify step, which is how OSA uses it (and is
>> described in the nova install docs [2]).
> In kolla land, I mean should I use the container image for the current release or the target release to execute the nova-status command. It sounds like it's the latter, which also implies we're using the target version of kolla/kolla-ansible. I hadn't twigged that we'd need to perform the schema upgrade and online migrations.
>> > 2. How would you expect this to be run with multiple projects? I was
>> > thinking of adding a new command that performs upgrade checks for all
>> > projects that would be read-only, then also performing the check again
>> > as part of the upgrade procedure.
>> Hmm, good question. This probably depends on each deployment tool and
>> how they roll through services to do the upgrade. Obviously you'd want
>> to run each project's checks as part of upgrading that service, but I
>> guess you're looking for some kind of "should we even start this whole
>> damn upgrade if we can detect early that there are going to be issues?".
>> If the early run is read-only though - and I'm assuming by read-only you
>> mean they won't cause a failure - how are you going to expose that there
>> is a problem without failing? Would you make that configurable?
>> Otherwise the checks themselves are supposed to be read-only and not
>> change your data (they aren't the same thing as an online data migration
>> routine for example).
> If we need to have run the schema upgrade and migrations before the upgrade check, I think that reduces the usefulness of a separate check operation. I was thinking you might be able to run the checks against the system prior to making any upgrade changes, but it seems not. I guess a separate check after the upgrade might still be useful for diagnosing upgrade issues from warnings.
>> > 3. For the warnings, would you recommend a -Werror style argument that
>> > optionally flags up warnings as errors? Reporting non-fatal errors is
>> > quite difficult in Ansible.
>> OSA fails on any return codes that aren't 0 (success) or 1 (warning).
>> It's hard to say when warning should be considered an error really. When
>> writing these checks I think of warning as a case where you might be OK
>> but we don't really know for sure, so it can aid in debugging
>> upgrade-related issues after the fact but might not necessarily mean you
>> shouldn't upgrade. mnaser has brought up the idea in the past of making
>> the output more machine readable so tooling could pick and choose which
>> things it considers to be a failure (assuming the return code was 1).
>> That's an interesting idea but one I haven't put a lot of thought into.
>> It might be as simple as outputting a unique code per check per project,
>> sort of like the error code concept in the API guidelines [3] which the
>> placement project is using [4].
> Machine readable would be nice. Perhaps there's something we could do to generate a report of the combined results.

Interesting you bring that up, I made an attempt a while back, but didn't have
the resources to drive it through.


>> [1] https://docs.openstack.org/nova/latest/reference/upgrade-checks.html
>> [2] https://docs.openstack.org/nova/latest/install/verify.html
>> [3] https://specs.openstack.org/openstack/api-wg/guidelines/errors.html
>> [4]
>> https://opendev.org/openstack/placement/src/branch/master/placement/errors.py
>> --
>> Thanks,
>> Matt

Mohammed Naser — vexxhost
D. 514-316-8872
D. 800-910-1726 ext. 200
E. mnaser at vexxhost.com
W. http://vexxhost.com

More information about the openstack-discuss mailing list