Open Stack

Sat Apr 27 10:53:13 UTC 2019

On Fri, Apr 26, 2019 at 5:03 AM Mark Goddard <mark at stackhpc.com> wrote:
>
>
>
> On Thu, 25 Apr 2019 at 23:50, Matt Riedemann <mriedemos at gmail.com> wrote:
>>
>> On 4/24/2019 8:21 AM, Mark Goddard wrote:
>> > I put together a patch for kolla-ansible with support for upgrade checks
>> > for some projects: https://review.opendev.org/644528. It's on the
>> > backburner at the moment but I plan to return to it during the Train
>> > cycle. Perhaps you could clarify a few things about expected usage.
>>
>> Cool. I'd probably try to pick one service (nova?) to start with before
>> trying to bite off all of these in a single change (that review is kind
>> of daunting).
>>
>> Also, as part of the community wide goal I wrote up reference docs in
>> the nova tree [1] which might answer your questions with links for more
>> details.
>>
>> >
>> > 1. Should the tool be run using the new code? I would assume so.
>>
>> Depends on what you mean by "new code". When nova introduced this in
>> Ocata it was meant to be run in a venv or container after upgrading the
>> newton schema and data migrations to ocata, but before restarting the
>> services with the ocata code and that's how grenade uses it. But the
>> checks should also be idempotent and can be run as a
>> post-install/upgrade verify step, which is how OSA uses it (and is
>> described in the nova install docs [2]).
>>
>
> In kolla land, I mean should I use the container image for the current release or the target release to execute the nova-status command. It sounds like it's the latter, which also implies we're using the target version of kolla/kolla-ansible. I hadn't twigged that we'd need to perform the schema upgrade and online migrations.
>
>>
>> > 2. How would you expect this to be run with multiple projects? I was
>> > thinking of adding a new command that performs upgrade checks for all
>> > projects that would be read-only, then also performing the check again
>> > as part of the upgrade procedure.
>>
>> Hmm, good question. This probably depends on each deployment tool and
>> how they roll through services to do the upgrade. Obviously you'd want
>> to run each project's checks as part of upgrading that service, but I
>> guess you're looking for some kind of "should we even start this whole
>> damn upgrade if we can detect early that there are going to be issues?".
>> If the early run is read-only though - and I'm assuming by read-only you
>> mean they won't cause a failure - how are you going to expose that there
>> is a problem without failing? Would you make that configurable?
>> Otherwise the checks themselves are supposed to be read-only and not
>> change your data (they aren't the same thing as an online data migration
>> routine for example).
>>
>
> If we need to have run the schema upgrade and migrations before the upgrade check, I think that reduces the usefulness of a separate check operation. I was thinking you might be able to run the checks against the system prior to making any upgrade changes, but it seems not. I guess a separate check after the upgrade might still be useful for diagnosing upgrade issues from warnings.
>
>>
>> > 3. For the warnings, would you recommend a -Werror style argument that
>> > optionally flags up warnings as errors? Reporting non-fatal errors is
>> > quite difficult in Ansible.
>>
>> OSA fails on any return codes that aren't 0 (success) or 1 (warning).
>> It's hard to say when warning should be considered an error really. When
>> writing these checks I think of warning as a case where you might be OK
>> but we don't really know for sure, so it can aid in debugging
>> upgrade-related issues after the fact but might not necessarily mean you
>> shouldn't upgrade. mnaser has brought up the idea in the past of making
>> the output more machine readable so tooling could pick and choose which
>> things it considers to be a failure (assuming the return code was 1).
>> That's an interesting idea but one I haven't put a lot of thought into.
>> It might be as simple as outputting a unique code per check per project,
>> sort of like the error code concept in the API guidelines [3] which the
>> placement project is using [4].
>>
>
> Machine readable would be nice. Perhaps there's something we could do to generate a report of the combined results.

Interesting you bring that up, I made an attempt a while back, but didn't have
the resources to drive it through.

https://review.opendev.org/#/c/576944/

>> [1] https://docs.openstack.org/nova/latest/reference/upgrade-checks.html
>> [2] https://docs.openstack.org/nova/latest/install/verify.html
>> [3] https://specs.openstack.org/openstack/api-wg/guidelines/errors.html
>> [4]
>> https://opendev.org/openstack/placement/src/branch/master/placement/errors.py
>>
>> --
>>
>> Thanks,
>>
>> Matt

-- 
Mohammed Naser — vexxhost
-----------------------------------------------------
D. 514-316-8872
D. 800-910-1726 ext. 200
E. mnaser at vexxhost.com
W. http://vexxhost.com

Open Stack

[goals][upgrade-checkers] Retrospective

OpenStack

Community

Documentation

Branding & Legal