[tripleo][update][blueprint] Update refactor: more feedback, more control, more speed.

Sofer Athlan-Guyot sathlang at redhat.com
Thu Jul 2 11:20:48 UTC 2020


Hi,

hope you liked the title, I find it catchy.

Update is mainly an afterthought that needs to work.  So we mainly fix
"stuff" there.  No major change happened there since a long time.

Following the PTG, I'm proposing a new blueprint and a bug:

 1. Refactor tripleo update to offer the user more feedback and
    control[1].
 2. Registering node and repos can happen after some module check for
    packages[2].

I'm pretty new to this so I would need feedback about the form and
content.  For instance, point 2. could be a blueprint instead of a bug,
tell me what you think.

 1. refactor update step to load step playbook instead of looping over
    the steps:
    - this will speed up update (no more skipped tasks)
    - this will offer point of recovery when the update fails
      (by doing something like in named debug[3] for deployment)

 2. refactor/fix? host-prep-tasks to include two steps:
    - step0 to add pre-update in-flight validation to the update
      process and rhosp registration;
    - step1 to all other tasks;
    - make sure it run in parallel on all nodes
 
Point 1. would be a catch up with deployment.  It offers speed
improvement as we wouldn't skip tasks anymore.  We could notify the user
of what we are doing: "I'm removing the node from the cluster" instead
of "step1".  It would offer the user the hook to be able to restart a
failed update from any step.  Overall a big win, I think.

Point 2. is newer, I filled it as a bug because I bumped into it as an
issue when trying to add validation for subscription.  It opens some
possibilities for the update:

 - in-flight validation at the beginning of the update process that
   would be skipped during deployment using tag

 - using tags we could also run specific day 2 action outside of the
   update window:

   openstack overcloud update run --tags 'pre-update-validation' (with
   pre-update-validation in host-prep-tasks step0)

   openstack overcloud update run --tags 'rhsm-subscription'

Well, it looked promising to me.

Now, tell me what you think, but please, be nice, I'm old and
susceptible.

I have more coming, sorted by order of though I put into it, starting
with the ones I though about more:

 - Check if we need a reboot of the server and notify the user.
 - Gain some more speed and clarity by having a
   running-on-all-host-in-parallel-host-update-prep-tasks new step.  For
   instance all HA image tagging magic could go in there.
 - Investigate converge and check if we still could not further optimize
   it for update.

I would like to gain more experience with the process before I filled
those new blueprints.

I'm going to draft a spec for the proposed blueprint and then I'll push
some WIP code.

Thanks,

[1] https://blueprints.launchpad.net/tripleo/+spec/tripleo-update-smart-steps
[2] https://bugs.launchpad.net/tripleo/+bug/1886028
[1] https://review.opendev.org/#/c/636731/
-- 
Sofer Athlan-Guyot
chem on #irc
DFG:Upgrades




More information about the openstack-discuss mailing list