(OpenStack-Upgrade)

Dmitriy Rabotyagov noonedeadpunk at gmail.com
Thu Mar 2 19:24:26 UTC 2023


Oh, well, that explains your attitude to upgrades then. But basically
it's all about collecting and sorting out a technical debt, that IS
collected by avoiding upgrades for as long as possible. Out of my
experience, a team of 3-4 engineers is capable of maintaining and
regularly upgrading OpenStack. Yes, maybe not once in 6 month, but
once a year for sure. And performing 2 sequential upgrades is not that
big of a deal - it's kind of 20 hours per year per region if you don't
have time or knowledge to deal with small hackeries for jumping
through 1 release (which is usually not a big deal).

Based on that my advice would be to prevent having and collecting
technical debt, as while it might feel cheaper to not invest time in
maintenance, dealing with debt is always more expensive. So do not be
afraid of upgrades if they're done in a timely manner, using
maintained and supported versions of software is always better then
legacy and EOLed ones.

We were also discussing the upgrade process with OpenStack-Ansible,
which is being used as a deployment tool, which does simplify the
upgrade process. I bet kolla-ansible also do a damn good job with
their upgrades. But I do understand how much a PITA heterogeneous
deployments can be.

And yeah, I meant 5. Regarding 4 - I kind of agree - each deployment
is individual especially with some time. And it's really true that on
production you will see issues you never saw in CI or DEV
environments, but such issues will be mostly related to the load or
not exact same configuration of dev envs. I'd say a good example of
that might be OVS or l3 agents, that will take way more time to
startup on production compared to sandbox where you won't spot any
downtime or issues.

чт, 2 мар. 2023 г. в 18:03, Albert Braden <ozzzo at yahoo.com>:
>
> 1. Of course you should upgrade every 6 months. I've never seen or heard of anyone doing that, but if you have the resources, I agree, that would be great. And yes, if you're upgrading a few versions, you may need to do one or more operating system upgrades along the way.
>
> 2. I've never seen an easy, smooth process. That being said, I've never done a single-version upgrade. If you upgrade every 6 months, then maybe it would be smooth and easy. The standard situation I saw during my contracting years is that a company has got themselves into a bind because they have a small team (or maybe 1 guy) running Openstack, and they haven't upgraded for a long time, so they hire me to clean up the mess.
>
> 4 (I think you meant 5 here?). I've never lost resources during an upgrade, but I would never promise customers that there is 0 percentage chance of loss. I always recommend that the customer be resilient against loss, for example by duplicating their application in multiple clusters and by maintaining backups of important data, and I strengthen that recommendation during upgrades.
> On Thursday, March 2, 2023, 10:56:47 AM EST, Dmitriy Rabotyagov <noonedeadpunk at gmail.com> wrote:
>
>
> These are very weird statements and I can not agree with most of them.
>
> 1. You should upgrade in time. All problems come if you try to avoid
> upgrades at any costs - then you're indeed in a situation when
> upgrades are painful as you're running obsolete stuff that is not
> supported anymore and not provided by your distro (or distro also is
> not supported as well).
> With SLURP releases you will be able to do upgrades yearly starting
> with Antelope. Before that upgrades should be done each 6 month
> basically. Jumping through 1 release was not supported before but is
> doable given some preparation and small hacks. Jumping through more
> than 1 release will almost certainly guarantee you pain. Upgrades to
> next releases are well tested both by individual projects and by
> OpenStack-Ansible, so given you've looked through release notes and
> adjusted configuration - it should be just fine.
>
> 2. It's quite an easy and relatively smooth process as of today. Yes,
> you will have small API interruptions during the upgrade and when
> services do restart they drop connections. But we control HAproxy
> backends to minimize the effect of this. In many cases upgrade can be
> performed just running scripts/run_upgrade.sh - it will work given
> it's ran against healthy cluster (meaning that you don't have dead
> galera or rabbit node in your cluster). At the moment we spend around
> a working day for upgrading a region, but planning to automate this
> process soonish to perform upgrades of production environments using
> Zuul. We also never had to rollback, as rollback is indeed painful
> process that you can hard process. So I won't sugggest rolling back
> production environment unless it's absolutely needed.
>
> 3. This is smth I will agree with. You can take a look at our MNAIO
> [1] that can help you to spawn a virtual sandbox with multiple nodes
> in it, where you can play with upgrades. Also I'd suggest running
> tempest or rally tests regularly. They are helpful indeed.
>
> 4. I'm not sure what's meant here at all. I can hardly imagine how you
> can fail an OpenStack upgrade in a way that you will lose customer
> data. I can recall such failures with Ceph though, but it was
> somewhere around Hammer release (0.84 or smth) which is not the case
> for quite a while as well.
>
>
> [1] https://opendev.org/openstack/openstack-ansible-ops/src/branch/master/multi-node-aio
>
> чт, 2 мар. 2023 г. в 15:15, Albert Braden <ozzzo at yahoo.com>:
> >
> > Having done a few upgrades, I can give you some general advice:
> >
> > 1. If you can avoid upgrading, do it! If you are lucky enough to have customers who are willing (or can be forced) to accept a "refresh" strategy whereby you build a new cluster and move them to it, that is substantially easier and safer.
> >
> > 2. If you must upgrade, go into it with the understanding that it is a difficult and dangerous process, and that avoiding failure will require meticulous preparation. Try to duplicate all of the weird things that your customers are doing, in your lab environment, then upgrade and roll it back repeatedly, documenting the steps in great detail (ideally automating them as much as possible) until you can roll forward and back in your sleep.
> >
> > 3. Develop a comprehensive test procedure (ideally automated) that tests standard, edge and corner cases before and after the upgrade/rollback.
> >
> > 4. Expect different clusters to behave differently during the upgrade, and to present unique problems, even though as far as you know they are setup identically. Expect to see issues in your prod clusters that you didn't see in lab/dev/QA, and budget extra downtime to solve those issues.
> >
> > 5. Recommend to your customers that they backup their data and configurations, so that they can recover if an upgrade fails and their resources are lost. Set the expectation that there is a non-zero probability of failure.
> > On Wednesday, March 1, 2023, 07:54:30 AM EST, Dmitriy Rabotyagov <noonedeadpunk at gmail.com> wrote:
> >
> >
> > Hey,
> >
> > Regarding rollaback of upgrade in OSA we indeed don't have any good
> > established/documented process for that. At the same time it should be
> > completely possible with some "BUT". It also depends on what exactly
> > you want to rollback - roles, openstack services or both. As OSA roles
> > can actually install any openstack service version.
> >
> > We keep all virtualenvs from the previous version, so during upgrade
> > we build just new virtualenvs and reconfigure systemd units to point
> > there. So fastest way likely would be to just edit systemd unit files
> > and point them to old venv version and reload systemd daemon and
> > service and restore DB from backup of course.
> > You can also define  <service>_venv_tag (ie `glance_venv_tag`) to the
> > old OSA version you was running and execute openstack-ansible
> > os-<service>-install.yml --tags  systemd-service,uwsgi - that in most
> > cases will be enough to just edit systemd units for the service and
> > start old version of it. BUT running without tags will result in
> > having new packages in old venv which is smth you totally want to
> > avoid.
> > To prevent that you can also define <service>_git_install_branch and
> > requirements_git_install_branch in /etc/openstack_deploy/group_vars
> > (it's important to use group vars if you want to rollback only one
> > service) and take value from
> > https://opendev.org/openstack/openstack-ansible/src/tag/26.0.1/playbooks/defaults/repo_packages/openstack_services.yml
> > (ofc pick your old version!)
> >
> > For a full rollback and not in-place workarounds, I think it should be like that
> > * checkout to previous osa version
> > * re-execute scripts/bootstrap-ansible.sh
> > * you should still take current versions of mariadb and rabbitmq and
> > define them in user_variables (galera_major_version,
> > galera_minor_version, rabbitmq_package_version,
> > rabbitmq_erlang_version_spec) - it's close to never ends well
> > downgrading these.
> > * Restore DB backup
> > * Re-run setup-openstack.yml
> >
> > It's quite a rough summary of how I do see this process, but to be
> > frank I never had to execute full downgrade - I was limited mostly by
> > downgrading 1 service tops after the upgrade.
> >
> > Hope that helps!
> >
> > ср, 1 мар. 2023 г. в 12:06, Adivya Singh <adivya1.singh at gmail.com>:
> >
> > >
> > > hi Alvaro,
> > >
> > > i have installed using Openstack-ansible, The upgrade procedure is consistent
> > >
> > > but what is the roll back procedure , i m looking for
> > >
> > > Regards
> > > Adivya Singh
> > >
> > > On Wed, Mar 1, 2023 at 12:46 PM Alvaro Soto <alsotoes at gmail.com> wrote:
> > >>
> > >> That will depend on how did you installed your environment: OSA, TripleO, etc.
> > >>
> > >> Can you provide more information?
> > >>
> > >> ---
> > >> Alvaro Soto.
> > >>
> > >> Note: My work hours may not be your work hours. Please do not feel the need to respond during a time that is not convenient for you.
> > >> ----------------------------------------------------------
> > >> Great people talk about ideas,
> > >> ordinary people talk about things,
> > >> small people talk... about other people.
> > >>
> > >> On Tue, Feb 28, 2023, 11:46 PM Adivya Singh <adivya1.singh at gmail.com> wrote:
> > >>>
> > >>> Hi Team,
> > >>>
> > >>> I am planning to upgrade my Current Environment, The Upgrade procedure is available in OpenStack Site and Forums.
> > >>>
> > >>> But i am looking fwd to roll back Plan , Other then have a Local backup copy of galera Database
> > >>>
> > >>> Regards
> > >>> Adivya Singh
> >
>



More information about the openstack-discuss mailing list