[openstack-dev] [tripleo] Progress on overcloud upgrade / update jobs

Marios Andreou marios at redhat.com
Mon Aug 8 07:42:41 UTC 2016


On 06/08/16 01:45, Emilien Macchi wrote:
> On Fri, Aug 5, 2016 at 4:19 PM, Wesley Hayutin <whayutin at redhat.com> wrote:
>>
>>
>> On Fri, Aug 5, 2016 at 4:08 PM, Emilien Macchi <emilien at redhat.com> wrote:
>>>
>>> On Fri, Aug 5, 2016 at 1:58 PM, Steven Hardy <shardy at redhat.com> wrote:
>>>> On Thu, Aug 04, 2016 at 09:46:20PM -0400, Emilien Macchi wrote:
>>>>> Hi,
>>>>>
>>>>> I'm currently working by iteration to get a new upstream job that test
>>>>> upgrades and update.
>>>>> Until now, I'm doing baby steps. I bootstrapped the work to upgrade
>>>>> undercloud, see https> ://review.openstack.org/#/c/346995/ for details
>>>>> (it's almost working hitting a packaging issue now).
>>>>>
>>>>> Now I am interested by having 2 overcloud jobs:
>>>>>
>>>>> - update: Newton -> Newton: basically, we already have it with
>>>>> gate-tripleo-ci-centos-7-ovb-upgrades - but my proposal is to use
>>>>> multinode work that James started.
>>>>> I have a PoC (2 lines of code):
>>>>> https://review.openstack.org/#/c/351330/1 that works, it deploys an
>>>>> overcloud using packaging, applies the patch in THT and run overcloud
>>>>> update. I tested it and it works fine, (I tried to break Keystone).
>>>>> Right now the job name is
>>>>> gate-tripleo-ci-centos-7-nonha-multinode-upgrades-nv because I took
>>>>> example from the existing ovb job that does the exact same thing.
>>>>> I propose to rename it to
>>>>> gate-tripleo-ci-centos-7-nonha-multinode-updates-nv. What do you
>>>>> think?
>>>>
>>>> This sounds good, and it seems to be a valid replacement for the old
>>>> "upgrades" job - it won't catch all kinds of update bugs (in particular
>>>> it
>>>> obviously won't run any packaged based updates at all), but it will
>>>> catch
>>>> the most serious template regressions, which will be useful coverage to
>>>> maintain I think.
>>>>
>>>>> - upgrade: Mitaka -> Newton: I haven't started anything yet but the
>>>>> idea is to test the upgrade from stable to master, using multinode job
>>>>> now (not ovb).
>>>>> I can prototype something but I would like to hear from our community
>>>>> before.
>>>>
>>>> I think getting this coverage in place is very important, we're
>>>> experiencing a lot of post-release pain due to the lack of this
>>>> coverage,
>>>> so +1 on any steps we can take to get some coverage here, I'd say go
>>>> ahead
>>>> and do the prototype if you have time to do it.
>>>
>>> ok, /me working on it.
>>>
>>>> You may want to chat with weshay, as I know there are some RDO upgrade
>>>> tests which were planned to be run as third-party jobs to get some
>>>> upgrade
>>>> coverage - I'm not sure if there is any scope for reuse here, or if it
>>>> will
>>>> be easier to just wire in the upgrade via our current scripts (obviously
>>>> some form of reuse would be good if possible).
>>>
>>> ack
>>>
>>>>> Please give some feedback if you are interested by this work and I
>>>>> will spend some time during the next weeks on $topic.
>>>>>
>>>>> Note: please also look my thread about undercloud upgrade job, I need
>>>>> your feedback too.
>>>>
>>>> My only question about undercloud upgrades is whether we might combine
>>>> the
>>>> overcloud upgrade job with this, e.g upgrade undercloud, then updgrade
>>>> overcloud.  Probably the blocker here will be the gate timeout I guess,
>>>> even if we're using pre-cached images etc.
>>>
>>> Yes, my final goal was to have a job like:
>>> 1) deploy Mitaka undercloud
>>> 2) deploy Mitaka overcloud
>>> 3) run pingtest
>>> 4) upgrade undercloud to Newton
>>> 5) upgrade overcloud to newton
>>> 6) re-run pingtest
>>
>>
>> FYI.. Mathieu wrote up https://review.openstack.org/#/c/323750/
>>
>> Emilien feel free to take it over, just sync up w/ Mathieu when he returns
>> from PTO on Monday.
>> Thanks
>>
> 
> Ok so I didn't modify his code, though I took over to add more bits.
> 
> Also, I prepared everything to start tests in upstream CI:
> 
> 1) Rename upgrades to updates jobs:
> Rename it in openstack-infra/project-config https://review.openstack.org/351914
> Rename it in tripleo-ci: https://review.openstack.org/#/c/351937
> Once it's done, we'll have 2 experimental jobs for upgrading
> overcloud: updates and upgrades, as we agreed in this thread.
> 
> 2) Undercloud upgrade job was rebased: https://review.openstack.org/#/c/346995/
> It contains some workarounds. Now the Undercloud Upgrade blueprint has
> been merged, people involved in upgrade should help me in 346995 (by
> review) to discuss about where we put the code that we need to
> upgrade.
> 
> 3) Overcloud update job was renamed and rebased:
> https://review.openstack.org/#/c/351330/
> It is passing CI, please review it and once it's merged, I'll propose
> to move it to check queue eventually, since we don't run the OVB
> updates job for all TripleO patches at this time, but only on periodic
> and experimental pipelines.
> Having gate-tripleo-ci-centos-7-nonha-multinode-updates-nv in the
> check queue will help us to again having an update job in place for
> free. This job will be useful until we get an upgrade job working.
> 
> 4) Overcloud upgrade job: https://review.openstack.org/#/c/323750/
> This is highly work in progress but Mathieu Bultel and I will work
> together to get it working. We'll use experimental pipeline for that.
> On a side note, I took the initiative to rebase it, add some bits,
> create a mini TODO in the commit message and also rebase
> https://review.openstack.org/#/c/321027/ useful for upgrades.
> 
> Any question or feedback is welcome!
> 

thanks very much for looking into and organising this Emilien. Just to
reiterate what others have said getting this into place will help us
avoid some pain from changes that are breaking upgrades.

The timeout really is a concern. Perhaps we can start with a single node
(non-ha) controller but really we want to be testing the ha scenario. I
haven't looked at mitaka to newton for a while but the liberty to mitaka
*controllers* upgrade on a 3-control 1 compute virt setup was taking
almost an hour very recently (32GB virt host octa-core) - emphasize...
*just* for the controllers upgrade step.

I'll help out as I can at least with reviews/anything else that comes up,

thanks, marios



More information about the OpenStack-dev mailing list