[openstack-dev] [tripleo] Adding new roles after upgrade is broken.

Jiří Stránský jistr at redhat.com
Fri Aug 18 16:13:55 UTC 2017


On 18.8.2017 17:25, Marios Andreou wrote:
> On Fri, Aug 18, 2017 at 4:22 PM, Jiří Stránský <jistr at redhat.com> wrote:
> 
>> On 18.8.2017 13:18, Sofer Athlan-Guyot wrote:
>>
>>> Hi,
>>>
>>> We may have missing packages when the user is adding a new role to its
>>> roles_data file and the base image is coming from previous version.
>>>
>>> The workflow would be this one:
>>>    - install newton
>>>    - upgrade to ocata
>>>    - add collectd to roles_data and redeploy the stack
>>>
>>> For instance if one is adding
>>> OS::TripleO::Services::Collectdservices::collectd in an ocata env coming
>>> from an upgraded newton env, he/she won't have the necessary packages
>>> (for instance collectd-disk).  The puppet manifest will fail has the
>>> package is missing and puppet doesn't install package.  The upgrade
>>> task[1] is useless as the new role wasn't added during the upgrade but
>>> after.
>>>
>>
>> Right, but the package could be added during the upgrade. The
>> upgrade_tasks could/should make the set of installed overcloud RPMs on par
>> with the overcloud-full image of the respective release, ideally. So you'd
>> have collectd RPMs installed always, both on freshly deployed and upgraded
>> envs, regardless if you actually use collectd or not. We already did some
>> package installs/uninstalls as part of upgrades and updates, but probably
>> didn't have 100% coverage.
>>
>>
> yeah +1 to this except where would those upgrade_tasks go? Taking the given
> example, upgrade_tasks in collectd-disk.yaml wont be executed because
> during the upgrade, the operator didn't have that enabled as a service (it
> is default off, and new so they couldn't deploy it before).
> 
> So we may have to use some 'central place' (this is what Sofer was
> advocating earlier on irc) like in the tripleo-packages.yaml and have tasks
> there.

Yes exactly, tripleo-packages.yaml is the right place IMO.

> The problem then becomes however that we don't _know_ which services
> we need to download packages for? Oh.. no you're saying we can use the
> current release package list (e.g. do you mean from something in
> https://github.com/openstack/tripleo-puppet-elements pkg-maps? ).

Yea i meant diff between e.g. Newton and Ocata t-p-e:

https://github.com/openstack/tripleo-puppet-elements/compare/stable/newton...stable/ocata

Or we could download overcloud-full.qcow2 for both Newton and Ocata, and 
inspect them using libguestfs-tools for the list of installed RPMs, 
which could give us the exact diff. That might be the easiest way how to 
do this perhaps?


Have a good day folks,

Jirka

> 
> 
>>
>>> I don't see any easy way to solve this.  Basically we need a way to keep
>>> in sync base image between release without using the upgrade_tasks,
>>> maybe in the tripleo-package one ?
>>>
>>
>> Given that released code is affected, we may treat it as a bug that
>> requires a minor update, and in addition to upgrade_tasks, we can add all
>> the necessary package installs into minor update code (yum_update.sh) too.
>> Again this shouldn't depend on what services are actually enabled, just
>> unconditionally sync with latest content of overcloud-full image of the
>> respective release.
>>
>> I guess the time consuming part will be preparing the envs that will allow
>> comparing a fresh deploy vs. an upgraded one to get the `rpm -qa | sort`
>> difference. Or we could try a shortcut and see what changes went into
>> tripleo-puppet-elements in each release.
>>
>>
> yeah based on what you said, am thinking
> https://github.com/openstack/tripleo-puppet-elements/blob/master/elements/overcloud-controller/pkg-map
> for example , or some parsing of the combined element pkg maps... :/ still
> likely need some tooling to do that though
> 
> thanks, marios
> 
> 
>>
>>> This shouldn't be a problem with container, but everything before pike
>>> is affected.
>>>
>>
>> Indeed. There will still be some basic baremetal host content management
>> as long as we're not using Atomic, but the room for potential problems will
>> be much smaller.
>>
>> Jirka
>>
>>
>>> Originially seen there[2]
>>>
>>> [1] https://github.com/openstack/tripleo-heat-templates/blob/sta
>>> ble/ocata/puppet/services/metrics/collectd.yaml#L130..L134
>>> [2] https://bugzilla.redhat.com/show_bug.cgi?id=1455065
>>>
>>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
> 
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 




More information about the OpenStack-dev mailing list