[openstack-dev] [Oslo] Improving oslo-incubator update.py

Doug Hellmann doug.hellmann at dreamhost.com
Mon Jan 13 17:07:53 UTC 2014

[resurrecting an old thread]

On Wed, Nov 27, 2013 at 6:26 AM, Flavio Percoco <flavio at redhat.com> wrote:

> On 27/11/13 10:59 +0000, Mark McLoughlin wrote:
>> On Wed, 2013-11-27 at 11:50 +0100, Flavio Percoco wrote:
>>> On 26/11/13 22:54 +0000, Mark McLoughlin wrote:
>>> >On Fri, 2013-11-22 at 12:39 -0500, Doug Hellmann wrote:
>>> >> On Fri, Nov 22, 2013 at 4:11 AM, Flavio Percoco <flavio at redhat.com>
>>> wrote:
>>> >> >    1) Store the commit sha from which the module was copied from.
>>> >> >    Every project using oslo, currently keeps the list of modules it
>>> >> >    is using in `openstack-modules.conf` in a `module` parameter. We
>>> >> >    could store, along with the module name, the sha of the commit it
>>> >> >    was last synced from:
>>> >> >
>>> >> >        module=log,commit
>>> >> >
>>> >> >        or
>>> >> >
>>> >> >        module=log
>>> >> >        log=commit
>>> >> >
>>> >>
>>> >> The second form will be easier to manage. Humans edit the module
>>> field and
>>> >> the script will edit the others.
>>> >
>>> >How about adding it as a comment at the end of the python files
>>> >themselves and leaving openstack-common.conf for human editing?
>>> I think having the commit sha will give us a starting point from which
>>> we could start updating that module from.
>> Sure, my only point was about where the commit sha comes from - i.e.
>> whether it's from a comment at the end of the python module itself or in
>> openstack-common.conf
> And, indeed you said 'at the end of the python files'. Don't ask me
> how the heck I misread that.
> The benefit I see from having them in the openstack-common.conf is
> that we can register a `StrOpt` for each object dynamically and get
> the sha using oslo.config. If we put it as a comment at the end of the
> python file, we'll have to read it and 'parse' it, I guess.
>>  It will mostly help with
>>> getting a diff for that module and the short commit messages where it
>>> was modified.
>>> Here's a pseudo-buggy-algorithm for the update process:
>>>     (1) Get current sha for $module
>>>     (2) Get list of new commits for $module
>>>     (3) for each commit of $module:
>>>         (3.1) for each modified_module in $commit
>>>             (3.1.1) Update those modules up to $commit
>>> (1)(modified_module)
>>>         (3.2) Copy the new file
>>>         (3.3) Update openstack-common with the latest sha
>>> This trusts the granularity and isolation of the patches proposed in
>>> oslo-incubator. However, in cases like 'remove vim mode lines' it'll
>>> fail assuming that updating every module is necessary - which is true
>>> from a git stand point.
>> This is another variant of the kind of inter-module dependency smarts
>> that update.py already has ... I'd be inclined to just omit those smarts
>> and just require the caller to explicitly list the modules they want to
>> include.
>> Maybe update.py could include some reporting to help with that choice
>> like "module foo depends on modules bar and blaa, maybe you want to
>> include them too" and "commit XXX modified module foo, but also module
>> bar and blaa, maybe you want to include them too".
> But, if we get to the point of suggesting the user to update module
> foo because it was modified in commit XXX, we'd have everything needed
> to make it recursive and update those modules as well.
> I agree with you on making it explicit, though. What about making it
> interactive then? update.py could ask users if they want to update
> module foo because it was modified in commit XXX and do it right away,
> which is not very different from updating module foo, print a report
> and let the user choose afterwards.
> (/me feels like Gollum now)
> I prefer the interactive way though, at least it doesn't require the
> user to run update several times for each module. We could also add a
> `--no-stop` flag that does exactly what you suggested.

I spent some time trying to think through how we could improve the update
script for [1], and I'm stumped on how to figure out *accurately* what
state the project repositories are in today.

We can't just compute the hash of the modules in the project receiving
copies, and then look for them in the oslo-incubator repo, because we
modify the files as we copy them out (to update the import statements and
replace "oslo" with the receiving project name in some places like config
option defaults).

We could undo those changes before computing the hash, but the problem is
further complicated because syncs are not being done of all modules
together. The common code in a project doesn't move forward in step with
the oslo-incubator repository as a whole. For example, sometimes only the
openstack/common/log.py module is copied and not all of openstack/common.
So log.py might be newer than a lot of the rest of the oslo code. The
problem is even worse for something like rpc, where it's possible that
modules within the rpc package might not all be updated together.

We could probably spend a lot of effort building a tool to tell us exactly
what the state of all of each common file is in each project, to figure out
what needs to be synced. I would much rather spend that effort on turning
the common code into libraries, though.

So, here's an alternative:

1. Projects accept a full sync of Oslo soon, including adding a value in
their openstack-common.conf indicating which commit in oslo-incubator is
reflected in the sync. We'll try to make those commit messages as detailed
as possible.

2. We modify update.py to remove the option to update individual modules
when copying from oslo-incubator. The new version would always apply all
changes from the last merged commit, as a series of commits, to the
receiving project. So if nova is out of step by 3 commits, then 3 new
commits would be created in the branch by the person doing the update, each
with the commit log message from the change in oslo-incubator. (This
lock-step approach is necessary to have any hope of figuring out which
commits are actually being synced, so the log messages are accurate.)

3. The person proposing the merge into the project can decide whether to
squash the commits, or leave them as separate reviews.

I'm not entirely certain I like this approach myself, but it's the best
I've been able to come up with. It essentially gives us the current
process, while removing the ability to potentially take a version of a
module without taking its dependencies (allowing us to step forward, and
track the commit messages accurately). It will also produce results similar
to what we will have when all of this oslo code moves into separate
libraries, where the changes to the library will be seen by the projects
without any action at all on their part.

OTOH, it will also require spending time on update.py, instead of releasing
a library from the incubator. And it doesn't really buy us that much in
terms of making the sync happen more easily, other than a reliable way of
having entirely accurate commit messages.

I would love to have someone else offer an alternative that's less effort
to change and provides the desired detailed log messages accurately.


[1] https://blueprints.launchpad.net/oslo/+spec/improve-update-script
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140113/f61cc74c/attachment.html>

More information about the OpenStack-dev mailing list