Open Stack

Wed Aug 6 22:12:44 UTC 2014

I'll start using pictures now, so let's assume M is the latest commit on
the master.

On Wed, Aug 6, 2014 at 9:31 PM, Zane Bitter <zbitter at redhat.com> wrote:

> On 04/08/14 19:18, Yuriy Taraday wrote:
>
>> Hello, git-review users!
>>
>> I'd like to gather feedback on a feature I want to implement that might
>> turn out useful for you.
>>
>> I like using Git for development. It allows me to keep track of current
>> development process, it remembers everything I ever did with the code
>> (and more).
>>
>
> _CVS_ allowed you to remember everything you ever did; Git is _much_ more
> useful.
>
>
>  I also really like using Gerrit for code review. It provides clean
>> interfaces, forces clean histories (who needs to know that I changed one
>> line of code in 3am on Monday?) and allows productive collaboration.
>>
>
> +1
>
>
>  What I really hate is having to throw away my (local, precious for me)
>> history for all change requests because I need to upload a change to
>> Gerrit.
>>
>
> IMO Ben is 100% correct and, to be blunt, the problem here is your
> workflow.
>

Well... That's the workflow that was born with Git. Keeping track of all
changes, do extremely cheap merges, and all that.

Don't get me wrong, I sympathise - really, I do. Nobody likes to change
> their workflow. I *hate* it when I have to change mine. However what you're
> proposing is to modify the tools to make it easy for other people - perhaps
> new developers - to use a bad workflow instead of to learn a good one from
> the beginning, and that would be a colossal mistake. All of the things you
> want to be made easy are currently hard because doing them makes the world
> a worse place.
>

And when OpenStack switched to Gerrit I was really glad. Instead of ugly

master: ...-M-.....................-o-o-...
             \                       /
              a1-b1-a2-a3-b2-c1-b3-c2

where a[1-3], b[1-3] and c[1-2] are iterations over the same piece of the
feature, we can have pretty

master: ...-M-.....o-.-o-...
             \    /   /
              A^-B^-C^

where A^, B^ and C^ are the perfect self-contained, independently
reviewable and mergeable pieces of the feature.

And this looked splendid and my workflow seemed clear. Suppose I have smth
like:

master: ...-M
             \
              A3-B2-C1

and I need to update B to B3 and C to C2. So I go:
$ git rebase -i M  # and add edit action to B commit
$ vim # do some changes, test them, etc
$ git rebase --continue
now I have

master: ...-M
             \
              A3-B2-C1
                \
                 B3-C1'

Then I fix C commit, amend it and get:

master: ...-M
             \
              A3-B2-C1
                \
                 B3-C1'
                   \
                    С2

Now everything's cool, isn't it? But world isn't fair. And C2 fails a test
that I didn't expect to fail. Or the test suite failed to fail earlier. I'd
like to see if I broke it just now or were it broken after rebase. How do I
do it? With your workflow - I don't. I play smart and guess where the
problem was or dig into reflog to find C1' (or C1), etc. Let's see what
else I can't find. After full iteration over this feature (as in the first
picture) I end up with total history like this:

master: ...-M
            |\
            | A1-B1
            |\
            | A2-B1'
             \
              A3-B1''
               |\
               | B2-C1
                \
                 B3-C1'
                   \
                    С2

With only A3, B3 and C2 available, the rest are practically unreachable.
Now you find out that something that you were sure was working in B1 is
broken (you'll tell me "hey, you're supposed to have tests with
everything!" - I'll answer: "what if you've found a problem in the test
suite that gave false success?"). You can do absolutely nothing to localize
the issue now. Just go and dig into your B code (which might've been
written months ago).
Or you slap your head understanding that the function you thought is not
needed in B2 is actually needed. Well, you can hope you did upload B2 to
Gerrit and you'll find it there. Or you didn't because you decided to make
that change the minute after you committed C1, created B3 and B2 never
existed now...

Now imagine you could somehow link together all As, Bs and Cs. Let's draw
vertical edges between them. And let's transpose the picture, shall we?

master: ...-M
             \
              A1-A2--A3--------
                \  \   \    \  \
                 B1-B1'-B1''-B2-B3----
                               \  \   \
                                C1-C1'-C2

Note that all commits here are absolutely the same as in previous picture.
They just have additional parents (and consequently differen IDs). No
changes to any code in them happen. No harm done.

So now it looks way better. I can just do:
$ git checkout B3
$ git diff HEAD~
and find my lost function!

Now let's be honest and admit that As, Bs and Cs are essentially branches -
labels your commits have that shift with relevant modifications to the new
ones. And let's drop direct links that lead to commits that are already
indirect parents.

master: ...-M
             \
          A:  A1-A2--A3
                \  \   \
             B:  B1-B1'-B1''-B2-B3
                               \  \
                            C:  C1-C1'-C2

And we can find out that this history represents _exactly_ what you've been
doing with those commits!

Oh, wait, there's still an issue with flaky test...
$ git checkout B
$ git bisect start HEAD A1 A2 A3  # As are definitely not to blame
$ git bisect run new_test
And wow - we just (after a cup of tea may be) got the very commit (or
merge) that broke that new test! Git is magical as long as you don't cut
down its data!

Note that all As, Bs and Cs still have the very same contents here as in
original version of history. What's missing is a way to easily push them
with proper parents to Gerrit. That's what I'm proposing.

I was going to reply this email bit by bit but couldn't help but provide an
illustration.

The big advantage of Git is that you no longer have to choose between
> having no version control while you work or having a history full of broken
> commits, like you did back in the bad old days of CVS. The fact that local
> history is editable is incredibly powerful, and you should start making use
> of it.
>

Oh yes, it's very powerful. And very dangerous. It lets you forget things.

A history of small, incremental, *working* changes is the artifact we want
> to produce.

Exactly. But we can keep history of this history carving process while it's
not over yet.

For me, trying to reconstruct that from a set of broken changes, or changes
> that only worked with now-obsolete versions of master, is highly
> counterproductive.

Absolutely agree. And I'd be wrong and a fool if I'd been proposing smth
like this.

I work with patches in the form that I intend to submit them (which changes
> as I work) because that means I don't have to maintain that form in my
> head, instead Git can do it for me.

The same here. I don't like to remember things by myself. I prefer to use
Git to store all history of all changes I've made. And it requires this
enhancement to extract final results to be posted on Gerrit.

Of course while I'm working I might need to retain a small amount of
> history - the most basic level of that just the working copy, but it often
> extends to some temporary patches that will later be squashed with others
> or dropped altogether. Don't forget about "git add -p" either - that makes
> it easy to split changes in a single file into separate commits, which is
> *much* more effective than using a purely time-based history.
>

Yes, and "git add -p" will be required to properly split commits at all
times.

> When master changes I rebase my work, because I need to test all of the
> patches that I will propose in a series against the current master into
> which they will be submitted. Retaining a change that did once work in a
> world that no longer exists is rarely interesting to me, but on those
> occasions where it is I would just create a local branch to hold on to it.
>

And with this improvement you won't have to create additional branches -
only the ones you actually need.

> You are struggling because you think of "history" as a linear set of
> temporal changes.

Nope. I think of history as of wibbly-wobbly directed-acyclic ... graph.

If you accept that history can instead contain an arbitrarily-ordered set
> of logical changes then your life will be much easier, since that
> corresponds exactly to what you need to deliver for review. As soon as you
> stop fighting against Gerrit and learn how to work with it, all of your
> problems evaporate. Tools that make it easier to fight it will ultimately
> only add complexity without solving the underlying problems.
>

Oh, I like Gerrit. I don't fight it. I'm event ready to enhance my tooling
just to let it be as good as it is.

> (BTW a tool I use to help with this is Stacked Git. It makes things like
> editing an early patch in a series much easier than rebase -i... I just do
> `stg refresh -p <patch_name>` and the right patch gets the changes. For
> dead-end ideas I just do `stg pop` and the patch stays around for reference
> but isn't part of the history. I usually don't recommend this tool to the
> community, however, because StGit doesn't run the commit hook that is
> needed to insert the ChangeId for Gerrit. I'd be happy to send you my
> patches that fix it if you want to try it out though.)

Does it record history of such changes?

>
>
>  That's why I want to propose making git-review to support the workflow
>> that will make me happy. Imagine you could do smth like this:
>>
>> 0. create new local branch;
>>
>> master: M--....
>>           \
>> feature:  *
>>
>> 1. start hacking, doing small local meaningful (to you) commits;
>>
>> master: M--....
>>           \
>> feature:  A-B-...-C
>>
>> 2. since hacking takes tremendous amount of time (you're doing a Cool
>> Feature (tm), nothing less) you need to update some code from master, so
>> you're just merging master in to your branch (i.e. using Git as you'd
>> use it normally);
>>
>
> This is not how I'd use Git normally.

Well, as per Git author, that's how you should do with not-CVS. You have
cheap merges - use them instead of erasing parts of history.

>
>  master: M--....-N-O-...
>>           \    \    \
>> feature:  A-B-...-C-D-...
>>
>> 3. and now you get the first version that deserves to be seen by
>> community, so you run 'git review', it asks you for desired commit
>> message, and <poof, magic-magic> all changes from your branch is
>> uploaded to Gerrit as _one_ change request;
>>
>
> You just said that this was a "Cool Feature (tm)" taking a "tremendous
> amount of time". Yet your solution is to squash everything into a single
> patch.
>

My solution is to provide developer a tool to not get lost in it. And allow
one to break out pieces of code as needed.

> The cases where feature development is arduous enough to supposedly
> require this workflow, yet the feature not so big that it needs multiple
> patches must be exceedingly rare.
>

But even small features tend to get lost in review queues. And even for
small pieces of code one would want to remember how one got around them.

Meanwhile, under your proposal the entire subset of developers who have
> enabled this option will always have all of their commits squashed into
> one. I don't see how you can credibly claim to be in favour of this and
> also of not "squashing together commits that should belong to separate
> change requests", because that is an inevitable result.

No, I'm not in favor of squashing everything around. I'm in favor of
keeping history as long as it's needed.

>
>  master: M--....-N-O-...
>>           \    \    \----E* <= uploaded
>> feature:  A-B-...-C-D-...-E
>>
>> 4. you repeat steps 1 and 2 as much as you like;
>> 5. and all consecutive calls to 'git review' will show you last commit
>> message you used for upload and use it to upload new state of your local
>> branch to Gerrit, as one change request.
>>
>> Note that during this process git-review will never run rebase or merge
>> operations. All such operations are done by user in local branch instead.
>>
>> Now, to the dirty implementations details.
>>
>
> I'm certain that nothing I have said above will have convinced you, but
> luckily there is nothing stopping you from just implementing a tiny script
> that does this for you. So just write it and use it locally instead of git
> review and you can keep your workflow. But please don't encourage others to
> adopt it.

As you can see from this thread, others are already using roughly the same
workflow and happy about it. I don't think you'll find someone complaining
that they do.

>
>  - Since suggested feature changes default behavior of git-review, it'll
>> have to be explicitly turned on in config (review.shadow_branches?
>> review.local_branches?). It should also be implicitly disabled on master
>> branch (or whatever is in .gitreview config).
>> - Last uploaded commit for branch <branch-name> will be kept in
>> refs/review-branches/<branch-name>.
>> - For every call of 'git review' it will find latest commit in
>> gerrit/master (or remote and branch from .gitreview), create a new one
>> that will have that commit as its parent and a tree of current commit
>> from local branch as its tree.
>> - While creating new commit, it'll open an editor to fix commit message
>> for that new commit taking it's initial contents from
>> refs/review-branches/<branch-name> if it exists.
>> - Creating this new commit might involve generating a temporary bare
>> repo (maybe even with shared objects dir) to prevent changes to current
>> index and HEAD while using bare 'git commit' to do most of the work
>> instead of loads of plumbing commands.
>>
>
> When you say "prevent changes to current index and HEAD", I hear "I want
> to automatically submit for review a patch that is different to what I have
> tested locally".

I've tested and committed latest version of patch. I want to let the world
know about it. So the very same code (down to the last bit) will get pushed
to Gerrit.

 Note that such approach won't work for uploading multiple change request
>> without some complex tweaks, but I imagine later we can improve it and
>> support uploading several interdependent change requests from several
>> local branches. We can resolve dependencies between them by tracking
>> latest merges (if branch myfeature-a has been merged to myfeature-b then
>> change request from myfeature-b will depend on change request from
>> myfeature-a):
>>
>> master:    M--....-N-O-...
>>              \    \    \---------E*
>> myfeature-a: A-B-...-C-D-...-E   \
>>                        \       \   J*<= uploaded
>> myfeature-b:           F-...-G-I-J
>>
>
> So now every patch in a series would be a separate branch???

Yeah. See above.

>  This improvement would be implemented later if needed.
>>
>> I hope such feature seams to be useful not just for me and I'm looking
>> forward to some comments on it.
>>
>
> -1, sorry.
>
> cheers,
> Zane.

-- 

Kind regards, Yuriy.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140807/8ced6dd2/attachment.html>

Open Stack

[openstack-dev] [git-review] Supporting development in local branches

OpenStack

Community

Documentation

Branding & Legal