The Hackerlab at regexps.com

Development Branches -- The star-merge Style of Cooperation

up: arch Meets hello-world
next: Symbolic Tags
prev: Patch Logs and Project Tree History

WARNING: This feature isn't present in the first release of tla but will be in the near future.

In earlier chapters, we developed an extended example out of the hello-world project.

Alice and Bob, the primary programmers on the project, started one archive and created some revisions there.

Candice, a user of the project, created her own archive, started a branch of the hello-world project, and began maintaining her own local modifications.

In this chapter, we'll begin to consider a situation that is more typical of free software projects in the real world. Here, we'll consider Alice and Bob to be the maintainers of a public project, and Candice as a major remote contributor to the project. We'll identify the new revision control needs that arise from that arrangement, and look at some arch commands that help to satisfy those needs.

Promoting an Elementary Branch to a Development Branch

So far, if you've been following the examples, Candice has an elementary branch. She made a branch from the mainline, made some local changes, and has kept her branch up-to-date with Alice and Bob's mainline.

We're supposing, at this point, that Alice and Bob want to merge Candice's changes into the mainline.

Well, that merging work has already been done. Candice's latest revision is exactly the tree that Alice and Bob want. They can incorporate that merge into their mainline very simply, by committing Candice's latest revision to their own mainline:


        % tla get -A candice@candice.net--2003-candice \
                    hello-world--candice--0.1 \
                    hw-C
        [...]


        % cd hw-C

        % tla set-tree-version -A lord@emf.net--2003-example \
                    hello-world--mainline--0.1

        % tla make-log
        ++log.hello-world--mainline--0.1--lord@emf.net--2003-example

        [... edit log file (consider `tla log-for-merge') ... ]

        % cat ++log.hello-world--mainline--0.1--lord@emf.net--2003-example
        Summary: merge from Candice's Branch
        Keywords: 

        Patches applied:

          * candice@candice.net--2003-candice/hello-world--candice--0.1--patch-2
             merge from mainline sources

          * candice@candice.net--2003-candice/hello-world--candice--0.1--patch-1
             Punctuated the output correctly

          * candice@candice.net--2003-candice/hello-world--candice--0.1--base-0
             tag of 
              lord@emf.net--2003-example/hello-world--mainline--0.1--patch-1

        % tla commit
        [....]


Read Carefully Note: Note carefully the trick we just used. Candice's latest revision was exactly what Alice and Bob wanted -- they combined get with set-tree-version to turn Candice's tree into one they could easily commit to their own mainline.

Simple Development Branches

Let's consider what happens as development proceeds on both branches. For this purpose, we'll introduce something new: a way of diagraming branches and the merges between them.

After the examples so far, we have this situation:


      mainline-0.1                    candice-0.1
      ------------                    -----------
        base-0             -----------> base-0 (a tag)
        patch-1  ---------'             patch-1
        patch-2             ----------> patch-2
        patch-3  ----------'  --------'
        patch-4  <-----------'


which tells us that the candice branch is a tag of patch-1 from the mainline; that at patch-2 of the candice branch, there was a merge of everything up to patch-3 of the mainline ; and finally that patch-4 of the mainline merges in everything up to patch-2 from the candice branch.

Whenever we have a such a diagram in which none of the merge lines cross, that is a simple development branch .

The significance of a simple development branch is that it's a model for how two development efforts can work asynchronously on one project. Within each effort -- on each branch -- programmer's use the "update/commit" style of cooperation (see The update/commit Style of Cooperation). However, changes on one branch have no effect on the other until the two branches are merged.

Introducing The Development Branch Merging Problem

Let's suppose that more work happens on both the mainline and candice branches, leaving us with:


      mainline-0.1                    candice-0.1
      ------------                    -----------
        base-0             -----------> base-0 (a tag)
        patch-1  ---------'             patch-1
        patch-2             ----------> patch-2
        patch-3  ----------'  --------' patch-3
        patch-4  <-----------'          patch-4
        patch-5
        patch-6


        % tla revisions --summary -A candice@candice.net--2003-candice \
                          hello-world--candice--0.1
        base-0
            tag of 
            lord@emf.net--2003-example/hello-world--mainline--0.1--patch-1
        patch-1
            Punctuated the output correctly
        patch-2
            merge from mainline sources
        patch-3
            added a period to output string
        patch-4
            capitalized the output string



        % tla revisions --summary -A lord@emf.net--2003-example \
                          hello-world--mainline--0.1
        base-0
            initial import
        patch-1
            Fix bugs in the "hello world" string
        patch-2
            commented return from main
        patch-3
            added copywrong statements
        patch-4
            merge from Candice's Branch
        patch-5
            fixed the copyrwrong for hw.c
        patch-6
            fixed the copyrwrong for main.c





Let's consider a scenario in which our goal is to merge the new work on the mainline branch into the candice branch. In other words, we want to wind up with:


      mainline-0.1                    candice-0.1
      ------------                    -----------
        base-0             -----------> base-0 (a tag)
        patch-1  ---------'             patch-1
        patch-2             ----------> patch-2
      patch-3  ----------'  --------' patch-3
      patch-4  <-----------'          patch-4
        patch-5               --------> patch-5
        patch-6  ------------'


How can we perform that merge? Let's start with the latest pre-merge candice revision (patch-4 ):


        % tla get -A candice@candice.net--2003-candice \
                       hello-world--candice--0.1--patch-4 \
                       hw-C-4
        [....]

        % cd hw-C-4


Here are two techniques that don't work:

replay Does Not Solve the Development Branch Merge Problem

replay will try to apply all "missing" changes from the mainline into the candice tree. The list of changeset it will apply is given by:


        % tla whats-missing --summary \
                              -A candice@candice.net--2003-example \
                              hello-world--mainline--0.1
        patch-4
            merge from Candice's Branch
        patch-5
            fixed the copyrwrong for hw.c
        patch-6
            fixed the copyrwrong for main.c


Problematic in that list is patch-4 . It's a merge that includes all of the changes from the candice branch up to its patch-2 level. Yet those changes are already present in the patch-4 revision of the candice branch -- so replay will be applying them redundantly (cause patch conflicts).

Note of Warning: The replay command will not prevent you from running further replays even though the source tree is not in a consistant state. TLA in its current incarnation does not merge reject files. This leaves open the possibility that patch rejects will be lost if a second replay is performed before the rejects from the first replay are resolved. (Some day TLA may be able to merge multiple rejects into a combined reject.)

Advanced User Note: The replay command has options that would allows us to skip the patch-4 revision from the mainline. That sort of solves the problem, but it has some drawbacks. First, it means that `patch-4' will continue to appear in the `whats-missing' output of the `candice' branch. Second, there is nothing that guarantees us that the `patch-4' changeset contains only* merges from the candice branch. If Alice and Bob made other changes in patch-4 , and we skip that changeset, those other changes will be lost.

update Does Not Solve the Development Branch Merge Problem

Suppose we try to update from the mainline branch. Recall that update will compute a changeset from the youngest mainline ancestor of the project tree to the tree itself, then apply that changeset to the latest mainline revision.

We have a notation for this. A changeset from X to Y is written:

        delta(X, Y)

In this case, update will start by computing a changeset from the mainline patch-3 revision to our project tree:

        delta(mainline--0.1--patch-3, hw-C-4) 

The tree that results for applying a changeset from X to Y to a tree Z is written:

        delta(X, Y) [ Z ]

In other words, the result of update in our example can be described as:

        delta(mainline--0.1--patch-3, hw-C-4) [mainline--0.1--patch-6]

Here's the problem, though. The patch-3 revision of mainline was not previously merged with the candice branch. Thus, the changeset

        delta(mainline--0.1--patch-3, hw-C-4)

will include, among other changes, the changes from patch-1 and patch-2 of the candice branch.

Unfortunately, the tree we'll be applying that changeset to, mainline--0.1--patch-6 , has already been merged with base-0...patch-2 of the candice branch.

As with replay , update will cause merge conflicts by making redundant changes.

Solving One Instance of the Development Branch Merging Problem

Using just our delta notation, let's look at solving this merge problem cleanly.

One possibility is that we want to give highest priority to the mainline branch, merging in changes from candice and then lowest priority to merging in any local changes that are only in the project tree. A solution that accomplishes that cleanly, without spurious conflicts from redundant changes is:


   tmp = delta(X, Y)[Z]
     where X = candice@candice.net--2003-candice/hello-world--candice--0.1--base-0
           Y = candice@candice.net--2003-candice/hello-world--candice--0.1--patch-2
           Z = lord@emf.net--2003-example/hello-world--mainline--0.1--patch-4
   

   answer = delta(L,M)[tmp]

     where L = lord@emf.net--2003-example/hello-world--mainline--0.1--patch-4
           M = /usr/lord/examples/wd/hw-AnB


Usage Note: The concept of "priority" in merging has to do with how (non-spurious) conflicts are handled. By giving higher priority to the mainline , we ensure that conflicting changes from the other two trees involved will be the ones that appear in .rej files. That can be important if, for example, we want to minimize changes made relative to the mainline as conflicts are resolved. (See Inexact Patching -- How Conflicts are Handled.)

A similar looking but different solution arises if we want to give higher priority to the candice branch than the mainline :



   tmp = delta(X, Y)[Z]
     where X = candice@candice.net--2003-candice/hello-world--candice--0.1--patch-2
           Y = lord@emf.net--2003-example/hello-world--mainline--0.1--patch-6
           Z = lord@emf.net--2003-candice/hello-world--candice--0.1--patch-4
   

   answer = delta(L,M)[tmp]

     where L = candice@candice.net--2003-candice/hello-world--candice--0.1--patch-4
           M = /usr/lord/examples/wd/hw-C-4



We could, in fact, implement those solutions in "brute force" way: studying the merge histories of the two branches, coming up with those solutions, checking out the necessary trees, using mkpatch to compute changesets, and dopatch to apply them. As we'll soon see, however, arch has built-in support for this kind of merge.

star-merge -- Solving the Development Branch Merging Problem in General

It's a bit beyond the scope of this tutorial to explain the complete solution to the development branch merging problem in general. The two solutions shown above illustrate two cases, but slightly different solutions are sometimes necessary.

What you should know is that when you have simple development branches (see Simple Development Branches), the command star-merge knows how to merge between them without causing spurious merge conflicts.

star-merge (with the --in-place option, as illustrated here) expects two arguments: a fully qualified version or revision and the project tree that the merge should be performed upon, which should start with './'

For example, our two solutions from the previous section can be achieved with either of these two solutions:

        
        
        % tla get -A candice@candice.net--2003-candic \
                hello-world--candice--0.1--patch-4 \
                merge-temp
        % tla star-merge lord@emf.net--2003/hello-world-mainline--0.1 \
                merge-temp

                                        - or -
        % cd hw-C-4
        % tla undo 
        % tla star-merge lord@emf.net--2003/hello-world-mainline--0.1 .


Usage Note: Notice that the merge solutions we found earlier both involved the applications of two changesets. In general, it's possible that either or both of those can result in patch conflicts. If conflicts occur after the first merge, star-merge stops and gives you a chance to resolve the conflicts. The --finish option to star-merge can then be used to apply the second changeset (see tla star-merge --help ).

Usage Notes: Recall that before we used star-merge , we did the first merge back from candice to mainline using a get/set-tree-version/commit trick (see Promoting an Elementary Branch to a Development Branch). In general, the first merge from an elementary branch back to its parent branch should always be done in that way. (Users sometimes request that star-merge should have that trick "built in" to its operation. Perhaps some day it will.)

arch Meets hello-world: A Tutorial Introduction to The arch Revision Control System
The Hackerlab at regexps.com