The Hackerlab at regexps.com

Inventory Tags for Source

up: arch Meets hello-world
next: Importing the First Revision
prev: Project Tree Inventories

Caution: Steep Learning Curve: As in the previous chapter, the concepts and commands introduced here are likely to be unfamiliar to you, even if you have used other revision control systems. Once you "get it", though, this will seem quite natural. Best of all, this is the last tricky step before we can start storing project trees in an archive.

Two Names for Every File

In the arch world, every source file (and directory) in your project tree has two names: a file path and a inventory tag .

The file path of a file always begins with the string ./ and continues with a relative path name. It describes where within a source tree a file is located.

The inventory tag of a file is a (mostly) arbitrary string that is unique to the file within the tree.

Ordinarily, when a file is moved, its file path changes, but its inventory tag remains the same.

For example, let's suppose that we have a tree with the files:


        file path               inventory tag
        ---------               -------------

        ./hw.c                  i_tag_hw
        ./main.c                i_tag_main


but then we rename hw.c to hello.c . Afterwards, we'll have:

        file path               inventory tag
        ---------               -------------

        ./hello.c               i_tag_hw
        ./main.c                i_tag_main


This chapter will teach you how tags are assigned to files, how tags are managed, and how they are used by arch .

Why is it Like This -- The Purpose of Inventory Tags

As you'll see in later chapters, arch is good at managing changes made to source trees and the files they contain, and good at telling you about the history of trees and files.

As an example, let's suppose that Alice and Bob are both working on the hello_world project. In her tree, Alice makes some changes to hw.c . In his tree, Bob renames hw.c to hello.c .

At some point it is necessary to "sync-up" Alice and Bob. Bob should wind up with the changes Alice has been making. Alice should wind up with the same file renaming that Bob has done.

arch provides many mechanisms for that syncing up -- it's one of the most important things that arch can do -- but nearly all of them boil down to computing and applying changesets.

Alice can ask arch to create a changeset describing the work she's done, and that changeset will describe the changes she made within hw.c . Bob can create a changeset and that changeset will describe the file renaming he did.

If Alice applies Bob's changeset to her tree, her copy of hw.c should be renamed hello.c . But a trickier case is this: What happens if Bob applies Alice's changeset to his tree?

Alice changed a file named ./hw.c , but in Bob's tree, those same changes should be made to a file named ./hello.c . Fortunately, both files have the same inventory tag:


        file path               inventory tag
        ---------               -------------

                 Alice's tree:
        ./hw.c                  i_tag_hw

                 Bob's tree:
        ./hello.c               i_tag_hw


In Alice's changeset, the changes Alice made are described as being made to the file whose tag is i_tag_hw .

Therefore, when applying that changeset to Bob's tree, arch knows to apply the changes to the file with that same tag; it knows to apply the changes to his ./hello.c .

That example illustrates what inventory tags are for: they allow arch to describe the changes made to a tree in terms of the logical identity of files rather than their physical location. There are many more complicated examples of how inventory tags come into play, but now you've seen at least the basic point.

Choices about Tags -- Introducing tagging-method

So, how are files given inventory tags? Filesystems do not, as a general rule, have any such concept as inventory tags built-in. Operating systems don't have inventory tags built-in. Rather, users have to take explicit action to assign tags to files and to maintain that assignment.

arch gives you a choice. There are three possible ways you can assign tags to files. When you first create a project, you choose from one of these three techniques.

The possibilities are briefly introduced here, then in more detail below.

The names Method of Tagging You are free to just ignore tags -- to not use them at all. This is called the names method of tagging. In this method, the inventory tag of every file is essentially the same as the file path of that file. As a result, if you rename a file, its tag changes:


           The names Method of Tagging


        file path               inventory tag
        ---------               -------------

                Before a rename
        ./hw.c                  ?./hw.c

                 After a rename

        ./hello.c               ?./hello.c


If you don't care about synchronizing changes between trees in which files have been renamed, the names method is the easiest to use.

The explicit Method of Tagging In the explicit method, inventory tags are stored separately from files, in subdirectories called .arch-ids . Using this method most resembles using older revision control systems such as CVS : if you add a new source file, you must use the command tla add-tag to assign it a tag; if you remove a file, you must use tla delete-tag to remove its tag; if you rename a file, you must use tla move-tag to move the corresponding tag file. This is a tried-and-true technique, but some users find the necessity of issuing commands like add-tag , delete-tag , and move-tag to be inconvenient.

The tagline Method of Tagging The tagline method is a superset of both the explicit method and the names method. You may, for a particular file, define no tag -- in which case it works like the names method. You may define a tag using tla add-tag -- in which case it works like the explicit method. However, the real convenience of the tagline method comes from this: you may define the tag for a file by adding a specially formatted string near the top or bottom of that file (e.g., in a comment for source files).

I strongly recommend the tagline method, even though it is the least familiar to new users. As you'll see below, its convenience far outweighs its drawbacks.

Setting the Tagging Method

To set or check the tagging method for a tree, use the tagging-method command:

        % cd ~/wd/hello-world

        # set the tagging method
        # 
        % tla tagging-method tagline


        # check the tagging method
        # 
        % tla tagging-method
        tagline


Possible values when setting a tagging method are names , explicit , and tagline . (A now deprecated but still supported method called implicit , which is similar to tagline , is also permitted.)

Using the names Tagging Method

If you want to use the names method of tagging, use this command:

        % tla tagging-method names

Thereafter, every file will have a tag that is derived from its path relative to the tree root.

(Again, the names tagging method has the disadvantage that changesets do not work cleanly if files have been renamed.)

Using the explicit Tagging Method

If you want to use the explicit method of tagging, use this command:

        % tla tagging-method explicit

When using the explicit method, it is (ordinarily) necessary to use tla add-tag to tag every file and directory:

        % tla add-tag FILE

If FILE is a directory, that will create FILE/.arch_ids/=id . If it is a regular file or symbolic link, it will create (in the same directory) .arch_ids/FILE.id . In either case, the file created will contain an automatically generated tag for the file.

If you remove a regular file or symbolic link, you must use the command:

        % tla delete-tag FILE

That won't remove FILE itself, but it will remove the inventory tag for FILE .

In order to remove a directory, you must yourself remove the .arch_ids subdirectory. That will also implicitly remove the inventory tags of any files that arch thinks are stored in that directory. (For example, using rm -rf to remove a directory also removes its tag.)

If you rename a regular file or symbolic link, you can use the command:

        % tla move-tag OLD-NAME NEW-NAME

to move the inventory tag for that file.

Usage Note: tla move-tag does not work like the unix command mv . In particular, NEW-NAME must be the new name of the file -- not the name of a directory into which the file is being moved.

If you rename a directory, its inventory tag (and the tags for all files and subdirectories it contains) move with it automatically (because the .arch_ids subdirectory has moved).

When you run tla inventory in a working directory using explicit tagging, only explicitly designated source files are listed. If you would rather see a list of all files passing the naming conventions for source files, use:

        % tla inventory --source --names

You should also read about tree-lint (see Keeping Things Neat and Tidy).

Using a tagline Inventory

To use tagline tagging, use the following command in your project tree:

        % tla tagging-method tagline

With this method: any file with no tag is assigned a tag based on its location -- just as in the names method. Any file assigned a tag with tla add-tag uses that tag -- just as in the explicit method. Finally, though, you can tag a file by adding a specially formatted string near the top or bottom of the file.

A tag within a file is a single line that occurs within 1024 bytes of the start or end of the file and has the form:

        <punct>arch-tag:<spaces><tag>

For example:


        /* arch-tag: `main' for the hello world program
         */


Note: Leading and trailing spaces around an inventory tag are not considered part of the tag. Within a tag, every non-graphical character is replaced by _ . For example, you write the that tag:

        `main' for the hello    world program

the actual inventory tag is:

        `main'_for_the_hello____world_program

Keeping Things Neat and Tidy

The command:

        % tla tree-lint

is useful for keeping things neat and tidy.

If you use explicit tagging, it will tell you of any tags for which the corresponding file does not exist. It will tell you of any files that pass the naming conventions, but for which no explicit tag exists.

If you use tagline tagging, it will tell you of any files for which no tag can be found -- either explicit or tagline. It will tell you of any explicit tags for which the corresponding file does not exist.

In either case, or if you are using the names tagging method, tree-lint will tell you of any files that don't fit the naming conventions at all.

Finally, if you use explicit or tagline tagging, tree-lint will check for cases where multiple files use the same tag. If any two files do have the same tag, you must correct that, either by editing the tag (if it is in the file itself) or by using delete-tag and add-tag to replace a duplicated explicit tag.

Tagging the hello-world Project

To continue the example started in earlier chapters: let's use the tagline tagging method for the hello-world project:

        % cd ~/wd/hello-world

        % tla tagging-method tagline

So, what does tree-lint tell us?

        % tla tree-lint
        These apparent source files lack inventory tags:
 
        ./hw.c
        ./main.c



If we edit the source files to add tags:

        % tail -3 main.c
        
        /* arch-tag: main module of the hello-world project
         */
        % tail -3 hw.c
        
        /* arch-tag: hello_world module of the hello-world project
         */

Then the next run of tree-lint omits the warning.

Usage Note 1: Notice that we didn't tag any of the files in {arch} . arch implicitly provides tags for all of its control files.

Usage Note 2: Our little sample project doesn't have any subdirectories other than arch control directories. If it did, we might want to tag those with tla add-tag .

Other Ways to Tag Files

In some situations, it isn't convenient to explicitly tag every file or to add a tagline tag to every file.

You can supply a default tag for every file in a directory that doesn't have an explicit tag with the command:

        % tla explicit-default TAG-PREFIX

After that, every file in that directory which lacks an explicit tag will have the tag:

        TAG-PREFIX__BASENAME

where BASENAME is the basename of the file. Default tags created in this way take precedence over tagline tags embedded in files. You can find out the default tag for a directory with:

        % tla explicit-default
        TAG-PREFIX

and remove the default with:

        % tla explicit-default --delete

You can also specify a default tag which has lower precedence than tagline tags:

        % tla explicit-default --weak TAG-PREFIX

and view that default:

        % tla explicit-default --weak

or delete it:

        % tla explicit-default --weak --delete

Telling tree-lint to Shut Up

When using tagline tags, you may sometimes have a directory with many files that have no tag (either explicit or tagline), but not want those files to appear in a report of untagged files generated by tree-lint . There are two ways to tell tree-lint to shut-up about such files:

One is to provide a default explicit tag or weak default explicit tag using tla explicit-default , as described above.

The second method is to label the directory as a "don't care" directory -- which means that tree-lint shouldn't complain about untagged files. You can do that with:

        % tla explicit-default --dont-care set

or remove the "don't care" flag with:

        % tla explicit-default --delete --dont-care

You can find out whether the "don't care" flag is set in a given directory with:

        % tla explicit-default --dont-care

arch Meets hello-world: A Tutorial Introduction to The arch Revision Control System
The Hackerlab at regexps.com