-.- --. .-. --..

Intro to Git's pathspec

Ever wanted to exclude some files when running git log? This post is what you want!

If you’ve used a glob pattern to run a particular git command on a specified list of files, you’ve used a simple version of pathspec. If you’ve never used it, then this post is for you. Skip to the advanced usage section if you’re familiar with the glob pattern but aren’t familiar with the optional flags that the pathspec accepts.

Some of the commonly-used Git commands that accept a pathspec are:

add
log
checkout
clean
diff
grep
ls-files
rm

The pathspec is specified after the command arguments, separated by a double dash --.

Basic Usage

The following command will add all the Ruby files under lib/ directory.

git add -- lib/*.rb

This command would reset all the changes made on all the Ruby files under lib/ directory:

git checkout -- lib/*.rb

I use this technique a lot to cleanup all the untracked temporary files lying inside a repository. A pathspec filter can be used to clean specific files without removing all untracked files. For example, to delete all the untracked files under lib directory:

git clean -f -- lib/*.rb

Note that the -f flag removes all the files. This can be dangerous when run by mistake, or when a particular file that needs to be committed has not been added to the work tree yet. To avoid this mistake, the clean command should first be run with the -n flag to see which files get removed. Tutorial on Git’s clean command, Git’s clean command documentation

Another useful use-case is where I want to see the diff of a particular file across different branches. Consider this branch structure:

     A---B---C topic
    /
D---E---A'---F master

Let’s say both branches have a README.md which has different content on topic and master branches, and we’re currently at the commit C. It’s a common need to find out the changes made to the file between the branches. One way to achieve this is by the following command:

git diff topic..master -- README.md

Note: We could’ve also used HEAD instead of topic if we’re on the topic branch. Another really interesting example of usage of pathspec and the git log command can be found in this HashRocket TIL post.

  1. topic...master is the revision list. This directive means “from the head commit on topic branch to the head commit of the master branch”.
  2. The revision list and the file name—the pathspec we want here—has to be separated by --. By adding the file name, we’ve filtered the diff output to include changes only from that file.

Advanced Usage

Apart from the simple glob patterns, the pathspec can also take optional flags that would make it work in a slightly different manner. These flags are of the form :(token). The token here is one of the following:

  • top
  • exclude
  • icase
  • literal

The tokens can also be mixed:

  • :(top,exclude)
  • :(top,icase)
  • :(icase,literal)

The documentation for each of these switches is available in the git glossary. For the sake of an example, let’s look at two of these switches: top and exclude.

The top switch switches the context of the git command you’re running by considering the glob pattern you specify as if it was run from the root of the git repo.

exclude can be used to blacklist some of the files from being included by the git command.

Consider the following directory structure:

~/test master(dirty) $ tree
.
├── about.html
├── css
│   ├── main.css
│   └── styles.css
└── index.html

1 directory, 4 files

Assume index.html isn’t yet added to the git tree.

If we were to git log -p on the master branch, but wanted to avoid paging though changes in about.html, we could do something like:

~/test/css master(dirty) $ git log -p -- . ":(exclude)about.html"

Now, if we are in the css directory, and we want to clean all the HTML in the repo except about.html, the usual way is to change directory one level up, and then remove the index.html by rm command or some other way. A more easier way would be:

~/test/css master(dirty) $ git clean -nf -- ../ ":(top,exclude)about.html"
Would remove ../index.html

Here, we use the ../ pathspec to tell the git command to include all the files from the top directory. The magic word exclusion pattern of :(top,exclude) tells the command to exclude the pattern we specified, and that the pattern should be matched from the root of the git repo. Then we specify the pattern for that exclusion pattern as about.html.

These switches can be pretty handy when you’d want to exclude a certain pattern of files from a git command.