In this post, I am going to present some simple tips when using grep.

Default Grep Command

I am using my custom grep command grr by default whenever I would run grep. It is defined as follows:

alias grr="grep --recursive --line-number \
    --exclude-dir=.git --exclude-dir=.svn \
    --exclude='*.o' --exclude='*.pyc' --exclude=.ctags"

The command is called grr for “GRep Recursive”. The --recursive switch is self-explanatory. The --line-number option makes grep prepend each match with the line where it was found.

The --exclude switches can be changed depending on the actual needs. There are tools dedicated specifically to search in source code folders. However, the above alias can be used on any machine - not only within the personal development environment.

A word of caution: Do not set an alias for grep without changing the name, i.e. don’t do alias grep="grep --recursive ...". This may bite you in shell scripts where the author “blindly” uses grep and thus your custom grep is called.

Use Colors And Pipes

I have already written about G as a better alias for grep when piping. Defining …

alias -g G='| grep'

… allows to run:

some_command G mypattern

An improved version of this alias keeps the color code from the input:

alias -g G='| grep --color=always'

This is really handy when piping grep into grep like so:

$ cat input.txt
aa
ab
ac
bb
bc
cc

$ cat input.txt | grep a | grep -v b
aa
ac

With color=alwyas set up grep will keep the input color codes arriving at stdin. I really like to pipe grep when it is easy to spell out the search pattern with multiple greps (“search for lines containing this but not that”). Without that option, only the last grep would colorize - whatever the last pattern than was.

Cat Grep Color

A Grep Wrapper

At work, there is a database dump containg lots of customer information. The file roughly looks like this:

customer1|somedata|otherdata|whateverdata|...
customer1|evenmoredata|otherdata|dododata|...
customer2|somereallystrangedata|otherdata...
customer2|evenmoredata|otherdata|...
customer3|somedata|otherdata|...
customer3|areyoustillreadingthis?|otherdata|...
...

I often find myself to search for data for a specific customer. For various reasons it is actually easier to grep for the customer in the dump file instead of querying the database. For this very end I have defined a dedicated shell function:

grc () { # GRep Customer
    FILE="/path/to/dump.txt"
    ls -hl $FILE
    grep --color $@ $FILE
}

This function is obviously just a wrapper around a grep with preset options. I call it like so: grc "customer1". The ls invocation is a hint to the age of the dump.

This could be achieved in other ways, too. The point is that it is very handy to type and always accessible from the command line (where I spent my day). This is not specific to grep but rather a common pattern: A couple of commands belong together? A couple of command line arguments are always used in conjunction? Make a shell function or alias and automatize tedious work away.