A better directory tree

Rob Griffiths in Macworld has a great column but he often glosses over details and in one recent case admitted to not understanding exactly what he was proposing we use. One of his recent columns showed a single command line to display a nicely indented folder tree on your command line:

find . -type d | sed -e 1d -e 's/[^-][^/]*//--/g' -e 's/^/ /' -e 's/-/|-/'

Griffith’s then admits that he does not understand the ‘sed’ command. So let’s have a close look at the entire thing. Understanding this stuff is actually more important than you might think, real power in manipulating text, even in your word processor, comes from understanding regular expressions.

The ‘find’ command is easily understood. ‘find’ requires a directory and optionally one or more filters to specify the files you want listed. In this case we are asking for all files of type ‘directory’.

Now onto the ‘sed’ command. ‘sed’ is the unix stream editor. It takes an input text stream and carries out a number of commands on each line before sending it to standard out. In our line you will see a ‘-e’ followed by some stuff. Each of the things following the ‘-e’ is a command. The first, ‘1d’ says that any line numbered 1 should have the ‘d’ command done to it, and ‘d’ is the delete command so we lose the first line.

Now we get down to some regular expression fu. If you want to read the manual on the regular expressions used by sed then a quick ‘man re_format’ (short for regular expression format) will give you an explanation. A fairly opaque explanation so my advice is to pick up a copy of “Mastering Regular Expressions.”

Our command line has three substitute commands (‘s’ for substitute) one after the other. The standard substitute command finds a regular expression and replaces it with a string – s/expression/replacement/[options], though we can replace the ‘/’ character with anything we want so long as we use the same character all three times.

The first one is needlessly complex – we could change it to 's/[^-][^/]*//--/g' or even 's#[^/]*/#--#g' and it would do the same thing. So lets go through it step by step. The brackets define a set of characters – we could say [abc] and that would match either ‘a’, ‘b’ or ‘c’. The ‘^’ character means ‘not’ so our first bracket means any character except for a ‘-‘, that’s why it’s unnecessary – we haven’t added the ‘-‘ characters in yet so there is no need to ignore them. The second bracket pair means “not a ‘/’ character” – note that in the original someone thought that because it uses the ‘/’ as the expression delimiter that the expression has to be ‘/’ to remove the special status of the character, though in fact only the ‘^’ has any special meaning in brackets. The star means “repeat forever”. So our regular expression means “match any string of characters up to and including a ‘/'”. Instead of having to use ‘/’ at the end of our regular expression we could change the expression delimiter to another character – I’ve chosen ‘#’. We then replace that with the contents of the second half, in our case a pair of minus signs. The ‘g’ at the end is short for ‘global’ which means the substitute command is repeated along the entire line, not just the first time our regular expression is found.

Now our second substitution. Here’s why we love regular expressions, in our first substitution the ‘^’ meant ‘not’, but in this case it is outside a pair of brackets so it indicates the start of the line. So we are finding the start of the line and inserting three spaces, basically so it looks prettier.

Our final substitution finds ‘-‘ and replaces it with ‘|-‘ so we get a vertical bar along the start of our list. Notice that since there is no ‘g’ at the end it only hits the first ‘-‘ character. You might also consider that the last two substitutions both add something to the start of the output line so we might be able to combine them. Yes, we can – right at the start of the line insert three spaces and a bar, that should work.

So we actually have a much shorter expression that we understand. Here it is:

find . -type d | sed -e 1d -e 's#[^/]*/#--#g' -e 's/^/ |/'

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s