mercredi 20 juin 2007

Counting the number of files in all subdirectories with a one-line Bash script

As I usually forget the scripts I'm using from time to time, I decided to backup them to a place where they'll be easily accessible and could eventually be used by someone needing the same thing.

Enough talking already. This time I needed to find the number of PDF files in all my directories (listed directory by directory). Currently I'm working under Linux, so a one-line bash script seemed the ideal solution. Here's what I came up with:

for x in `ls -d */ . | tr ' ' '*' ` ; do x=`echo "$x" | tr '*' ' '` ; find "$x" | grep -i "\.pdf$" | wc -l | xargs -i/// echo /// files in $x ; done



Here is a sample output of the execution:
1069 files in .
4 files in App Help/
1 files in Books/
73 files in Lectures/
12 files in Personal/
979 files in Work/


Some explanations:
ls -d */ - lists all subdirectories of your current directory. If you want to list all hidden directories add ".*/" (as found in the comments of this post).

tr ' ' '*' - replaces all spaces with * so that the "for" loop does not break directory names containing spaces. The name is restored with another "tr". I used the symbol * as it is very rare that someone would put it in the name of a directory, but who knows.

find "$x" - outputs all files of the given subdirectory to the standard output, which in this case is:

grep -i "\.pdf$" - which filters out only the files ending with .pdf (case insensitive).

wc -l - counts the number of lines (ie. the number of files) and the output is passed to xargs, which is used to format the output.

xargs -i/// echo /// files in $x - the "-i" option specifies the string that will be replaced with the text read from the standard input (ie. the number of files). I'm using a string that is impossible to put in a filename, so that we have a nicely formatted output.


Ok, that's it for now. I hope that someone might find this one useful.

Alex

Aucun commentaire: