You are here

  1. Blogs
  2. » charles's blog

Walking a directory tree with bash

I'm doing the hard-drive shuffle thing. I have a lot of data and I'm paranoid about losing it. I've been let down once or twice by bad copies so I thought I should take checksums before copying. I had a lot of fun arguing with wildcards and string escaping so I thought I'd share my adventure, as I've already worked out how to do this and forgotten at least once.

My first attempt. It dies when fed directories.

[code lang=bash]md5sum * | md5sums.txt
[/code]

I try again, and it occours to me to use tee in append mode so I get output to the screen as well

[code lang=bash]for foo in `ls -R` do
md5sum $foo | tee -a md5sums.txt
done[/code]

This doesn't work, md5sum complains. I have a look at ls -R

Quote:
some.file
another.file

Directory1/
Directory2/

Directory1:
inDirectory1.file
inDirectory1again.file

Directory2:
inDirectory2.file
inDirectory2again.file

I RTFM and can't find an option to list full paths. I consider using find.

for foo in `find -type f` do<br />
md5sum $foo | tee -a md5sums.txt<br />
done

This fails when anything has a space in the filename. I RTFM on find. find -ls does the escaping i need but has all the useful info that ls -l spits out. Not too handy and I don't feel like resorting to sed, or awk. Further through the fine manual I find

man find wrote:
-exec command ;
Execute command; true if 0 status is returned. All following arguments to find are taken to be
arguments to the command until an argument consisting of ‘;’ is encountered. The string ‘{}’ is
replaced by the current file name being processed everywhere it occurs in the arguments to the
command

To cut a long story short that's not quite the whole truth, as the shell works its magic on {} and ; so we need to add quotes and a slash.

find -type f -exec md5sum '{}' \; | tee md5sums.txt

Of course as I watch the damn thing run I start wondering about how to perform the md5sums in parallel to get it done faster. I start the same command on the copy of the files and notice that find lists the files in a different order, so I'm going to need to apply sort and whilst I'm at it I should probably use the list to strip out any of the inevitable duplicate files I have kicking around.

Or I could have just installed md5deep.

Subject: 

Add new comment

BBCode, html and code systax highlighting

  • Allowed HTML tags: <a><img> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd><strike><hr>
  • Lines and paragraphs break automatically.
  • You can use BBCode tags in the text. URLs will automatically be converted to links.

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.

My Band

LinuxCounter.net

Creative Commons License
Except where otherwise noted, work is licensed under a Creative Commons Licence and is the work and opinion of the credited author(s).

Powered by Drupal

My Facebook


Charles Elwood's Facebook profile