20100509

findDuplicates.pl

I have a big collection of wallpapers that my siblings have a copy of and will add new ones into the mix.  The problem is, sometimes they move stuff around into their own folders, and make it really hard to sync.

So I was looking around on the internet, and found hardlinkpy.  It finds duplicate files on your file system, and lets you hardlink them together.  While nifty, it wasn't quite what I wanted to do, but they inspired me.  It looked easier to write my own than to try and modify their huge python script (I don't know much python), so the end result is findDuplicates. It goes through a set of directories, and compares files of identical size to see if they are the same, and then either lists or deletes them for you.

Now I can copy an entire folder system, remove the duplicates from mine, and then merge the two without worrying about duplicates.  (I was going to write a merge mode, but it proved easier to handle by hand.)

For anyone who's got a similar need, enjoy!

No comments:

Post a Comment