removing duplicate MP3s

Tony Baldwin photodharma at gmail.com
Thu Jun 11 01:28:26 UTC 2009


David Fox wrote:
> On Wed, Jun 10, 2009 at 2:45 PM, admin2<admin2 at enabled.com> wrote:
>> Hi list,
>>
>> I am looking for an automated application that can look at MP3s and
>> remove the duplicate file names that could spread across multiple
>> directories.  Is there such an animal?
> 
> Not sure. fdupes has been mentioned, but to really be effective said
> tool would have to be able to look at the mp3 metadata and derive the
> song that way, and keep track of what it finds in a buffer.
> 
> For instance, you might have:
> 
> 01-ReallyCoolSong.mp3
> 
> and
> 
> ReallyCoolSong.mp3
> 
> From a filesystem standpoint those two files are different, but they
> could possibly be the same song from the same ReallyCoolAlbum (tm).
> They might even be encoded at different bit rates, so going by file
> sizes won't be accurate. But the tags metadata would reveal the song
> title and show one of the files as a duplicate of the other.
> 

This will find all duplicate files, and will output a script with a list 
of them all for removal.

#!/bin/bash
OUTF=~/rem-dupes.sh;
echo "#! /bin/sh" > $OUTF;
find "$@" -type f -exec md5sum {} \; |
     sort --key=1,32 | uniq -w 32 -d --all-repeated=separate |
     sed -r 's/^[0-9a-f]*( )*//;s/([^a-zA-Z0-9./_-])/\\\1/g;s/(.+)/#rm 
\1/' >> $OUTF;
chmod a+x $OUTF; ls -l $OUTF

The thing is, it will list ALL duplicate files.
But, the script comes out with all lines commented out like:

#!/bin/bash

#rm /path/to/somesong.mp3
#rm /path/to/otherdir/somesong.01.mp3 # same file, different dir and 
slightly different name...but the same song.

So you just go through and uncomment the ones you want to remove and run 
the script.  Of course, you could remove the # from the script and get 
it all not commented, then go in and delete or comment the ones you 
don't want removed.

cd to the dir you want to work in, first.

Alternatively, you could use fslint if you want a gui tool.
It can find duplicate files and remove them, find bad links, and other 
"file system lint", and clean it up.
apt-cache showpkg fslint for more info.

/tony

-- 
http://www.baldwinsoftware.com
free/open source software
tcl yer os with a feather...




More information about the ubuntu-users mailing list