Weeding out Duplicate files in Hardy Heron 8.04
Rafiq Hajat
ipi.malawi at gmail.com
Wed Jan 20 14:18:27 UTC 2010
If your files are in a directory /home/dir/mm, you can try the
following:
find /home/dir/mm -type f -print0 | xargs -0 md5sum > /tmp/files
(all has to be typed on a single line) and then
sort /tmp/files | uniq -w 32 -D > /tmp/dupes
(all has to be typed on a single line). In /tmp/dupes you have
the list of duplicate files preprended with its md5 sum. You
can look at the list using
less /tmp/dupes
(type q to quit). If the duplicate list is not too big, you can
remove all but one of each "duplicate" (that way you can also
visually check if these are names of real duplicate). If you
want something more automatic you can try the following:
sum=z;while read md name;
do
if [ "$md" = "$sum" ]; then
rm "$name"
else
sum="$md"
fi
done < /tmp/dupes
Warning: if automatically erases files, it may be dangerous.
Hope this helps,
Lo?c
Wow....this is pretty far out! I'm gonna give it a shot and see how it
turns out.
Thanks Loic
Rafiq Hajat
Executive Director
The Institute for Policy Interaction (IPI)
P. O. Box E14,
Post Dot Net - Blantyre
Malawi
Mobile: +265 999 968800
www.ipimalawi.org
More information about the ubuntu-users
mailing list