Converting all files to UTF-8 ?
Klaus Alexander Seistrup
kseistrup at gmail.com
Tue Dec 28 21:12:46 UTC 2004
On Tue, 28 Dec 2004 21:22:10 +0100, Vincent Trouilliez
<vincent.trouilliez at wanadoo.fr> wrote:
> Okay.
>
> So, now, how can I use iconv to convert all my file names ?
>
> 1) it doesn't look like iconv has a 'recursive' option to process
> automatically sub-folders.
Use find(1) to traverse current directory (.) recursively:
find . -type f -print \
| while read oldName
do
newName="$(echo \"${oldName}\" | iconv -f iso-8859-1 -t utf-8)";
mv "${oldName}" "${newName}";
done
The expression above assumes that the input charset is iso-8859-1 and
that the output charset is utf-8.
However...,
> 2) it needs to know the original encoding format... how the hell do I
> know that ? Files are many thousands, over 10 year old and come from
> CDs, internet, MS-DOS, Win95, Win XP etc, they probably don't use the
> same format ! :-/
... the input format seems to be a problem?!
I assume some of the filenames needn't be converted. Those with ASCII
chars only, can be moved without conversion. At least this saves some
manual work. You can use
echo "${oldName}" \
| iconv -f iso-8859-1 -t us-ascii 2>/dev/null \
|| echo "${oldname}"
to test whether a filename needs conversion, and
echo "${oldName}" \
| iconv -f iso-8859-1 -t us-ascii 2>/dev/null \
&& echo "${oldname}"
to test whether a filename needn't be converted.
Or perhaps I just make the whole thing much more complicated than it is... ;-)
> Sure must be a solution, if UTF-8 really becomes the norm, then millions
> of people are facing the same problem as I am now. So there has to be a
> solution...
Another thing is . . . you wrote earlier that your filenames look like
"brûlée" when they should look like "brûlée", right? But filenames
like "brûlée" are "utf-8 encoded and viewed in iso-8859-1".
--
Klaus Alexander Seistrup
SubZeroNet · Copenhagen · Denmark
More information about the ubuntu-users
mailing list