How to format text for normal reading
user1
bqz69 at telia.com
Sat Nov 6 08:58:43 UTC 2010
I tried to do this:
for file in *.html; do html2text -o "${file%.*}.txt" "$file" ; done
I found it here: http://commandline.org.uk/command-line/converting-html-
to-text/
That works fine, but when I then cat all the single text files into one
big text file I still need to format this big file, to make it readable.
So my problem is not really only a html problem, but how to make any text
file which is badly formatted readable.
That is to get each paragraph stand out with full lines ended with a dot
as well as strange charachers/charachter-phrases removed.
Here follows 3 examples of some charachters I want removed:
� "
More information about the ubuntu-users
mailing list