[ubuntu-us-nc] PDF conversion

Jeff Lane jeffrey.lane at canonical.com
Tue Jun 22 16:06:36 BST 2010


On Tue, 2010-06-22 at 00:05 -0400, J Mark Cox wrote:
> On Mon, 2010-06-21 at 19:42 -0400, Jeff Lane wrote:
> > Instead of all these fancy pointy-clickey ways, try on one of the tools
> > from poppler-utils (should be installed by default in 10.04, or at least
> > I don't remember ever installing them).
> > 
> > For example:
> > 
> > pdftotext - converts pdf files to text files
> > pdftohtml - converts pdf files to HTML files
> > pdftops - converts pdf to PostScript
> > pdftoabw - converts pdf to AbiWord format
> > 
> > http://poppler.freedesktop.org/
> > 
> > And it's all shell, so you can script it to run against all the PDFs you
> > have...
> > 
> > 
> Awesome! Now I have to go edit those tax form pdf files I have been
> wanting  to "slightly" modify. Oh my, coffee...
> 
> Make checks payable to:
> Send checks to: 
> 
> Just kidding, but looks like some intriguing possibilities none the
> less.

Yeah, they are neat little tools.  I've had them for a while and use
them on occasion, but I don't know if they're part of the default Lucid
install or not.  I always thought they were part of Xpdf, until the
other day, to be honest :)  I never installed poppler myself, so I am
left to guess that it was either a default package, or was a dependency
of something else I installed.

In any case, they work pretty well, though not always.  Apparently there
are some PDFs that have mangled data or werid fonts or characters that
don't always get converted properly, so it's still a good idea to
actually look over the converted files before pushing them anywhere
important... just like when using OCR...

Cheers
Jeff

BTW: I've been meaning to ask this for a while, you aren't the Mark Cox
of Red Hat fame, are you?

-- 
Jeff Lane <jeffrey at canonical.com> 
Ubuntu Ham: W4KDH
IRC: bladernr or bladernr_
gpg: 1024D/3A14B2DD 8C88 B076 0DD7 B404 1417  C466 4ABD 3635 3A14 B2DD
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/ubuntu-us-nc/attachments/20100622/d29ee4f6/attachment.pgp 


More information about the ubuntu-us-nc mailing list