[Coco] Rainbow archives in DjVu

Jeff Teunissen deek at d2dc.net
Wed Mar 25 17:06:15 EDT 2009


Bill wrote:
> What would the qualifier be for converting pdf to djvu? 
> 
> I found a converter, and this is what I did: I copied a pdf file to the
> converter directory, and ran the converter. 
> 
> This is what I ended up with: 500.pdf=4,020,113   500.djvu=9,770,640
> 
> I don't know if there is a command in the exe file to shrink, if there is, I
> didn't see it.
> 
> And the converter I used  was pdf2djvu
> (http://www.softpedia.com/get/Office-tools/PDF/PDF2DjVu.shtml). I gotta tell
> ya, it was VERY SLOW.

pdf2djvu is not good for scanned documents, and neither is DjVuDigital.

pdf2djvu is designed to convert NORMAL PDF files to DjVu. That is, it takes
the "ASCII" text already in digital form out of the PDF, turns it into a
foreground image, turns all of the graphic data into the new DjVu file's
background layer and shrinks it down to 50 dots per inch. With regular PDF
files, this works great. The problem comes in when you try to use it for a PDF
of a scan.

Since a scanned document is all graphic data already and has no "ASCII" text,
all pdf2djvu does is shrink the image and compress it. Since there's no way to
simplify the contents of the page, the files are both not very good and very
large.

Getting good quality and small files means finding a way to separate the stuff
that was printed on the page from the other stuff. That's what all my filters
and stuff do, and why my files are tiny and look pretty good (though it can
probably be done better)



More information about the Coco mailing list