News:

Forum changes: Editing of posts has been turned off until further notice.

Main Menu

A Cheap Way To Create PDFs.

Started by Galfraxas, March 02, 2002, 06:29:03 AM

Previous topic - Next topic

Galfraxas

Okay, some of you guys may have seen this on UncleBear.com, but there's this rather cool utility out there that allows people to create PDF files for free, and it's aptly named FreePDF. It's available at http://www.webxd.com/zipguy/freepdf.htm, and in my opinion, it rocks. Sure, it does require two other little bits of software, (one is 1.7MB and the other is something like 8MB, not too shabby), and it takes a while to setup, but it works great, you can use any document creation program to instantly create a PDF file. I've tested it on several programs, including Character Generators for several mainstream games, and it's only failed to provide adequate results on two occasions. First, I have yet to figure out how to get it to work with notepad. Second, if you try to convert an HTML file with it, it will, because it uses a redirected printer port, add the date/time/title onto the top and bottom of the document, just like printing from IE does. I've just been goofing around with it, and I've converted two of my games to PDF with it, and it seems to do great. I don't think I'll be needing to buy Acrobat now that I have this. Hope this helps somebody. (Oh, and if you're wondering, I am not the guy who made the program. I just like it so much I had to share.)

Tim Boser (AKA Galfraxas)
Imagination is the key to inner peace. Do you know which door it lies behind?

James V. West

Another program that creates PDFs is HTMLdoc from Easy Software. Its also free, and is super fast to get and setup. Unfortunately, I have never had good results with it. The only reason I mention it here is because I'm not a grandmaster at this stuff, so I might just be suffering from a lack of expertise. The program might be great.

You should find it here:

http://www.easysw.com

Thanks for mentioning FreePDF. And anyone who quotes The Great Weirdo in his signiture must be onto something fantastic.

Zak Arntson

For an example of HTMLDoc output, you can check out my Adventures in Space .pdf file over at www.anvilwerks.com (it's in the Gaming Bookshelf)

I found it's frustrating to do any fancy layout, but tables work fine.

Matt Machell

HTMLdoc seems to produce quite good output, and it is amazingly easy to install and use. It took me hardly any time to put together a PDF for Lost Gods.

Cheers for the recommendation.


Matt

EvilIdler

If FreePDF can take Postscript as input, you can
do this neat trick:

1.Set up a printer driver for one of the better HP
Laserjet printers (the higher the DPI, the better
the result).
2.Load document into Word.
3.Print to file through this fake printer. Name the
file something.ps.

This is the way I get perfect conversion from .doc
files before I process it further with Linux tools,
anyway.
O- EvilIdler

Clay

Both of these solutions sound pretty good for getting a quick PDF out.  I'm concerned though that you're using PDF like a printed page.  That might be sufficient for your needs, but we've discussed in other threads the fact that PDF can do a lot more than the printed page.  Hyperlinking, the PDF table of contents, and thumbnails all come to mind.

What free solutions are out there to do this?  I've mentioned LaTeX in the past, which is great but doesn't lend itself to visual composition.  Have people tried SGML processors like Jade?  Soem word processors, especially Word Perfect, will output SGML that could be run through Jade.

One of the benefits of the SGML tools that I saw was the ability to produce multiple output formats from the same source. In addition to PDF, you could generate rtf, HTML, and postscript.
Clay Dowling
RPG-Campaign.com - Online Campaign Planning and Management

Reimer Behrends

Quote from: Galfraxas
Okay, some of you guys may have seen this on UncleBear.com, but there's this rather cool utility out there that allows people to create PDF files for free, and it's aptly named FreePDF.

It should be noted that the underlying engine of FreePDF is Ghostscript. Ghostscript is an extremely mature PostScript processor, but there are a couple of bumps in the road when you're trying to generate PDF from it. Most importantly, depending on the PostScript output, fonts tend to get embedded multiple times, increasing the size of the PDF significantly (check File/Document Info/Fonts => List All Fonts to see if it actually happens). Also, there's a problem with the so-called makepattern primitive in PostScript, which leads to much larger PDFs than when using Distiller (happens when you have areas filled with bitmaps).

But if you know how to massage PostScript to bypass those problems, Ghostscript is a pretty good solution for generating PDF output.

-- Reimer Behrends

EvilIdler

ps2pdf (seems to be part of Debian Linux' version
of Ghostscript et al, or possibly the teTeX/LaTeX
packages) seems to do a good job of filtering out
duplicate fonts. I haven't figured out if there is any
support for internal PDF compression in those, but
the usual archive formats do a better job.

Compression should make multiple versions of the
same font less harmful, in any case.
O- EvilIdler

Reimer Behrends

Quote from: Clay
Both of these solutions sound pretty good for getting a quick PDF out.  I'm concerned though that you're using PDF like a printed page.  That might be sufficient for your needs, but we've discussed in other threads the fact that PDF can do a lot more than the printed page.  Hyperlinking, the PDF table of contents, and thumbnails all come to mind.

Okay, a quick primer on PostScript/PDF.

PostScript is essentially a programming language to describe a page. It has commands to display lines, curves, areas, text, bitmaps, but also stuff like conditionals, loops, subroutines. When generating PostScript output, a word processor/DTP program writes a PostScript program that, when run, displays the page as you see it on the screen (well, ideally).

PDF is more or less PostScript without the programming language features (loops and subroutines) and an extended set of graphics primitives. That's actually a bit of an oversimplification, but it works for now. (If you generate uncompressed PDF, you can actually view it as a text file.)

Distiller works by interpreting a PostScript program like a PostScript printer does, but instead of rendering the graphics on a printed page, it writes out the limited set of graphics primitives as a PDF file.

Now, PDF has some features (such as hyperlinks) that aren't present in PostScript as such. But you can insert a fake command in PostScript that a PostScript printer will just ignore. Distiller, on the other hand, will understand it and create hyperlinks, etc. This command is called 'pdfmark'. A word processor or DTP program that wishes to generate clickable links has to know about 'pdfmark' commands and to actively generate them. If it can, good, then most PDF generators (including Distiller, but also Ghostscript, PStill, etc.) should give you what you want.

But if your word processor isn't clever enough to do this, you still can do it -- it just requires a bit more effort. Check Thomas Merz's pdfmark primer for how to do it in a normal program. Just be warned that it is fairly technical.

-- Reimer Behrends

Matt Machell

Quote from: Clay
 Hyperlinking, the PDF table of contents, and thumbnails all come to mind.

HTMLdoc, if the input is an HTML page does hyperlinks, and it says it does a table of contents (though I have yet to test it)

Matt

Reimer Behrends

Quote from: EvilIdler
ps2pdf (seems to be part of Debian Linux' version
of Ghostscript et al, or possibly the teTeX/LaTeX
packages) seems to do a good job of filtering out
duplicate fonts. I haven't figured out if there is any
support for internal PDF compression in those, but
the usual archive formats do a better job.

Two things: First of all, most PDF generators usually compress their output using /FlateEncode, the same algorithm used by zip. So, compressing them further is usually only going to net you a few percentage points of compression at the very best (for the small parts of the PDF that aren't compressed).

Second, duplicate fonts are actually a fairly complicated topic. To begin with, ps2pdf subsets fonts: that is, only those glyphs (characters) actually used are embedded, which cuts down on the problem far more than compression. Also, duplicate fonts only occur under certain conditions (warning, this is going to be technical).

The first of them is the 'save'/'restore' pair of PostScript operators. Save stores the state of the PostScript language interpreter, restore restores it. Many PostScript generators surround each page of output with these operators to have a "clean slate" for the next page. Since Ghostscript evolved from being a "mere" PostScript interpreter, it (unlike Distiller) doesn't remember fonts globally, but as part of the interpreter state, so save/restore is killing its memory of that (see this  link for the gory details).

The second condition is the 'definefont' operator. Most of the time, it's not actually used for defining fonts, but rather to 'reencode' a font. Each PostScript font has a standard mapping of 8-bit characters to the actual graphical 'glyphs' that are drawn. Now, if you're using the Windows 1252 codepage or writing text using the European Latin-1/Latin-9 character sets, this mapping has to change. Unfortunately, Ghostscript seems to think on occasion that changing the mapping means changing the entire font, and will promptly label it as distinct.

So, it will all depend on what the PostScript generated by your program looks like -- some output works just fine, some doesn't. And if you're PostScript savvy, you can usually massage the output to make it work, anyway (for instance, I have scripts to fix these problems for the two systems I use most frequently).

-- Reimer Behrends

EvilIdler

Quote from: Reimer Behrends
So, compressing them further is usually only
going to net you a few percentage points of
compression at the very best (for the small
parts of the PDF that aren't compressed).
I've had varying results on the compression,
but there have been cases where I could save
fifty percent. Of course, that was mainly text..

>Also, duplicate fonts only occur under certain
>conditions (warning, this >is going to be
>technical).
I'm a programmer. I can handle it ;)

>Since Ghostscript evolved from being a
>"mere" PostScript interpreter, it (unlike
>Distiller) doesn't remember fonts globally, but
>as part of the interpreter state, so >save/restore is killing its memory of that
Messy!

>Unfortunately, Ghostscript seems to think on
>occasion that changing the mapping means
>changing the entire font, and will promptly
>label it as distinct.
Ah. I've seen similar things happen elsewhere.
Word has a version of that - use a font once,
and it goes into the document's list whether
you end up displaying with it or not.

>you're PostScript savvy, you can usually
>massage the output to make it work, anyway
>(for instance, I have scripts to fix these
>problems for the two systems I use most
>frequently).
Do you have these wondrous scripts publically
available? That sounds like things that belong
on Freshmeat (if they aren't already) :)
O- EvilIdler