### 10 Postscript and PDF

10.1 Ghostscript
10.2 GV and Ghostview
10.3 Acrobat

10.8 PS Utils

Postscript is a page description language. It was introduced by Adobe Systems in the mid-eighties and has become the standard device independent file format for printing graphics files. What this means is that PostScript describes a graphics image in such a way so that it does not make any reference to specific device features (e.g. printer resolution) so that the same description (postScript file) could be used on any PostScript compatible printer.

An Encapsulated PostScript File (EPSF or EPS) is a PostScript file structured so that it can be incorporated or included into another PostScript file (so that for example a diagram created with a graphics application can be inserted into a text document created with a word processor).

PDF is another page description language introduced by Adobe to replace PostScript, however it isn’t yet in as widespread use as PostScript. For instance its quite hard to find a printer that has a PDF interpreter implemented in hardware, i.e. you can not sent a PDF file directly to the printer but must first convert it to PostScript using display software such as Adobe Acrobat.

#### 10.1 Ghostscript

The Ghostscript software suite is an interpreter for the PostScript language, with the ability to convert PostScript language files to many other formats, display them, and print them on printers that don’t have PostScript language capability built in. Additionally Ghostscript also functions as an interpreter for Portable Document Format (PDF) files, with the much the same capabilities. Finally the suite also contains a C subroutine library (the Ghostscript library) that implements the graphics capabilities that appear as primitive operations in the PostScript language.

There are actually two different versions of Ghostscript, these being the Aladdin and GNU distributions. The main difference between them seems to be the licencing terms, GNU Ghostscript being distributed under the GPL of course with the Aladdin version being distributed under the Aladdin Free Public Licence. The only difference in the licencing terms appears to be that the Aladdin licence does not allow commercial distribution. If you are using Linux you almost certainly have GNU Ghostscript installed due to the licencing issue.

Further information on Ghostscript can be found at http://www.cs.wisc.edu/~ghost/.

#### 10.2 GV and Ghostview

Ghostview is a full fuction X Windows interface for the Ghostscript the PostScript interpreter. Ghostview and Ghostscript function as two cooperating programs. Ghostview creates the viewing window and Ghostscript draws in it. The GUI is fairly self explanatory, however the application ships with an extensive manual page (type man ghostview at the UNIX prompt).

GV is a version of Ghostview that was modified for VMS, some enhancements made, and then modified to run again under Unix. It is now replacing Ghostview as the standard desktop tool for viewing PostScript files, and is in fact the default viewier in most Linux dsitributions (i.e. if you type ghostview on a Linux prompt you’ll probably actually start the GV program instead). An example of GV in action can be seen in Figure 45. Further information on GV and Ghostview can be found at http://www.cs.wisc.edu/~ghost/.

#### 10.3 Acrobat

Adobe Acrobat Reader allows you to view and print PDF files. While the viewer is free, if you want to create PDF content the tools to do so are not. More information is available at http://www.adobe.com/products/acrobat/readermain.html. The Acrobat reader is distributed as part of the Staarlink baseset software, and can be started by typing acroread.

#### 10.4 psmerge

psmerge is a utility program for merging one or more Encapsulated PostScript Files into a single PostScript file. The input files can be individually rotated, scaled and shifted. The output file can either be Encapsulated PostScript or “normal” PostScript suitable for sending to a printer. The psmerge utility is covered in detail in SUN/164.

#### 10.5 epsutil

epsutil is a utility for manipulating Encapsulated PostScript files. For more information see the manual at http://www.math.utah.edu/~beebe/software/epsutil/epsutil.html.

#### 10.6 prescript and pstotext

prescript extracts text from a PostScript file, storing it either as plain ASCII text, or as HTML according to the mandatory first command-line argument. Usage is:

prescript [ html $\mid$ plain ] [ input.ps ]

The output file will be given the same base name as the input file, with its file extension set to one of .html or .txt, according to the first command-line argument.

prescript uses a PostScript interpreter, normally gs, to execute the PostScript program, so that even text that is generated programmatically, rather than being explicitly present in PostScript strings, can be extracted. Particular attention is paid to heuristic recognition of word breaks, to reconstruction of words hyphenated at line breaks, to preservation of paragraph breaks, and to recognition of TEXligatures.

The prescript program can be downloaded from http://www.nzdl.org/html/prescript.html.

A possible substitute for prescript is the pstotext utility. More information can be found at http://www.research.digital.com/SRC/virtualpaper/pstotext.html.

#### 10.7 Postscript to PDF

PDF files can be easily generated using the gs utility using the following command.

gs -q -dSAFER -dNOPAUSE -sPAPERSIZE=a4 -sDEVICE=pdfwrite
-sOutputFile=output.pdf  input.ps

#### 10.8 PS Utils

PSUtils, written by Angus Duggan, is a collection of useful utilities for manipulating PostScript documents. Programs included are psnup, for placing out several logical pages on a single sheet of paper, psselect, for selecting pages from a document, pstops, for general imposition, psbook, for signature generation for booklet printing, and psresize, for adjusting page sizes.

• psbook

The psbook program rearranges pages from a PostScript document into “signatures” for printing books or booklets, creating a new PostScript file.

Usage is:

psbook [ -q ] [ -ssignature ] [ infile [ outfile ] ]

Where -q surpresses printing of page numbers below the pages being rearranged (by default page numbers are printed), and -ssignature selects the size of signature which will be used. The signature size is the number of sides which will be folded and bound together; the number given should be a multiple of four. The default is to use one signature for the whole file. Extra blank sides will be added if the file does not contain a multiple of four pages.

• psnup

The psnup program puts multiple logical pages onto each physical sheet of paper. The potential use of this utility is varied but one particular use is in conjunction with psbook. For example, using groff to create a PostScript document and lpr as the UNIX print spooler a typical command line might look like this:

% groff -Tps -ms file | psbook | psnup -2 | lpr

Where file is a four-page document this command will result in a two-page document printing two pages of file per page and rearranges the page order to match the input Pages 4 and 1 on the first output page and Pages 2 then 3 of the input document on the second output page.

Usage is:

psnup [ -wwidth ] [ -hheight ] [ -ppaper ] [ -Wwidth ] [ -Hheight ]
[ -Ppaper ] [ -l ] [ -r ] [ -f ] [ -c ] [ -mmargin ]
[ -bborder ] [ -dlwidth ] [ -sscale ] [ -+nup ] [ -q ]
[ infile [ outfile ] ]

The -w option gives the paper width, and the -h option gives the paper height, normally specified in ‘cm’ or ‘in’ to convert PostScript’s points (1/72 of an inch) to centimeters or inches. The -p option can be used as an alternative, to set the paper size to A3, A4, A5, B5, letter, legal, tabloid, statement, executive, folio, quarto or 10x14. The default paper size is A4.

The -W, -H, and -P options set the input paper size, if it is different from the output size. This makes it easy to impose pages of one size on a different size of paper.

The -l option should be used for pages which are in landscape orientation (rotated 90 degrees anticlockwise). The -r option should be used for pages which are in seascape orientation (rotated 90 degrees clockwise), and the -f option should be used for pages which have the width and height interchanged, but are not rotated.

Psnup normally uses “row-major” layout, where adjacent pages are placed in rows across the paper. The -c option changes the order to “column-major”, where successive pages are placed in columns down the paper.

A margin to leave around the whole page can be specified with the -m option. This is useful for sheets of “thumbnail” pages, because the normal page margins are reduced by putting multiple pages on a single sheet.

The -b option is used to specify an additional margin around each page on a sheet.

The -d option draws a line around the border of each page, of the specified width. If the lwidth parameter is omitted, a default linewidth of 1 point is assumed. The linewidth is relative to the original page dimensions, i.e. it is scaled down with the rest of the page.

The scale chosen by psnup can be overridden with the -s option. This is useful to merge pages which are already reduced.

The -nup option selects the number of logical pages to put on each sheet of paper. This can be any whole number; psnup tries to optimise the layout so that the minimum amount of space is wasted. If psnup cannot find a layout within its tolerance limit, it will abort with an error message. The alternative form -n nup can also be used, for compatibility with other n-up programs. psnup normally prints the page numbers of the pages re-arranged; the -q option suppresses this feature.

• psselect

The psselect program selects pages from a PostScript document, creating a new PostScript file. Usage is:

psselect [ -q ] [ -e ] [ -o ] [ -r ] [ -ppages ] [ pages ]
[ infile [ outfile ] ]

Where the -e option selects all of the even pages; it may be used in conjunction with the other page selection options to select the even pages from a range of pages, alternatively the -o option selects all of the odd pages; it also may be used in conjunction with the other page selection options.

The -ppages option specifies the pages which are to be selected. Pages is a comma-separated list of page ranges, each of which may be a page number, or a page range of the form first-last. If first is omitted, the first page is assumed, and if last is omitted, the last page is assumed. The prefix character “_” indicates that the page number is relative to the end of the document, counting backwards. If just this character with no page number is used, a blank page will be inserted in the output.

The -r option causes psselect to output the selected pages in reverse order.

psselect normally prints the page numbers of the pages rearranged; the -q option suppresses this. If any of the -r, -e, or -o options are specified, the page range must be given with the -p option.

• pstops

The pstops program preforms general page rearrangement and selection, creating a new PostScript file. pstops can be used to perform a large number of arbitrary re-arrangements of documents, including arranging for printing 2-up, 4-up, booklets, reversing, selecting front or back sides of documents, scaling, etc.

Usage is:

pstops [ -q ] [ -b ] [ -wwidth ] [ -hheight ] [ -ppaper ] [ -dlwidth ]
pagespecs infile [ outfile ] ]

pagespecs = [modulo:]specs
specs = spec[+specs][,specs]
spec = [-]pageno[L][R][U][@scale][(xoff,yoff)]

modulo is the number of pages in each block. The value of modulo should be greater than 0; the default value is 1. specs are the page specifications for the pages in each block. The value of the pageno in each spec should be between 0 (for the first page in the block) and modulo-1 (for the last page in each block) inclusive. The optional dimensions xoff and yoff shift the page by the specified amount. xoff and yoff are in PostScript’s points, but may be followed by the units ‘cm’ or ‘in’ to convert to centimetres or inches, or the flags ‘w’ or ‘h’ to specify as a multiple of the width or height. The optional flags L, R, and U rotate the page left, right, or upside-down. The optional scale parameter scales the page by the fraction specified. If the optional minus sign is specified, the page is relative to the end of the document, instead of the start. If page specs are separated by ‘+’ the pages will be merged into one page; if they are separated by ‘,’ they will be on separate pages. If there is only one page specification, with pageno zero, it may be omitted.

The shift, rotation, and scaling are performed in that order regardless of which order they appear on the command line.

The -w option gives the width which is used by the ‘w’ dimension specifier, and the -h option gives the height which is used by the ‘h’ dimension specifier. These dimensions are also used (after scaling) to set the clipping path for each page. The -p option can be used as an alternative, to set the paper size to A3, A4, A5, B5, letter, legal, tabloid, statement, executive, folio, quarto or 10x14. The default paper size is A4.

The -b option prevents any bind operators in the PostScript prolog from binding. This may be needed in cases where complex multi-page re-arrangements are being done.

The -d option draws a line around the border of each page, of the specified width. If the lwidth parameter is omitted, a default linewidth of 1 point is assumed. The linewidth is relative to the original page dimensions, i.e. it is scaled up or down with the rest of the page.

pstops normally prints the page numbers of the pages re-arranged; the -q option suppresses this feature.

• psresize

The psresize program rescales and centres a document on a different size of paper. Usage is:

psresize [ -wwidth ] [ -hheight ] [ -ppaper ] [ -Wwidth ] [ -Hheight ]
[ -Ppaper ] [ -q ] [ infile [ outfile ] ]

The -w option gives the output paper width, and the -h option gives the output paper height, normally specified in ‘cm’ or ‘in’ to convert PostScript’s points (1/72 of an inch) to centimeters or inches. The -p option can be used as an alternative, to set the output paper size to A3, A4, A5, B5, letter, legal, tabloid, statement, executive, folio, quarto or 10x14. The default output paper size is A4. The -W option gives the input paper width, and the -H option gives the input paper height. The -P option can be used as an alternative, to set the input paper size. psresize normally prints the page numbers of the pages output; the -q option suppresses this feature.

#### 10.9 Generating Postscript Output

A common task is to take an image, for instance a GIF or JPEG, and generate a PS or EPS output figure for publication. Depending on which package is used for this task there is a suprising difference between the size of the final postscript image. Of the packages available ImageMagick seems to produce the smallest postscript output files due to its use of vectorised postscript rather than bitmaps which other packages (such as xv) use. In extreme cases this can mean the difference between a 2Mb and 50k final postscript file. All the postscript images in this cookbook were generated from the original GIF files using ImageMagick.