text processing programming language design and implementation (4th edition) by t. pratt and m....

15
Text processing Programming Language Design and Implemen tation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

Upload: isabel-jenkins

Post on 20-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

Text processing

Programming Language Design and Implementation (4th Edition)

by T. Pratt and M. ZelkowitzPrentice Hall, 2001

Section 12.1

Page 2: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

2

Desktop publishing

Traditional publication systems: WYSIWYG - What you see is what you get Typewriters examples of early WYSIWYG systems More complex today - Multiple fonts, colors, embedde

d graphics Need for embedded commands to describe layout of doc

ument

Three approaches to desktop publishing: WYSIWYG Page description languages Document compiling

Page 3: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

3

WYSIWYG

A common approach in PC world

Tools like Microsoft Word and Corel WordPerfect

Embedded commands in document to control layout (fonts, colors, font size, location of objects)

Rich Text Format (RTF) - An ASCII language for describing such layout. Can be used to pass information among different processors

Page 4: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

4

LaTeX

TeX: Document processing system developed by Donald Knuth a macro processing system for creation of string tex

t (i.e., documents) Arcane syntax

LaTeX: Macros for TeX a set of macros developed for TeX by Leslie Lamport creates a series of environments and control structu

res similar to programming language structures for lack of a better term, we often refer to the com

piling of the book as various chapters are processed by the TeX program

This book developed using LaTeX

Page 5: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

5

LaTeX executionExecutes much like a traditional compiler: First pass:

Read in text and create output format.Create symbol table for all internal references (sec

tion numbers, page numbers, figure numbers)Create table of contents and index, if desired

Second pass:Read in text and create output format.This time, internal references are correct because o

f symbol table created during pass 1. Third pass:

If no changes made to symbol table by pass 2, same as pass 2; otherwise repeat pass 2, again until no further changes are made to symbol table

[Why more than 2 passes? - Think of putting a table of contents at beginning of a report.]

Page 6: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

6

LaTeX features

LaTeX creates environments that make TeX easier to use. These behave much like C or Pascal scope rules

For example, one can begin and end a list of items: Numbered

\begin{enumerate}\item text [Prints as number 1]\item text [prints as number 2]\end{enumerate} [End of list]

Bulleted (“itemized”) Named (“description”)

Starting new sections or subsections automatically adjusts the appropriate section numbers. LaTeX has a syntax similar to the block-structured style of a programming languages.

Page 7: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

7

LaTeX structure

Page 8: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

8

LaTeX execution

By invoking LaTeX, the latex.tex macros are read into TeX to create commands for chapters, sections, subsections, figures, tables, lists, and the numerous other structures needed to write simple documents.

The documentstyle command (in LaTeX) allows the user to add other style features.

The required article parameter causes article.sty to be read in to tailor latex.tex with commands needed for an article. For example, there are no chapters in articles, but for style book (i.e., book.sty), chapters are defined.

11pt defines the size of the text font (11-point type), and art11.sty is read giving additional information on line and character spacing for 11- point type. The TeX program along with article.sty and art11.sty form the standard way to process a LaTeX article.

Mystyle.sty defines addition maccros a user can add to tailor LaTeX for a specific document.

Page 9: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

9

Page description languages

A Postscript program consists of five components:1. An interpreter for performing calculations. A simple

postfix execution stack is the basic model.2. A language syntax. This is based on Forth.3. Painting extensions. An extension to Forth with pain

ting commands for managing the process of painting text and pictures on a sheet of paper.

4. Defines a virtual machine for drawing information (text and graphics on a page). The showpage operator causes the described page to be displayed

5. Conventions. A series of conventions, not part of the formal Postscript language, that various printers use for consistency in presentation. Use of these conventions makes it easier for transporting postscript documents from one system to another.

Page 10: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

10

Postscript execution modelA Postscript program consists of a se

quence of commands that represent the postfix of the algorithm necessary to paint the document.

Postscript execution begins with two entries initially on the stack, which the program may not remove:

Systemdict is the system dictionary, which represents the initial binding of Postscript objects to their internal representation.

Userdict is the user dictionary, which represents the new definitions included within this execution of a Postscript program. This may include redefinition of primitive objects already defined in systemdict.

Page 11: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

11

Sample Postscript command

Each argument is stacked on Postscript stack: /box {newpath 0 0 moveto 0 1 lineto 3 1 lineto 3 0 line

to closepath} def

/box: Add name box to stack. / says this is a definition and not to evaluate arguments, only move to stack (like quote in LISP)

newpath: start a new pathmoveto: Take top two stack arguments and move cursor to

that (X,Y) locationlineto: Draw line from current cursor to the (X,Y) addr

ess, which is the top two stack numbersclosepath: Draw line back to newpath locationdef: Everything within { ... } is defined to be command

box[Note that the command box now draws a rectangle from

(0,0) to (0,1) to (3,1) to (3,0) and back to (0,0)]

Page 12: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

12

Summary

Note differences between models:

LaTeX and MS Word - define the layout of the final document

Postscript - defines a program which computes the final layout. A Postscript printer contains an interpreter that executes the Postscript program to produce the final printed document

Page 13: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

13

Postscript execution stacks

1. The operand stack contains the operands as they are stacked,executed, and unstacked.

2. The dictionary stack contains only dictionary objects. This stack defines the scope and context of each definition.

3. The execution stack contains executable objects. For the most part, these are functions in intermediate stages of execution.

4. The graphics state stack manages the context for painting objects on the page

A Postscript program is a sequence of ASCII characters. As each token is read, its definition is accessed in the stack (by first looking in userdict and then systemdict) and executed by an appropriate action.

Page 14: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

14

Document conventions

Conventions built into all Postscript interpreters: The leading comment should be%!PS That informs the interp

reter that the file is a Postscript program. Each page of a document is usually bracketed by a save an

d a restore command to isolate that page from the effects of other pages.

%%DocumentFonts: a list of fonts used in the document %%Title: an arbitrary string, the title of the document %%Creator: the name of the program that created the file %%CreationDate: the date and time of creation %%Pages: the number of pages in the document. %%BoundingBox: the four values that represent the lower l

eft and upper right corners of the page that are actually painted by the program. This allows the pages to be inserted into other documents.

Page 15: Text processing Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 12.1

15

Postscript summary

Postscript was developed to be a virtual machine architecture that can be used to create printable documents. Postscript of a document is not meant to be read by a programmer. However, the syntax is quite simple and easily understood.

Postscript has been developed further by Adobe with the creation of their Portable Document Format (PDF). PDF is a form of compressed Postscript. PDF readers are freely available over the Internet, and most Web browsers can display PDF files. PDF has become ubiquitous for the transmission and display of formatted documents.

Giving away PDF display programs was a shrewd move for Adobe because they sell the Acrobat program needed to create PDF documents.