texts 2: markup languages, software for manipulating...

17
Texts 2: Markup languages, software for manipulating text László Kálmán 1 Csaba Oravecz 1 Péter Szigetvári 2 1 Research Institute for Linguistics Hungarian Academy of Sciences 2 Department of English Linguistics Eötvös Loránd University Lecture 4 / 3 Oct 2007 Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software outline Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software abstract this lecture tells you about ways of formatting electronic text important software for creating and manipulating electronic text the features and functions of such software Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software markup languages 1: *MLs SGML (Standard Generalized Markup Language; ISO 8879) a metalanguage used to define specific markup schemes (a system of tags) HTML (Hypertext Markup Language) an implementation of SGML, used for web documents XML (Extensible Markup Language) a simplified subset of SGML XHTML (Extensible Hypertext Markup Language) an implementation of XML, used for web documents (HTML : SGML = XHTML : XML) Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Upload: others

Post on 18-Feb-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

Texts 2: Markup languages,software for manipulating text

László Kálmán1 Csaba Oravecz1 Péter Szigetvári2

1Research Institute for LinguisticsHungarian Academy of Sciences

2Department of English LinguisticsEötvös Loránd University

Lecture 4 / 3 Oct 2007

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

outline

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

abstract

this lecture tells you about

• ways of formatting electronic text

• important software for creating and manipulating electronictext

• the features and functions of such software

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

markup languages 1: ∗MLs

SGML (Standard Generalized Markup Language; ISO 8879)

a metalanguage used to define specific markup schemes (asystem of tags)

HTML (Hypertext Markup Language)

an implementation of SGML, used for web documents

XML (Extensible Markup Language)

a simplified subset of SGML

XHTML (Extensible Hypertext Markup Language)

an implementation of XML, used for web documents(HTML : SGML = XHTML : XML)

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 2: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

a chunk of SGML code

Figure: part of thesource for the entryfor quiz inOrszág–Magay’sEnglish–Hungariandictionary

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

the entry for quiz printed

Figure: the printed entry for quiz in Ország–Magay’sEnglish–Hungarian dictionary

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

markup languages 1: ∗MLs

SGML (Standard Generalized Markup Language; ISO 8879)

a metalanguage used to define specific markup schemes (asystem of tags)

HTML (Hypertext Markup Language)

an implementation of SGML, used for web documents

XML (Extensible Markup Language)

a simplified subset of SGML

XHTML (Extensible Hypertext Markup Language)

an implementation of XML, used for web documents(HTML : SGML = XHTML : XML)

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

example HTML code. . .

Figure: a sample HTML source file. . .

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 3: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

. . . shown in Firefox and Opera

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

. . . shown in w3m

Figure: . . . and its output in w3m (a CLI browser)

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

markup languages 1: ∗MLs

SGML (Standard Generalized Markup Language; ISO 8879)

a metalanguage used to define specific markup schemes (asystem of tags)

HTML (Hypertext Markup Language)

an implementation of SGML, used for web documents

XML (Extensible Markup Language)

a simplified subset of SGML

XHTML (Extensible Hypertext Markup Language)

an implementation of XML, used for web documents(HTML : SGML = XHTML : XML)

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

markup languages 2: lightweight

• less elaborate systems used for specific purposes, e.g.,

BBCode (Bulletin Board Code)

used on bulletin boards, like the SEAS Forum (btw. have youjoined yet?); contains only some formatting (italics, boldface,colour, size), hyperlink tags, and emoticons (smilies)

Wikitext

used on Wiki sites, some formatting, links to other Wiki pages,external links, pictures, maps

• Wikitext is copiously documented in the relevant Wikipages, BBCode is also usually explained in forum FAQs

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 4: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

editing BBCode

Figure: editing BBCode

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

the code itself. . .

The [b]sole[/b] aim of [i]this[/i] [u]message[/u]

is to [color=orange]exemplify[/color]

[code]BBCode[/code] for students of

[url=http://budling.nytud.hu/itcourse]this

course[/url].

[list=a][*]first item[*]second item[/list]

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

forum post

Figure: . . . and the result

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

markup languages 2: lightweight

• less elaborate systems used for specific purposes, e.g.,

BBCode (Bulletin Board Code)

used on bulletin boards, like the SEAS Forum (btw. have youjoined yet?); contains only some formatting (italics, boldface,colour, size), hyperlink tags, and emoticons (smilies)

Wikitext

used on Wiki sites, some formatting, links to other Wiki pages,external links, pictures, maps

• Wikitext is copiously documented in the relevant Wikipages, BBCode is also usually explained in forum FAQs

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 5: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

Wikipedia: Polcz Alaine

Figure: Wikipedia’s entry for Polcz Alaine

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Wikipedia: editing Polcz Alaine

Figure: editing Wikipedia’s entry for Polcz Alaine

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

markup languages 2: lightweight

• less elaborate systems used for specific purposes, e.g.,

BBCode (Bulletin Board Code)

used on bulletin boards, like the SEAS Forum (btw. have youjoined yet?); contains only some formatting (italics, boldface,colour, size), hyperlink tags, and emoticons (smilies)

Wikitext

used on Wiki sites, some formatting, links to other Wiki pages,external links, pictures, maps

• Wikitext is copiously documented in the relevant Wikipages, BBCode is also usually explained in forum FAQs

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

markup languages 3: TEX & co.

used for professional typesetting

TEX

a typesetting system created by Donald Knuth to typeset thesecond edition of the second volume of his book The Art ofComputer Programming

LATEX

a set of macros built upon the above to ease the user’s life

other TEXes

there are many other types of TEXes, e.g., AMS-TEX

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 6: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

TEX source . . .

Figure: plain TEX source file

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

. . . and the result

left margincentre of line

right marginone-third of line width

1) item 12) item 2

2/a) subitem within item 23) item 3

Figure: output of the above TEX source file

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

markup languages 3: TEX & co.

used for professional typesetting

TEX

a typesetting system created by Donald Knuth to typeset thesecond edition of the second volume of his book The Art ofComputer Programming

LATEX

a set of macros built upon the above to ease the user’s life

other TEXes

there are many other types of TEXes, e.g., AMS-TEX

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

LATEX source . . .

Figure: LATEX source file (of the above output)

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 7: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

markup languages 3: TEX & co.

used for professional typesetting

TEX

a typesetting system created by Donald Knuth to typeset thesecond edition of the second volume of his book The Art ofComputer Programming

LATEX

a set of macros built upon the above to ease the user’s life

other TEXes

there are many other types of TEXes, e.g., AMS-TEX

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

machine-generated text formats

RTF (Rich Text Format)

Microsoft’s proprietary platform-independent document format,human readable, but rarely edited directly

PostScript

a page description and programming language, the de factostandard for printing; human readable, editable

PDF (Portable Document Format)

Adobe’s proprietary document format, based on PostScript,encoding the exact look of the document; the most widespreadformat of publishing heavily formatted documents on the web;usually non-human-readable, compressed

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

RTF source and output

the RTF source. . .{\rtf1\ansi{\fonttbl\f0\fswiss Helvetica;}\f0Hello!\parThis is some {\b bold} text.\par}

. . . outputs the following

Hello!This is some bold text.

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

machine-generated text formats

RTF (Rich Text Format)

Microsoft’s proprietary platform-independent document format,human readable, but rarely edited directly

PostScript

a page description and programming language, the de factostandard for printing; human readable, editable

PDF (Portable Document Format)

Adobe’s proprietary document format, based on PostScript,encoding the exact look of the document; the most widespreadformat of publishing heavily formatted documents on the web;usually non-human-readable, compressed

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 8: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

PostScript fragment

Figure: the first few of about 2M lines from the PostScript version ofthe present slide show

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

machine-generated text formats

RTF (Rich Text Format)

Microsoft’s proprietary platform-independent document format,human readable, but rarely edited directly

PostScript

a page description and programming language, the de factostandard for printing; human readable, editable

PDF (Portable Document Format)

Adobe’s proprietary document format, based on PostScript,encoding the exact look of the document; the most widespreadformat of publishing heavily formatted documents on the web;usually non-human-readable, compressed

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Portable Document Format fragment

Figure: the first few lines of the PDF version of the present slide show

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

types of markup

procedural markup

uses explicit instructions, like

• set this in italics: \textit{this}

• skip a line, set the text in larger boldface font, skip a lineagain: <br><p><font size=+1><b>Sectiontitle</b></font></p><br>

logical/descriptive/semantic/generic markup

• emphasize: \emph{this}

• typeset a section heading:<h1>Title of section</h1>

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 9: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

comparison of markup types

logical

• depends heavily on laterinterpretation (esp. in webdocuments)

• interpretation of markuphas to be customized

• flexible on format: e.g.,\emph{} produces italicsin a roman context, androman in an italic context

• style easily modifiable later

procedural

• firmer control over output

• less customizationnecessary

• (often) premature stanceon format

• style modifiable byextensive replacement ofmarkup

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

types of editors

editors• hex editor (for experts): shows “character” codes

• line editor (spartan, obsolete): can edit only one line of thetext at a time, e.g., ed (Unix/Linux), Edlin (MS-DOS,Windows)

• text editor (formatting by markup)e.g., vi, Emacs, Notepad, Simple Text

• word processor (usually WYSIWYG)e.g., Microsoft Word, AbiWord, Open Office Writer

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

types of editors: a hex editor

Figure: Screenshot of Hex Editor (HHD Software) (from Wikipedia)

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

types of editors

editors• hex editor (for experts): shows “character” codes

• line editor (spartan, obsolete): can edit only one line of thetext at a time, e.g., ed (Unix/Linux), Edlin (MS-DOS,Windows)

• text editor (formatting by markup)e.g., vi, Emacs, Notepad, Simple Text

• word processor (usually WYSIWYG)e.g., Microsoft Word, AbiWord, Open Office Writer

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 10: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

types of editors: a line editor

Figure: Screenshot of a GNU ed session

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

types of editors

editors• hex editor (for experts): shows “character” codes

• line editor (spartan, obsolete): can edit only one line of thetext at a time, e.g., ed (Unix/Linux), Edlin (MS-DOS,Windows)

• text editor (formatting by markup)e.g., vi, Emacs, Notepad, Simple Text

• word processor (usually WYSIWYG)e.g., Microsoft Word, AbiWord, Open Office Writer

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

types of editors: a text editor

Figure: Screenshot of Emacs

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

types of editors

editors• hex editor (for experts): shows “character” codes

• line editor (spartan, obsolete): can edit only one line of thetext at a time, e.g., ed (Unix/Linux), Edlin (MS-DOS,Windows)

• text editor (formatting by markup)e.g., vi, Emacs, Notepad, Simple Text

• word processor (usually WYSIWYG)e.g., Microsoft Word, AbiWord, Open Office Writer

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 11: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

types of editors: a word processor

Figure: Screenshot of Open Office Writer

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

borderline case 1

Figure: “WYSIWYG” in Emacs, a text editor

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

borderline case 2

Figure: markup in Open Office Writer, a WYSIWYG word processor

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

syntax highlighting 1

Figure: syntax highlighting for HTML in Emacs

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 12: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

syntax highlighting 2

Figure: syntax highlighting for shell scripts in Emacs

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

syntax highlighting 3

Figure: syntax highlighting for perl in Emacs

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

text formatters

what’s that?

text formatters are programmes that feed on marked-up text,and produce formatted output from it, e.g.,

HTML INPUT (EATEN BY BROWSERS) TEX INPUT OUTPUT

<em>vis-&agrave;-vis</em> {\it vis-\‘a-vis} vis-à-vis2<sup>2</sup>=<strong>4</strong> $2^2={\bf 4}$ 22

= 4

some common text formatters• RUNOFF (1964), nroff, troff, groff

• TEX, LATEX

• web browsers contain an HTML formatter to be able todisplay HTML source files

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

ambiguous terminology

N.B. words like TEX are used ambiguously: both for themarkup language and for the text formatting programme;this ambiguity does not normally cause anymisunderstanding

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 13: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

a “definition” of word processor

a word processor

is a text editor and formatter in one

using a word processor as a text editor only

editing a file and saving it as plain text (.txt)

using a word processor as a text formatter only

opening a file and saving it as PDF (or PostScript)

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

concepts 1a

• opening/reading/retrieving a file: copying (part of) a file intothe memory (this part of the memory will be called buffer),and usually displaying (part of) it on the screen, so that itcan be read or modified by the user

• opening a new file: presenting an empty buffer so that a filecan be created from scratch

• saving/writing a file: writing the contents of the buffer to thedisk (this usually destroys the original file, but see VERSION

CONTROL on week 13)

• saving a file as: writing the contents of the buffer to the diskwith a file name different form the original, or in a formatdifferent form the original (or what the editor defaults to)

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

concepts 1b

• auto(matic )saving: regular automatic saving of thecontents of the buffer to minimize data loss in case ofpower failure

• recovering a file: restoring the contents of an unsaved filefrom the automatically saved version

• quitting: in some editors quitting leaves the original filesintact, i.e., all the changes you made in the session are lost(except if there was autosaving during the session)

• exiting: modern editors usually ask if unsaved buffersought to be written to the disk; sometimes this does nothappen if you shut down the computer: to be on the safeside you had always better save buffers manually andperhaps close the editor before shutting down thecomputer

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

concepts 2

• cursor: an underscore (_), vertical line ( ), rectangular box( ), which indicates the point where text will be entered ifyou begin to type; it may blink; often its shape is differentdepending on input mode (insert or overwrite)

• insert mode: typed text will be inserted, pushing outcharacters to the right (left in right-to-left scripts)

• overwrite mode: typed text will overwrite characters to theright (left in right-to-left scripts)

• mark: another point in the text; the region between thecursor and the mark is selected for some operation

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 14: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

concepts 2: cursor in insert mode

Figure: cursor in Open Office Writer in insert modeKálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

concepts 2

• cursor: an underscore (_), vertical line ( ), rectangular box( ), which indicates the point where text will be entered ifyou begin to type; it may blink; often its shape is differentdepending on input mode (insert or overwrite)

• insert mode: typed text will be inserted, pushing outcharacters to the right (left in right-to-left scripts)

• overwrite mode: typed text will overwrite characters to theright (left in right-to-left scripts)

• mark: another point in the text; the region between thecursor and the mark is selected for some operation

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

concepts 2: cursor in overwrite mode

Figure: cursor in Open Office Writer in overwrite modeKálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

concepts 2

• cursor: an underscore (_), vertical line ( ), rectangular box( ), which indicates the point where text will be entered ifyou begin to type; it may blink; often its shape is differentdepending on input mode (insert or overwrite)

• insert mode: typed text will be inserted, pushing outcharacters to the right (left in right-to-left scripts)

• overwrite mode: typed text will overwrite characters to theright (left in right-to-left scripts)

• mark: another point in the text; the region between thecursor and the mark is selected for some operation

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 15: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

concepts 2: selected text in Open Office Writer

Figure: selected text in Open Office WriterKálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

concepts 2: selected text in Emacs

Figure: selected text in Emacs: mark is in line 895 column 0, cursor inline 896, column 11

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

concepts 3

• cutting/killing:1 removing the selected region from the textand putting it to the clipboard/kill ring

• copying: copying the selected region to the clipboard/killring

• pasting/yanking: copying the contents of the clipboard/killring into the buffer

1MS dialect/Emacs dialect on this pageKálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

concepts 4

• find/search: looking for a given pattern in the buffer

• overwrapped search: looking for occurrences of the patternfrom the begin of the file after the end of the file has beenreached (or from the end in the case of reverse/backwardsearching)

• incremental search: looking for a given pattern on the fly

• replace: removing the first given pattern from the bufferand inserting the second given patter in its place

regular expressions

offer a very powerful tool in replacing patters (more on them onweek 10)

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 16: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

text justification

types of justification

centred

flush left, ragged right

flush right, ragged left

justified

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

justification of monospace text

When a monospaced font is used, there is a way to justify text without insertingextra spaces. Careful word choice allows the author to write with exactly eightycharacters per line, creating a visual effect of justification. Since many wordsin English mean the same thing but are different lengths, it is just a matter oftrial and error to find the proper line length. For extra points, you should endthe last line after eighty characters as well, creating an invincible paragraph.

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

comparison of markup and WYSIWYG

markup

• daunting at first sight

• powerful (e.g., “put thisone-third on the way betweenthe two margins”)

• persuades user moreeffectively to use logicalmarkup

• both on CLIs and GUIs

• uses less computerresources

• user sees everything in thefile

WYSIWYG

• intuitive, easy at first sight

• “what you see is all you get”

• allows user to use primitiveformatting techniques

• possible only on GUIs

• uses huge computerresources

• data in the file are hiddenfrom user

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

a horrendous example

Figure: hanging and normal indentation: never do it this way!

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

Page 17: Texts 2: Markup languages, software for manipulating textseas3.elte.hu/itcourse-2007/texts-2-h.pdf · Texts 2: Markup languages, software for manipulating text László Kálmán1

comparison of markup and WYSIWYG

markup

• daunting at first sight

• powerful (e.g., “put thisone-third on the way betweenthe two margins”)

• persuades user moreeffectively to use logicalmarkup

• both on CLIs and GUIs

• uses less computerresources

• user sees everything in thefile

WYSIWYG

• intuitive, easy at first sight

• “what you see is all you get”

• allows user to use primitiveformatting techniques

• possible only on GUIs

• uses huge computerresources

• data in the file are hiddenfrom user

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

hyphenation

hyphenation: the WYSIWYG way

points of hyphenation are calculated at the end of each line,they do not change later on, occasionally yielding very looselines; paragraph-based hyphenation would burden the systemtoo much, and would result in constantly flickering characterswhile text is entered

hyphenation: the TEX way

points of hyphenation are calculated for a whole paragraph, andrecalculated several times, until the optimal solution is achieved(this is possible, because the calculation does not take placeduring the editing of the text)

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

comparison of markup and WYSIWYG

markup

• daunting at first sight

• powerful (e.g., “put thisone-third on the way betweenthe two margins”)

• persuades user moreeffectively to use logicalmarkup

• both on CLIs and GUIs

• uses less computerresources

• user sees everything in thefile

WYSIWYG

• intuitive, easy at first sight

• “what you see is all you get”

• allows user to use primitiveformatting techniques

• possible only on GUIs

• uses huge computerresources

• data in the file are hiddenfrom user

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software

data potentially hidden in a WYSIWYG document

• data about the owner of the word procesor

• data about previous edits in the documentyou can lose your job if you are unwary, e.g.,

Rossz verzió

Menesztették Dobos Gabriellát, a Fovárosi Foügyészség sajtóosztályának vezetojét,és felmentették szóvivoi posztjáról is. [. . . ]Ugyanakkor hiba történt, amelynek felelose van, hiszen a vádirat lerövidítése és arövidített, adatokat nem sérto változat elkészítése Dobos Gabriella osztályvezetofeladata volt. Mint az ügyészségi vizsgálatban kiderült, Dobos valóban le is rövidítettea vádiratot, de csak kijelölte a törlendo részeket, és úgy küldte a rövidített verziót aLegfobb Ügyészségre, hogy abból véglegesen nem törölték az adatokat.Így egy órán át az internetre kitett verziót (bizonyos billentyuk megnyomásával) bárkikiegészíthette a teljes verzióból kihúzott részekkel. Így a személyes adatokhoz is bárkihozzáférhetett, ami az adatvédelmi törvényt sérti.

— http://index.hu/politika/belfold/kendh050830/

Kálmán, Oravecz, Szigetvári Texts 2: Markup languages, software