chapter 6 text and multimedia languages and properties
TRANSCRIPT
![Page 1: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/1.jpg)
Chapter 6Text and Multimedia Languages and Properties..
.
![Page 2: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/2.jpg)
Introduction Document has given syntax and structure
also has semantics may have presentation style associated
with it Figure 6.1 depicts all these relationships document can also have information
about itself, called metadata
![Page 3: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/3.jpg)
Syntax of document can express different elements such as structure, presentation style, semantics
one or more of these elements may be given together structural element (e.g. section) can have
fixed formatting style
![Page 4: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/4.jpg)
Syntax of document can be implicit in its content expressed in declarative language or PL
current trend is to use languages that provide information on document
structure format semantics
readable by humans and computers SGML is one such language
![Page 5: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/5.jpg)
Metadata Metadata is data about data metadata associated with text include
author date of publication source of publication document length (in pages, words, bytes) document genre (book, article, memo)
Machine Readable Cataloging Record (MARC) is most used format for library records
![Page 6: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/6.jpg)
In Web, metadata used for many purposes cataloging content rating (e.g. to protect children
from reading some type of document) intellectual property rights digital signatures (for authentication) privacy levels (who should/should not
have access to document) application to EC, etc.
![Page 7: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/7.jpg)
New standard for Web metadata is Resource Description Framework (RDF)
RDF allows description of Web resources
consists of description of nodes and attached attribute/value pairs nodes can be any Web resource (any URI),
that include URL attributes are properties of nodes, and their
values are text strings or other nodes
![Page 8: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/8.jpg)
Text With the advent of computers,
necessary to code text in binary digits
first coding schemes were EBCDIC and ASCII
for internationalization of oriental languages like Chinese or Japanese Kanji, 16-bit Unicode (ISO10616) exists
![Page 9: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/9.jpg)
Text Formats No single format for text document in the past, IR systems would convert
document to internal format cannot change content of document
current IR systems have filters to handle most popular documents, in particular Word, WordPerfect or Framemaker
![Page 10: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/10.jpg)
Other text formats for document interchange Rich Text Format (RTF)
used by word processors and has ASCII syntax
Portable Document Format (PDF) developed for displaying and printing
documents Multipurpose Internet Mail Exchange
(MIME) used to encode electronic mail
![Page 11: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/11.jpg)
![Page 12: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/12.jpg)
![Page 13: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/13.jpg)
![Page 14: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/14.jpg)
![Page 15: Chapter 6 Text and Multimedia Languages and Properties](https://reader036.vdocuments.us/reader036/viewer/2022082316/56649e305503460f94b20ef6/html5/thumbnails/15.jpg)