document object model (dom) cheng-chia chen. what is dom ? dom (document object model) a tree-view...

49
Document Object Model (DOM) Cheng-Chia Chen

Post on 20-Dec-2015

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Document Object Model(DOM)

Cheng-Chia Chen

Page 2: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

What is DOM ?

• DOM (Document Object Model)

• A tree-view Data model of XML Documents

• An API for XML document processing– cross multi-languages

– language neutral.

– defined in terms of CORBA IDL

– language-specific bindings supplied for ECMAScript, java, ….

Page 3: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

DOM (Document Object Model)

What is the Document Object Model of the following document:

<?xml version=“1.0” encoding=“UTF-8” ?>

<TABLE><TBODY> <TR> <TD>紅樓夢 </TD> <TD>曹雪芹 </TD> </TR> <TR> <TD>三國演義 </TD> <TD>羅貫中 </TD> </TR> </TBODY></TABLE>

Page 4: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Tree view (DOM view) of an XML Docuemnt

紅樓夢 曹雪芹 三國演義 羅貫中

(document node; root)

(element node)

(text node)

Page 5: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Class/interface Hierarchy of DOM (core) level 1&2 spec.

CharacterData

Attr

DocumentType

ProcessingInstruciton

DocumentFragment

Document

Element

(general) Entity

EntityReference

Notation

CDATASection

Commnet

Text

DOMImplementation

NamedNodeMap

NodeList

Node

DOMException

Page 6: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Possible children of different kinds of nodes• Document

– Element (≤ 1), DocumentType (≤ 1) , ProcessingInstruction, Comment,

• Element , DocumentFragment, EntityReference, Entity– Element, ProcessingInstruction, Comment, Text,

CDATASection, EntityReference

• Attr – Text, EntityReference

• Text, CDATASection, Comment, Notation, ProcessingInstruction, DocumentType – are leaves [ no children]

Notes: 1. Attr is not a child of any element. 2. Entities and Natations defined in DTD can be accessed via

getEntities() and getNatations() of DocumentType.

Page 7: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document
Page 8: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Node and Nodetype constantspublic interface Node { // NodeType: there are 12 kinds of nodes public static final short ELEMENT_NODE = 1; public static final short ATTRIBUTE_NODE = 2; public static final short TEXT_NODE = 3; public static final short CDATA_SECTION_NODE = 4; public static final short ENTITY_REFERENCE_NODE = 5; public static final short ENTITY_NODE = 6; public static final short PROCESSING_INSTRUCTION_NODE =

7; public static final short COMMENT_NODE = 8; public static final short DOCUMENT_NODE = 9; public static final short DOCUMENT_TYPE_NODE = 10;

public static final short DOCUMENT_FRAGMENT_NODE = 11;

public static final short NOTATION_NODE = 12;

Page 9: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

IDL2Java Mapping of IDL attributes

// syntax of IDL attributes:

[readonly] attribute <type> <attrName> [// raise (<exception>) ]*

// we will abbreviate it by

<type>[R]:<attrName>

which is translated into one or two java methods:

• public <type> get<AttrName>() [throws {<exceptions>}];

if it is readable and

• public void set<AttrName>(<type> <newAttValue> )

[throws {<exceptions>}];

if it is writable.

Page 10: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Example:• The following attributes of the Node interface :

readonly attribute DOMString nodeName;

attribute DOMString nodeValue;

// raises(DOMException) on setting

// raises(DOMException) on retrieval

readonly attribute Node parentNode; are abbreviated as:

String[R]:nodeName,

String:nodeValue,

String[R]:parentNode, respectively, and will be mapped to 4 java

methods:

public String getNodeName();

public String getNodeValue() throws DOMException;

public void setNodeValue(String nodeValue)

throws DOMException;

public Node getParentNode();

Page 11: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Node attributes

// nodeName, nodeType and nodeValue

• String[R] : nodeName;

• short[R] : nodeType;

• String : nodeValue;

// raise(DOMException) on get/set

// namespace support: DOM2 only

• String[R] : namespaceURI;

• String[R] : localName;

• String : prefix

// node owner:

• Document[R]: ownerDocument;

Page 12: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Values of NodeName, NodeType and attributes in a Node

Interface nodeName nodeValue attributesAttr name of attribute value of attribute nullCDATASection #cdata-section content nullComment #comment content nullDocument #document null nullDocumentFragment #document-fragment null nullDocumentType document type name null nullElement tag name null NamedNodeMapEntity entity name null nullEntityReference null name of entity referenced nullNotation notation name null nullProcessingInstruction content excluding target target nullText #text content of the text node null

Page 13: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Node attributes

// node relatives

• Node[R] : parentNode, firstChild, lastChild,

• Node[R] : previousSibling, nextSibling;

• NodeList [R] : childNodes;

• NamedNodeMap[R]: attributes;

previousSliblingthis

firstChild

parentNode

lastChild

nextSibling

childNodes

Page 14: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Node manipulation and testing Methods

public Node insertBefore(Node newChild, Node refChild)

public Node replaceChild(Node newChild, Node oldChild)

public Node removeChild(Node oldChild) public Node appendChild(Node newChild) // all the above 4 methods throws DOMException;public boolean hasChildNodes();public Node cloneNode(boolean deep);// Introduced in DOM Level 2:public boolean hasAttributes(); // ture if element and

hasAttributespublic void normalize(); // merge descendant adjacent Texts

into onepublic boolean isSupported(String feature, String version); // same as hasFeature(feature, version) in DOMImplementation

Page 15: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

NodeList and NamedNodeMap

public interface NodeList { // access node collection by index public Node item(int index); // zero-based public int getLength(); }

public interface NamedNodeMap { public Node getNamedItem(String name); // by nodeName public Node setNamedItem(Node arg) throws DOMException; // insert/replace node with nodeName= arg.getNodeName() public Node removeNamedItem(String name) throws DOMException; public Node item(int index); public int getLength(); // Introduced in DOM Level 2: public Node getNamedItemNS(namespaceURI, localName); public Node setNamedItemNS(Node arg) throws DOMException; public Node removeNamedItemNS(namespaceURI, localName) throws DOMException ; }

Page 16: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Elementpublic interface Element extends Node { public String getTagName(); // String[R]:tagName =getName() public String getAttribute(name); //value// set/replace attr ; value not parsed; for value with entity reference,// use setAttributeNode instead public void setAttribute(name, value) throws DOMException; public void removeAttribute(name) throws DOMException; public Attr getAttributeNode(name); public Attr setAttributeNode(Attr newAttr) // add/replace newAttr;

throws DOMException; // return replaced attr or null public Attr removeAttributeNode(Attr oldAttr) throws DOMException; public NodeList getElementsByTagName(name);// and additional DOM2 methods …

Page 17: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Additional ELEMENT methods in DOM2

// Introduced in DOM Level 2:

String getAttributeNS(namespaceURI, localName);

void setAttributeNS(namespaceURI, qualifiedName, value)

throws DOMException;

// set/replace attribute; value not parsed

void removeAttributeNS(namespaceURI, localName) throws DOMException;

Attr getAttributeNodeNS(namespaceURI, localName);

Attr setAttributeNodeNS(Attr newAttr) throws DOMException;

NodeList getElementsByTagNameNS(namespaceURI, localName);

boolean hasAttribute(name);

boolean hasAttributeNS(namespaceURI, localName); };

Page 18: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

the Document node public interface Document extends Node {// 3 attributes:DocumentType[R]: doctype;DOMImplementation[R]; implementation;Element[R]: documentElement;

// factory methods: <nodetype> create<nodetype>(data) ;Element createElement(String tagName) throws DOMException;DocumentFragment createDocumentFragment();Text createTextNode(String data);Comment createComment(String data);CDATASection createCDATASection(String data) throws DOMException;ProcessingInstruction createProcessingInstruction(String target, String data)

throws DOMException;

Page 19: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

the Document node (cont’d)

Attr createAttribute(name) throws DOMException;EntityReference createEntityReference(name) throws DOMException;// end of factory methodsNodeList getElementsByTagName(tagname); // DOM 2Node importNode(Node importedNode, boolean deep) throws DOMException;Element createElementNS(namespaceURI, qualifiedName) throws DOMException;Attr createAttributeNS(namespaceURI, qualifiedName) throws DOMException;NodeList getElementsByTagNameNS(namespaceURI, localName);public Element getElementById(String elementId); }

Page 20: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

CharacterData

public interface CharacterData extends Node { public String getData() throws DOMException; public void setData(String data) throws DOMException; public int getLength(); public String substringData(int offset, int count) throws DOMException; public void appendData(String arg) throws

DOMException; public void insertData(int offset, String arg) throws DOMException; public void deleteData(int offset, int count) throws DOMException; public void replaceData(int offset, int count, String arg) throws DOMException; }

Page 21: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Attr, Text and Commentpublic interface Attr extends Node { public String getName(); public boolean getSpecified(); public String getValue(); public void setValue(String value); public Element getOwnerElement(); // DOM2 } public interface Text extends CharacterData { public Text splitText(int offset) throws DOMException; }

public interface Comment extends CharacterData { }

Page 22: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

CDATASection, DocumentType and Notation

public interface CDATASection extends Text {}public interface DocumentType extends Node { String getName(); NamedNodeMap getEntities(); // GEs (int/external) only, // PEs excluded NamedNodeMap getNotations();// DOM2 only methods String getPublicId(); // publicId and String getSystemId(); // systemId of external subset if any String getInternalSubset(); // internal subset as a string } public interface Notation extends Node { public String getPublicId(); public String getSystemId(); }

Page 23: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Entity, EntityReference and ProcessingInstruction

public interface Entity extends Node { // for GE or unparsed

public String getPublicId(); // entity only.

public String getSystemId();

public String getNotationName(); }

// Entity’s replacement Text are stored as its childNodes

// if available.

public interface EntityReference extends Node { }

public interface ProcessingInstruction extends Node {

public String getTarget();

public String getData();

public void setData(String data) throws DOMException; }

Page 24: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

DOMException

public abstract class DOMException extends RuntimeException {

public DOMException(short code, String message) {

super(message); this.code = code; }

public short code;

// ExceptionCode

public static final short INDEX_SIZE_ERR = 1;

public static final short DOMSTRING_SIZE_ERR = 2;

public static final short HIERARCHY_REQUEST_ERR = 3;

public static final short WRONG_DOCUMENT_ERR = 4;

public static final short INVALID_CHARACTER_ERR = 5;

public static final short NO_DATA_ALLOWED_ERR = 6;

public static final short NO_MODIFICATION_ALLOWED_ERR = 7;

public static final short NOT_FOUND_ERR = 8;

public static final short NOT_SUPPORTED_ERR = 9;

public static final short INUSE_ATTRIBUTE_ERR = 10;

Page 25: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

DOMException

// DOM2 only DOMException code

public static final short INVALID_STATE_ERR = 11;

public static final short SYNTAX_ERR = 12;

public static final short INVALID_MODIFICATION_ERR = 13;

public static final short NAMESPACE_ERR = 14;

public static final short INVALID_ACCESS_ERR = 15;}

Page 26: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

DOMImplementation and DocumentFragment

public interface DOMImplementation {

public boolean hasFeature(String feature, String version);

public DocumentType createDocumentType(qName, publicId, systemId) throws DOMException;

public Document createDocument(

namespaceURI, // namespace URI of the document element

qName, // QName of the document element

DocumentType doctype) throws DOMException;

}

public interface DocumentFragment extends Node { }

Page 27: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

legal feature string

Module Feature String XML XML HTML HTML Views Views StyleSheets StyleSheets CSS CSS CSS (extended interfaces) CSS2 Events Events User Interface Events (UIEvent interface) UIEvents Mouse Events (MouseEvents interface) MouseEvents Mutation Events (MutationEvent interface) MutationEvents HTML Events HTMLEvents Traversal Traversal Range Range

Page 28: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Module dependence

Module Implies

Views XML or HTML

StyleSheets StyleSheets and XML or HTML

CSS StyleSheets, Views and XML or HTML

CSS2 CSS, StyleSheets, Views and XML or HTML

Events XML or HTML

UIEvents Views, Events and XML or HTML

MouseEvents UIEvents, Views, Events and XML or HTML

MutationEvents Events and XML or HTML

HTMLEvents Events and XML or HTML

Page 29: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

DOMParsers and DOMImplementations

Problems:

• How to get a DOM object from an XML Document ?– DOMParser

• HOW to construct DOM objects directly by programs ?– get a DOMImplementation

• HOW to get a DOM object form an XML Document and modify it by programs ?– get a DOMParser and then get the DOMImplementation from the

DOM object.

DOMParser

XML Document

DOM Document

Page 30: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Use Apache’s xerces for DOM• XML2DOM:// find the DOM parser implementation class:

org.apache.xerces.parsers.DOMParserDOMParser parser = new DOMParser();parser.setFeature(("http://xml.org/sax/features/validation", true );parser.setFeature(("http://xml.org/sax/features/namespace", true ); …parser.parse( url_or_inputSource) ;Document doc = parser.getDocument();

DOMImplementation =doc.getImplementation();• Construct DOM from scratch:// find DOMImplematation class:

org.apache.xerces.dom.DOMImplementationImplDOMImplementation dm = new DOMImplementationImpl();// or dm = DOMImplementationImpl.getDOMImplementation(); // non-domDocument doc = dm.createDocument(…);Element e = doc.createElement(…);Attr attr = doc.createAttributeNS(…);Text txt = doc.createTextNode(“…”);

Page 31: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

JAXP (Java API for XML Processing) 1.1

• Sun’s Java API for XML Processing• three modules:

– for DOM Processing– for SAX Processing– for Transformation

• 5 packages1. javax.xml.parsers

– Provides classes allowing the processing of XML documents. – Two types of plugable parsers are supported: – SAX (Simple API for XML) – DOM (Document Object Model)

2. javax.xml.transform ( + javax.xml.transform.dom, javax.xml.transform.sax, javax.xml.transform.stream)– APIs for processing transformation instructions, and performing a

transformation from source to result.

Page 32: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

JAXP’s DOM plugability mechanism

Page 33: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

JAXP API for DOM

• javax.xml.dom.DocumentBuilder– Defines the API to obtain DOM Document instances from

an XML document. Using this class, an application programmer can obtain a Document from XML.

• javax.xml.dom.DocumentBuilderFactory– Defines a factory API that enables applications to obtain a

parser that produces DOM object trees from XML documents.

– abstract class– Concrete subclass can be obtained by the static method:– DocumentBuilderFactory.newInstance()– desired capability of the parser can be specified by setting

the various properties of the obtained factory instance.

Page 34: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Example Code

import javax.xml.parsers.*;

DocumentBuilder builder;

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

factory.setNamespaceAware(true);

factory.setValidating(true);

String location = "http://myserver/mycontent.xml";

try {

builder = factory.newDocumentBuilder();

Document doc1 = builder.parse(location);

Document doc2 = builder.newDocument(); //empty document

} catch (SAXException se) {// handle error

} catch (IOException ioe) { // handle error

} catch (ParserConfigurationException pce){// handle error

}

Page 35: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

javax.xml.dom.DocumentBuilder • abstract DOMImplementation getDOMImplementation()

– Obtain an instance of a DOMImplementation object.

• abstract Document newDocument() – Obtain a new instance of a DOM Document object to build a DOM tree with.

• abstract boolean isNamespaceAware() – Indicates whether or not this parser is configured to understand

namespaces.

• abstract boolean isValidating() – Indicates whether or not this parser is configured to validate XML

documents.

• Document parse(File | InputSource | InputStream [, systemId] | uriString )– Parse the content of the given file as an XML document and return a new

DOM Document object.

• abstract void setEntityResolver(EntityResolver er) – Specify the EntityResolver to be used to resolve entities present in the XML

document to be parsed.

• abstract void setErrorHandler(ErrorHandler eh) – Specify the ErrorHandler to be used to report errors present in the XML

document to be parsed.

Page 36: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

javax.xml.dom.DocumentBuilderFactory

• Object getAttribute(String name) • void setAttribute(String name, Object value)

– Allows the user to set/get specific attributes on the underlying implementation.

• boolean isIgnoringComments() , setIgnoringComments(boolean)– Indicates whether or not the factory is configured to produce parsers

which ignores comments.

• Other properties:– IgnoringElementContentWhitespace ; ExpandEntityReferences; – Coalescing; // merge adjacent texts and CDATA into a text node– NamespaceAware; Validating;

• abstract DocumentBuilder newDocumentBuilder() – Creates a new instance of a DocumentBuilder using the currently

configured parameters.

• static DocumentBuilderFactory newInstance() – Obtain a new instance of a DocumentBuilderFactory.

Page 37: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

HOW DocumentBuilderFactory finds its instance

•Use the javax.xml.parsers.DocumentBuilderFactory system property

•Use the above property at file "lib/jaxp.properties" in the JRE directory.

•look for the classname in the file META-INF/services/ javax.xml.parsers.DocumentBuilderFactory in jars available to the runtime.

•Platform default DocumentBuilderFactory instance, which is "org.apache.crimson.jaxp.DocumentBuilderFactoryImpl“ for JAXP1.1 and crimson1.1.

Page 38: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Bootstrap DOM (level 3 core)

• Problem : how to get a DOMImplementation ?– implementation dependant prior to level 3.– xerces: => org.apache.xerces.dom.DOMImplmentationImpl;– crimson =>org.apache.crimson.tree.DOMImplementationImpl

• two supporting class/interface:– DOMImplementationRegistry– DOMImplementationSource

public interface DOMImplementationSource {

DOMImplementation

getDOMImplementation(String features);

};

Page 39: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

DOMImplementationRegistry

public class DOMImplementationRegistry { // The system property to specify the DOMImplementationSource class

names. public static String PROPERTY =

"org.w3c.dom.DOMImplementationSourceList"; private static Vector sources = new Vector(); private static boolean initialized = false; private static void initialize() throws ClassNotFoundException, InstantiationException, IllegalAccessException { initialized = true; String p = System.getProperty(PROPERTY); if (p == null) { return; } StringTokenizer st = new StringTokenizer(p); while (st.hasMoreTokens()) { Object source = Class.forName(st.nextToken()).newInstance(); sources.addElement(source); } }

Page 40: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

public static DOMImplementation getDOMImplementation(String features) throws ClassNotFoundException, InstantiationException, IllegalAccessException { if (!initialized) { initialize(); } int len = sources.size(); for (int i = 0; i < len; i++) { DOMImplementationSource source = (DOMImplementationSource) sources.elementAt(i);

DOMImplementation impl = source.getDOMImplementation(features);

if (impl != null) { return impl; } } return null; }

Page 41: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

/* Register an implementation. */

public static void addSource(DOMImplementationSource s)

throws ClassNotFoundException,

InstantiationException, IllegalAccessException

{

if (!initialized) { initialize(); }

sources.addElement(s);

// update system property accordingly

StringBuffer b = new StringBuffer(System.getProperty(PROPERTY));

b.append(" " + s.getClass().getName());

System.setProperty(PROPERTY, b.toString()); }}

Page 42: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Get Your DOMImplementation via DOMImplementationRegistry

1. Add all known DOMImplementationSource classes or classnames to your JVM:

A. put all classnames (space separated) into the System property "org.w3c.dom.DOMImplementationSourceList”

System.putProperty(PROPERTY, classnames);

B. DOMImplementationRegistry

.addSource(DOMImplementationSource);

2. Query DOMImplementationReqistry:

DOMImplementation impl = DOMImplementationRegistry

.getDOMImplementation("XML 1.0");

Page 43: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

Example: XDXTest

import java.io.File; import org.w3c.dom.Document;import org.apache.xerces.parsers.DOMParser;public class XDXTest { public void test(String xmlDocument, String outputFilename) throws Exception { File outputFile = new File(outputFilename); DOMParser parser = new DOMParser();

// Get the DOM tree as a Document object parser.parse(xmlDocument); Document doc = parser.getDocument();

// Serialize DOM2XML d2x = new DOM2XML(); d2x.toXML(doc, new File(outputFilename)); }

Page 44: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

DOM SerializerTest (continued)

public static void main(String[] args) {

if (args.length != 2) {

System.out.println(

"Usage: java XDXTest " +

"[XML document to read] " +

"[filename to write out to]");

System.exit(0); }

try {

XDXTest tester = new XDXTest();

tester.test(args[0], args[1]); // input file, outpt file name

} catch (Exception e) {

e.printStackTrace();

}

}

}

Page 45: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

DOMSerializer

import java.io.*; import org.w3c.dom.*public class DOM2XML {

private String indent; // Indentation to use private String lineSeparator; // Line separator to use

public DOM2XML() { indent = ""; lineSeparator = "\n"; } public void setIndent(String indent) { this.indent = indent; } public void setLineSeparator(String lineSeparator) { …} public void toXML(Document doc, OutputStream out) throws IOException { Writer writer = new OutputStreamWriter(out); serialize(doc, writer); } public void toXML(Document doc, File file) throws IOException { … } public void toXML(Document doc, Writer writer) throws IOException { // Start serialization recursion with no indenting serializeNode(doc, writer, ""); writer.flush(); }

Page 46: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

public void serializeNode(Node node, Writer writer, String indentLevel)

throws IOException { // Determine action based on node type switch (node.getNodeType()) { case Node.DOCUMENT_NODE: writer.write("<?xml version=\"1.0\"?>"); writer.write(lineSeparator); // recurse on each child NodeList nodes = node.getChildNodes(); if (nodes != null) { for (int i=0; i<nodes.getLength(); i++) { serializeNode(nodes.item(i), writer, ""); } } break;

Page 47: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

case Node.ELEMENT_NODE: String name = node.getNodeName(); writer.write(indentLevel + "<" +

name); NamedNodeMap attributes = node.getAttributes(); for (int i=0; i<attributes.getLength(); i++) { Node current = attributes.item(i); writer.write(" " + current.getNodeName() + "=\"" + current.getNodeValue() + "\""); } writer.write(">"); // end of STAG NodeList children = node.getChildNodes(); if (children != null) { if ((children.item(0) != null) && (children.item(0).getNodeType() == Node.ELEMENT_NODE)) { writer.write(lineSeparator); } for (int i=0; i<children.getLength(); i++) { serializeNode(children.item(i), writer, indentLevel + indent); } if ((children.item(0) != null) && (children.item(children.getLength()-

1) .getNodeType() == Node.ELEMENT_NODE)) { writer.write(indentLevel); } } writer.write("</" + name + ">"); writer.write(lineSeparator); break;

Page 48: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

case Node.TEXT_NODE: writer.write(node.getNodeValue()); break;

case Node.CDATA_SECTION_NODE: writer.write("<![CDATA[" + node.getNodeValue() + "]]>"); break;

case Node.COMMENT_NODE: writer.write(indentLevel + "<!-- " + node.getNodeValue() + " -->"); writer.write(lineSeparator); break; case Node.PROCESSING_INSTRUCTION_NODE: writer.write("<?" + node.getNodeName() + " " + node.getNodeValue()

+ "?>"); writer.write(lineSeparator); break;

Page 49: Document Object Model (DOM) Cheng-Chia Chen. What is DOM ? DOM (Document Object Model) A tree-view Data model of XML Documents An API for XML document

case Node.ENTITY_REFERENCE_NODE: writer.write("&" + node.getNodeName() + ";"); break;

case Node.DOCUMENT_TYPE_NODE: DocumentType docType = (DocumentType)node; writer.write("<!DOCTYPE " + docType.getName()); if (docType.getPublicId() != null) { writer.write(" PUBLIC \"" + docType.getPublicId() + "\" ");

} else { writer.write(" SYSTEM "); } writer.write("\"" + docType.getSystemId() + "\">");

writer.write(lineSeparator); break; } }}