The XML Document Object Model (DOM) is a programming interface that provides a standard way to access properties of an XML document via any programming language - JS, VBS, PHP, Perl, Java, C++, VB, and others. Using the DOM, a programmer can navigate a representation of the XML document to retrieve, add, modify, or delete its contents.
The XML DOM represents each property of an XML document as a separate "node", which means that
The XML DOM arranges the nodes into a "node tree" structure reflecting the hierachical relationship of the document properties. Beneath the top-level "document node", the node tree begins with the root element of the XML document. BEneath this are element nodes for each element nested directly within the root element. The tree continues down through attribute nodes, text nodes, and other elements, until completing the text nodes at the lowest level.
The terms "parent", "child", "sibling" and "leaf" are used to describe the hierarchical relationship between nodes. A parent node may contain one, or more, nested child nodes. Silblings are nodes that have the same parent, and leaf nodes have no children.
In the XML DOM a node tree represents each property of an XML document hierarchically, which means that
In the XML document listed below the first line contains the standard processing instruction defining the document type, which is represenetaed by teh "document node" within the XML DOM. Beneath this the node tree starts at the XML <catalog> element, which is represented in teh DOM by the root "element node".
<?xml version="1.0" encoding="UTF-8" ?>
<catalog>
<book id="1">
<!-- Describing the book. -->
<title>XML in easy steps</title>
<author>Mike McGrath</author>
</book>
</catalog>
In order to access the data in an XML document via the XML DOM it must first be loaded into your computer's memory. IE uses an ActiveXObject called "Microsoft.XMLDOM" to achieve this. Create an HTML document in the same directory as thh tree.xml file, lised on teh previous page, then follow these steps to load its data.
Unfortunately the Microsoft.XMLDOM ActiveXObject is only available to IE so additional code needs to be added to the example on the previous page to allow other browsers access to XML data via the XML DOM.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <title>Loading XML Data</title> <meta http-equiv = "Content-Type" content = "text/html;charset=UTF-8" />
<!-- 1. Add a JavaScript code area. --> <script type="text/javascript">
/* <![CDATA[ */
// MIKE'S NOTE: the CDATA tags in these XHTML documents are
// enclosed within /* */ to prevent browser detection
// while still allowing the XHTML to be validated // 2. Declare a global variable to store XML representation.
var xmlDoc;
// 3. Create a function to load XML data.
function loadXmlDoc() { // 14. Provide the ability to fail gracefully.
try {
// 4. Conditional test for ActiveX with Internet Explorer.
if (window.ActiveXObject)
{
// 5. Statements to load XML document.
xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
xmlDoc.async=false;
xmlDoc.load("tree.xml"); // 9. Call the function to display data.
getXmlData();
// 10.Conditional test for document.implementation with Firefox.
} else if (document.implementation && document.implementation.createDocument) { // 11. Statements to load XML document.
xmlDoc = document.implementation.createDocument("","",null);
xmlDoc.load("tree.xml"); // 12. Call the function to display data.
xmlDoc.onload=getXmlData;
}
}
catch(err) {
// 13. Default message to display if XML data cannot be loaded.
document.getElementById("box").innerHTML = "XML data is unavailable";
}
} // 8. Function to display XML data.
function getXmlData() {
// Get the root element node name.
var node = xmlDoc.documentElement.nodeName; // Add the root node name to a string;
var data = "Root Element Name is: " + node; // Display the string.
document.getElementById("box").innerHTML = data; } // 6. Call the XML loader function when the HTML document has loaded. window.onload=loadXmlDoc; /* ]]> */ </script> </head> <body> <!-- 7. A div area where data can be displayed. --> <div id = "box" style= "background:yellow;width:250px;padding:5px" />
</body> </html>
The simplest way to reference all XML elements of a particular name is with the XML DOM's getElementsByTagName method. This can easily find one or more elements of the name specified as its argument - regardless of the document structure.
The results are returned as a zero-indexed nodeList in which each element node can be addressed using its index number. For instance, getElementsByTagName("title") returns an array of all <title>element nodes. The first <title> element node can be addressed as nodeList[0], the second as nodeList[1], and so on.
A nodeList also has a useful length property that contains the numeric total of elements in the ndoeList. Typically a loop is used ato iterate through a nodeList and its length property specifies when the loop should terminate.
To illustrate the getElementsByTagName method in action with a nodeList the listing below, books.xml, replaces the data in the earrlier tree.xml document with three different titles.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Addressing Elements By Tag Name</title>
<meta http-equiv = "Content-Type" content = "text/html;charset=UTF-8" />
<script type="text/javascript">
/* <![CDATA[ */
// MIKE'S NOTE: the CDATA tags in these XHTML documents are
// enclosed within /* */ to prevent browser detection
// while still allowing the XHTML to be validated.
var xmlDoc;
function loadXmlDoc()
{
try
{
if (window.ActiveXObject)
{
xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
xmlDoc.async=false;
xmlDoc.load("books.xml");
getXmlData();
}
else if (document.implementation && document.implementation.createDocument)
{
xmlDoc = document.implementation.createDocument("","",null);
xmlDoc.load("books.xml");
xmlDoc.onload=getXmlData;
}
}
catch(err)
{
document.getElementById("box").innerHTML = "XML data is unavailable";
}
}
function getXmlData()
{
// Get all <title> element nodes.
var nodeList = xmlDoc.getElementsByTagName("title");
// Add the node list length to a string.
var data = "No. of Titles: " + nodeList.length;
// Append each <title> text node to the string.
for ( i = 0; i < nodeList.length; i++ )
{
data += "<br>" + nodeList[i].firstChild.nodeValue;
}
// Display the string.
document.getElementById("box").innerHTML = data;
}
window.onload=loadXmlDoc;
/* ]]> */
</script>
</head>
<body>
<div id = "box" style= "background:yellow;width:250px;padding:5px"> </div>
</body>
</html>
The childNodes property of any element referenced in teh XML DOM will return a zero-indexed nodeList array of child nodes representing those elements nested directly below that element. Each child node can be addressed using its index number and the length property will return the number of elements in the array - as in the previous example.
The size of a childNodes nodeList array is, sadly, not consistent between browsers as their XML DOM treatment varies. This is evident by the numeric value returned for the length property.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Addressing Child Nodes</title>
<meta http-equiv = "Content-Type" content = "text/html;charset=UTF-8" />
<script type="text/javascript">
/* <![CDATA[ */
// MIKE'S NOTE: the CDATA tags in these XHTML documents are
// enclosed within /* */ to prevent browser detection
// while still allowing the XHTML to be validated.
var xmlDoc;
function loadXmlDoc()
{
try
{
if (window.ActiveXObject)
{
xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
xmlDoc.async=false;
xmlDoc.load("books.xml");
getXmlData();
}
else if (document.implementation && document.implementation.createDocument)
{
xmlDoc = document.implementation.createDocument("","",null);
xmlDoc.load("books.xml");
xmlDoc.onload=getXmlData;
}
}
catch(err)
{
document.getElementById("box").innerHTML = "XML data is unavailable";
}
}
function getXmlData()
{
// Get all child nodes under the root element.
var nodeList = xmlDoc.documentElement.childNodes;
// Add the node list length to a string.
var data = "No. of Child Nodes: " + nodeList.length;
// Append the node list item number and type to the string.
for( i=0; i < nodeList.length; i++ )
{
data += "<li>Node: " + i ;
data += " Node Type: " + nodeList[i].nodeType;
}
// Display the string.
document.getElementById("box").innerHTML = data;
}
window.onload=loadXmlDoc;
/* ]]> */
</script>
</head>
<body>
<div id = "box" style= "background:yellow;width:250px;padding:5px"> </div>
</body>
</html>
To understand the difference between how IE and Firefox treat the same XML data in their repective DOM it's useful to inspect the type of each node in the childNodes array. The nodeType property of each node contains a numeric value according to its type - an element node is 1 and a text node is 3.
Add this text function getXmlData() and open the page in IE and Firefox to see the difference.
// Append the node list item number and type to the string.
for( i=0; i < nodeList.length; i++ )
{
data += "<li>Node: " + i ;
data += " Node Type: " + nodeList[i].nodeType;
}
The results illustrate why the childNodes array are different sizes - IE only includes element node types in the childNodes array whereas Firefox also includes text node types.
The nodeType property, introduced in teh previous example, can represent a componenet of an XML document using these values:
| Component | Node Type | Component | Node Type |
| Element | 1 | Instruction | 7 |
| Attribute | 2 | Comment | 8 |
| Text | 3 | Document | 9 |
| CDATA | 4 | Doctype | 10 |
| Entity reference | 5 | Fragment | 11 |
| Entity | 6 | Notation | 12 |
In order to select only element nodes, for any browser, a loop can make a conditional test of each node's nodeType value and select only those nodes with a value of 1.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Filtering by Node Type</title>
<meta http-equiv = "Content-Type" content = "text/html;charset=UTF-8" />
<script type="text/javascript">
/* <![CDATA[ */
// MIKE'S NOTE: the CDATA tags in these XHTML documents are
// enclosed within /* */ to prevent browser detection
// while still allowing the XHTML to be validated.
var xmlDoc;
function loadXmlDoc()
{
try
{
if (window.ActiveXObject)
{
xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
xmlDoc.async=false;
xmlDoc.load("books.xml");
getXmlData();
}
else if (document.implementation && document.implementation.createDocument)
{
xmlDoc = document.implementation.createDocument("","",null);
xmlDoc.load("books.xml");
xmlDoc.onload=getXmlData;
}
}
catch(err)
{
document.getElementById("box").innerHTML = "XML data is unavailable";
}
}
function getXmlData()
{
// Get all child nodes under the root element.
var nodeList = xmlDoc.documentElement.childNodes;
// Create a new array to store element nodes.
var nodeArr = new Array();
// Assign only element nodes to the filtered array.
for( i = 0; i < nodeList.length; i++ )
{
if(nodeList[i].nodeType == 1 )
nodeArr[ nodeArr.length ] = nodeList[i];
}
// Add the filtered array size to a string.
var data = "No. of Filtered Nodes: " + nodeArr.length;
// Append the node name of each filtered node to the string.
for( i = 0; i < nodeArr.length; i++ )
{
data += "<li>" + nodeArr[i].nodeName;
}
// Display the string.
document.getElementById("box").innerHTML = data;
}
window.onload=loadXmlDoc;
/* ]]> */
</script>
</head>
<body>
<div id = "box" style= "background:yellow;width:250px;padding:5px"> </div>
</body>
</html>
As the structure of XML documents presents data within elements on nested levels it's logical to address all element nodes in JS with nested loops - each loop iterating through all the element nodes on one particular level. For instance, an outer loop can iterate through all element nodes on the top level, and a nested inner loop can iterate through all child nodes on the level below. Additional nested loops can be used to address further levels in order to get the data from all element nodes.
Data stored within an attribute can be addressed by specifying the attribute name as the argument to the containing element's getAttribute method. For example, you can address the data in an id attribute within a <book> element as book.getAttribute("id").
Using nested loops and the getAttribute method allows all data within an XML document to be retrieved by JS via the XML DOM. The following steps demonstrate this process to retrieve all data from the books.xml document.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Get All Data</title>
<meta http-equiv = "Content-Type" content = "text/html;charset=UTF-8" />
<script type="text/javascript">
/* <![CDATA[ */
// MIKE'S NOTE: the CDATA tags in these XHTML documents are
// enclosed within /* */ to prevent browser detection
// while still allowing the XHTML to be validated.
var xmlDoc;
function loadXmlDoc()
{
try
{
if (window.ActiveXObject)
{
xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
xmlDoc.async=false;
xmlDoc.load("books.xml");
getXmlData();
}
else if (document.implementation && document.implementation.createDocument)
{
xmlDoc = document.implementation.createDocument("","",null);
xmlDoc.load("books.xml");
xmlDoc.onload=getXmlData;
}
}
catch(err)
{
document.getElementById("box").innerHTML = "XML data is unavailable";
}
}
function getXmlData()
{
// Get all <book> element nodes.
var nodeList = xmlDoc.getElementsByTagName("book");
// Initialize a variable with an empty string.
var data = "";
// Loop through all <book> element nodes.
for(i=0; i < nodeList.length; i++)
{
// Append the id attribute value to the string.
data += "ISBN: " + nodeList[i].getAttribute("id");
// Loop through each child node of the <book> element node.
for( j=0; j < nodeList[i].childNodes.length; j++)
{
if( nodeList[i].childNodes[j].nodeType == 1 )
{
// Append each child node name to the string.
data += "<br>" +
nodeList[i].childNodes[j].nodeName.toUpperCase() ;
// Append each child node's text node value to the string.
data += ": " +
nodeList[i].childNodes[j].firstChild.nodeValue;
}
}
// Add a horizontal rule below each data set.
data += "<hr>";
}
// Display the string.
document.getElementById("box").innerHTML = data;
}
window.onload=loadXmlDoc;
/* ]]> */
</script>
</head>
<body>
<div id = "box" style= "background:yellow;width:250px;padding:5px"> </div>
</body>
</html>
When retrieving data from the XML DOM you may often want to select particular data by querying the value within an element - just as a database query retrieves particular data from a database. In both cases, the query tests a specified value then retrieves particular data according to the result of the test . For instance, in a database table named "books" a query might test an "author" column for a specified name, then return all book data for each match. Similarly, in the XML document books.xml a query might test an <author> element for a specified name, then return all book data for each match.
The following codes approximate an SQL database query of SELECT book FROM books WHERE author = "Mike McGrath";
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Selecting Specific Data</title>
<meta http-equiv = "Content-Type" content = "text/html;charset=UTF-8" />
<script type = "text/javascript">
/* <![CDATA[ */
// MIKE'S NOTE: the CDATA tags in these XHTML documents are
// enclosed within /* */ to prevent browser detection
// while still allowing the XHTML to be validated.
var xmlDoc;
function loadXmlDoc()
{
try
{
if (window.ActiveXObject)
{
xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
xmlDoc.async = false;
xmlDoc.load("books.xml");
getXmlData();
}
else if (document.implementation && document.implementation.createDocument)
{
xmlDoc = document.implementation.createDocument("","",null);
xmlDoc.load("books.xml");
xmlDoc.onload=getXmlData;
}
}
catch(err)
{
document.getElementById("box").innerHTML = "XML data is unavailable";
}
}
function getXmlData()
{
// Get all <author> element nodes.
var nodeList = xmlDoc.getElementsByTagName("author");
// Declare two variables.
var book, data = "Selected books by Mike McGrath:";
// Loop through all <author> element nodes.
for(i=0; i< nodeList.length; i++)
{
// Test the text node value to match the query string.
if(nodeList[i].firstChild.nodeValue == "Mike McGrath")
{
// Store the <book> parent node of the matched <author> element.
book = nodeList[i].parentNode;
// Loop through all child nodes of the stored <book> element.
for( j = 0; j < book.childNodes.length; j++ )
{
// Match only the <title> element node.
if( (book.childNodes[j].nodeType ==1) && (book.childNodes[j].nodeName == "title") )
{
// Append the selected text node value (the title).
data += "<dt>"+ book.childNodes[j].firstChild.nodeValue;
// Append the selected id attribute value (the ISBN).
data += "<dd>" + book.getAttribute("id") ;
}
}
}
}
// Display the string.
document.getElementById("box").innerHTML = data;
}
window.onload=loadXmlDoc;
/* ]]> */
</script>
</head>
<body>
<div id = "box" style= "background:yellow;width:250px;padding:5px"> </div>
</body>
</html>