commons-xml/java/external/xdocs/dom/core/introduction.html

513 lines
20 KiB
HTML

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!--
Generated: Thu Nov 09 17:42:28 EST 2000 jfouffa.w3.org
-->
<html lang='en' xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>What is the Document Object Model?</title>
<link rel='stylesheet' type='text/css' href='./spec.css' />
<link rel='stylesheet' type='text/css' href='W3C-REC.css' />
<link rel='next' href='core.html' />
<link rel='contents' href='Overview.html#contents' />
<link rel='index' href='def-index.html' />
<link rel='previous' href='copyright-notice.html' />
</head>
<body>
<div class='navbar' align='center'><a accesskey='p'
href='copyright-notice.html'>previous</a> &nbsp; <a accesskey='n'
href='core.html'>next</a> &nbsp; <a accesskey='c'
href='Overview.html#contents'>contents</a> &nbsp; <a accesskey='i'
href='def-index.html'>index</a>
<hr title='Navigation area separator' />
</div>
<div class='noprint' style='text-align: right'>
<p style='font-family: monospace;font-size:small'>13 November,
2000</p>
</div>
<div class='div1'><a id="Introduction" name='Introduction'></a>
<h1 id='Introduction-h1' class='div1'>What is the Document Object
Model?</h1>
<dl>
<dt><i>Editors</i></dt>
<dd>Philippe Le H&eacute;garet, W3C</dd>
<dd>Lauren Wood, SoftQuad Software Inc., WG Chair</dd>
<dd>Jonathan Robie, Texcel (for DOM Level 1)</dd>
</dl>
<div class='div2'><a id="ID-E7C3082" name='ID-E7C3082'></a>
<h2 id='ID-E7C3082-h2' class='div2'>Introduction</h2>
<p>The Document Object Model (DOM) is an application programming
interface (<a href='glossary.html#dt-API'><em>API</em></a>) for
valid <a href='glossary.html#dt-HTML'><em>HTML</em></a> and
well-formed <a href='glossary.html#dt-XML'><em>XML</em></a>
documents. It defines the logical structure of documents and the
way a document is accessed and manipulated. In the DOM
specification, the term "document" is used in the broad sense -
increasingly, XML is being used as a way of representing many
different kinds of information that may be stored in diverse
systems, and much of this would traditionally be seen as data
rather than as documents. Nevertheless, XML presents this data as
documents, and the DOM may be used to manage this data.</p>
<p>With the Document Object Model, programmers can build documents,
navigate their structure, and add, modify, or delete elements and
content. Anything found in an HTML or XML document can be accessed,
changed, deleted, or added using the Document Object Model, with a
few exceptions - in particular, the DOM <a
href='glossary.html#dt-interface'><em>interfaces</em></a> for the
XML internal and external subsets have not yet been specified.</p>
<p>As a W3C specification, one important objective for the Document
Object Model is to provide a standard programming interface that
can be used in a wide variety of environments and <a
href='glossary.html#dt-application'><em>applications</em></a>. The
DOM is designed to be used with any programming language. In order
to provide a precise, language-independent specification of the DOM
interfaces, we have chosen to define the specifications in Object
Management Group (OMG) IDL [<a class='noxref'
href='references.html#OMGIDL'>OMGIDL</a>], as defined in the CORBA
2.3.1 specification [<a class='noxref'
href='references.html#CORBA'>CORBA</a>]. In addition to the OMG IDL
specification, we provide <a
href='glossary.html#dt-lang-binding'><em>language bindings</em></a>
for Java [<a class='noxref' href='references.html#Java'>Java</a>]
and ECMAScript [<a class='noxref'
href='references.html#ECMAScript'>ECMAScript</a>] (an
industry-standard scripting language based on JavaScript [<a
class='noxref' href='references.html#JavaScript'>JavaScript</a>]
and JScript [<a class='noxref'
href='references.html#JScript'>JScript</a>]).</p>
<p><b>Note:</b> OMG IDL is used only as a language-independent and
implementation-neutral way to specify <a
href='glossary.html#dt-interface'><em>interfaces</em></a>. Various
other IDLs could have been used ([<a class='noxref'
href='references.html#COM'>COM</a>], [<a class='noxref'
href='references.html#JavaIDL'>JavaIDL</a>], [<a class='noxref'
href='references.html#MSIDL'>MIDL</a>], ...). In general, IDLs are
designed for specific computing environments. The Document Object
Model can be implemented in any computing environment, and does not
require the object binding runtimes generally associated with such
IDLs.</p>
</div>
<!-- div2 ID-E7C3082 -->
<div class='div2'><a id="ID-E7C30821" name='ID-E7C30821'></a>
<h2 id='ID-E7C30821-h2' class='div2'>What the Document Object Model
is</h2>
<p>The DOM is a programming <a
href='glossary.html#dt-API'><em>API</em></a> for documents. It is
based on an object structure that closely resembles the structure
of the documents it <a
href='glossary.html#dt-model'><em>models</em></a>. For instance,
consider this table, taken from an HTML document:</p>
<div class='code-block'>
<pre>
&lt;TABLE&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD&gt;Shady Grove&lt;/TD&gt;
&lt;TD&gt;Aeolian&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD&gt;Over the River, Charlie&lt;/TD&gt;
&lt;TD&gt;Dorian&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
</pre>
</div>
<p>A graphical representation of the DOM of the example table
is:<br />
</p>
<div align='center'>
<hr width='90%' size='2' />
<img src='./images/table.gif'
alt='graphical representation of the DOM of the example table' />
<hr width='90%' size='2' />
<b>graphical representation of the DOM of the example table</b>
<hr width='90%' size='2' />
</div>
<p>In the DOM, documents have a logical structure which is very
much like a tree; to be more precise, which is like a "forest" or
"grove", which can contain more than one tree. Each document
contains zero or one doctype nodes, one root element node, and zero
or more comments or processing instructions; the root element
serves as the root of the element tree for the document. However,
the DOM does not specify that documents must be
<em>implemented</em> as a tree or a grove, nor does it specify how
the relationships among objects be implemented. The DOM is a
logical model that may be implemented in any convenient manner. In
this specification, we use the term <em>structure model</em> to
describe the tree-like representation of a document. We also use
the term "tree" when referring to the arrangement of those
information items which can be reached by using "tree-walking"
methods; (this does not include attributes). One important property
of DOM structure models is <em>structural isomorphism</em>: if any
two Document Object Model implementations are used to create a
representation of the same document, they will create the same
structure model, in accordance with the XML Information Set [<a
class='noxref' href='references.html#InfoSet'>Infoset</a>].</p>
<p><b>Note:</b> There may be some variations depending on the
parser being used to build the DOM. For instance, the DOM may not
contain whitespaces in element content if the parser discards
them.</p>
<p>The name "Document Object Model" was chosen because it is an "<a
href='glossary.html#dt-object-model'><em>object model</em></a>" in
the traditional object oriented design sense: documents are modeled
using objects, and the model encompasses not only the structure of
a document, but also the behavior of a document and the objects of
which it is composed. In other words, the nodes in the above
diagram do not represent a data structure, they represent objects,
which have functions and identity. As an object model, the DOM
identifies:</p>
<ul>
<li>the interfaces and objects used to represent and manipulate a
document</li>
<li>the semantics of these interfaces and objects - including both
behavior and attributes</li>
<li>the relationships and collaborations among these interfaces and
objects</li>
</ul>
<p>The structure of SGML documents has traditionally been
represented by an abstract <a
href='glossary.html#dt-datamodel'><em>data model</em></a>, not by
an object model. In an abstract <a
href='glossary.html#dt-datamodel'><em>data model</em></a>, the
model is centered around the data. In object oriented programming
languages, the data itself is encapsulated in objects that hide the
data, protecting it from direct external manipulation. The
functions associated with these objects determine how the objects
may be manipulated, and they are part of the object model.</p>
</div>
<!-- div2 ID-E7C30821 -->
<div class='div2'><a id="ID-E7C30822" name='ID-E7C30822'></a>
<h2 id='ID-E7C30822-h2' class='div2'>What the Document Object Model
is not</h2>
<p>This section is designed to give a more precise understanding of
the DOM by distinguishing it from other systems that may seem to be
like it.</p>
<ul>
<li>The Document Object Model is not a binary specification. DOM
programs written in the same language binding will be source code
compatible across platforms, but the DOM does not define any form
of binary interoperability.</li>
<li>The Document Object Model is not a way of persisting objects to
XML or HTML. Instead of specifying how objects may be represented
in XML, the DOM specifies how XML and HTML documents are
represented as objects, so that they may be used in object oriented
programs.</li>
<li>The Document Object Model is not a set of data structures; it
is an <a href='glossary.html#dt-object-model'><em>object
model</em></a> that specifies interfaces. Although this document
contains diagrams showing parent/child relationships, these are
logical relationships defined by the programming interfaces, not
representations of any particular internal data structures.</li>
<li>The Document Object Model does not define what information in a
document is relevant or how information in a document is
structured. For XML, this is specified by the W3C XML Information
Set [<a class='noxref' href='references.html#InfoSet'>Infoset</a>].
The DOM is simply an <a
href='glossary.html#dt-API'><em>API</em></a> to this information
set.</li>
<li>The Document Object Model, despite its name, is not a
competitor to the Component Object Model (COM). COM, like CORBA, is
a language independent way to specify interfaces and objects; the
DOM is a set of interfaces and objects designed for managing HTML
and XML documents. The DOM may be implemented using
language-independent systems like COM or CORBA; it may also be
implemented using language-specific bindings like the Java or
ECMAScript bindings specified in this document.</li>
</ul>
</div>
<!-- div2 ID-E7C30822 -->
<div class='div2'><a id="ID-E7C30823" name='ID-E7C30823'></a>
<h2 id='ID-E7C30823-h2' class='div2'>Where the Document Object
Model came from</h2>
<p>The DOM originated as a specification to allow JavaScript
scripts and Java programs to be portable among Web browsers.
"Dynamic HTML" was the immediate ancestor of the Document Object
Model, and it was originally thought of largely in terms of
browsers. However, when the DOM Working Group was formed at W3C, it
was also joined by vendors in other domains, including HTML or XML
editors and document repositories. Several of these vendors had
worked with SGML before XML was developed; as a result, the DOM has
been influenced by SGML Groves and the HyTime standard. Some of
these vendors had also developed their own object models for
documents in order to provide an API for SGML/XML editors or
document repositories, and these object models have also influenced
the DOM.</p>
</div>
<!-- div2 ID-E7C30823 -->
<div class='div2'><a id="ID-E7C30824" name='ID-E7C30824'></a>
<h2 id='ID-E7C30824-h2' class='div2'>Entities and the DOM Core</h2>
<p>In the fundamental DOM interfaces, there are no objects
representing entities. Numeric character references, and references
to the pre-defined entities in HTML and XML, are replaced by the
single character that makes up the entity's replacement. For
example, in:</p>
<div class='code-block'>
<pre>
&lt;p&gt;This is a dog &amp;amp; a cat&lt;/p&gt;
</pre>
</div>
<p>the "&amp;amp;" will be replaced by the character "&amp;", and
the text in the P element will form a single continuous sequence of
characters. Since numeric character references and pre-defined
entities are not recognized as such in CDATA sections, or in the
SCRIPT and STYLE elements in HTML, they are not replaced by the
single character they appear to refer to. If the example above were
enclosed in a CDATA section, the "&amp;amp;" would not be replaced
by "&amp;"; neither would the &lt;p&gt; be recognized as a start
tag. The representation of general entities, both internal and
external, are defined within the extended (XML) interfaces of DOM
Level 1 [<a class='noxref' href='references.html#DOM-Level-1'>DOM
Level 1</a>].</p>
<p>Note: When a DOM representation of a document is serialized as
XML or HTML text, applications will need to check each character in
text data to see if it needs to be escaped using a numeric or
pre-defined entity. Failing to do so could result in invalid HTML
or XML. Also, <a
href='glossary.html#dt-implementation'><em>implementations</em></a>
should be aware of the fact that serialization into a character
encoding ("charset") that does not fully cover ISO 10646 may fail
if there are characters in markup or CDATA sections that are not
present in the encoding.</p>
</div>
<!-- div2 ID-E7C30824 -->
<div class='div2'><a id="ID-Conformance" name='ID-Conformance'></a>
<h2 id='ID-Conformance-h2' class='div2'>Conformance</h2>
<p>This section explains the different levels of conformance to DOM
Level 2. DOM Level 2 consists of 14 modules. It is possible to
conform to DOM Level 2, or to a DOM Level 2 module.</p>
<p>An implementation is DOM Level 2 conformant if it supports the
Core module defined in this document (see <a
href='core.html#ID-BBACDC08'>Fundamental Interfaces</a>). An
implementation conforms to a DOM Level 2 module if it supports all
the interfaces for that module and the associated semantics.</p>
<p>Here is the complete list of DOM Level 2.0 modules and the
features used by them. Feature names are case-insensitive.</p>
<dl>
<dt><b>Core module</b></dt>
<dd>defines the feature <a
href='core.html#ID-BBACDC08'><em>"Core"</em></a>.</dd>
<dt><b>XML module</b></dt>
<dd>defines the feature <a
href='core.html#ID-E067D597'><em>"XML"</em></a>.</dd>
<dt><b>HTML module</b></dt>
<dd>defines the feature "HTML". (see [<a class='noxref'
href='references.html#DOMHTML-inf'>DOM Level 2 HTML</a>]).
<p><b>Note:</b> At time of publication, this DOM Level 2 module is
not yet a W3C Recommendation.</p>
</dd>
<dt><b>Views module</b></dt>
<dd>defines the feature <a
href='http://www.w3.org/TR/DOM-Level-2-Views/views.html'><em>"Views"</em></a>
in [<a class='noxref' href='references.html#DOMViews-inf'>DOM Level
2 Views</a>].</dd>
<dt><b>Style Sheets module</b></dt>
<dd>defines the feature <a
href='http://www.w3.org/TR/DOM-Level-2-Style/stylesheets.html'><em>"StyleSheets"</em></a>
in [<a class='noxref' href='references.html#DOMStyleSheets-inf'>DOM
Level 2 Style Sheets</a>].</dd>
<dt><b>CSS module</b></dt>
<dd>defines the feature <a
href='http://www.w3.org/TR/DOM-Level-2-Style/css.html'><em>"CSS"</em></a>
in [<a class='noxref' href='references.html#DOMCSS-inf'>DOM Level 2
CSS</a>].</dd>
<dt><b>CSS2 module</b></dt>
<dd>defines the feature <a
href='http://www.w3.org/TR/DOM-Level-2-Style/css.html'><em>"CSS2"</em></a>
in [<a class='noxref' href='references.html#DOMCSS-inf'>DOM Level 2
CSS</a>].</dd>
<dt><b>Events module</b></dt>
<dd>defines the feature <a
href='http://www.w3.org/TR/DOM-Level-2-Events/events.html'><em>"Events"</em></a>
in [<a class='noxref' href='references.html#DOMEvents-inf'>DOM
Level 2 Events</a>].</dd>
<dt><b>User interface Events module</b></dt>
<dd>defines the feature <a
href='http://www.w3.org/TR/DOM-Level-2-Events/events.html'><em>"UIEvents"</em></a>
in [<a class='noxref' href='references.html#DOMEvents-inf'>DOM
Level 2 Events</a>].</dd>
<dt><b>Mouse Events module</b></dt>
<dd>defines the feature <a
href='http://www.w3.org/TR/DOM-Level-2-Events/events.html'><em>"MouseEvents"</em></a>
in [<a class='noxref' href='references.html#DOMEvents-inf'>DOM
Level 2 Events</a>].</dd>
<dt><b>Mutation Events module</b></dt>
<dd>defines the feature <a
href='http://www.w3.org/TR/DOM-Level-2-Events/events.html'><em>"MutationEvents"</em></a>
in [<a class='noxref' href='references.html#DOMEvents-inf'>DOM
Level 2 Events</a>].</dd>
<dt><b>HTML Events module</b></dt>
<dd>defines the feature <a
href='http://www.w3.org/TR/DOM-Level-2-Events/events.html'><em>"HTMLEvents"</em></a>
in [<a class='noxref' href='references.html#DOMEvents-inf'>DOM
Level 2 Events</a>].</dd>
<dt><b>Range module</b></dt>
<dd>defines the feature <a
href='http://www.w3.org/TR/DOM-Level-2-Traversal-Range/ranges.html'>
<em>"Range"</em></a> in [<a class='noxref'
href='references.html#DOMRange-inf'>DOM Level 2 Range</a>].</dd>
<dt><b>Traversal module</b></dt>
<dd>defines the feature <a
href='http://www.w3.org/TR/DOM-Level-2-Traversal-Range/traversal.html'>
<em>"Traversal"</em></a> in [<a class='noxref'
href='references.html#DOMTraversal-inf'>DOM Level 2
Traversal</a>].</dd>
</dl>
<p>A DOM implementation must not return <code>"true"</code> to the
<code>hasFeature(feature, version)</code> <a
href='glossary.html#dt-method'><em>method</em></a> of the <a
href='core.html#ID-102161490'><code>DOMImplementation</code></a>
interface for that feature unless the implementation conforms to
that module. The <code>version</code> number for all features used
in DOM Level 2.0 is "2.0".</p>
</div>
<!-- div2 ID-Conformance -->
<div class='div2'><a id="ID-E7C30826" name='ID-E7C30826'></a>
<h2 id='ID-E7C30826-h2' class='div2'>DOM Interfaces and DOM
Implementations</h2>
<p>The DOM specifies interfaces which may be used to manage XML or
HTML documents. It is important to realize that these interfaces
are an abstraction - much like "abstract base classes" in C++, they
are a means of specifying a way to access and manipulate an
application's internal representation of a document. Interfaces do
not imply a particular concrete implementation. Each DOM
application is free to maintain documents in any convenient
representation, as long as the interfaces shown in this
specification are supported. Some DOM implementations will be
existing programs that use the DOM interfaces to access software
written long before the DOM specification existed. Therefore, the
DOM is designed to avoid implementation dependencies; in
particular,</p>
<ol>
<li>Attributes defined in the IDL do not imply concrete objects
which must have specific data members - in the language bindings,
they are translated to a pair of get()/set() functions, not to a
data member. Read-only attributes have only a get() function in the
language bindings.</li>
<li>DOM applications may provide additional interfaces and objects
not found in this specification and still be considered DOM
conformant.</li>
<li>Because we specify interfaces and not the actual objects that
are to be created, the DOM cannot know what constructors to call
for an implementation. In general, DOM users call the createX()
methods on the Document class to create document structures, and
DOM implementations create their own internal representations of
these structures in their implementations of the createX()
functions.</li>
</ol>
<p>The Level 1 interfaces were extended to provide both Level 1 and
Level 2 functionality.</p>
<p>DOM implementations in languages other than Java or ECMAScript
may choose bindings that are appropriate and natural for their
language and run time environment. For example, some systems may
need to create a Document2 class which inherits from Document and
contains the new methods and attributes.</p>
<p>DOM Level 2 does not specify multithreading mechanisms.</p>
</div>
<!-- div2 ID-E7C30826 --></div>
<!-- div1 Introduction -->
<div class='navbar' align='center'>
<hr title='Navigation area separator' />
<a accesskey='p' href='copyright-notice.html'>previous</a> &nbsp;
<a accesskey='n' href='core.html'>next</a> &nbsp; <a accesskey='c'
href='Overview.html#contents'>contents</a> &nbsp; <a accesskey='i'
href='def-index.html'>index</a></div>
</body>
</html>