diff --git a/java/external/xdocs/dom/core/i18n.html b/java/external/xdocs/dom/core/i18n.html deleted file mode 100644 index 782128f..0000000 --- a/java/external/xdocs/dom/core/i18n.html +++ /dev/null @@ -1,303 +0,0 @@ - - - - -Accessing code point boundaries - - - - - - - - - - -
-

13 November, -2000

-
- -
- -

Appendix B: Accessing code point -boundaries

- -
-
Mark Davis, IBM
- -
Lauren Wood, SoftQuad Software Inc.
-
- -
-

Table of contents

- - -
- -
- -

B.1: Introduction

- -

This appendix is an informative, not a normative, part of the -Level 2 DOM specification.

- -

Characters are represented in Unicode by numbers called code -points (also called scalar values). These numbers can -range from 0 up to 1,114,111 = 10FFFF16 (although some -of these values are illegal). Each code point can be directly -encoded with a 32-bit code unit. This encoding is termed UCS-4 (or -UTF-32). The DOM specification, however, uses UTF-16, in which the -most frequent characters (which have values less than -FFFF16) are represented by a single 16-bit code unit, -while characters above FFFF16 use a special pair of code -units called a surrogate pair. For more information, see [Unicode] or the -Unicode Web site.

- -

While indexing by code points as opposed to code units is not -common in programs, some specifications such as XPath (and -therefore XSLT and XPointer) use code point indices. For -interfacing with such formats it is recommended that the -programming language provide string processing methods for -converting code point indices to code unit indices and back. Some -languages do not provide these functions natively; for these it is -recommended that the native String type that is bound -to DOMString be -extended to enable this conversion. An example of how such an API -might look is supplied below.

- -

Note: Since these methods are supplied as an illustrative -example of the type of functionality that is required, the names of -the methods, exceptions, and interface may differ from those given -here.

-
- - -
- -

B.2: Methods

- -
-
Interface StringExtend
- -
-

Extensions to a language's native String class or interface

- -
-

-IDL Definition
- -
-
-
-interface StringExtend {
-  int                findOffset16(in int offset32)
-                                        raises(StringIndexOutOfBoundsException);
-  int                findOffset32(in int offset16)
-                                        raises(StringIndexOutOfBoundsException);
-};
-
-
- -
-
- -
Methods
- -
-
-
findOffset16
- -
-
Returns the UTF-16 offset that corresponds to a -UTF-32 offset. Used for random access. - -

Note: You can always round-trip from a UTF-32 offset to a -UTF-16 offset and back. You can round-trip from a UTF-16 offset to -a UTF-32 offset and back if and only if the offset16 is not in the -middle of a surrogate pair. Unmatched surrogates count as a single -UTF-16 value.

- -
Parameters - -
-
-
offset32 of type -int
- -
UTF-32 offset.
-
-
-
-
- - -
Return Value - -
- - - - - -
-

int

-
-

UTF-16 offset

-
-
-
- - -
Exceptions - -
- - - - - -
-

StringIndexOutOfBoundsException

-
-

if offset32 is out of bounds.

-
-
-
- - -
- - -
- -
findOffset32
- -
-
Returns the UTF-32 offset corresponding to a -UTF-16 offset. Used for random access. To find the UTF-32 length of -a string, use: - -
-
-len32 = findOffset32(source, source.length());
-
-
- -

Note: If the UTF-16 offset is into the middle of a -surrogate pair, then the UTF-32 offset of the end of the -pair is returned; that is, the index of the char after the end of -the pair. You can always round-trip from a UTF-32 offset to a -UTF-16 offset and back. You can round-trip from a UTF-16 offset to -a UTF-32 offset and back if and only if the offset16 is not in the -middle of a surrogate pair. Unmatched surrogates count as a single -UTF-16 value.

- -
Parameters - -
-
-
offset16 of type -int
- -
UTF-16 offset
-
-
-
-
- - -
Return Value - -
- - - - - -
-

int

-
-

UTF-32 offset

-
-
-
- - -
Exceptions - -
- - - - - -
-

StringIndexOutOfBoundsException

-
-

if offset16 is out of bounds.

-
-
-
- - -
- - -
-
-
-
-
-
-
- -
- - - - - - diff --git a/java/external/xdocs/dom/core/images/table.gif b/java/external/xdocs/dom/core/images/table.gif deleted file mode 100644 index bdcea3b..0000000 Binary files a/java/external/xdocs/dom/core/images/table.gif and /dev/null differ