diff --git a/java/external/xdocs/dom/core/i18n.html b/java/external/xdocs/dom/core/i18n.html deleted file mode 100644 index 782128f..0000000 --- a/java/external/xdocs/dom/core/i18n.html +++ /dev/null @@ -1,303 +0,0 @@ - - - -
-This appendix is an informative, not a normative, part of the -Level 2 DOM specification.
- -Characters are represented in Unicode by numbers called code -points (also called scalar values). These numbers can -range from 0 up to 1,114,111 = 10FFFF16 (although some -of these values are illegal). Each code point can be directly -encoded with a 32-bit code unit. This encoding is termed UCS-4 (or -UTF-32). The DOM specification, however, uses UTF-16, in which the -most frequent characters (which have values less than -FFFF16) are represented by a single 16-bit code unit, -while characters above FFFF16 use a special pair of code -units called a surrogate pair. For more information, see [Unicode] or the -Unicode Web site.
- -While indexing by code points as opposed to code units is not
-common in programs, some specifications such as XPath (and
-therefore XSLT and XPointer) use code point indices. For
-interfacing with such formats it is recommended that the
-programming language provide string processing methods for
-converting code point indices to code unit indices and back. Some
-languages do not provide these functions natively; for these it is
-recommended that the native String type that is bound
-to DOMString be
-extended to enable this conversion. An example of how such an API
-might look is supplied below.
Note: Since these methods are supplied as an illustrative -example of the type of functionality that is required, the names of -the methods, exceptions, and interface may differ from those given -here.
-Extensions to a language's native String class or interface
- -
-interface StringExtend {
- int findOffset16(in int offset32)
- raises(StringIndexOutOfBoundsException);
- int findOffset32(in int offset16)
- raises(StringIndexOutOfBoundsException);
-};
-
-findOffset16Note: You can always round-trip from a UTF-32 offset to a -UTF-16 offset and back. You can round-trip from a UTF-16 offset to -a UTF-32 offset and back if and only if the offset16 is not in the -middle of a surrogate pair. Unmatched surrogates count as a single -UTF-16 value.
- -offset32 of type
-int|
-
|
-
- UTF-16 offset - |
-
|
-
|
-
- if |
-
findOffset32-len32 = findOffset32(source, source.length()); --
Note: If the UTF-16 offset is into the middle of a -surrogate pair, then the UTF-32 offset of the end of the -pair is returned; that is, the index of the char after the end of -the pair. You can always round-trip from a UTF-32 offset to a -UTF-16 offset and back. You can round-trip from a UTF-16 offset to -a UTF-32 offset and back if and only if the offset16 is not in the -middle of a surrogate pair. Unmatched surrogates count as a single -UTF-16 value.
- -offset16 of type
-int|
-
|
-
- UTF-32 offset - |
-
|
-
|
-
- if offset16 is out of bounds. - |
-