LYCOS RETRIEVER
Unicode: Unicode Standard
built 655 days ago
When writing about a Unicode character, it is normal to write "U+" followed by a hexadecimal number indicating the character's code point. For code points in the Basic Multilingual Plane (BMP), four digits are used; for code points outside the BMP, five or six digits are used, as required. Older versions of the standard used similar notations, but with slightly different rules. For example, Unicode 3.0 used "U-" followed by eight digits, and allowed "U+" to be used only with exactly four digits in order to indicate a code unit, not a code point.
Source:
The most common Java task that requires some Unicode know-how is opening a file that contains Unicode data. Tip #1: Use Readers and Writers to work with Unicode data. Java's standard InputStream and OutputStream objects are for reading and writing binary bytes; they are not Unicode-aware. Readers and Writers are Unicode-aware and will perform all the necessary encoding and decoding for you.
Source:
For the first time, in Unicode 3.01 characters are encoded beyond the original 16-bit code space or the BMP (Plane 0). These new characters, encoded at code positions of U+10000 or higher, are synchronized with the international standard ISO/IEC 10646-2. In addition to two Private Use Areas-plane 15 (U+F0000 - U+FFFFD) and plane 16 (U+100000 - U+10FFFD)-Unicode 3.1 and 10646-2 define three new supplementary planes:
Source:
Oracle's support of Unicode is quite comprehensive. Oracle Database 10g Release 2 provides full support for Unicode 4.0, the standard for multilingual support. This support allows customers to develop, deploy, and host multiple languages in a single central database or as part of a grid. Oracle ... offers the flexibility to store all data in a Unicode database in UTF-8 or to incrementally store select columns in the Unicode datatype in UTF-8 or UTF-16.
Source:
Unicode is the international standard whose goal is to specify a code matching every character needed by every written human language to a single unique integer number, called a code point. It is the explicit aim of Unicode to abolish traditional character encodings such as those defined by the ISO 8859 standard, which are used in the various countries of the world, but are largely incompatible with each other.
Source:
Unicode enables the exchange of text data internationally and creates the foundation for global software. The recognized standard technology for internationalization worldwide, Unicode currently enables the conversion of all major languages of the Americas, Europe, the Middle East, Africa, India, Asia, and the Pacific Basin."
Source: