LYCOS RETRIEVER
Unicode: Unicode Transformation Format
built 628 days ago
Unicode defines two mapping methods: the Unicode Transformation Format (UTF) encodings, and the Universal Character Set (UCS) encodings. An encoding maps (possibly a subset of) the range of Unicode code points to sequences of values in some fixed-size range, termed code values. The numbers in the names of the encodings indicate the number of bits in one code value (for UTF encodings) or the number of bytes per code value (for UCS) encodings. UTF-8 and UTF-16 are probably the most commonly used encodings. UCS-2 is an obsolete subset of UTF-16; UCS-4 and UTF-32 are functionally equivalent.
Source:
UTF-16: This is the 16-bit encoding form of the Unicode Standard where characters are assigned a unique 16-bit value, with the exception of characters encoded by surrogate pairs, which consist of a pair of 16-bit values. The Unicode 16-bit encoding form is identical to the International Organization for Standardization/International Electrotechnical Commision (ISO/IEC) transformation format UTF-16. In UTF-16, any characters that are mapped up to the number 65,535 are encoded as a single 16-bit value; characters mapped above the number 65,535 are encoded as pairs of 16-bit values. (For more information on surrogate pairs, see "Surrogate Pairs" later in this chapter.) UTF-16 little-endian is the encoding standard at Microsoft (and in the Windows operating system).
Source:
There are several formats for storing Unicode code points. When combined with the byte order of the hardware (big endian or little endian), they are known officially as "character encoding schemes." They are ... known by their UTF acronyms, which stand for "Unicode Transformation Format" or "Universal Character Set Transformation Format." See byte order.
Source:
This is a no-frills plain-text web page containing text in many languages (2) encoded in Unicode Transformation Format 8 (UTF-8). You might see a lot of "unknown glyph" boxes or gibberish, depending on your browser, font, and locale.
Source: