Unicode Visualization

Unicode Information

"Unicode is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines as of the current version (15.0) 149,186 characters covering 161 modern and historic scripts, as well as symbols, emoji (including in colors), and non-visual control and formatting codes.

The Unicode Standard, however, includes more than just the base code. Alongside the character encodings, the Consortium's official publication includes a wide variety of details about the scripts and how to display them: normalization rules, decomposition, collation, rendering, and bidirectional text display order for multilingual texts, and so on. The Standard also includes reference data files and visual charts to help developers and designers correctly implement the repertoire.

Unicode can be stored using several different encodings, which translate the character codes into sequences of bytes. The Unicode standard defines three and several other encodings exist, all in practice variable-length encodings. The most common encodings are the ASCII-compatible UTF-8, the ASCII-incompatible UTF-16 (compatible with the obsolete UCS-2), and the Chinese Unicode encoding standard GB18030 which is not an official Unicode standard but is used in China and implements Unicode fully." Wikipedia: Unicode

Unicode stores code points for characters in 17 planes containing 65,536 characters each. The maximum number of characters is reduced by 2,048 surrogates and 66 non-characters:

17 planes \cdot 65,536 code points - 2,048 surrogates - 66 noncharacters = 1,111,998 characters

Unicode Visualization

Input a Unicode Glyph ('Character') or Choose an Example

Unicode Information

Web links

Character

Code Point

Character Encoding