Unicode is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems.

Codes between SC and TC

The characters which is of the same writing in both Simplified Chinese (SC) and Traditional Chinese (TC) are assigned with the unified code points, but the ones of different writing are assigned with different code points. Like:

中华 / 中華- 中: U+4E2D 华: U+534E 華: U+83EF注:“中"字只有一个编码。
汉字 / 漢字- 汉: U+6C49 漢: U+6F22 字: U+5B57 注:“字"字只有一个编码。
学习 / 學習- 学: U+5B66 學: U+5B78习: U+4E60 習: U+7FD2

多音字 (Heteronym)



  • 朝 : chao2 / zhao1
  • 单 : dan1 / shan4
  • 仇 : chou2 / qiu2

Online Tools

  1. Escaped Unicode, Decimal NCRs, Hexadecimal NCRs, UTF-8 Converter -- recommended.
  2. A visual unicode database -- For query or browsing.
  3. Unicode character inspector -- For query. And it will tell you the duplicated input.

TBD - To be done

See Also

Last modified 4 years ago Last modified on May 22, 2015, 1:35:06 PM