Supplementary Character Support Approach
- Use the primitive type int to represent code points in low-level APIs, such as the static methods of the Character class.
- Interpret char sequences in all forms (char[], implementations of java.lang.CharSequence, implementations of java.text.CharacterIterator) as UTF-16 sequences, and promote their use in higher-level APIs.
- Provide APIs to easily convert between various char and code point based representations.
Good blog on unicode support in j2se5 : John Conner blog
Highlights:
# char is a UTF-16 code unit, not a code point
# new low-level APIs use an int to represent a Unicode code point
# high level APIs have been updated to understand surrogate pairs
# a preference towards char sequence APIs instead of char based methods
No comments:
Post a Comment