1.unicodedata.lookup():通过索引中的名称查找相应的字符

2.unicodedata.name():通过字符查找名称,与unicodedata.lookup()相反

3.unicodedata.decimal():返回表示数字字符的数值

4.unicodedata.digit():把一个合法的数字字符串转换为数字值

5.unicodedata.numeric():把一个表示数字的字符串转换为浮点数返回,与unicodedata.digit()不同的是:它可以任意表示数值的字符都可以,不仅仅限于0到9的字符

6.unicodedata.category():把一个字符返回它在UNICODE里分类的类型

UNICODE具体类型如下:
Code Description
[Cc] Other, Control
[Cf] Other, Format
[Cn] Other, Not Assigned (no characters in the file have this property)
[Co] Other, Private Use
[Cs] Other, Surrogate
[LC] Letter, Cased
[Ll] Letter, Lowercase
[Lm] Letter, Modifier
[Lo] Letter, Other
[Lt] Letter, Titlecase
[Lu] Letter, Uppercase
[Mc] Mark, Spacing Combining
[Me] Mark, Enclosing
[Mn] Mark, Nonspacing
[Nd] Number, Decimal Digit
[Nl] Number, Letter
[No] Number, Other
[Pc] Punctuation, Connector
[Pd] Punctuation, Dash
[Pe] Punctuation, Close
[Pf] Punctuation, Final quote (may behave like Ps or Pe depending on usage)
[Pi] Punctuation, Initial quote (may behave like Ps or Pe depending on usage)
[Po] Punctuation, Other
[Ps] Punctuation, Open
[Sc] Symbol, Currency
[Sk] Symbol, Modifier
[Sm] Symbol, Math
[So] Symbol, Other
[Zl] Separator, Line
[Zp] Separator, Paragraph
[Zs] Separator, Space
上述代码的结果依次如下:

网友评论