Unicode character classes

Unicode character classes
Sequence Description
\p{C} Other.
\p{Cc} Other, control.
\p{Cf} Other, format.
\p{Co} Other, private use.
\p{Cs} Other, surrogate.
\p{L} Letter.
\p{LC} Letter, cased.
\p{Ll} Letter, lowercase.
\p{Lm} Letter, modifier.
\p{Lo} Letter, other.
\p{Lt} Letter, titlecase.
\p{Lu} Letter, uppercase.
\p{M} Mark.
\p{Mc} Mark, space combining.
\p{Me} Mark, enclosing.
\p{Mn} Mark, nonspacing.
\p{N} Number.
\p{Nd} Number, decimal digit.
\p{Nl} Number, letter.
\p{No} Number, other.
\p{P} Punctuation.
\p{Pc} Punctiation, connector.
\p{Pd} Punctuation, dash.
\p{Pe} Punctuation, close.
\p{Pf} Punctuation, final quote.
\p{Pi} Punctuation, initial quote.
\p{Po} Punctuation, other.
\p{Ps} Punctuation, open.
\p{S} Symbol.
\p{Sc} Symbol, currency.
\p{Sk} Symbol, modifier.
\p{Sm} Symbol, math.
\p{So} Symbol, other.
\p{Z} Separator.
\p{Zl} Separator, line.
\p{Zp} Separator, paragraph.
\p{Zs} Separator, space.

These character clasess are only available, if the option --enable-parle-utf32 was passed at the compilation time.

关注编程学问公众号