Enum GraphemeClusterClass
Unicode Grapheme_Cluster_Break property values and local rule sentinels. https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Break_Property_Values
public enum GraphemeClusterClass
Fields
Any = 0Rule sentinel that matches any code point.
This is not a Unicode property value; it represents the "any" operand in UAX #29 boundary rules.
CarriageReturn = 1U+000D CARRIAGE RETURN (CR).
Control = 3Controls, separators, formats, and default-ignorable unassigned code points that form hard grapheme cluster boundaries.
This class excludes CR, LF, U+200C ZERO WIDTH NON-JOINER, U+200D ZERO WIDTH JOINER, and prepended concatenation marks because those participate in more specific UAX #29 rules.
Extend = 4Extending code points that remain in the same extended grapheme cluster as the preceding base.
This includes Grapheme_Extend code points, emoji modifiers, U+200C ZERO WIDTH NON-JOINER, and a small number of spacing marks needed for canonical equivalence.
ExtendedPictographic = 13Extended pictographic code points used by GB11 emoji ZWJ sequence handling.
This is not itself a Grapheme_Cluster_Break property value; UAX #29 uses it when matching emoji ZWJ sequences.
HangulLead = 8Hangul leading consonant Jamo (Hangul_Syllable_Type = L).
HangulLeadVowel = 11Hangul LV syllables.
HangulLeadVowelTail = 12Hangul LVT syllables.
HangulTail = 10Hangul trailing consonant Jamo (Hangul_Syllable_Type = T).
HangulVowel = 9Hangul vowel Jamo (Hangul_Syllable_Type = V).
LineFeed = 2U+000A LINE FEED (LF).
Other = 255Other.
This is the Unicode
Other/XXfallback for code points without an explicit grapheme cluster break class.Prepend = 6Code points that prepend to the following grapheme cluster.
This includes Indic_Syllabic_Category values Consonant_Preceding_Repha and Consonant_Prefixed, plus Prepended_Concatenation_Mark code points.
RegionalIndicator = 5Regional indicator symbols used to build flag emoji pairs.
SpacingMark = 7Spacing marks that extend the previous grapheme cluster.
This includes spacing marks whose Grapheme_Cluster_Break value is not Extend, plus U+0E33 THAI CHARACTER SARA AM and U+0EB3 LAO VOWEL SIGN AM.
ZeroWidthJoiner = 14U+200D ZERO WIDTH JOINER.
Remarks
UAX #29 uses these classes in ordered boundary rules to determine extended grapheme clusters. Some members are rule sentinels rather than Unicode property values exposed by the standard.