Reland "Fix RenderText itemization for complex emoji"
This is a reland of 55549387 Original change's description: > Fix RenderText itemization for complex emoji > > > > This CL is a refactoring on the way RenderText do > > ItemizeTextToRuns for the corner cases where a run > > should be split in grapheme clusters. > > > > The previous code was naming these cases: > > * Unusual characters > > * Special characters > > * Non-regular characters > > But they should be more specific on the purpose of splitting the runs. > > > > Also, the algorithm used for splitting a sequence of codepoint > > was based on the code block of the corresponding codepoints > > and few cases where trying to merge adjacent codepoint in > > a grapheme. The algorithm was incorrect multiple cases (e.g. emoji). > > > > Unicode provide a way to split a sequence of codepoints into > > grapheme and grapheme clusters. > > They provide a state machine which is using the codepoint proporties > > to decide if the current location is a grapheme boundaries. > > > > The ICU library is providing an API over the character properties > > to help iterating over graphemes. > > * ubrk_open > > * ubrk_first > > * ubrk_next > > * ubrk_close > > > > The class base::i18n::BreakIterator(..., BREAK_CHARACTER) is > > providing an easy to use wrapper over that API. > > > > > > The current CL is replacing the previous characters based splitting > > algorithm by the graphemes based version. > > > > > > > > See emoji sequence: > > http://www.unicode.org/reports/tr51/ > > 1.4.5 Emoji Sequences > > > > The full emoji list: > > * http://unicode.org/emoji/charts/full-emoji-list.html > > Emoji data, used to make our unittests: > > * http://www.unicode.org/Public/emoji/12.0/emoji-data.txt > > > > > see: UNICODE TEXT SEGMENTATION (http://unicode.org/reports/tr29/) > > see: https://cs.chromium.org/chromium/src/third_party/icu/source/common/unicode/ubrk.h > > see: https://cs.chromium.org/chromium/src/base/i18n/break_iterator.h > > Change-Id: I6b9a9c79021f2ce0e2db7cdefdd0838b5911f445 > Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1788804 > Commit-Queue: Etienne Bergeron <etienneb@chromium.org> > Reviewed-by: Alexei Svitkine <asvitkine@chromium.org> > Reviewed-by: Robert Liao <robliao@chromium.org> > Cr-Commit-Position: refs/heads/master@{#712704} Change-Id: I64c877907d6e961bd223a7c4cb1193c240b0b7ac Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1900347Reviewed-by:Alexei Svitkine <asvitkine@chromium.org> Commit-Queue: Etienne Bergeron <etienneb@chromium.org> Cr-Commit-Position: refs/heads/master@{#713179}
Showing
This diff is collapsed.
Please register or sign in to comment