Revert "Fix RenderText itemization for complex emoji
" This reverts commit 55549387. Reason for revert: breaks gfx_unittests on (none) GPU on Mac FallbackFontCommonScript/RenderTextTestWithFallbackFontCase.FallbackFont/common21 Expected equality of these values: missing_glyphs Which is: 1 0 Original change's description: > Fix RenderText itemization for complex emoji > > > > This CL is a refactoring on the way RenderText do > > ItemizeTextToRuns for the corner cases where a run > > should be split in grapheme clusters. > > > > The previous code was naming these cases: > > * Unusual characters > > * Special characters > > * Non-regular characters > > But they should be more specific on the purpose of splitting the runs. > > > > Also, the algorithm used for splitting a sequence of codepoint > > was based on the code block of the corresponding codepoints > > and few cases where trying to merge adjacent codepoint in > > a grapheme. The algorithm was incorrect multiple cases (e.g. emoji). > > > > Unicode provide a way to split a sequence of codepoints into > > grapheme and grapheme clusters. > > They provide a state machine which is using the codepoint proporties > > to decide if the current location is a grapheme boundaries. > > > > The ICU library is providing an API over the character properties > > to help iterating over graphemes. > > * ubrk_open > > * ubrk_first > > * ubrk_next > > * ubrk_close > > > > The class base::i18n::BreakIterator(..., BREAK_CHARACTER) is > > providing an easy to use wrapper over that API. > > > > > > The current CL is replacing the previous characters based splitting > > algorithm by the graphemes based version. > > > > > > > > See emoji sequence: > > http://www.unicode.org/reports/tr51/ > > 1.4.5 Emoji Sequences > > > > The full emoji list: > > * http://unicode.org/emoji/charts/full-emoji-list.html > > Emoji data, used to make our unittests: > > * http://www.unicode.org/Public/emoji/12.0/emoji-data.txt > > > > > see: UNICODE TEXT SEGMENTATION (http://unicode.org/reports/tr29/) > > see: https://cs.chromium.org/chromium/src/third_party/icu/source/common/unicode/ubrk.h > > see: https://cs.chromium.org/chromium/src/base/i18n/break_iterator.h > > Change-Id: I6b9a9c79021f2ce0e2db7cdefdd0838b5911f445 > Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1788804 > Commit-Queue: Etienne Bergeron <etienneb@chromium.org> > Reviewed-by: Alexei Svitkine <asvitkine@chromium.org> > Reviewed-by: Robert Liao <robliao@chromium.org> > Cr-Commit-Position: refs/heads/master@{#712704} TBR=robliao@chromium.org,asvitkine@chromium.org,etienneb@chromium.org Change-Id: I6ea8c752e40ed07522d0d336ada289934c899477 No-Presubmit: true No-Tree-Checks: true No-Try: true Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1900433Reviewed-by:Joshua Pawlicki <waffles@chromium.org> Commit-Queue: Joshua Pawlicki <waffles@chromium.org> Cr-Commit-Position: refs/heads/master@{#712778}
Showing
This diff is collapsed.
Please register or sign in to comment