• Joshua Pawlicki's avatar
    Revert "Fix RenderText itemization for complex emoji · 67c6d71e
    Joshua Pawlicki authored
    "
    
    This reverts commit 55549387.
    
    Reason for revert: breaks gfx_unittests on (none) GPU on Mac
    FallbackFontCommonScript/RenderTextTestWithFallbackFontCase.FallbackFont/common21
    Expected equality of these values:
      missing_glyphs
        Which is: 1
      0
    
    Original change's description:
    > Fix RenderText itemization for complex emoji
    > 
    > 
    > 
    > This CL is a refactoring on the way RenderText do
    > 
    > ItemizeTextToRuns for the corner cases where a run
    > 
    > should be split in grapheme clusters.
    > 
    > 
    > 
    > The previous code was naming these cases:
    > 
    >   * Unusual characters
    > 
    >   * Special characters
    > 
    >   * Non-regular characters
    > 
    > But they should be more specific on the purpose of splitting the runs.
    > 
    > 
    > 
    > Also, the algorithm used for splitting a sequence of codepoint
    > 
    > was based on the code block of the corresponding codepoints
    > 
    > and few cases where trying to merge adjacent codepoint in
    > 
    > a grapheme. The algorithm was incorrect multiple cases (e.g. emoji).
    > 
    > 
    > 
    > Unicode provide a way to split a sequence of codepoints into
    > 
    > grapheme and grapheme clusters.
    > 
    > They provide a state machine which is using the codepoint proporties
    > 
    > to decide if the current location is a grapheme boundaries.
    > 
    > 
    > 
    > The ICU library is providing an API over the character properties
    > 
    > to help iterating over graphemes.
    > 
    >   * ubrk_open
    > 
    >   * ubrk_first
    > 
    >   * ubrk_next
    > 
    >   * ubrk_close
    > 
    > 
    > 
    > The class base::i18n::BreakIterator(..., BREAK_CHARACTER) is
    > 
    > providing an easy to use wrapper over that API.
    > 
    > 
    > 
    > 
    > 
    > The current CL is replacing the previous characters based splitting
    > 
    > algorithm by the graphemes based version.
    > 
    > 
    > 
    > 
    > 
    > 
    > 
    > See emoji sequence:
    > 
    >   http://www.unicode.org/reports/tr51/
    > 
    >   1.4.5 Emoji Sequences
    > 
    > 
    > 
    > The full emoji list:
    > 
    >   * http://unicode.org/emoji/charts/full-emoji-list.html
    > 
    > Emoji data, used to make our unittests:
    > 
    >   * http://www.unicode.org/Public/emoji/12.0/emoji-data.txt
    > 
    > 
    > 
    > 
    > see: UNICODE TEXT SEGMENTATION (http://unicode.org/reports/tr29/)
    > 
    > see: https://cs.chromium.org/chromium/src/third_party/icu/source/common/unicode/ubrk.h
    > 
    > see: https://cs.chromium.org/chromium/src/base/i18n/break_iterator.h
    > 
    > Change-Id: I6b9a9c79021f2ce0e2db7cdefdd0838b5911f445
    > Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1788804
    > Commit-Queue: Etienne Bergeron <etienneb@chromium.org>
    > Reviewed-by: Alexei Svitkine <asvitkine@chromium.org>
    > Reviewed-by: Robert Liao <robliao@chromium.org>
    > Cr-Commit-Position: refs/heads/master@{#712704}
    
    TBR=robliao@chromium.org,asvitkine@chromium.org,etienneb@chromium.org
    
    Change-Id: I6ea8c752e40ed07522d0d336ada289934c899477
    No-Presubmit: true
    No-Tree-Checks: true
    No-Try: true
    Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1900433Reviewed-by: default avatarJoshua Pawlicki <waffles@chromium.org>
    Commit-Queue: Joshua Pawlicki <waffles@chromium.org>
    Cr-Commit-Position: refs/heads/master@{#712778}
    67c6d71e
render_text_unittest.cc 255 KB