[Lookalikes] Expand common words used in target embedding.
This CL expands the list of common words ignored by target embedding to the full list from //components/url_formatter/spoof_checks/common_words. It also removes words that are on that list from the existing list of common words (which is maintained as a supplemental list). This CL also adds a little bit of logic to disable common word detection on domains that are in the special list of domains that are allowed to be embedded, but only at the end (domains that are higher-value that use common word). This is necessary since, e.g., "office" is a common word on the full common word list, but not in the old list. A side effect of this change is that the common word list is included in Android. This list causes a big bump in binary size. A future edit may reduce the size of the word list used for Android, but that'll be a substantial engineering effort, and this list is an important mitigation in a security feature. The list is efficiently stored as a DAFSA, so I know of no obvious way to shrink it down. Binary-Size: Size increase is unavoidable (see above). Bug: 1154726 Change-Id: I417d92761377f6b6e11772b8f06c9b36b5083676 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2630738 Commit-Queue: Joe DeBlasio <jdeblasio@chromium.org> Reviewed-by:Mustafa Emre Acer <meacer@chromium.org> Cr-Commit-Position: refs/heads/master@{#844156}
Showing
Please register or sign in to comment