Allow whole-script confusable Cyrillic domains only on Cyrillic TLDs
A whole-script confusable Cyrillic domain consists of entirely Cyrillic characters that look identical to Latin characters (e.g. xn--80ak6aa92e[.]com decodes to аррӏе[.]com where аррӏе is in fact '\x0430\x0440\x0440\x04cf\x0435'). A previous change allowed whole-script confusable Cyrillic characters on non-ASCII top level domains only. This means that xn--80ak6aa92e[.]com remains punycode (TLD is .com) but xn--80ak6aa92e[.]xn--p1ai is decoded as аррӏе[.]рф (TLD is Cyrillic). However, this also allows spoofs in other non-ASCII TLDs such as аррӏе[.]中国 so it's not a sufficient measure. This change further limits allowable whole-script confusable Cyrillic domains to Cyrillic TLDs (instead of non-ASCII) and a small list of additional TLDs containing a large number of Cyrillic domains (bg, by, kz, pyc, ru, su, ua, uz). The idea is that users familiar with Cyrillic are more likely to encounter these TLDs and notice any discrepancies in the displayed domain name. Bug: 968505 Change-Id: Ib7462c9776f3640a5f60e5c79ac1a0c5d7b2028c Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1881887 Commit-Queue: Mustafa Emre Acer <meacer@chromium.org> Reviewed-by:Christopher Thompson <cthomp@chromium.org> Reviewed-by:
Peter Kasting <pkasting@chromium.org> Cr-Commit-Position: refs/heads/master@{#712764}
Showing
Please register or sign in to comment