Commit 4d666348 authored by Joe DeBlasio's avatar Joe DeBlasio Committed by Commit Bot

Include U+0517 in set of Cyrillic/Latin lookalikes.

Cyrillic letter U+0517 (ԗ) looks somewhat similar to the Latin letter p.
This CL adds this character to the set of Cyrillic characters that look
like Latin characters. Domains made up entirely of Cyrillic/Latin
lookalikes are displayed as punycode in URLs.

Bug: 863663
Change-Id: I4340c48d124c9c4cd3d3b5d0f9d3865d709e082d
Reviewed-on: https://chromium-review.googlesource.com/c/1286825
Commit-Queue: Joe DeBlasio <jdeblasio@chromium.org>
Commit-Queue: Peter Kasting <pkasting@chromium.org>
Reviewed-by: default avatarPeter Kasting <pkasting@chromium.org>
Cr-Commit-Position: refs/heads/master@{#600582}
parent c5aef007
......@@ -177,7 +177,7 @@ IDNSpoofChecker::IDNSpoofChecker() {
// These Cyrillic letters look like Latin. A domain label entirely made of
// these letters is blocked as a simplified whole-script-spoofable.
cyrillic_letters_latin_alike_ = icu::UnicodeSet(
icu::UnicodeString::fromUTF8("[асԁеһіјӏорԛѕԝхуъЬҽпгѵѡ]"), status);
icu::UnicodeString::fromUTF8("[асԁеһіјӏорԗԛѕԝхуъЬҽпгѵѡ]"), status);
cyrillic_letters_latin_alike_.freeze();
cyrillic_letters_ =
......
......@@ -382,6 +382,9 @@ const IDNTestCase idn_cases[] = {
// музей (museum in Russian) has characters without a Latin-look-alike.
{"xn--e1adhj9a.com", L"\x043c\x0443\x0437\x0435\x0439.com", true},
// ѕсоԗе.com is Cyrillic with Latin lookalikes.
{"xn--e1ari3f61c.com", L"\x0455\x0441\x043e\x0517\x0435.com", false},
// Combining Diacritic marks after a script other than Latin-Greek-Cyrillic
{"xn--rsa2568fvxya.com", L"\xd55c\x0301\xae00.com", false}, // 한́글.com
{"xn--rsa0336bjom.com", L"\x6f22\x0307\x5b57.com", false}, // 漢̇字.com
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment