Commit 0a3c9b58 authored by Bruce Long's avatar Bruce Long Committed by Commit Bot

Windows Spellcheck: Generic preferred languages can be enabled

The browser language settings page allows a user to add preferred
languages, and only the added languages are candidates for toggling
on for spellchecking. Several of the accept languages come in both
regionally specific form (Brazilian Portuguese, "pt-BR" e.g.) and
generic form (Portuguese, "pt" e.g.). However, the current logic in the
the SpellcheckService class that maps accept language tags to more
complete BCP47 tags specifying Windows spellcheck dictionaries does
not map the platform dictionary name to the generic name if an exact
match is already found. For example, "pt-BR" gets mapped to its exact
match "pt-BR", but not to "pt". Since there is no Hunspell support for
generic "pt", you cannot enable generic Portuguese for spellchecking.

A more insidious issue is seen with Italian. If you install the
Italian (Italy) Windows language pack, you get dictionary support
for "it-IT" and "it-CH" (Italian as spoken in Italy and Switzerland
respectively). Since there is an exact match with the "it-IT" accept
language tag, generic "it" does not get mapped. But since generic "it"
does have Hunspell support, the hybrid spellcheck fallback logic will
kick in here. A user can enable generic Italian for spellcheck but in
fact Hunspell will be used, even with the Italian Windows language
pack in place.

The fix is to add an entry to the dictionary map for the generic
language subtag if a regionally specific platform spellcheck dictionary
is available, and pass the generic subtag to the platform API when
creating a Windows spellchecker. The platform API is smart enough to
recognize the primary language associated with a language pack, so
passing the "pt" subtag will in fact create the "pt-BR" spellchecker.

Note this fix only works when the Windows platform spellchecking
feature is enabled. If using Hunspell exclusively, you still can't
toggle generic languages unless they can be mapped to an available
Hunspell dictionary

Bug: 1105610
Change-Id: Ib995289a55ed8c8c51bf83e3a1ca4118a07814c8
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2310804Reviewed-by: default avatarRouslan Solomakhin <rouslan@chromium.org>
Commit-Queue: Bruce Long <brlong@microsoft.com>
Cr-Commit-Position: refs/heads/master@{#791808}
parent 6c73d460
...@@ -237,12 +237,18 @@ bool SpellcheckService::SignalStatusEvent( ...@@ -237,12 +237,18 @@ bool SpellcheckService::SignalStatusEvent(
// static // static
std::string SpellcheckService::GetSupportedAcceptLanguageCode( std::string SpellcheckService::GetSupportedAcceptLanguageCode(
const std::string& supported_language_full_tag) { const std::string& supported_language_full_tag,
bool generic_only) {
// Default to accept language in hardcoded list of Hunspell dictionaries // Default to accept language in hardcoded list of Hunspell dictionaries
// (kSupportedSpellCheckerLanguages). // (kSupportedSpellCheckerLanguages).
std::string supported_accept_language = std::string supported_accept_language =
spellcheck::GetCorrespondingSpellCheckLanguage( spellcheck::GetCorrespondingSpellCheckLanguage(
supported_language_full_tag); supported_language_full_tag);
if (generic_only) {
supported_accept_language = SpellcheckService::GetLanguageAndScriptTag(
supported_accept_language,
/* include_script_tag= */ false);
}
#if defined(OS_WIN) #if defined(OS_WIN)
if (!spellcheck::UseBrowserSpellChecker()) if (!spellcheck::UseBrowserSpellChecker())
...@@ -258,6 +264,11 @@ std::string SpellcheckService::GetSupportedAcceptLanguageCode( ...@@ -258,6 +264,11 @@ std::string SpellcheckService::GetSupportedAcceptLanguageCode(
std::vector<std::string> accept_languages; std::vector<std::string> accept_languages;
l10n_util::GetAcceptLanguages(&accept_languages); l10n_util::GetAcceptLanguages(&accept_languages);
if (generic_only) {
return GetSupportedAcceptLanguageCodeGenericOnly(
supported_language_full_tag, accept_languages);
}
// First try exact match. Per BCP47, tags are in ASCII and should be treated // First try exact match. Per BCP47, tags are in ASCII and should be treated
// as case-insensitive (although there are conventions for the capitalization // as case-insensitive (although there are conventions for the capitalization
// of subtags). // of subtags).
...@@ -284,10 +295,10 @@ std::string SpellcheckService::GetSupportedAcceptLanguageCode( ...@@ -284,10 +295,10 @@ std::string SpellcheckService::GetSupportedAcceptLanguageCode(
return base::EqualsCaseInsensitiveASCII( return base::EqualsCaseInsensitiveASCII(
SpellcheckService::GetLanguageAndScriptTag( SpellcheckService::GetLanguageAndScriptTag(
supported_language_full_tag, supported_language_full_tag,
/* include_script_tag */ true), /* include_script_tag= */ true),
SpellcheckService::GetLanguageAndScriptTag( SpellcheckService::GetLanguageAndScriptTag(
accept_language, accept_language,
/* include_script_tag */ true)); /* include_script_tag= */ true));
}); });
if (iter != accept_languages.end()) if (iter != accept_languages.end())
...@@ -297,30 +308,8 @@ std::string SpellcheckService::GetSupportedAcceptLanguageCode( ...@@ -297,30 +308,8 @@ std::string SpellcheckService::GetSupportedAcceptLanguageCode(
// kok as an accept language, but if the Konkani language pack is // kok as an accept language, but if the Konkani language pack is
// installed the Windows spellcheck API reports kok-Deva-IN for the // installed the Windows spellcheck API reports kok-Deva-IN for the
// dictionary name. // dictionary name.
iter = return GetSupportedAcceptLanguageCodeGenericOnly(supported_language_full_tag,
std::find_if(accept_languages.begin(), accept_languages.end(), accept_languages);
[supported_language_full_tag](const auto& accept_language) {
return base::EqualsCaseInsensitiveASCII(
SpellcheckService::GetLanguageAndScriptTag(
supported_language_full_tag,
/* include_script_tag */ false),
SpellcheckService::GetLanguageAndScriptTag(
accept_language,
/* include_script_tag */ false));
});
if (iter != accept_languages.end()) {
// Special case for Serbian--"sr" implies Cyrillic script. Don't mark it as
// supported for sr-Latn*.
if (base::EqualsCaseInsensitiveASCII(
SpellcheckService::GetLanguageAndScriptTag(
supported_language_full_tag,
/* include_script_tag */ true),
"sr-Latn")) {
return "";
}
return *iter;
}
#endif // defined(OS_WIN) #endif // defined(OS_WIN)
...@@ -615,7 +604,7 @@ void SpellcheckService::InitWindowsDictionaryLanguages( ...@@ -615,7 +604,7 @@ void SpellcheckService::InitWindowsDictionaryLanguages(
for (const auto& windows_spellcheck_language : windows_spellcheck_languages) { for (const auto& windows_spellcheck_language : windows_spellcheck_languages) {
std::string accept_language = std::string accept_language =
SpellcheckService::GetSupportedAcceptLanguageCode( SpellcheckService::GetSupportedAcceptLanguageCode(
windows_spellcheck_language); windows_spellcheck_language, /* generic_only */ false);
AddWindowsSpellcheckDictionary(accept_language, AddWindowsSpellcheckDictionary(accept_language,
windows_spellcheck_language); windows_spellcheck_language);
...@@ -627,9 +616,25 @@ void SpellcheckService::InitWindowsDictionaryLanguages( ...@@ -627,9 +616,25 @@ void SpellcheckService::InitWindowsDictionaryLanguages(
if (base::EqualsCaseInsensitiveASCII( if (base::EqualsCaseInsensitiveASCII(
"sr-Cyrl", SpellcheckService::GetLanguageAndScriptTag( "sr-Cyrl", SpellcheckService::GetLanguageAndScriptTag(
windows_spellcheck_language, windows_spellcheck_language,
/* include_script_tag */ true))) { /* include_script_tag= */ true))) {
AddWindowsSpellcheckDictionary("sr", windows_spellcheck_language); AddWindowsSpellcheckDictionary("sr", windows_spellcheck_language);
} }
// Add the generic language with the region subtag removed too if it exists
// in the list of accept languages, and use it when calling the Windows
// spellcheck APIs. For example, if the preferred language settings include
// just generic Portuguese (pt), but the Portuguese (Brazil) platform
// language pack (pt-BR) is installed, we want an entry for it so that the
// generic Portuguese language can be enabled for spellchecking. The Windows
// platform spellcheck API has logic to load the pt-BR dictionary if only pt
// is specified as the BCP47 language tag. The use of a map in
// AddWindowsSpellcheckDictionary ensures there won't be duplicate entries
// if a generic language was already added above (ar-SA would already be
// mapped to ar since the accept language ar-SA is not recognized by the
// browser e.g.).
accept_language = SpellcheckService::GetSupportedAcceptLanguageCode(
windows_spellcheck_language, /* generic_only */ true);
AddWindowsSpellcheckDictionary(accept_language, accept_language);
} }
// A user may have removed a language pack for a non-Hunspell language after // A user may have removed a language pack for a non-Hunspell language after
...@@ -671,6 +676,9 @@ void SpellcheckService::OverrideBinderForTesting(SpellCheckerBinder binder) { ...@@ -671,6 +676,9 @@ void SpellcheckService::OverrideBinderForTesting(SpellCheckerBinder binder) {
std::string SpellcheckService::GetLanguageAndScriptTag( std::string SpellcheckService::GetLanguageAndScriptTag(
const std::string& full_tag, const std::string& full_tag,
bool include_script_tag) { bool include_script_tag) {
if (full_tag.empty())
return "";
std::string language_and_script_tag; std::string language_and_script_tag;
std::vector<std::string> subtags = base::SplitString( std::vector<std::string> subtags = base::SplitString(
...@@ -694,6 +702,39 @@ std::string SpellcheckService::GetLanguageAndScriptTag( ...@@ -694,6 +702,39 @@ std::string SpellcheckService::GetLanguageAndScriptTag(
return language_and_script_tag; return language_and_script_tag;
} }
#if defined(OS_WIN)
// static
std::string SpellcheckService::GetSupportedAcceptLanguageCodeGenericOnly(
const std::string& supported_language_full_tag,
const std::vector<std::string>& accept_languages) {
auto iter =
std::find_if(accept_languages.begin(), accept_languages.end(),
[supported_language_full_tag](const auto& accept_language) {
return base::EqualsCaseInsensitiveASCII(
SpellcheckService::GetLanguageAndScriptTag(
supported_language_full_tag,
/* include_script_tag= */ false),
SpellcheckService::GetLanguageAndScriptTag(
accept_language,
/* include_script_tag= */ false));
});
if (iter != accept_languages.end()) {
// Special case for Serbian--"sr" implies Cyrillic script. Don't mark it as
// supported for sr-Latn*.
if (base::EqualsCaseInsensitiveASCII(
SpellcheckService::GetLanguageAndScriptTag(
supported_language_full_tag,
/* include_script_tag= */ true),
"sr-Latn")) {
return "";
}
return *iter;
}
return "";
}
// static // static
bool SpellcheckService::HasPrivateUseSubTag(const std::string& full_tag) { bool SpellcheckService::HasPrivateUseSubTag(const std::string& full_tag) {
std::vector<std::string> subtags = base::SplitString( std::vector<std::string> subtags = base::SplitString(
...@@ -704,7 +745,6 @@ bool SpellcheckService::HasPrivateUseSubTag(const std::string& full_tag) { ...@@ -704,7 +745,6 @@ bool SpellcheckService::HasPrivateUseSubTag(const std::string& full_tag) {
return base::Contains(subtags, "x"); return base::Contains(subtags, "x");
} }
#if defined(OS_WIN)
// static // static
std::string SpellcheckService::GetTagToPassToWindowsSpellchecker( std::string SpellcheckService::GetTagToPassToWindowsSpellchecker(
const std::string& accept_language, const std::string& accept_language,
...@@ -733,7 +773,7 @@ std::string SpellcheckService::GetTagToPassToWindowsSpellchecker( ...@@ -733,7 +773,7 @@ std::string SpellcheckService::GetTagToPassToWindowsSpellchecker(
return SpellcheckService::GetLanguageAndScriptTag( return SpellcheckService::GetLanguageAndScriptTag(
supported_language_full_tag, supported_language_full_tag,
/* include_script_tag */ true); /* include_script_tag= */ true);
} }
#endif // defined(OS_WIN) #endif // defined(OS_WIN)
......
...@@ -103,9 +103,12 @@ class SpellcheckService : public KeyedService, ...@@ -103,9 +103,12 @@ class SpellcheckService : public KeyedService,
// Get the best match of a supported accept language code for the provided // Get the best match of a supported accept language code for the provided
// language tag. Returns an empty string if no match is found. Method cannot // language tag. Returns an empty string if no match is found. Method cannot
// be defined in spellcheck_common.h as it depends on l10n_util, and code // be defined in spellcheck_common.h as it depends on l10n_util, and code
// under components cannot depend on ui/base. // under components cannot depend on ui/base. If |generic_only| is true,
// then only return the language subtag (first part of the full BCP47 tag)
// if the generic accept language is supported by the browser.
static std::string GetSupportedAcceptLanguageCode( static std::string GetSupportedAcceptLanguageCode(
const std::string& supported_language_full_tag); const std::string& supported_language_full_tag,
bool generic_only = false);
#if defined(OS_WIN) #if defined(OS_WIN)
// Since Windows platform dictionary support is determined asynchronously, // Since Windows platform dictionary support is determined asynchronously,
...@@ -216,12 +219,18 @@ class SpellcheckService : public KeyedService, ...@@ -216,12 +219,18 @@ class SpellcheckService : public KeyedService,
static std::string GetLanguageAndScriptTag(const std::string& full_tag, static std::string GetLanguageAndScriptTag(const std::string& full_tag,
bool include_script_tag); bool include_script_tag);
#if defined(OS_WIN)
// Returns the language subtag (first part of the full BCP47 tag)
// if the generic accept language is supported by the browser.
static std::string GetSupportedAcceptLanguageCodeGenericOnly(
const std::string& supported_language_full_tag,
const std::vector<std::string>& accept_languages);
// Returns true if full BCP47 language tag contains private use subtag (e.g in // Returns true if full BCP47 language tag contains private use subtag (e.g in
// the tag "ja-Latn-JP-x-ext"), indicating the tag is only for use by private // the tag "ja-Latn-JP-x-ext"), indicating the tag is only for use by private
// agreement. // agreement.
static bool HasPrivateUseSubTag(const std::string& full_tag); static bool HasPrivateUseSubTag(const std::string& full_tag);
#if defined(OS_WIN)
// Returns the BCP47 language tag to pass to the Windows spellcheck API, based // Returns the BCP47 language tag to pass to the Windows spellcheck API, based
// on the accept language and full tag, with special logic for languages that // on the accept language and full tag, with special logic for languages that
// can be written in different scripts. // can be written in different scripts.
......
...@@ -722,13 +722,15 @@ class SpellcheckServiceWindowsHybridBrowserTestDelayInit ...@@ -722,13 +722,15 @@ class SpellcheckServiceWindowsHybridBrowserTestDelayInit
// Used for faking the presence of Windows spellcheck dictionaries. // Used for faking the presence of Windows spellcheck dictionaries.
const std::vector<std::string> kWindowsSpellcheckLanguages = { const std::vector<std::string> kWindowsSpellcheckLanguages = {
"fi-FI" // Finnish has no Hunspell support. "fi-FI", // Finnish has no Hunspell support.
"fr-FR", // French has both Windows and Hunspell support. "fr-FR", // French has both Windows and Hunspell support.
"pt-BR" // Portuguese (Brazil) has both Windows and Hunspell support, but
// generic pt does not have Hunspell support.
}; };
// Used for testing whether primary preferred language is enabled by default for // Used for testing whether primary preferred language is enabled by default for
// spellchecking. // spellchecking.
const char kAcceptLanguages[] = "fi-FI,fi,ar-AR,fr-FR,hr,ceb"; const char kAcceptLanguages[] = "fi-FI,fi,ar-AR,fr-FR,fr,hr,ceb,pt-BR,pt";
const std::vector<std::string> kSpellcheckDictionariesBefore = { const std::vector<std::string> kSpellcheckDictionariesBefore = {
// Note that Finnish is initially unset, but has Windows spellcheck // Note that Finnish is initially unset, but has Windows spellcheck
// dictionary present. // dictionary present.
...@@ -736,16 +738,23 @@ const std::vector<std::string> kSpellcheckDictionariesBefore = { ...@@ -736,16 +738,23 @@ const std::vector<std::string> kSpellcheckDictionariesBefore = {
// dictionary is not present. // dictionary is not present.
"fr-FR", // French has both Windows and Hunspell support, and its Windows "fr-FR", // French has both Windows and Hunspell support, and its Windows
// spellcheck dictionary is present. // spellcheck dictionary is present.
"fr", // Generic language should also be toggleable for spellcheck.
"hr", // Croatian has Hunspell support. "hr", // Croatian has Hunspell support.
"ceb" // Cebuano doesn't have any dictionary support and should be removed "ceb", // Cebuano doesn't have any dictionary support and should be
// from preferences. // removed from preferences.
"pt-BR", // Portuguese (Brazil) has both Windows and Hunspell support, and
// its Windows spellcheck dictionary is present.
"pt" // Generic language should also be toggleable for spellcheck.
}; };
const std::vector<std::string> kSpellcheckDictionariesAfter = { const std::vector<std::string> kSpellcheckDictionariesAfter = {
"fi", // Finnish should have been enabled for spellchecking since it's the "fi", // Finnish should have been enabled for spellchecking since
// primary language. // it's the primary language.
"fr-FR", // French should still be there. "fr-FR", // French should still be there.
"hr" // So should Croatian. "fr", // Should still be entry for generic French.
"hr", // So should Croatian.
"pt-BR", // Portuguese (Brazil) should still be there.
"pt" // Should still be entry for generic Portuguese.
}; };
// As a prelude to the next test, sets the initial accept languages and // As a prelude to the next test, sets the initial accept languages and
...@@ -803,8 +812,7 @@ IN_PROC_BROWSER_TEST_F(SpellcheckServiceWindowsHybridBrowserTestDelayInit, ...@@ -803,8 +812,7 @@ IN_PROC_BROWSER_TEST_F(SpellcheckServiceWindowsHybridBrowserTestDelayInit,
std::map<std::string, std::string> std::map<std::string, std::string>
windows_spellcheck_dictionary_map_first_call = windows_spellcheck_dictionary_map_first_call =
service->windows_spellcheck_dictionary_map_; service->windows_spellcheck_dictionary_map_;
EXPECT_EQ(kWindowsSpellcheckLanguages.size(), EXPECT_FALSE(windows_spellcheck_dictionary_map_first_call.empty());
windows_spellcheck_dictionary_map_first_call.size());
// Check that the primary accept language has spellchecking enabled and // Check that the primary accept language has spellchecking enabled and
// that languages with no spellcheck support have spellchecking disabled. // that languages with no spellcheck support have spellchecking disabled.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment