Commit 37414e4b authored by manuk's avatar manuk Committed by Commit Bot

[omnibox]: Dedupe historyQuick provider classification.

This is the 16th refactoring CL aimed at reducing duplication and
inconsistency for classifying omnibox results.

The historyQuick provider compares the user input and the suggestion
text to find the corresponding matches during construction of the
ScoredHistoryMatch precursor to the suggestion's AutocompleteMatch. The
term matches are then used for both scoring and classifying.

With this CL, historyQuick classification uses the 'FindTermMatches' and
'ClassifyTermMatches' methods that other providers use. This improves
consistency with other providers at the cost of inconsistency with the
historyQuick provider's scoring.

The differences between current and this CL's classification:

1) Prefix matching. With this CL, if the input is an exact prefix of the
suggestion text, subsequent matching words in the suggestion text are
not bolded. E.g. for input 'x' and suggestion 'x x', before this CL,
both occurrences would be bolded; with this CL, only the first will be.

2) Midword matching. Before this CL, midword matches were allowed for
the URL host domain. E.g., for input 'x' and suggestion 'zxx.xx/xx',
before this CL, the first 5 occurrences would be bolded,
'z[xx].[xx]/[x]x'; with this CL, only the 3rd and 4th will be bolded,
'zxx.[x]x/[x]x'.

3) Input word-break separators. Before this CL, the user input was not
broken by symbols, though the suggest text was. E.g., for input 'x%x y'
and suggestion 'x%x x y%y y', before this CL, it would bold as '[x%x] x
[y]%[y] [y]'; with this CL, it will bold as '[x]%[x] [x] [y]%[y] [y]'.

All 3 changes apply to suggestion classification (bolding) only;
determining which suggestions to display and their ordering is
unaffected.

Re consistency with other providers: A user may see suggestions from
different providers with similar texts. E.g., the user input 'the' could
provide both search and historyQuick suggestions with texts 'the cake
ate the moon'. It would be surprising if such suggestions with the same
text were bolded differently; e.g. the search suggestion were bolded
'[the] cake ate the moon', whereas the historyQuick suggestion was
bolded '[the] cake ate [the] moon'.

Re inconsistency with historyQuick's scoring. Scoring is consistent with
the previous bolding; all previously bolded terms would either
contribute (positively or negatively) to the suggestions score or
disqualify the suggestion (e.g. 'm' disqualifies 'yahoo.com'). E.g., 'y
hoo' would bold both terms in the suggestion URL '[y]a[hoo].com', and
both would contribute to the score. With the classification changes,
only 'y' would be bolded, suggesting the 'hoo' term did not contribute
to scoring.

Bug: 366623
Change-Id: I478a9d4fcc63abe7aa55dba4274e8896d6bdc388
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1593717
Commit-Queue: manuk hovanesian <manukh@chromium.org>
Reviewed-by: default avatarTommy Li <tommycli@chromium.org>
Cr-Commit-Position: refs/heads/master@{#664505}
parent 28f9927e
...@@ -20,12 +20,12 @@ ...@@ -20,12 +20,12 @@
#include "components/bookmarks/browser/bookmark_model.h" #include "components/bookmarks/browser/bookmark_model.h"
#include "components/history/core/browser/history_database.h" #include "components/history/core/browser/history_database.h"
#include "components/history/core/browser/history_service.h" #include "components/history/core/browser/history_service.h"
#include "components/omnibox/browser/autocomplete_match_classification.h"
#include "components/omnibox/browser/autocomplete_match_type.h" #include "components/omnibox/browser/autocomplete_match_type.h"
#include "components/omnibox/browser/autocomplete_provider_client.h" #include "components/omnibox/browser/autocomplete_provider_client.h"
#include "components/omnibox/browser/autocomplete_result.h" #include "components/omnibox/browser/autocomplete_result.h"
#include "components/omnibox/browser/history_url_provider.h" #include "components/omnibox/browser/history_url_provider.h"
#include "components/omnibox/browser/in_memory_url_index.h" #include "components/omnibox/browser/in_memory_url_index.h"
#include "components/omnibox/browser/in_memory_url_index_types.h"
#include "components/omnibox/browser/omnibox_field_trial.h" #include "components/omnibox/browser/omnibox_field_trial.h"
#include "components/omnibox/browser/url_prefix.h" #include "components/omnibox/browser/url_prefix.h"
#include "components/prefs/pref_service.h" #include "components/prefs/pref_service.h"
...@@ -237,27 +237,44 @@ AutocompleteMatch HistoryQuickProvider::QuickMatchToACMatch( ...@@ -237,27 +237,44 @@ AutocompleteMatch HistoryQuickProvider::QuickMatchToACMatch(
!PreventInlineAutocomplete(autocomplete_input_); !PreventInlineAutocomplete(autocomplete_input_);
} }
// The term match offsets should be adjusted based on the formatting // HistoryQuick classification diverges from relevance scoring. Specifically,
// applied to the suggestion contents displayed in the dropdown. // 1) All occurrences of the input contribute to relevance; e.g. for the input
std::vector<size_t> offsets = // 'pre', the suggestion 'pre prefix' will be scored higher than 'pre suffix'.
OffsetsFromTermMatches(history_match.url_matches); // For classification though, if the input is a prefix of the suggestion text,
match.contents = url_formatter::FormatUrlWithOffsets( // only the prefix will be bolded; e.g. the 1st suggestion will display '[pre]
// prefix' as opposed to '[pre] [pre]fix'. This divergence allows consistency
// with other providers' and google.com's bolding.
// 2) Mid-word occurrences of the input within the suggestion URL contribute
// to relevance; e.g. for the input 'mail', the suggestion 'mail - gmail.com'
// will be scored higher than 'mail - outlook.live.com'. Mid-word matches only
// in the domain affect scoring. For classification though, mid-word matches
// are not bolded; e.g. the 1st suggestion will display '[mail] - gmail.com'.
// 3) User input is not broken on symbols for relevance calculations; e.g. for
// the input '#yolo', the suggestion 'how-to-yolo - yolo.com/#yolo' would be
// scored the same as 'how-to-tie-a-tie - yolo.com/#yolo/tie'. For
// classification though, user input is broken on symbols; e.g. the 1st
// suggestion will display 'how-to-[yolo] - [yolo].com/#[yolo]'.
match.contents = url_formatter::FormatUrl(
info.url(), info.url(),
AutocompleteMatch::GetFormatTypes( AutocompleteMatch::GetFormatTypes(
autocomplete_input_.parts().scheme.len > 0 || autocomplete_input_.parts().scheme.len > 0 ||
history_match.match_in_scheme, history_match.match_in_scheme,
history_match.match_in_subdomain), history_match.match_in_subdomain),
net::UnescapeRule::SPACES, nullptr, nullptr, &offsets); net::UnescapeRule::SPACES, nullptr, nullptr, nullptr);
auto contents_terms =
TermMatches new_matches = FindTermMatches(autocomplete_input_.text(), match.contents);
ReplaceOffsetsInTermMatches(history_match.url_matches, offsets); match.contents_class = ClassifyTermMatches(
match.contents_class = contents_terms, match.contents.size(),
SpansFromTermMatch(new_matches, match.contents.length(), true); ACMatchClassification::MATCH | ACMatchClassification::URL,
ACMatchClassification::URL);
// Format the description autocomplete presentation.
match.description = info.title(); match.description = info.title();
match.description_class = SpansFromTermMatch( auto description_terms =
history_match.title_matches, match.description.length(), false); FindTermMatches(autocomplete_input_.text(), match.description);
match.description_class = ClassifyTermMatches(
description_terms, match.description.size(), ACMatchClassification::MATCH,
ACMatchClassification::NONE);
match.RecordAdditionalInfo("typed count", info.typed_count()); match.RecordAdditionalInfo("typed count", info.typed_count());
match.RecordAdditionalInfo("visit count", info.visit_count()); match.RecordAdditionalInfo("visit count", info.visit_count());
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment