Commit 4eb24b29 authored by mgiuca's avatar mgiuca Committed by Commit bot

Changed the app list string matching formula.

This is used to rank apps and webstore results. This should not (really)
affect the relative ranking of any results, only the absolute scores
that they are assigned internally. However, this will be relevant in the
future when we start comparing scores of different types of results
against each other.

The algorithm (used to score app and webstore results) now has a
different tapering formula, designed to reach a higher score with fewer
keystrokes. Previously, it was based on the percentage of the full title
you had typed (which unfairly de-prioritized apps with long titles, such
as "Google Keep - notes and lists"). Now, it has an exponential curve,
so you get a reasonably high score with just a few letters matched, and
then it tapers off, approaching 1.0 as you type more letters.

BUG=422610

Review URL: https://codereview.chromium.org/1138193002

Cr-Commit-Position: refs/heads/master@{#329574}
parent 64067bf2
...@@ -4,6 +4,8 @@ ...@@ -4,6 +4,8 @@
#include "ui/app_list/search/tokenized_string_match.h" #include "ui/app_list/search/tokenized_string_match.h"
#include <cmath>
#include "base/i18n/string_search.h" #include "base/i18n/string_search.h"
#include "base/logging.h" #include "base/logging.h"
#include "ui/app_list/search/tokenized_string_char_iterator.h" #include "ui/app_list/search/tokenized_string_char_iterator.h"
...@@ -218,10 +220,14 @@ bool TokenizedStringMatch::Calculate(const TokenizedString& query, ...@@ -218,10 +220,14 @@ bool TokenizedStringMatch::Calculate(const TokenizedString& query,
} }
} }
// Using length() for normalizing is not 100% correct but should be good // Temper the relevance score with an exponential curve. Each point of
// enough compared with using real char count of the text. // relevance (roughly, each keystroke) is worth less than the last. This means
if (text.text().length()) // that typing a few characters of a word is enough to promote matches very
relevance_ /= text.text().length(); // high, with any subsequent characters being worth comparatively less.
// TODO(mgiuca): This doesn't really play well with Omnibox results, since as
// you type more characters, the app/omnibox results tend to jump over each
// other.
relevance_ = 1.0 - std::pow(0.5, relevance_);
return relevance_ > kNoMatchScore; return relevance_ > kNoMatchScore;
} }
......
...@@ -14,7 +14,7 @@ namespace app_list { ...@@ -14,7 +14,7 @@ namespace app_list {
namespace test { namespace test {
// Returns a string of |text| marked the hits in |match| using block bracket. // Returns a string of |text| marked the hits in |match| using block bracket.
// e.g. text= "Text", hits = [{0,1}], returns "[T]ext". // e.g. text= "Text", match.hits = [{0,1}], returns "[T]ext".
std::string MatchHit(const base::string16& text, std::string MatchHit(const base::string16& text,
const TokenizedStringMatch& match) { const TokenizedStringMatch& match) {
base::string16 marked = text; base::string16 marked = text;
...@@ -119,5 +119,36 @@ TEST(TokenizedStringMatchTest, Relevance) { ...@@ -119,5 +119,36 @@ TEST(TokenizedStringMatchTest, Relevance) {
} }
} }
// More specialized tests of the absolute relevance scores. (These tests are
// minimal, because they are so brittle. Changing the scoring algorithm will
// require updating this test.)
TEST(TokenizedStringMatchTest, AbsoluteRelevance) {
const double kEpsilon = 0.006;
struct {
const char* text;
const char* query;
double expected_score;
} kTestCases[] = {
// The first few chars should increase the score extremely high. After
// that, they should count less.
// NOTE: 0.87 is a magic number, as it is the Omnibox score for a "pretty
// good" match. We want a 3-letter prefix match to be slightly above 0.87.
{"Google Chrome", "g", 0.5},
{"Google Chrome", "go", 0.75},
{"Google Chrome", "goo", 0.88},
{"Google Chrome", "goog", 0.94},
};
TokenizedStringMatch match;
for (size_t i = 0; i < arraysize(kTestCases); ++i) {
const base::string16 text(base::UTF8ToUTF16(kTestCases[i].text));
EXPECT_TRUE(match.Calculate(base::UTF8ToUTF16(kTestCases[i].query), text));
EXPECT_NEAR(match.relevance(), kTestCases[i].expected_score, kEpsilon)
<< "Test case " << i << " : text=" << kTestCases[i].text
<< ", query=" << kTestCases[i].query
<< ", expected_score=" << kTestCases[i].expected_score;
}
}
} // namespace test } // namespace test
} // namespace app_list } // namespace app_list
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment