Commit 4e19b363 authored by Alex Moshchuk's avatar Alex Moshchuk Committed by Commit Bot

Reorganize isolated origins into a map indexed by site URLs.

Previously isolated origins were stored in a set.  This resulted in
large overheads when URLs needed to be checked against isolated
origins, since the set had to be walked through linearly, running a
fairly slow comparison function
(IsolatedOriginUtil::DoesOriginMatchIsolatedOrigin()) on each isolated
origin, and this lookup turned out to be fairly common.

This CL reorganizes the underlying data structure into a map keyed by
site URLs.  For example, if we add {https://foo.com,
https://bar.foo.com, https://www.bar.com} as isolated origins, the map
will look like this:
  https://foo.com -> { https://foo.com, https://bar.foo.com }
  https://bar.com -> { https://www.bar.com }

When checking a URL against isolated origins, we now can just look up
the map using the URL's site and then look for the most specific
isolated origins as before.  The site can be found in O(log n) time,
and the corresponding list of origins to search using
DoesOriginMatchIsolatedOrigin() changes from O(n) to O(originsPerSite),
which is expected to be ~1.

Bug: 877653
Change-Id: I41667f94cea7f1f8e5b53fa23443d83cfbb3c0ed
Reviewed-on: https://chromium-review.googlesource.com/1208942
Commit-Queue: Alex Moshchuk <alexmos@chromium.org>
Reviewed-by: default avatarŁukasz Anforowicz <lukasza@chromium.org>
Reviewed-by: default avatarCharlie Reis <creis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#590063}
parent 27357612
...@@ -1230,10 +1230,18 @@ void ChildProcessSecurityPolicyImpl::AddIsolatedOrigins( ...@@ -1230,10 +1230,18 @@ void ChildProcessSecurityPolicyImpl::AddIsolatedOrigins(
return true; // Remove. return true; // Remove.
}); });
// Taking the lock once and doing a batch insertion via base::flat_set::insert
// is important because of performance characteristics of base::flat_set.
base::AutoLock lock(lock_); base::AutoLock lock(lock_);
isolated_origins_.insert(origins_to_add.begin(), origins_to_add.end()); for (const url::Origin& origin : origins_to_add) {
// GetSiteForOrigin() is used to look up the site URL of |origin| to speed
// up the isolated origin lookup. This only performs a straightforward
// translation of an origin to eTLD+1; it does *not* take into account
// effective URLs, isolated origins, and other logic that's not needed
// here, but *is* typically needed for making process model decisions. Be
// very careful about using GetSiteForOrigin() elsewhere, and consider
// whether you should be using GetSiteForURL() instead.
GURL key(SiteInstanceImpl::GetSiteForOrigin(origin));
isolated_origins_[key].insert(origin);
}
} }
bool ChildProcessSecurityPolicyImpl::IsIsolatedOrigin( bool ChildProcessSecurityPolicyImpl::IsIsolatedOrigin(
...@@ -1245,20 +1253,53 @@ bool ChildProcessSecurityPolicyImpl::IsIsolatedOrigin( ...@@ -1245,20 +1253,53 @@ bool ChildProcessSecurityPolicyImpl::IsIsolatedOrigin(
bool ChildProcessSecurityPolicyImpl::GetMatchingIsolatedOrigin( bool ChildProcessSecurityPolicyImpl::GetMatchingIsolatedOrigin(
const url::Origin& origin, const url::Origin& origin,
url::Origin* result) { url::Origin* result) {
// GetSiteForOrigin() is used to look up the site URL of |origin| to speed
// up the isolated origin lookup. This only performs a straightforward
// translation of an origin to eTLD+1; it does *not* take into account
// effective URLs, isolated origins, and other logic that's not needed
// here, but *is* typically needed for making process model decisions. Be
// very careful about using GetSiteForOrigin() elsewhere, and consider
// whether you should be using GetSiteForURL() instead.
return GetMatchingIsolatedOrigin(
origin, SiteInstanceImpl::GetSiteForOrigin(origin), result);
}
bool ChildProcessSecurityPolicyImpl::GetMatchingIsolatedOrigin(
const url::Origin& origin,
const GURL& site_url,
url::Origin* result) {
*result = url::Origin(); *result = url::Origin();
base::AutoLock lock(lock_); base::AutoLock lock(lock_);
// Look up the list of origins corresponding to |origin|'s site.
auto it = isolated_origins_.find(site_url);
// Subtle corner case: if the site's host ends with a dot, do the lookup
// without it. A trailing dot shouldn't be able to bypass isolated origins:
// if "https://foo.com" is an isolated origin, "https://foo.com." should
// match it.
if (it == isolated_origins_.end() && site_url.host().back() == '.') {
GURL::Replacements replacements;
std::string host = site_url.host();
host.pop_back();
replacements.SetHostStr(host);
it = isolated_origins_.find(site_url.ReplaceComponents(replacements));
}
// If multiple isolated origins are registered with a common domain suffix, // If multiple isolated origins are registered with a common domain suffix,
// return the most specific one. For example, if foo.isolated.com and // return the most specific one. For example, if foo.isolated.com and
// isolated.com are both isolated origins, bar.foo.isolated.com should return // isolated.com are both isolated origins, bar.foo.isolated.com should
// foo.isolated.com. // return foo.isolated.com.
bool found = false; bool found = false;
for (auto isolated_origin : isolated_origins_) { if (it != isolated_origins_.end()) {
if (IsolatedOriginUtil::DoesOriginMatchIsolatedOrigin(origin, for (const url::Origin& isolated_origin : it->second) {
isolated_origin)) { if (IsolatedOriginUtil::DoesOriginMatchIsolatedOrigin(origin,
if (!found || result->host().length() < isolated_origin.host().length()) { isolated_origin)) {
*result = isolated_origin; if (!found ||
found = true; result->host().length() < isolated_origin.host().length()) {
*result = isolated_origin;
found = true;
}
} }
} }
} }
...@@ -1268,8 +1309,11 @@ bool ChildProcessSecurityPolicyImpl::GetMatchingIsolatedOrigin( ...@@ -1268,8 +1309,11 @@ bool ChildProcessSecurityPolicyImpl::GetMatchingIsolatedOrigin(
void ChildProcessSecurityPolicyImpl::RemoveIsolatedOriginForTesting( void ChildProcessSecurityPolicyImpl::RemoveIsolatedOriginForTesting(
const url::Origin& origin) { const url::Origin& origin) {
GURL key(SiteInstanceImpl::GetSiteForOrigin(origin));
base::AutoLock lock(lock_); base::AutoLock lock(lock_);
isolated_origins_.erase(origin); isolated_origins_[key].erase(origin);
if (isolated_origins_[key].empty())
isolated_origins_.erase(key);
} }
} // namespace content } // namespace content
...@@ -12,6 +12,7 @@ ...@@ -12,6 +12,7 @@
#include <vector> #include <vector>
#include "base/compiler_specific.h" #include "base/compiler_specific.h"
#include "base/containers/flat_map.h"
#include "base/containers/flat_set.h" #include "base/containers/flat_set.h"
#include "base/gtest_prod_util.h" #include "base/gtest_prod_util.h"
#include "base/macros.h" #include "base/macros.h"
...@@ -97,6 +98,15 @@ class CONTENT_EXPORT ChildProcessSecurityPolicyImpl ...@@ -97,6 +98,15 @@ class CONTENT_EXPORT ChildProcessSecurityPolicyImpl
bool GetMatchingIsolatedOrigin(const url::Origin& origin, bool GetMatchingIsolatedOrigin(const url::Origin& origin,
url::Origin* result) override; url::Origin* result) override;
// A version of GetMatchingIsolatedOrigin that takes in both the |origin| and
// the |site_url| that |origin| corresponds to. |site_url| is the key by
// which |origin| will be looked up in |isolated_origins_|; this function
// allows it to be passed in when it is already known to avoid recomputing it
// internally.
bool GetMatchingIsolatedOrigin(const url::Origin& origin,
const GURL& site_url,
url::Origin* result);
// Returns if |child_id| can read all of the |files|. // Returns if |child_id| can read all of the |files|.
bool CanReadAllFiles(int child_id, const std::vector<base::FilePath>& files); bool CanReadAllFiles(int child_id, const std::vector<base::FilePath>& files);
...@@ -361,7 +371,17 @@ class CONTENT_EXPORT ChildProcessSecurityPolicyImpl ...@@ -361,7 +371,17 @@ class CONTENT_EXPORT ChildProcessSecurityPolicyImpl
// when making process model decisions, rather than the origin's scheme and // when making process model decisions, rather than the origin's scheme and
// eTLD+1. Each of these origins requires a dedicated process. This set is // eTLD+1. Each of these origins requires a dedicated process. This set is
// protected by |lock_|. // protected by |lock_|.
base::flat_set<url::Origin> isolated_origins_; //
// The origins are stored in a map indexed by a site URL computed for each
// origin. For example, adding https://foo.com, https://bar.foo.com, and
// https://www.bar.com would result in the following structure:
// https://foo.com -> { https://foo.com, https://bar.foo.com }
// https://bar.com -> { https://www.bar.com }
// This organization speeds up lookups of isolated origins. The site can be
// found in O(log n) time, and the corresponding list of origins to search
// using the expensive DoesOriginMatchIsolatedOrigin() comparison is
// typically small.
base::flat_map<GURL, base::flat_set<url::Origin>> isolated_origins_;
DISALLOW_COPY_AND_ASSIGN(ChildProcessSecurityPolicyImpl); DISALLOW_COPY_AND_ASSIGN(ChildProcessSecurityPolicyImpl);
}; };
......
...@@ -9,6 +9,7 @@ ...@@ -9,6 +9,7 @@
#include "base/logging.h" #include "base/logging.h"
#include "base/test/mock_log.h" #include "base/test/mock_log.h"
#include "content/browser/child_process_security_policy_impl.h" #include "content/browser/child_process_security_policy_impl.h"
#include "content/browser/site_instance_impl.h"
#include "content/public/common/bindings_policy.h" #include "content/public/common/bindings_policy.h"
#include "content/public/common/url_constants.h" #include "content/public/common/url_constants.h"
#include "content/test/test_content_browser_client.h" #include "content/test/test_content_browser_client.h"
...@@ -1086,12 +1087,32 @@ TEST_F(ChildProcessSecurityPolicyTest, OriginGranting) { ...@@ -1086,12 +1087,32 @@ TEST_F(ChildProcessSecurityPolicyTest, OriginGranting) {
p->Remove(kRendererID); p->Remove(kRendererID);
} }
namespace {
// Helpers to construct (key, value) entries used to validate the
// isolated_origins_ map.
auto IsolatedOriginEntry(const url::Origin& origin) {
return std::pair<GURL, base::flat_set<url::Origin>>(
SiteInstanceImpl::GetSiteForOrigin(origin), {origin});
}
auto IsolatedOriginEntry(const url::Origin& origin1,
const url::Origin& origin2) {
EXPECT_EQ(SiteInstanceImpl::GetSiteForOrigin(origin1),
SiteInstanceImpl::GetSiteForOrigin(origin2));
return std::pair<GURL, base::flat_set<url::Origin>>(
SiteInstanceImpl::GetSiteForOrigin(origin1), {origin1, origin2});
}
} // namespace
// Verifies ChildProcessSecurityPolicyImpl::AddIsolatedOrigins method. // Verifies ChildProcessSecurityPolicyImpl::AddIsolatedOrigins method.
TEST_F(ChildProcessSecurityPolicyTest, AddIsolatedOrigins) { TEST_F(ChildProcessSecurityPolicyTest, AddIsolatedOrigins) {
url::Origin foo = url::Origin::Create(GURL("https://foo.com/")); url::Origin foo = url::Origin::Create(GURL("https://foo.com/"));
url::Origin bar = url::Origin::Create(GURL("https://bar.com/")); url::Origin bar = url::Origin::Create(GURL("https://bar.com/"));
url::Origin baz = url::Origin::Create(GURL("https://baz.com/")); url::Origin baz = url::Origin::Create(GURL("https://baz.com/"));
url::Origin foobar = url::Origin::Create(GURL("https://foobar.com/")); url::Origin quxfoo = url::Origin::Create(GURL("https://qux.foo.com/"));
url::Origin baz_http_8000 = url::Origin::Create(GURL("http://baz.com:8000/")); url::Origin baz_http_8000 = url::Origin::Create(GURL("http://baz.com:8000/"));
url::Origin baz_https_8000 = url::Origin baz_https_8000 =
url::Origin::Create(GURL("https://baz.com:8000/")); url::Origin::Create(GURL("https://baz.com:8000/"));
...@@ -1104,28 +1125,39 @@ TEST_F(ChildProcessSecurityPolicyTest, AddIsolatedOrigins) { ...@@ -1104,28 +1125,39 @@ TEST_F(ChildProcessSecurityPolicyTest, AddIsolatedOrigins) {
// Verify deduplication of the argument. // Verify deduplication of the argument.
p->AddIsolatedOrigins({foo, bar, bar}); p->AddIsolatedOrigins({foo, bar, bar});
EXPECT_THAT(p->isolated_origins_, testing::UnorderedElementsAre(foo, bar)); EXPECT_THAT(p->isolated_origins_,
testing::UnorderedElementsAre(IsolatedOriginEntry(foo),
IsolatedOriginEntry(bar)));
// Verify that the old set is extended (not replaced). // Verify that the old set is extended (not replaced).
p->AddIsolatedOrigins({baz}); p->AddIsolatedOrigins({baz});
EXPECT_THAT(p->isolated_origins_, EXPECT_THAT(p->isolated_origins_,
testing::UnorderedElementsAre(foo, bar, baz)); testing::UnorderedElementsAre(IsolatedOriginEntry(foo),
IsolatedOriginEntry(bar),
IsolatedOriginEntry(baz)));
// Verify deduplication against the old set. // Verify deduplication against the old set.
p->AddIsolatedOrigins({foo}); p->AddIsolatedOrigins({foo});
EXPECT_THAT(p->isolated_origins_, EXPECT_THAT(p->isolated_origins_,
testing::UnorderedElementsAre(foo, bar, baz)); testing::UnorderedElementsAre(IsolatedOriginEntry(foo),
IsolatedOriginEntry(bar),
IsolatedOriginEntry(baz)));
// Verify deduplication considers scheme and port differences. // Verify deduplication considers scheme and port differences. Note that
// origins that differ only in ports map to the same key.
p->AddIsolatedOrigins({baz, baz_http_8000, baz_https_8000}); p->AddIsolatedOrigins({baz, baz_http_8000, baz_https_8000});
EXPECT_THAT(p->isolated_origins_, EXPECT_THAT(p->isolated_origins_,
testing::UnorderedElementsAre(foo, bar, baz, baz_http_8000, testing::UnorderedElementsAre(
baz_https_8000)); IsolatedOriginEntry(foo), IsolatedOriginEntry(bar),
IsolatedOriginEntry(baz, baz_https_8000),
IsolatedOriginEntry(baz_http_8000)));
// Verify that adding an origin that is invalid for isolation will 1) log a // Verify that adding an origin that is invalid for isolation will 1) log a
// warning and 2) won't CHECK or crash the browser process, 3) will not add // warning and 2) won't CHECK or crash the browser process, 3) will not add
// the invalid origin, but will add the remaining origins passed to // the invalid origin, but will add the remaining origins passed to
// AddIsolatedOrigins. // AddIsolatedOrigins. Note that the new |quxfoo| origin should map to the
// same key (i.e., the https://foo.com/ site URL) as the existing |foo|
// origin.
{ {
base::test::MockLog mock_log; base::test::MockLog mock_log;
EXPECT_CALL(mock_log, EXPECT_CALL(mock_log,
...@@ -1134,10 +1166,12 @@ TEST_F(ChildProcessSecurityPolicyTest, AddIsolatedOrigins) { ...@@ -1134,10 +1166,12 @@ TEST_F(ChildProcessSecurityPolicyTest, AddIsolatedOrigins) {
.Times(1); .Times(1);
mock_log.StartCapturingLogs(); mock_log.StartCapturingLogs();
p->AddIsolatedOrigins({foobar, invalid_etld}); p->AddIsolatedOrigins({quxfoo, invalid_etld});
EXPECT_THAT(p->isolated_origins_, EXPECT_THAT(p->isolated_origins_,
testing::UnorderedElementsAre(foo, bar, baz, baz_http_8000, testing::UnorderedElementsAre(
baz_https_8000, foobar)); IsolatedOriginEntry(foo, quxfoo), IsolatedOriginEntry(bar),
IsolatedOriginEntry(baz, baz_https_8000),
IsolatedOriginEntry(baz_http_8000)));
} }
} }
......
...@@ -401,7 +401,11 @@ bool SiteInstanceImpl::IsSameWebSite(BrowserContext* browser_context, ...@@ -401,7 +401,11 @@ bool SiteInstanceImpl::IsSameWebSite(BrowserContext* browser_context,
} }
// If the sites are the same, check isolated origins. If either URL matches // If the sites are the same, check isolated origins. If either URL matches
// an isolated origin, compare origins rather than sites. // an isolated origin, compare origins rather than sites. As an optimization
// to avoid unneeded isolated origin lookups, shortcut this check if the two
// origins are the same.
if (src_origin == dest_origin)
return true;
auto* policy = ChildProcessSecurityPolicyImpl::GetInstance(); auto* policy = ChildProcessSecurityPolicyImpl::GetInstance();
url::Origin src_isolated_origin; url::Origin src_isolated_origin;
url::Origin dest_isolated_origin; url::Origin dest_isolated_origin;
...@@ -449,27 +453,21 @@ GURL SiteInstanceImpl::GetSiteForURL(BrowserContext* browser_context, ...@@ -449,27 +453,21 @@ GURL SiteInstanceImpl::GetSiteForURL(BrowserContext* browser_context,
: real_url; : real_url;
url::Origin origin = url::Origin::Create(url); url::Origin origin = url::Origin::Create(url);
// Isolated origins should use the full origin as their site URL. A subdomain
// of an isolated origin should also use that isolated origin's site URL. It
// is important to check |url| rather than |real_url| here, since some
// effective URLs (such as for NTP) need to be resolved prior to the isolated
// origin lookup.
auto* policy = ChildProcessSecurityPolicyImpl::GetInstance();
url::Origin isolated_origin;
if (policy->GetMatchingIsolatedOrigin(origin, &isolated_origin))
return isolated_origin.GetURL();
// If the url has a host, then determine the site. Skip file URLs to avoid a // If the url has a host, then determine the site. Skip file URLs to avoid a
// situation where site URL of file://localhost/ would mismatch Blink's origin // situation where site URL of file://localhost/ would mismatch Blink's origin
// (which ignores the hostname in this case - see https://crbug.com/776160). // (which ignores the hostname in this case - see https://crbug.com/776160).
if (!origin.host().empty() && origin.scheme() != url::kFileScheme) { if (!origin.host().empty() && origin.scheme() != url::kFileScheme) {
// Only keep the scheme and registered domain of |origin|. GURL site_url(GetSiteForOrigin(origin));
std::string domain = net::registry_controlled_domains::GetDomainAndRegistry(
origin.host(), // Isolated origins should use the full origin as their site URL. A
net::registry_controlled_domains::INCLUDE_PRIVATE_REGISTRIES); // subdomain of an isolated origin should also use that isolated origin's
std::string site = origin.scheme(); // site URL. It is important to check |origin| (based on |url|) rather than
site += url::kStandardSchemeSeparator; // |real_url| here, since some effective URLs (such as for NTP) need to be
site += domain.empty() ? origin.host() : domain; // resolved prior to the isolated origin lookup.
auto* policy = ChildProcessSecurityPolicyImpl::GetInstance();
url::Origin isolated_origin;
if (policy->GetMatchingIsolatedOrigin(origin, site_url, &isolated_origin))
return isolated_origin.GetURL();
// If an effective URL was used, augment the effective site URL with the // If an effective URL was used, augment the effective site URL with the
// underlying web site in the hash. This is needed to keep // underlying web site in the hash. This is needed to keep
...@@ -479,12 +477,16 @@ GURL SiteInstanceImpl::GetSiteForURL(BrowserContext* browser_context, ...@@ -479,12 +477,16 @@ GURL SiteInstanceImpl::GetSiteForURL(BrowserContext* browser_context,
// TODO(https://crbug.com/734722): Consider replacing this hack with // TODO(https://crbug.com/734722): Consider replacing this hack with
// a proper security principal. // a proper security principal.
if (should_use_effective_urls && url != real_url) { if (should_use_effective_urls && url != real_url) {
site += "#" + GetSiteForURL(browser_context, real_url, std::string non_translated_site_url(
false /* should_use_effective_urls */) GetSiteForURL(browser_context, real_url,
.spec(); false /* should_use_effective_urls */)
.spec());
GURL::Replacements replacements;
replacements.SetRefStr(non_translated_site_url.c_str());
site_url = site_url.ReplaceComponents(replacements);
} }
return GURL(site); return site_url;
} }
// If there is no host but there is a scheme, return the scheme. // If there is no host but there is a scheme, return the scheme.
...@@ -535,6 +537,18 @@ GURL SiteInstanceImpl::GetSiteForURL(BrowserContext* browser_context, ...@@ -535,6 +537,18 @@ GURL SiteInstanceImpl::GetSiteForURL(BrowserContext* browser_context,
return GURL(); return GURL();
} }
// static
GURL SiteInstanceImpl::GetSiteForOrigin(const url::Origin& origin) {
// Only keep the scheme and registered domain of |origin|.
std::string domain = net::registry_controlled_domains::GetDomainAndRegistry(
origin.host(),
net::registry_controlled_domains::INCLUDE_PRIVATE_REGISTRIES);
std::string site = origin.scheme();
site += url::kStandardSchemeSeparator;
site += domain.empty() ? origin.host() : domain;
return GURL(site);
}
// static // static
GURL SiteInstanceImpl::GetEffectiveURL(BrowserContext* browser_context, GURL SiteInstanceImpl::GetEffectiveURL(BrowserContext* browser_context,
const GURL& url) { const GURL& url) {
......
...@@ -127,12 +127,19 @@ class CONTENT_EXPORT SiteInstanceImpl final : public SiteInstance, ...@@ -127,12 +127,19 @@ class CONTENT_EXPORT SiteInstanceImpl final : public SiteInstance,
// Returns the site for the given URL, which includes only the scheme and // Returns the site for the given URL, which includes only the scheme and
// registered domain. Returns an empty GURL if the URL has no host. // registered domain. Returns an empty GURL if the URL has no host.
// |use_effective_urls| specifies whether to resolve |url| to an effective // |should_use_effective_urls| specifies whether to resolve |url| to an
// URL (via ContentBrowserClient::GetEffectiveURL()) before determining the // effective URL (via ContentBrowserClient::GetEffectiveURL()) before
// site. // determining the site.
static GURL GetSiteForURL(BrowserContext* context, static GURL GetSiteForURL(BrowserContext* context,
const GURL& url, const GURL& url,
bool use_effective_urls); bool should_use_effective_urls);
// Returns the site of a given |origin|. Unlike GetSiteForURL(), this does
// not utilize effective URLs, isolated origins, or other special logic. It
// only translates an origin into a site (i.e., scheme and eTLD+1) and is
// used internally by GetSiteForURL(). For making process model decisions,
// GetSiteForURL() should be used instead.
static GURL GetSiteForOrigin(const url::Origin& origin);
// Returns the URL to which a process should be locked for the given URL. // Returns the URL to which a process should be locked for the given URL.
// This is computed similarly to the site URL (see GetSiteForURL), but // This is computed similarly to the site URL (see GetSiteForURL), but
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment