Commit 87c5ea5e authored by Matt Menke's avatar Matt Menke Committed by Commit Bot

Update Life of a URLRequest

Bug: None
Change-Id: I108539cbf637e8120bed1ad448e0ac2d4060100b
Reviewed-on: https://chromium-review.googlesource.com/c/1327233
Commit-Queue: Matt Menke <mmenke@chromium.org>
Reviewed-by: default avatarMaks Orlovich <morlovich@chromium.org>
Cr-Commit-Position: refs/heads/master@{#609325}
parent 0ab8faf7
# Life of a URLRequest # Life of a URLRequest
This document is intended as an overview of the core layers of the network This document is intended as an overview of the core layers of the network
stack, their basic responsibilities, how they fit together, and where some of stack and the network service, their basic responsibilities, and how they fit
the pain points are, without going into too much detail. Though it touches a together, without going into too much detail. This doc assumes the network
bit on child processes and the content/loader stack, the focus is on net/ service is enabled, though the network service is not yet enabled by default
itself. on any platform.
It's particularly targeted at people new to the Chrome network stack, but It's particularly targeted at people new to the Chrome network stack, but
should also be useful for team members who may be experts at some parts of the should also be useful for team members who may be experts at some parts of the
...@@ -15,40 +15,25 @@ network stack, and then moves on to discuss how various components plug in. ...@@ -15,40 +15,25 @@ network stack, and then moves on to discuss how various components plug in.
If you notice any inaccuracies in this document, or feel that things could be If you notice any inaccuracies in this document, or feel that things could be
better explained, please do not hesitate to submit patches. better explained, please do not hesitate to submit patches.
# Anatomy of the Network Stack # Anatomy of the Network Stack
The network stack is located in //net/ in the Chrome repo, and uses the
namespace "net". Whenever a class name in this doc has no namespace, it can
generally be assumed it's in //net/ and is in the net namespace.
The top-level network stack object is the URLRequestContext. The context has The top-level network stack object is the URLRequestContext. The context has
non-owning pointers to everything needed to create and issue a URLRequest. The non-owning pointers to everything needed to create and issue a URLRequest. The
context must outlive all requests that use it. Creating a context is a rather context must outlive all requests that use it. Creating a context is a rather
complicated process, and it's recommended that most consumers use complicated process, and it's recommended that most consumers use
URLRequestContextBuilder to do this. URLRequestContextBuilder to do this.
Chrome has a number of different URLRequestContexts, as there is often a need to
keep cookies, caches, and socket pools separate for different types of requests.
Here are the main ones used by Chrome browser:
* The system URLRequestContext, also owned by the IOThread, used for requests
that aren't associated with a profile.
* Each profile, including incognito profiles, has a number of URLRequestContexts
that are created as needed:
* The main URLRequestContext is mostly created in ProfileIOData, though it
has a couple components that are passed in from content's StoragePartition
code. Several other components are shared with the system URLRequestContext,
like the HostResolver.
* Each non-incognito profile also has a media request context, which uses a
different on-disk cache than the main request context. This prevents a
single huge media file from evicting everything else in the cache. (See also
crbug.com/789657)
* On desktop platforms, each profile has a request context for extensions.
* Each profile has two contexts for each isolated app (One for media, one
for everything else).
The primary use of the URLRequestContext is to create URLRequest objects using The primary use of the URLRequestContext is to create URLRequest objects using
URLRequestContext::CreateRequest(). The URLRequest is the main interface used URLRequestContext::CreateRequest(). The URLRequest is the main interface used
by consumers of the network stack. It is used to make the actual requests to a by direct consumers of the network stack. It use used to drive requests for
server. Each URLRequest tracks a single request across all redirects until an http, https, ftp, and some data URLs. Each URLRequest tracks a single request
error occurs, it's canceled, or a final response is received, with a (possibly across all redirects until an error occurs, it's canceled, or a final response
empty) body. is received, with a (possibly empty) body.
The HttpNetworkSession is another major network stack object. It owns the The HttpNetworkSession is another major network stack object. It owns the
HttpStreamFactory, the socket pools, and the HTTP/2 and QUIC session pools. It HttpStreamFactory, the socket pools, and the HTTP/2 and QUIC session pools. It
...@@ -63,94 +48,130 @@ get their dependencies from the HttpNetworkSession. ...@@ -63,94 +48,130 @@ get their dependencies from the HttpNetworkSession.
# How many "Delegates"? # How many "Delegates"?
The network stack informs the embedder of important events for a request using A URLRequest informs the consumer of important events for a request using two
two main interfaces: the URLRequest::Delegate interface and the NetworkDelegate main interfaces: the URLRequest::Delegate interface and the NetworkDelegate
interface. interface.
The URLRequest::Delegate interface consists of a small set of callbacks needed The URLRequest::Delegate interface consists of a small set of callbacks needed
to let the embedder drive a request forward. URLRequest::Delegates generally own to let the embedder drive a request forward. The NetworkDelegate is an object
the URLRequest. pointed to by the URLRequestContext and shared by all requests, and includes
callbacks corresponding to most of the URLRequest::Delegate's callbacks, as
The NetworkDelegate is an object pointed to by the URLRequestContext and shared well as an assortment of other methods.
by all requests, and includes callbacks corresponding to most of the
URLRequest::Delegate's callbacks, as well as an assortment of other methods. The # The Network Service and Mojo
NetworkDelegate is optional, while the URLRequest::Delegate is not.
The network service, which lives in //services/network/, wraps //net/ objects,
and provides cross-process network APIs and their implementations for the rest
of Chrome. The network service uses the namespace "network" for all its classes.
The Mojo interfaces it provides are in the network::mojom namespace. Mojo is
Chrome's IPC layer. Generally there's a network::mojom::FooPtr proxy object in
the consumer's process which also implements the network::mojom::Foo interface.
When the proxy object's methods are invoked, it passes the call and all its
arguments over a Mojo IPC channel to another the implementation of the
network::mojom::Foo interface in the network service (typically implemented by a
class named network::Foo), which may be running in another process, or possibly
another thread in the consumer's process.
The network::NetworkService object is singleton that is used by Chrome to create
all other network service objects. The primary objects it is used to create are
the network::NetworkContexts, each of which owns its own mostly independent
URLRequestContext. Chrome has a number of different NetworkContexts, as there
is often a need to keep cookies, caches, and socket pools separate for different
types of requests, depending on what's making the request. Here are the main
NetworkContexts used by Chrome:
* The system NetworkContext, created and owned by Chrome's
SystemNetworkContextManager, is used for requests that aren't associated with
particular user or Profile. It has no on-disk storage, so loses all state, like
cookies, after each browser restart. It has no in-memory http cache, either.
SystemNetworkContextManager also sets up global network service preferences.
* Each Chrome Profile, including incognito Profiles, has its own NetworkContext.
Except for incognito and guest profiles, these contexts store information in
their own on-disk store, which includes cookies and an HTTP cache, among other
things. Each of these NetworkContexts is owned by a StoragePartition object in
the browser process, and created by a Profile's ProfileNetworkContextService.
* On platforms that support apps, each Profile has a NetworkContext for each app
installed on that Profile. As with the main NetworkContext, these may have
on-disk data, depending on the Profile and the App.
# Life of a Simple URLRequest # Life of a Simple URLRequest
A request for data is normally dispatched from a child to the browser process. A request for data is dispatched from some other process which results in
There a URLRequest is created to drive the request. A protocol-specific job creating a network::URLLoader in the network process. The URLLoader then
(e.g. HTTP, data, file) is attached to the request. That job first checks the creates a URLRequest to drive the request. A protocol-specific job
cache, and then creates a network connection object, if necessary, to actually (e.g. HTTP, data, file) is attached to the request. In the HTTP case, that job
fetch the data. That connection object interacts with network socket pools to first checks the cache, and then creates a network connection object, if
potentially re-use sockets; the socket pools create and connect a socket if necessary, to actually fetch the data. That connection object interacts with
there is no appropriate existing socket. Once that socket exists, the HTTP network socket pools to potentially re-use sockets; the socket pools create and
request is dispatched, the response read and parsed, and the result returned connect a socket if there is no appropriate existing socket. Once that socket
back up the stack and sent over to the child process. exists, the HTTP request is dispatched, the response read and parsed, and the
result returned back up the stack and sent over to the child process.
Of course, it's not quite that simple :-}. Of course, it's not quite that simple :-}.
Consider a simple request issued by a child process. Suppose it's an HTTP Consider a simple request issued by some process other than the network
request, the response is uncompressed, no matching entry in the cache, and there service's process. Suppose it's an HTTP request, the response is uncompressed,
are no idle sockets connected to the server in the socket pool. no matching entry in the cache, and there are no idle sockets connected to the
server in the socket pool.
Continuing with a "simple" URLRequest, here's a bit more detail on how things Continuing with a "simple" URLRequest, here's a bit more detail on how things
work. work.
### Request starts in a child process ### Request starts in some (non-network) process
Summary: Summary:
* A user (e.g. the WebURLLoaderImpl for Blink) asks ResourceDispatcher to start * A consumer (e.g. the content::ResourceDispatcher for Blink, the
the request. content::NavigationURLLoaderImpl for frame navigations, or a
* ResourceDispatcher sends an IPC to the ResourceDispatcherHost in the network::SimpleURLLoader) passes a network::ResourceRequest object and
browser process. network::mojom::URLLoaderClient Mojo channel to a
network::mojom::URLLoaderFactory, and tells it to create and start a
Chrome has a single browser process, which handles network requests and tab network::mojom::URLLoader.
management, among other things, and multiple child processes, which are * Mojo sends the network::ResourceRequest over an IPC pipe to a
generally sandboxed so can't send out network requests directly. There are network::URLLoaderFactory in the network process.
multiple types of child processes (renderer, GPU, plugin, etc). The renderer
processes are the ones that layout webpages and run HTML. Chrome has a single browser process which handles starting and configuring other
processes, tab management, and navigation, among other things, and multiple
Each child process has at most one ResourceDispatcher, which is responsible for child processes, which are generally sandboxed and have no network access
all URL request-related communication with the browser process. When something themselves, apart from the network service (Which either runs in its own
in another process needs to issue a resource request, it calls into the process, or potentially in the browser process to preserve RAM). There are
ResourceDispatcher to start a request. A RequestPeer is passed in to receive multiple types of child processes (renderer, GPU, plugin, network, etc). The
messages related to the request. When started, the renderer processes are the ones that layout webpages and run HTML.
ResourceDispatcher assigns the request a per-renderer ID, and then sends the
ID, along with all information needed to issue the request, to the The browser process creates the top level network::mojom::NetworkContext
ResourceDispatcherHost in the browser process. objects, and uses them to create network::mojom::URLLoaderFactories, which it
can set some security-related options on, before vending them to child
### ResourceDispatcherHost sets up the request in the browser process processes. Child processes can then use them to directly talk to the network
service.
A consumer that wants to make a network request gets a URLLoaderFactory through
some manner, assembles a bunch of parameters in the large ResourceRequest
object, creates a network::mojom::URLLoaderClient Mojo channel for the
network::mojom::URLLoader to use to talk back to it, and then passes them to
the URLLoaderFactory, which returns a URLLoader object that it can use to
manage the network request.
### network::URLLoaderFactory sets up the request in the browser process
Summary: Summary:
* ResourceDispatcherHost uses the URLRequestContext to create the URLRequest. * network::URLLoaderFactory creates a network::URLLoader.
* ResourceDispatcherHost creates a ResourceLoader and a chain of * network::URLLoader uses the network::NetworkContext's URLRequestContext to
ResourceHandlers to manage the URLRequest. create and start a URLRequest.
* ResourceLoader starts the URLRequest.
The URLLoaderFactory, along with all NetworkContexts and most of the network
The ResourceDispatcherHost (RDH), along with most of the network stack, lives stack, lives on a single thread in the network service. It gets a reconstituted
on the browser process's IO thread. The browser process only has one RDH, ResourceRequest object from the Mojo pipe, does some checks to make sure it
which is responsible for handling all network requests initiated by can service the request, and if so, creates a URLLoader, passing the request and
ResourceDispatchers in all child processes, not just renderer processes. the NetworkContext associated with the URLLoaderFactory.
Requests initiated in the browser process don't go through the RDH, with some
exceptions. The URLLoader then calls into a URLRequestContext to create the URLRequest. The
URLRequestContext has pointers to all the network stack objects needed to issue
When the RDH sees the request, it calls into a URLRequestContext to create the the request over the network, such as the cache, cookie store, and host
URLRequest. The URLRequestContext has pointers to all the network stack resolver. The URLLoader then calls into the ResourceScheduler, which may delay
objects needed to issue the request over the network, such as the cache, cookie starting the request, based on priority and other activity. Eventually, the
store, and host resolver. The RDH then creates a chain of ResourceHandlers ResourceScheduler starts the request.
each of which can monitor/modify/delay/cancel the URLRequest and the
information it returns. The only one of these I'll talk about here is the
AsyncResourceHandler, which is the last ResourceHandler in the chain. The RDH
then creates a ResourceLoader (which is the URLRequest::Delegate), passes
ownership of the URLRequest and the ResourceHandler chain to it, and then starts
the ResourceLoader.
The ResourceLoader checks that none of the ResourceHandlers want to cancel,
modify, or delay the request, and then finally starts the URLRequest.
### Check the cache, request an HttpStream ### Check the cache, request an HttpStream
...@@ -222,10 +243,9 @@ and tells it to start the request. ...@@ -222,10 +243,9 @@ and tells it to start the request.
* HttpBasicStream sends the request, and waits for the response. * HttpBasicStream sends the request, and waits for the response.
* The HttpBasicStream sends the response headers back to the * The HttpBasicStream sends the response headers back to the
HttpNetworkTransaction. HttpNetworkTransaction.
* The response headers are sent up to the URLRequest, to the ResourceLoader, * The response headers are sent up through the URLRequest, to the
and down through the ResourceHandler chain. network::URLLoader.
* They're then sent by the the last ResourceHandler in the chain (the * They're then sent to the network::mojom::URLLoaderClient via Mojo.
AsyncResourceHandler) to the ResourceDispatcher, with an IPC.
The HttpNetworkTransaction passes the request headers to the HttpBasicStream, The HttpNetworkTransaction passes the request headers to the HttpBasicStream,
which uses an HttpStreamParser to (finally) format the request headers and body which uses an HttpStreamParser to (finally) format the request headers and body
...@@ -235,56 +255,46 @@ The HttpStreamParser waits to receive the response and then parses the HTTP/1.x ...@@ -235,56 +255,46 @@ The HttpStreamParser waits to receive the response and then parses the HTTP/1.x
response headers, and then passes them up through both the response headers, and then passes them up through both the
HttpNetworkTransaction and HttpCache::Transaction to the URLRequestHttpJob. The HttpNetworkTransaction and HttpCache::Transaction to the URLRequestHttpJob. The
URLRequestHttpJob saves any cookies, if needed, and then passes the headers up URLRequestHttpJob saves any cookies, if needed, and then passes the headers up
to the URLRequest and on to the ResourceLoader. to the URLRequest and on to the network::URLLoader, which sends the data over
a Mojo pipe to the network::mojom::URLLoaderClient, passed in to the URLLoader
The ResourceLoader passes them through the chain of ResourceHandlers, and then when it was created.
they make their way to the AsyncResourceHandler. The AsyncResourceHandler uses
the renderer process ID ("child ID") to figure out which process the request
was associated with, and then sends the headers along with the request ID to
that process's ResourceDispatcher. The ResourceDispatcher uses the ID to
figure out which RequestPeer the headers should be sent to, which
sends them on to the RequestPeer.
### Response body is read ### Response body is read
Summary: Summary:
* AsyncResourceHandler allocates a 512k ring buffer of shared memory to read * network::URLLoader creates a raw Mojo data pipe, and passes one end to the
the body of the request. network::mojom::URLLoaderClient.
* AsyncResourceHandler tells the ResourceLoader to read the response body to * The URLLoader requests shared memory buffer from the Mojo data pipe.
the buffer, 32kB at a time. * The URLLoader tells the URLRequest to write to the memory buffer, and tells
* AsyncResourceHandler informs the ResourceDispatcher of each read using the pipe when data has been written to the buffer.
cross-process IPCs. * The last two steps repeat until the request is complete.
* ResourceDispatcher tells the AsyncResourceHandler when it's done with the
data with each read, so it knows when parts of the buffer can be reused. Without waiting to hear back from the network::mojom::URLLoaderClient, the
network::URLLoader allocates a raw mojo data pipe, and passes the client the
Without waiting to hear back from the ResourceDispatcher, the ResourceLoader read end of the pipe. The URLLoader then grabs an IPC buffer from the pipe,
tells its ResourceHandler chain to allocate memory to receive the response and passes a 64KB body read request down through the URLRequest all the way
body. The AsyncResourceHandler creates a 512KB ring buffer of shared memory, down to the HttpStreamParser. Once some data is read, possibly less than 64KB,
and then passes the first 32KB of it to the ResourceLoader for the first read. the number of bytes read makes its way back to the URLLoader, which then tells
The ResourceLoader then passes a 32KB body read request down through the the Mojo pipe the read was complete, and then requests another buffer from the
URLRequest all the way down to the HttpStreamParser. Once some data is read, pipe, to continue writing data to. The pipe may apply back pressure, to limit
possibly less than 32KB, the number of bytes read makes its way back to the the amount of unconsumed data that can be in shared memory buffers at once.
AsyncResourceHandler, which passes the shared memory buffer and the offset and This process repeats until the response body is completely read.
amount of data read to the renderer process.
The AsyncResourceHandler relies on ACKs from the renderer to prevent it from
overwriting data that the renderer has yet to consume. This process repeats
until the response body is completely read.
### URLRequest is destroyed ### URLRequest is destroyed
Summary: Summary:
* When complete, the RDH deletes the ResourceLoader, which deletes the * When complete, the network::URLLoaderFactory deletes the network::URLLoader,
URLRequest and the ResourceHandler chain. which deletes the URLRequest.
* During destruction, the HttpNetworkTransaction determines if the socket is * During destruction, the HttpNetworkTransaction determines if the socket is
reusable, and if so, tells the HttpBasicStream to return it to the socket pool. reusable, and if so, tells the HttpBasicStream to return it to the socket pool.
When the URLRequest informs the ResourceLoader it's complete, the When the URLRequest informs the network::URLLoader the request is complete, the
ResourceLoader tells the ResourceHandlers, and the AsyncResourceHandler tells URLLoader passes the message along to the network::mojom::URLLoaderClient, over
the ResourceDispatcher the request is complete. The RDH then deletes its Mojo pipe, before telling the URLLoaderFactory to destroy the URLLoader,
ResourceLoader, which deletes the URLRequest and ResourceHandler chain. which results in destroying the URLRequest and closing all Mojo pipes related to
the request.
When the HttpNetworkTransaction is being torn down, it figures out if the When the HttpNetworkTransaction is being torn down, it figures out if the
socket is reusable. If not, it tells the HttpBasicStream to close the socket. socket is reusable. If not, it tells the HttpBasicStream to close the socket.
...@@ -341,11 +351,9 @@ the browser process. ...@@ -341,11 +351,9 @@ the browser process.
## Cancellation ## Cancellation
A request can be cancelled by the child process, by any of the A consumer can cancel a request at any time by deleting the
ResourceHandlers in the chain, or by the ResourceDispatcherHost itself. When the network::mojom::URLLoader pipe used by the request. This will cause the
cancellation message reaches the URLRequest, it passes on the fact it's been network::URLLoader to destroy itself and its URLRequest.
cancelled back to the ResourceLoader, which then sends the message down the
ResourceHandler chain.
When an HttpNetworkTransaction for a cancelled request is being torn down, it When an HttpNetworkTransaction for a cancelled request is being torn down, it
figures out if the socket the HttpStream owns can potentially be reused, based figures out if the socket the HttpStream owns can potentially be reused, based
...@@ -366,18 +374,24 @@ body, so the cache only has the headers. The cache then treats it as a complete ...@@ -366,18 +374,24 @@ body, so the cache only has the headers. The cache then treats it as a complete
entry, even if the headers indicated there will be a body. entry, even if the headers indicated there will be a body.
The URLRequestHttpJob then checks with the URLRequest if the redirect should be The URLRequestHttpJob then checks with the URLRequest if the redirect should be
followed. The URLRequest then informs the ResourceLoader about the redirect, to followed. The URLRequest then informs the network::URLLoader about the redirect,
give it a chance to cancel the request. The information makes its way down which passes information about the redirect to network::mojom::URLLoaderClient,
through the AsyncResourceHandler into the other process, via the in the consumer process. Whatever issued the original request then checks
ResourceDispatcher. Whatever issued the original request then checks if the if the redirect should be followed.
redirect should be followed.
If the redirect should be followed, the URLLoaderClient calls back into the
The ResourceDispatcher then asynchronously sends a message back to either URLLoader over the network::mojom::URLLoader Mojo interface, which tells the
follow the redirect or cancel the request. In either case, the old URLRequest to follow the redirect. The URLRequest then creates a new
HttpTransaction is destroyed, and the HttpNetworkTransaction attempts to drain URLRequestJob to send the new request. If the URLLoaderClient chooses to
the socket for reuse, just as in the cancellation case. If the redirect is cancel the request instead, it can delete the network::mojom::URLLoader
followed, the URLRequest calls into the URLRequestJobFactory to create a new pipe, just like the cancellation case discussed above. In either case, the
URLRequestJob, and then starts it. old HttpTransaction is destroyed, and the HttpNetworkTransaction attempts to
drain the socket for reuse, as discussed in the previous section.
In some cases, the consumer may choose to handle a redirect itself, like
passing off the redirect to a ServiceWorker. In this case, the consumer cancels
the request and then call into some other network::mojom::URLLoaderFactory
the new URL to continue the request.
## Filters (gzip, deflate, brotli, etc) ## Filters (gzip, deflate, brotli, etc)
...@@ -546,8 +560,13 @@ priority socket request. ...@@ -546,8 +560,13 @@ priority socket request.
## Non-HTTP Schemes ## Non-HTTP Schemes
The URLRequestJobFactory has a ProtocolHander for each supported scheme. The URLRequestJobFactory has a ProtocolHander for ftp, http, https, and data
Non-HTTP URLRequests have their own ProtocolHandlers. Some are implemented in URLs, though most data URLs are handled directly in the renderer. For other
net/, (like FTP, file, and data, though the renderer handles some data URLs schemes, and non-network code that can intercept HTTP/HTTPS requests (Like
internally), and others are implemented in content/ or chrome (like blob, ServiceWorker, or extensions), there's typically another
chrome, and chrome-extension). network::mojom::URLLoaderFactory class that is used instead of
network::URLLoaderFactory. These URLLoaderFactories are not part of the
network service. Some of these are web standards and handled in content/
code (Like blob:// and file:// URLs), while other of these are
chrome-specific, and implemented in chrome/ (like chrome:// and
chrome-extension:// URLs).
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment