Abort sync cycles when download step fails

This change causes the syncer to exit from its download then commit
function if the download fails. This helps prevent the creation of
server-side conflicts, which would be more likely to happen if a
download failed but the following commit succeeded.

The main changes are as follows:
- Rather than proceed to the next step when a download updates failure
is detected, the syncer will exit. This should cause the
SyncScheduler to schedule a retry at a later time. 
- The definition of a download updates failure is now based on the
return code of the download attempt, rather than checking the contents
of the (possible non-existent) returned protobuf. This makes the error
detection logic used here more closely match the logic used to decide
whether or not to schedule retries.

This implementation has a side-effect on configure sync cycles. The old
behaviour was to handle failures by applying whatever updates we had
downloaded at that point. The new behaviour will leave updates
unapplied if any error occurs. This better matches a nearby comment's
assertion which states that we should not attempt to apply updates until
we have downloaded all of them. I believe the author of that comment
would approve of this change.

This change moves around some of the ExtensionActivityMonitor logic.
It's important that we not take the data from the extensions acitivity
monitor at the start of the cycle. Restoring that data is handled in
the commit building and response code, which might not be executed if we
need to break out early. This also fixes issue 123270.

Most of the diffs for this change are concentrated in the unit tests:
- Expose more of the SyncScheduler to the SyncerTest test harness.
- Add functions to SyncerTest for testing specific types of sync jobs.
- Add some functions that allow us to better control failures in
MockConnectionManager.
- Added tests for configure job success and failure.
- Added tests for update then commit job success and failure.
- Removed some redundant tests.
- Removed unused test infrastructure.

This change allows the integration tests to expose a bug that was
introduced back in r118572.  That commit assumed it was OK to not
retry a job that failed due to a MIGRATION_DONE response from the
server.  That is incorrect, and this error has been fixed.

BUG=122033, 123270
TEST=sync_unit_tests, specifically:
SyncerTest.UpdateThenCommit,
SyncerTest.UpdateFailsThenDontCommit,
SyncerTest.ConfigureDownloadsTwoBatchesSuccess,
SyncerTest.ConfigureFailsDontApplyUpdates


Review URL: http://codereview.chromium.org/10103017

git-svn-id: svn://svn.chromium.org/chrome/trunk/src@133754 0039d316-1c4b-4281-b951-d872f2087c98
parent 0bc80bf5
......@@ -182,10 +182,8 @@ class MigrationTest : public SyncTest {
AwaitQuiescence();
}
// Re-enable notifications if we disabled it.
if (do_test_without_notifications) {
EnableNotifications();
}
// TODO(rlarocque): It should be possible to re-enable notifications
// here, but doing so makes some windows tests flaky.
}
private:
......@@ -251,6 +249,7 @@ IN_PROC_BROWSER_TEST_F(MigrationSingleClientTest, BookmarksPrefsBoth) {
// Two data types with one being nigori.
// See crbug.com/124480.
IN_PROC_BROWSER_TEST_F(MigrationSingleClientTest,
DISABLED_PrefsNigoriIndividiaully) {
RunSingleClientMigrationTest(
......@@ -258,8 +257,7 @@ IN_PROC_BROWSER_TEST_F(MigrationSingleClientTest,
TRIGGER_NOTIFICATION);
}
// TODO(rlarocque): Re-enable this test when crbug.com/122033 is fixed.
IN_PROC_BROWSER_TEST_F(MigrationSingleClientTest, DISABLED_PrefsNigoriBoth) {
IN_PROC_BROWSER_TEST_F(MigrationSingleClientTest, PrefsNigoriBoth) {
RunSingleClientMigrationTest(
MakeList(MakeSet(syncable::PREFERENCES, syncable::NIGORI)),
MODIFY_PREF);
......@@ -294,6 +292,7 @@ IN_PROC_BROWSER_TEST_F(MigrationSingleClientTest,
// All data types plus nigori.
// See crbug.com/124480.
IN_PROC_BROWSER_TEST_F(MigrationSingleClientTest,
DISABLED_AllTypesWithNigoriIndividually) {
ASSERT_TRUE(SetupClients());
......@@ -302,9 +301,8 @@ IN_PROC_BROWSER_TEST_F(MigrationSingleClientTest,
RunSingleClientMigrationTest(migration_list, MODIFY_BOOKMARK);
}
// TODO(rlarocque): Re-enable this test when crbug.com/122033 is fixed.
IN_PROC_BROWSER_TEST_F(MigrationSingleClientTest,
DISABLED_AllTypesWithNigoriAtOnce) {
AllTypesWithNigoriAtOnce) {
ASSERT_TRUE(SetupClients());
syncable::ModelTypeSet all_types = GetPreferredDataTypes();
all_types.Put(syncable::NIGORI);
......@@ -345,15 +343,9 @@ class MigrationTwoClientTest : public MigrationTest {
DISALLOW_COPY_AND_ASSIGN(MigrationTwoClientTest);
};
#if defined(OS_MACOSX)
#define MAYBE_MigratePrefsThenModifyBookmark DISABLED_MigratePrefsThenModifyBookmark
#else
#define MAYBE_MigratePrefsThenModifyBookmark MigratePrefsThenModifyBookmark
#endif
// Easiest possible test of migration errors: triggers a server
// migration on one datatype, then modifies some other datatype.
IN_PROC_BROWSER_TEST_F(MigrationTwoClientTest,
MAYBE_MigratePrefsThenModifyBookmark) {
IN_PROC_BROWSER_TEST_F(MigrationTwoClientTest, MigratePrefsThenModifyBookmark) {
RunTwoClientMigrationTest(MakeList(syncable::PREFERENCES),
MODIFY_BOOKMARK);
}
......@@ -378,7 +370,7 @@ IN_PROC_BROWSER_TEST_F(MigrationTwoClientTest, MigrationHellWithoutNigori) {
RunTwoClientMigrationTest(migration_list, MODIFY_BOOKMARK);
}
// TODO(rlarocque) Re-enable this test when crbug.com/122033 is fixed.
// See crbug.com/124480.
IN_PROC_BROWSER_TEST_F(MigrationTwoClientTest,
DISABLED_MigrationHellWithNigori) {
ASSERT_TRUE(SetupClients());
......@@ -409,7 +401,8 @@ class MigrationReconfigureTest : public MigrationTwoClientTest {
DISALLOW_COPY_AND_ASSIGN(MigrationReconfigureTest);
};
IN_PROC_BROWSER_TEST_F(MigrationReconfigureTest, DISABLED_SetSyncTabs) {
IN_PROC_BROWSER_TEST_F(MigrationReconfigureTest,
DISABLED_SetSyncTabs) {
if (!ServerSupportsErrorTriggering()) {
LOG(WARNING) << "Test skipped in this server environment.";
return;
......
......@@ -682,6 +682,37 @@ const char* SyncScheduler::GetDecisionString(
return "";
}
// static
void SyncScheduler::SetSyncerStepsForPurpose(
SyncSessionJob::SyncSessionJobPurpose purpose,
SyncerStep* start,
SyncerStep* end) {
switch (purpose) {
case SyncSessionJob::CONFIGURATION:
*start = DOWNLOAD_UPDATES;
*end = APPLY_UPDATES;
return;
case SyncSessionJob::CLEAR_USER_DATA:
*start = CLEAR_PRIVATE_DATA;
*end = CLEAR_PRIVATE_DATA;
return;
case SyncSessionJob::NUDGE:
case SyncSessionJob::POLL:
*start = SYNCER_BEGIN;
*end = SYNCER_END;
return;
case SyncSessionJob::CLEANUP_DISABLED_TYPES:
*start = CLEANUP_DISABLED_TYPES;
*end = CLEANUP_DISABLED_TYPES;
return;
default:
NOTREACHED();
*start = SYNCER_END;
*end = SYNCER_END;
return;
}
}
void SyncScheduler::PostTask(
const tracked_objects::Location& from_here,
const char* name, const base::Closure& task) {
......@@ -730,36 +761,6 @@ void SyncScheduler::ScheduleSyncSessionJob(const SyncSessionJob& job) {
delay);
}
void SyncScheduler::SetSyncerStepsForPurpose(
SyncSessionJob::SyncSessionJobPurpose purpose,
SyncerStep* start, SyncerStep* end) {
DCHECK_EQ(MessageLoop::current(), sync_loop_);
switch (purpose) {
case SyncSessionJob::CONFIGURATION:
*start = DOWNLOAD_UPDATES;
*end = APPLY_UPDATES;
return;
case SyncSessionJob::CLEAR_USER_DATA:
*start = CLEAR_PRIVATE_DATA;
*end = CLEAR_PRIVATE_DATA;
return;
case SyncSessionJob::NUDGE:
case SyncSessionJob::POLL:
*start = SYNCER_BEGIN;
*end = SYNCER_END;
return;
case SyncSessionJob::CLEANUP_DISABLED_TYPES:
*start = CLEANUP_DISABLED_TYPES;
*end = CLEANUP_DISABLED_TYPES;
return;
default:
NOTREACHED();
*start = SYNCER_END;
*end = SYNCER_END;
return;
}
}
void SyncScheduler::DoSyncSessionJob(const SyncSessionJob& job) {
DCHECK_EQ(MessageLoop::current(), sync_loop_);
if (!ShouldRunJob(job)) {
......
......@@ -179,6 +179,7 @@ class SyncScheduler : public sessions::SyncSession::Delegate {
};
friend class SyncSchedulerTest;
friend class SyncSchedulerWhiteboxTest;
friend class SyncerTest;
FRIEND_TEST_ALL_PREFIXES(SyncSchedulerWhiteboxTest,
DropNudgeWhileExponentialBackOff);
......@@ -247,6 +248,12 @@ class SyncScheduler : public sessions::SyncSession::Delegate {
static const char* GetDecisionString(JobProcessDecision decision);
// Assign |start| and |end| to appropriate SyncerStep values for the
// specified |purpose|.
static void SetSyncerStepsForPurpose(
SyncSessionJob::SyncSessionJobPurpose purpose,
SyncerStep* start, SyncerStep* end);
// Helpers that log before posting to |sync_loop_|. These will only post
// the task in between calls to Start/Stop.
void PostTask(const tracked_objects::Location& from_here,
......@@ -340,12 +347,6 @@ class SyncScheduler : public sessions::SyncSession::Delegate {
// Creates a session for a poll and performs the sync.
void PollTimerCallback();
// Assign |start| and |end| to appropriate SyncerStep values for the
// specified |purpose|.
void SetSyncerStepsForPurpose(SyncSessionJob::SyncSessionJobPurpose purpose,
SyncerStep* start,
SyncerStep* end);
// Used to update |connection_code_|, see below.
void UpdateServerConnectionManagerStatus(
HttpResponse::ServerConnectionCode code);
......
......@@ -110,17 +110,6 @@ void Syncer::SyncShare(sessions::SyncSession* session,
switch (current_step) {
case SYNCER_BEGIN:
// This isn't perfect, as we can end up bundling extensions activity
// intended for the next session into the current one. We could do a
// test-and-reset as with the source, but note that also falls short if
// the commit request fails (e.g. due to lost connection), as we will
// fall all the way back to the syncer thread main loop in that case,
// creating a new session when a connection is established, losing the
// records set here on the original attempt. This should provide us
// with the right data "most of the time", and we're only using this
// for analysis purposes, so Law of Large Numbers FTW.
session->context()->extensions_monitor()->GetAndClearRecords(
session->mutable_extensions_activity());
session->context()->PruneUnthrottledTypes(base::TimeTicks::Now());
session->SendEventNotification(SyncEngineEvent::SYNC_CYCLE_BEGIN);
......@@ -166,13 +155,19 @@ void Syncer::SyncShare(sessions::SyncSession* session,
case STORE_TIMESTAMPS: {
StoreTimestampsCommand store_timestamps;
store_timestamps.Execute(session);
// We should download all of the updates before attempting to process
// them.
if (session->status_controller().ServerSaysNothingMoreToDownload() ||
!session->status_controller().download_updates_succeeded()) {
next_step = APPLY_UPDATES;
} else {
// We download all of the updates before attempting to apply them.
if (!session->status_controller().download_updates_succeeded()) {
// We may have downloaded some updates, but if the latest download
// attempt failed then we don't have all the updates. We'll leave
// it to a retry job to pick up where we left off.
last_step = SYNCER_END; // Necessary for CONFIGURATION mode.
next_step = SYNCER_END;
DVLOG(1) << "Aborting sync cycle due to download updates failure";
} else if (!session->status_controller()
.ServerSaysNothingMoreToDownload()) {
next_step = DOWNLOAD_UPDATES;
} else {
next_step = APPLY_UPDATES;
}
break;
}
......@@ -203,6 +198,19 @@ void Syncer::SyncShare(sessions::SyncSession* session,
if (!session->status_controller().commit_ids().empty()) {
DVLOG(1) << "Building a commit message";
// This isn't perfect, since the set of extensions activity may not
// correlate exactly with the items being committed. That's OK as
// long as we're looking for a rough estimate of extensions activity,
// not an precise mapping of which commits were triggered by which
// extension.
//
// We will push this list of extensions activity back into the
// ExtensionsActivityMonitor if this commit turns out to not contain
// any bookmarks, or if the commit fails.
session->context()->extensions_monitor()->GetAndClearRecords(
session->mutable_extensions_activity());
BuildCommitCommand build_commit_command;
build_commit_command.Execute(session);
......
This diff is collapsed.
......@@ -175,7 +175,8 @@ class StatusController {
// Returns true if the last download_updates_command received a valid
// server response.
bool download_updates_succeeded() const {
return updates_response().has_get_updates();
return shared_.error.value().last_download_updates_result
== SYNCER_OK;
}
// Returns true if the last updates response indicated that we were fully
......
......@@ -215,6 +215,8 @@ TEST_F(SyncSessionTest, MoreToSyncIfUnsyncedGreaterThanCommitted) {
TEST_F(SyncSessionTest, MoreToDownloadIfDownloadFailed) {
status()->set_updates_request_types(ParamsMeaningAllEnabledTypes());
status()->set_last_download_updates_result(NETWORK_IO_ERROR);
// When DownloadUpdatesCommand fails, these should be false.
EXPECT_FALSE(status()->ServerSaysNothingMoreToDownload());
EXPECT_FALSE(status()->download_updates_succeeded());
......@@ -229,6 +231,7 @@ TEST_F(SyncSessionTest, MoreToDownloadIfGotChangesRemaining) {
// When the server returns changes_remaining, that means there's
// more to download.
status()->set_last_download_updates_result(SYNCER_OK);
status()->mutable_updates_response()->mutable_get_updates()
->set_changes_remaining(1000L);
EXPECT_FALSE(status()->ServerSaysNothingMoreToDownload());
......@@ -242,54 +245,7 @@ TEST_F(SyncSessionTest, MoreToDownloadIfGotChangesRemaining) {
TEST_F(SyncSessionTest, MoreToDownloadIfGotNoChangesRemaining) {
status()->set_updates_request_types(ParamsMeaningAllEnabledTypes());
// When the server returns a timestamp, that means we're up to date.
status()->mutable_updates_response()->mutable_get_updates()
->set_changes_remaining(0);
EXPECT_TRUE(status()->ServerSaysNothingMoreToDownload());
EXPECT_TRUE(status()->download_updates_succeeded());
// Download updates has its own loop in the syncer; it shouldn't factor
// into HasMoreToSync.
EXPECT_FALSE(session_->HasMoreToSync());
}
TEST_F(SyncSessionTest, MoreToDownloadIfGotNoChangesRemainingForSubset) {
status()->set_updates_request_types(ParamsMeaningJustOneEnabledType());
// When the server returns a timestamp, that means we're up to date for that
// type. But there may still be more to download if there are other
// datatypes that we didn't request on this go-round.
status()->mutable_updates_response()->mutable_get_updates()
->set_changes_remaining(0);
EXPECT_TRUE(status()->ServerSaysNothingMoreToDownload());
EXPECT_TRUE(status()->download_updates_succeeded());
// Download updates has its own loop in the syncer; it shouldn't factor
// into HasMoreToSync.
EXPECT_FALSE(session_->HasMoreToSync());
}
TEST_F(SyncSessionTest, MoreToDownloadIfGotChangesRemainingAndEntries) {
status()->set_updates_request_types(ParamsMeaningAllEnabledTypes());
// The actual entry count should not factor into the HasMoreToSync
// determination.
status()->mutable_updates_response()->mutable_get_updates()->add_entries();
status()->mutable_updates_response()->mutable_get_updates()
->set_changes_remaining(1000000L);;
EXPECT_FALSE(status()->ServerSaysNothingMoreToDownload());
EXPECT_TRUE(status()->download_updates_succeeded());
// Download updates has its own loop in the syncer; it shouldn't factor
// into HasMoreToSync.
EXPECT_FALSE(session_->HasMoreToSync());
}
TEST_F(SyncSessionTest, MoreToDownloadIfGotNoChangesRemainingAndEntries) {
status()->set_updates_request_types(ParamsMeaningAllEnabledTypes());
// The actual entry count should not factor into the HasMoreToSync
// determination.
status()->mutable_updates_response()->mutable_get_updates()->add_entries();
status()->set_last_download_updates_result(SYNCER_OK);
status()->mutable_updates_response()->mutable_get_updates()
->set_changes_remaining(0);
EXPECT_TRUE(status()->ServerSaysNothingMoreToDownload());
......
......@@ -48,7 +48,7 @@ MockConnectionManager::MockConnectionManager(syncable::Directory* directory)
store_birthday_sent_(false),
client_stuck_(false),
commit_time_rename_prepended_string_(""),
fail_next_postbuffer_(false),
countdown_to_postbuffer_fail_(0),
directory_(directory),
mid_commit_observer_(NULL),
throttling_(false),
......@@ -111,8 +111,9 @@ bool MockConnectionManager::PostBufferToPath(PostBufferParams* params,
InvalidateAndClearAuthToken();
}
if (fail_next_postbuffer_) {
fail_next_postbuffer_ = false;
if (--countdown_to_postbuffer_fail_ == 0) {
// Fail as countdown hits zero.
params->response.server_status = HttpResponse::SYNC_SERVER_ERROR;
return false;
}
......@@ -564,30 +565,6 @@ const CommitResponse& MockConnectionManager::last_commit_response() const {
return *commit_responses_->back();
}
void MockConnectionManager::ThrottleNextRequest(
ResponseCodeOverrideRequestor* visitor) {
base::AutoLock lock(response_code_override_lock_);
throttling_ = true;
if (visitor)
visitor->OnOverrideComplete();
}
void MockConnectionManager::FailWithAuthInvalid(
ResponseCodeOverrideRequestor* visitor) {
base::AutoLock lock(response_code_override_lock_);
fail_with_auth_invalid_ = true;
if (visitor)
visitor->OnOverrideComplete();
}
void MockConnectionManager::StopFailingWithAuthInvalid(
ResponseCodeOverrideRequestor* visitor) {
base::AutoLock lock(response_code_override_lock_);
fail_with_auth_invalid_ = false;
if (visitor)
visitor->OnOverrideComplete();
}
bool MockConnectionManager::IsModelTypePresentInSpecifics(
const google::protobuf::RepeatedPtrField<
sync_pb::DataTypeProgressMarker>& filter,
......
......@@ -117,25 +117,11 @@ class MockConnectionManager : public browser_sync::ServerConnectionManager {
// issue multiple requests during a sync cycle.
void NextUpdateBatch();
void FailNextPostBufferToPathCall() { fail_next_postbuffer_ = true; }
void FailNextPostBufferToPathCall() { countdown_to_postbuffer_fail_ = 1; }
void FailNthPostBufferToPathCall(int n) { countdown_to_postbuffer_fail_ = n; }
void SetClearUserDataResponseStatus(sync_pb::SyncEnums::ErrorType errortype);
// A visitor class to allow a test to change some monitoring state atomically
// with the action of overriding the response codes sent back to the Syncer
// (for example, so you can say "ThrottleNextRequest, and assert no more
// requests are made once throttling is in effect" in one step.
class ResponseCodeOverrideRequestor {
public:
// Called with response_code_override_lock_ acquired.
virtual void OnOverrideComplete() = 0;
protected:
virtual ~ResponseCodeOverrideRequestor() {}
};
void ThrottleNextRequest(ResponseCodeOverrideRequestor* visitor);
void FailWithAuthInvalid(ResponseCodeOverrideRequestor* visitor);
void StopFailingWithAuthInvalid(ResponseCodeOverrideRequestor* visitor);
void FailNonPeriodicGetUpdates() { fail_non_periodic_get_updates_ = true; }
// Simple inspectors.
......@@ -222,6 +208,11 @@ class MockConnectionManager : public browser_sync::ServerConnectionManager {
return store_birthday_;
}
// Explicitly indicate that we will not be fetching some updates.
void ClearUpdatesQueue() {
update_queue_.clear();
}
// Locate the most recent update message for purpose of alteration.
sync_pb::SyncEntity* GetMutableLastUpdate();
......@@ -299,8 +290,9 @@ class MockConnectionManager : public browser_sync::ServerConnectionManager {
bool client_stuck_;
std::string commit_time_rename_prepended_string_;
// Fail on the next call to PostBufferToPath().
bool fail_next_postbuffer_;
// On each PostBufferToPath() call, we decrement this counter. The call fails
// iff we hit zero at that call.
int countdown_to_postbuffer_fail_;
// Our directory. Used only to ensure that we are not holding the transaction
// lock when performing network I/O. Can be NULL if the test author is
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment