Sync: Improve handling of database load failures

This failure path has not received a lot of testing until now. Here are
the issues addressed by this patch:

- We usually check the return value of step() calls. However, we do not
check the return value of prepare() calls, which are more likely to fail.
If they do fail, we will DCHECK() or go on to dereference an invalid
pointer in step(). This patch checks the return value of one particular
prepare statement, the one in CheckIntegrity().

- Disable DCHECKs on sqlite errors, DirectoryManager open failure,
and SyncManager initialization failure. This will allow us to test these
error paths.

- Be careful in ShutdownOnSyncThread(). The directory will not be fully
intialized during shutdown if the database load failed.

- Add a ProfileSyncService unit test that simulates a load from an
unreadable database. The harness had to be modified slightly to make
this possible.

- Remove a setup_for_test_mode_ flag in SyncManager::SyncInternal::Init.
I don't know what the original intent of this flag was. However, I do
know that it prevents me from properly simulating a database load failure
and removing it seems to have no ill effects.

- Do not delete the database from DirectoryBackingStore. If this code
were to get executed it would put us into an inconsistent state. See
issue 103824. However, it's unlikely this code would get executed. If
the database were actually corrupt, we would DCHECK or de-reference an
invalid pointer on our way to this code because we don't check the
return value of the attempt to prepare an SQL statement in
DirectoryBackingStore::CheckIntegrity().

- Modify the DirectoryBackingStoreTest.Corruption unit test to expect the
new behaviour.

- Disable sync when backend initialize fails. Such a failure could
be due to bad local state. We don't know the actual cause because the
information is not available from the ProfileSyncService callback. The
safe course of action is to clear our local sync state and try again
later. It's the easiest way to get back to the most well travelled sync
initialization path.

- Fix error handling logic in OpenAndConfigureHandleHelper.  It used
to rely on a specially-crafted scoped_ptr to close the database if we
had to leave the function unsuccessfully.  This was wrong in two ways.
First, it did not reset the handle to NULL, which meant that the
DirectoryBackingStore would attempt to free the handle again when it
was destructed.  Second, it failed to clean up the handle when the
return value was not SQLITE_OK.  (Though I suppose it would have been
cleaned up by the DirectoryBackingStore destructor, thanks to the
previous issue).

BUG=103307, 103824
TEST=DirectoryBackingStoreTest.Corruption, ProfileSyncServiceTest.CorruptDatabase


Review URL: http://codereview.chromium.org/8568028

git-svn-id: svn://svn.chromium.org/chrome/trunk/src@110356 0039d316-1c4b-4281-b951-d872f2087c98
parent c4cef04e
......@@ -420,8 +420,11 @@ void SyncBackendHost::Core::OnInitializationComplete(
base::Bind(&Core::HandleInitializationCompletedOnFrontendLoop, this,
js_backend, success));
// Initialization is complete, so we can schedule recurring SaveChanges.
sync_loop_->PostTask(FROM_HERE, base::Bind(&Core::StartSavingChanges, this));
if (success) {
// Initialization is complete, so we can schedule recurring SaveChanges.
sync_loop_->PostTask(FROM_HERE,
base::Bind(&Core::StartSavingChanges, this));
}
}
void SyncBackendHost::Core::OnAuthError(const AuthError& auth_error) {
......@@ -608,7 +611,7 @@ void SyncBackendHost::Core::DoInitialize(const DoInitializeOptions& options) {
host_->sync_notifier_factory_.CreateSyncNotifier(),
options.restored_key_for_bootstrapping,
options.setup_for_test_mode);
DCHECK(success) << "Syncapi initialization failed!";
LOG_IF(ERROR, !success) << "Syncapi initialization failed!";
}
void SyncBackendHost::Core::DoCheckServerReachable() {
......
......@@ -802,7 +802,7 @@ bool SyncManager::SyncInternal::Init(
bool signed_in = SignIn(credentials);
if (signed_in || setup_for_test_mode_) {
if (signed_in) {
if (scheduler()) {
scheduler()->Start(
browser_sync::SyncScheduler::CONFIGURATION_MODE, base::Closure());
......@@ -894,7 +894,6 @@ bool SyncManager::SyncInternal::OpenDirectory() {
DCHECK(!initialized_) << "Should only happen once";
bool share_opened = dir_manager()->Open(username_for_share(), this);
DCHECK(share_opened);
if (!share_opened) {
LOG(ERROR) << "Could not open share for:" << username_for_share();
return false;
......@@ -1291,7 +1290,7 @@ void SyncManager::SyncInternal::ShutdownOnSyncThread() {
net::NetworkChangeNotifier::RemoveIPAddressObserver(this);
observing_ip_address_changes_ = false;
if (dir_manager()) {
if (initialized_ && dir_manager()) {
{
// Cryptographer should only be accessed while holding a
// transaction.
......
......@@ -555,11 +555,16 @@ void ProfileSyncService::OnUnrecoverableError(
void ProfileSyncService::OnBackendInitialized(
const WeakHandle<JsBackend>& js_backend, bool success) {
if (HasSyncSetupCompleted()) {
UMA_HISTOGRAM_BOOLEAN("Sync.FirstBackendInitializeSuccess", success);
} else {
UMA_HISTOGRAM_BOOLEAN("Sync.RestoreBackendInitializeSuccess", success);
}
if (!success) {
// If backend initialization failed, abort. We only want to blow away
// state (DBs, etc) if this was a first-time scenario that failed.
wizard_.Step(SyncSetupWizard::FATAL_ERROR);
Shutdown(!HasSyncSetupCompleted());
// Something went unexpectedly wrong. Play it safe: nuke our current state
// and prepare ourselves to try again later.
DisableForUser();
return;
}
......
......@@ -77,14 +77,15 @@ class ProfileSyncServiceTest : public testing::Test {
// TODO(akalin): Refactor the StartSyncService*() functions below.
void StartSyncService() {
StartSyncServiceAndSetInitialSyncEnded(true, true, false, true);
StartSyncServiceAndSetInitialSyncEnded(true, true, false, true, true);
}
void StartSyncServiceAndSetInitialSyncEnded(
bool set_initial_sync_ended,
bool issue_auth_token,
bool synchronous_sync_configuration,
bool sync_setup_completed) {
bool sync_setup_completed,
bool expect_create_dtm) {
if (!service_.get()) {
// Set bootstrap to true and it will provide a logged in user for test
service_.reset(new TestProfileSyncService(&factory_,
......@@ -97,9 +98,13 @@ class ProfileSyncServiceTest : public testing::Test {
if (!sync_setup_completed)
profile_->GetPrefs()->SetBoolean(prefs::kSyncHasSetupCompleted, false);
// Register the bookmark data type.
EXPECT_CALL(factory_, CreateDataTypeManager(_, _)).
WillOnce(ReturnNewDataTypeManager());
if (expect_create_dtm) {
// Register the bookmark data type.
EXPECT_CALL(factory_, CreateDataTypeManager(_, _)).
WillOnce(ReturnNewDataTypeManager());
} else {
EXPECT_CALL(factory_, CreateDataTypeManager(_, _)).Times(0);
}
if (issue_auth_token) {
profile_->GetTokenService()->IssueAuthTokenForTest(
......@@ -192,7 +197,7 @@ TEST_F(ProfileSyncServiceTest, JsControllerHandlersBasic) {
TEST_F(ProfileSyncServiceTest,
JsControllerHandlersDelayedBackendInitialization) {
StartSyncServiceAndSetInitialSyncEnded(true, false, false, true);
StartSyncServiceAndSetInitialSyncEnded(true, false, false, true, true);
StrictMock<MockJsEventHandler> event_handler;
EXPECT_CALL(event_handler, HandleJsEvent(_, _)).Times(AtLeast(1));
......@@ -234,7 +239,7 @@ TEST_F(ProfileSyncServiceTest, JsControllerProcessJsMessageBasic) {
TEST_F(ProfileSyncServiceTest,
JsControllerProcessJsMessageBasicDelayedBackendInitialization) {
StartSyncServiceAndSetInitialSyncEnded(true, false, false, true);
StartSyncServiceAndSetInitialSyncEnded(true, false, false, true, true);
StrictMock<MockJsReplyHandler> reply_handler;
......@@ -277,8 +282,9 @@ TEST_F(ProfileSyncServiceTest, TestStartupWithOldSyncData) {
ASSERT_NE(-1,
file_util::WriteFile(sync_file3, nonsense3, strlen(nonsense3)));
StartSyncServiceAndSetInitialSyncEnded(false, false, true, false);
StartSyncServiceAndSetInitialSyncEnded(false, false, true, false, true);
EXPECT_FALSE(service_->HasSyncSetupCompleted());
EXPECT_FALSE(service_->sync_initialized());
// Since we're doing synchronous initialization, backend should be
// initialized by this call.
......@@ -300,6 +306,28 @@ TEST_F(ProfileSyncServiceTest, TestStartupWithOldSyncData) {
ASSERT_NE(file2text.compare(nonsense2), 0);
}
TEST_F(ProfileSyncServiceTest, CorruptDatabase) {
const char* nonesense = "not a database";
FilePath temp_directory = profile_->GetPath().AppendASCII("Sync Data");
FilePath sync_db_file = temp_directory.AppendASCII("SyncData.sqlite3");
ASSERT_TRUE(file_util::CreateDirectory(temp_directory));
ASSERT_NE(-1,
file_util::WriteFile(sync_db_file, nonesense, strlen(nonesense)));
// Initialize with HasSyncSetupCompleted() set to true and InitialSyncEnded
// false. This is to model the scenario that would result when opening the
// sync database fails.
StartSyncServiceAndSetInitialSyncEnded(false, true, true, true, false);
// The backend is not ready. Ensure the PSS knows this.
EXPECT_FALSE(service_->sync_initialized());
// Ensure we will be prepared to initialize a fresh DB next time.
EXPECT_FALSE(service_->HasSyncSetupCompleted());
}
} // namespace
} // namespace browser_sync
......@@ -191,34 +191,38 @@ DirectoryBackingStore::~DirectoryBackingStore() {
bool DirectoryBackingStore::OpenAndConfigureHandleHelper(
sqlite3** handle) const {
if (SQLITE_OK == sqlite_utils::OpenSqliteDb(backing_filepath_, handle)) {
sqlite_utils::scoped_sqlite_db_ptr scoped_handle(*handle);
sqlite3_busy_timeout(scoped_handle.get(), std::numeric_limits<int>::max());
sqlite3_busy_timeout(*handle, std::numeric_limits<int>::max());
{
string integrity_error;
bool is_ok = CheckIntegrity(scoped_handle.get(), &integrity_error);
bool is_ok = CheckIntegrity(*handle, &integrity_error);
if (!is_ok) {
LOG(ERROR) << "Integrity check failed: " << integrity_error;
sqlite3_close(*handle);
*handle = NULL;
return false;
}
}
{
sqlite_utils::SQLStatement statement;
statement.prepare(scoped_handle.get(), "PRAGMA fullfsync = 1");
statement.prepare(*handle, "PRAGMA fullfsync = 1");
if (SQLITE_DONE != statement.step()) {
LOG(ERROR) << sqlite3_errmsg(scoped_handle.get());
LOG(ERROR) << sqlite3_errmsg(*handle);
sqlite3_close(*handle);
*handle = NULL;
return false;
}
}
{
sqlite_utils::SQLStatement statement;
statement.prepare(scoped_handle.get(), "PRAGMA synchronous = 2");
statement.prepare(*handle, "PRAGMA synchronous = 2");
if (SQLITE_DONE != statement.step()) {
LOG(ERROR) << sqlite3_errmsg(scoped_handle.get());
LOG(ERROR) << sqlite3_errmsg(*handle);
sqlite3_close(*handle);
*handle = NULL;
return false;
}
}
sqlite3_busy_timeout(scoped_handle.release(),
kDirectoryBackingStoreBusyTimeoutMs);
sqlite3_busy_timeout(*handle, kDirectoryBackingStoreBusyTimeoutMs);
#if defined(OS_WIN)
// Do not index this file. Scanning can occur every time we close the file,
// which causes long delays in SQLite's file locking.
......@@ -229,14 +233,25 @@ bool DirectoryBackingStore::OpenAndConfigureHandleHelper(
#endif
return true;
} else {
// The sqlite3 API docs indicate a handle is usually returned even
// in the case of error. We clean it up here.
if (*handle != NULL) {
sqlite3_close(*handle);
*handle = NULL;
}
return false;
}
return false;
}
bool DirectoryBackingStore::CheckIntegrity(sqlite3* handle, string* error)
const {
sqlite_utils::SQLStatement statement;
statement.prepare(handle, "PRAGMA integrity_check(1)");
if (SQLITE_OK !=
statement.prepare(handle, "PRAGMA integrity_check(1)")) {
*error = sqlite3_errmsg(handle);
return false;
}
if (SQLITE_ROW != statement.step()) {
*error = sqlite3_errmsg(handle);
return false;
......@@ -289,37 +304,7 @@ DirOpenResult DirectoryBackingStore::Load(MetahandlesIndex* entry_bucket,
bool DirectoryBackingStore::BeginLoad() {
DCHECK(load_dbhandle_ == NULL);
bool ret = OpenAndConfigureHandleHelper(&load_dbhandle_);
if (ret)
return true;
// Something's gone wrong. Nuke the database and try again.
using ::operator<<; // For string16.
LOG(ERROR) << "Sync database " << backing_filepath_.value()
<< " corrupt. Deleting and recreating.";
file_util::Delete(backing_filepath_, false);
bool failed_again = !OpenAndConfigureHandleHelper(&load_dbhandle_);
// Using failed_again here lets us distinguish from cases where corruption
// occurred even when re-opening a fresh directory (they'll go in a separate
// double weight histogram bucket). Failing twice in a row means we disable
// sync, so it's useful to see this number separately.
int bucket = failed_again ? 2 : 1;
#if defined(OS_WIN)
UMA_HISTOGRAM_COUNTS_100("Sync.DirectoryOpenFailedWin", bucket);
#elif defined(OS_MACOSX)
UMA_HISTOGRAM_COUNTS_100("Sync.DirectoryOpenFailedMac", bucket);
#else
UMA_HISTOGRAM_COUNTS_100("Sync.DirectoryOpenFailedNotWinMac", bucket);
#if defined(OS_CHROMEOS)
UMA_HISTOGRAM_COUNTS_100("Sync.DirectoryOpenFailedCros", bucket);
#elif defined(OS_LINUX)
UMA_HISTOGRAM_COUNTS_100("Sync.DirectoryOpenFailedLinux", bucket);
#else
UMA_HISTOGRAM_COUNTS_100("Sync.DirectoryOpenFailedOther", bucket);
#endif // OS_LINUX && !OS_CHROMEOS
#endif // OS_WIN
return !failed_again;
return OpenAndConfigureHandleHelper(&load_dbhandle_);
}
void DirectoryBackingStore::EndLoad() {
......
......@@ -2082,15 +2082,7 @@ TEST_F(DirectoryBackingStoreTest, Corruption) {
scoped_ptr<DirectoryBackingStore> dbs(
new DirectoryBackingStore(GetUsername(), GetDatabasePath()));
// In release mode, we expect the sync database to nuke itself and start
// over if it detects invalid/corrupted data.
#if defined(NDEBUG) && !defined(DCHECK_ALWAYS_ON)
EXPECT_TRUE(dbs->BeginLoad());
#elif defined(NDEBUG) && defined(DCHECK_ALWAYS_ON)
EXPECT_DEATH(dbs->BeginLoad(), "stmt_");
#else
EXPECT_DEATH(dbs->BeginLoad(), "sqlite error");
#endif
EXPECT_FALSE(dbs->BeginLoad());
}
}
......
......@@ -33,9 +33,9 @@ class DebugSQLErrorHandler: public VanillaSQLErrorHandler {
public:
virtual int HandleError(int error, sqlite3* db) {
error_ = error;
NOTREACHED() << "sqlite error " << error
<< " " << sqlite3_errmsg(db)
<< " db " << static_cast<void*>(db);
LOG(ERROR) << "sqlite error " << error
<< " " << sqlite3_errmsg(db)
<< " db " << static_cast<void*>(db);
return error;
}
};
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment