forked from chromium/chromium
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This document contains organic, artisanal links of the highest quality, hand-crafted for maximum robustness in the face of code change. Note that you can patch this CL and use ./tools/md_browser/md_browser.py to see a rendered version, though the styling isn't quite the same as Gitiles. BUG=658298 Review-Url: https://codereview.chromium.org/2669593004 Cr-Commit-Position: refs/heads/master@{#447613}
- Loading branch information
maxbogue
authored and
Commit bot
committed
Feb 1, 2017
1 parent
c84d488
commit 119f6c9
Showing
1 changed file
with
297 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,297 @@ | ||
# Chrome Sync's Model API | ||
|
||
Chrome Sync operates on discrete, explicitly defined model types (bookmarks, | ||
preferences, tabs, etc). These model types are individually responsible for | ||
implementing their own local storage and responding to remote changes. This | ||
guide is for developers interested in syncing data for their model type to the | ||
cloud using Chrome Sync. It describes the newest version of the API, known as | ||
Unified Sync and Storage (USS). There is also the deprecated [SyncableService | ||
API] (aka Directory), which as of early 2016 is still used by most model types. | ||
|
||
[SyncableService API]: https://www.chromium.org/developers/design-documents/sync/syncable-service-api | ||
|
||
[TOC] | ||
|
||
## Overview | ||
|
||
To correctly sync data, USS requires that sync metadata be stored alongside your | ||
model data in a way such that they are written together atomically. **This is | ||
very important!** Sync must be able to update the metadata for any local data | ||
changes as part of the same write to disk. If you attempt to write data to disk | ||
and only notify sync afterwards, a crash in between the two writes can result in | ||
changes being dropped and never synced to the server, or data being duplicated | ||
due to being committed more than once. | ||
|
||
[`ModelTypeSyncBridge`][Bridge] is the interface the model code must implement. | ||
The bridge tends to be either a [`KeyedService`][KeyedService] or owned by one. | ||
The correct place for the bridge generally lies as close to where your model | ||
data is stored as possible, as the bridge needs to be able to inject metadata | ||
updates into any local data changes that occur. | ||
|
||
The bridge has access to a [`ModelTypeChangeProcessor`][MTCP] object, which it | ||
uses to communicate local changes to sync using the `Put` and `Delete` methods. | ||
The processor will communicate remote changes from sync to the bridge using the | ||
`ApplySyncChanges` method. [`MetadataChangeList`][MCL] is the way sync will | ||
communicate metadata changes to the storage mechanism. Note that it is typically | ||
implemented on a per-storage basis, not a per-type basis. | ||
|
||
[Bridge]: https://cs.chromium.org/chromium/src/components/sync/model/model_type_sync_bridge.h | ||
[KeyedService]: https://cs.chromium.org/chromium/src/components/keyed_service/core/keyed_service.h | ||
[MTCP]: https://cs.chromium.org/chromium/src/components/sync/model/model_type_change_processor.h | ||
[MCL]: https://cs.chromium.org/chromium/src/components/sync/model/metadata_change_list.h | ||
|
||
## Data | ||
|
||
### Specifics | ||
|
||
Model types will define a proto that contains the necessary fields of the | ||
corresponding native type (e.g. [`TypedUrlSpecifics`][TypedUrlSpecifics] | ||
contains a URL and a list of visit timestamps) and include it as a field in the | ||
generic [`EntitySpecifics`][EntitySpecifics] proto. This is the form that all | ||
communications with sync will use. This proto form of the model data is referred | ||
to as the specifics. | ||
|
||
[TypedUrlSpecifics]: https://cs.chromium.org/chromium/src/components/sync/protocol/typed_url_specifics.proto | ||
[EntitySpecifics]: https://cs.chromium.org/search/?q="message+EntitySpecifics"+file:sync.proto | ||
|
||
### Identifiers | ||
|
||
There are two primary identifiers for entities: **storage key** and **client | ||
tag**. The bridge will need to take an [`EntityData`][EntityData] object (which | ||
contains the specifics) and be able generate both of these from it. For | ||
non-legacy types without significant performance concerns, these will generally | ||
be the same. | ||
|
||
The storage key is used to uniquely identify entities locally within a client. | ||
It’s what’s used to refer to entities most of the time and, as its name implies, | ||
the bridge needs to be able to look up local data and metadata entries in the | ||
store using it. Because it is a local identifier, it can change as part of | ||
database migrations, etc. This may be desirable for efficiency reasons. | ||
|
||
The client tag is used to generate the **client tag hash**, which will identify | ||
entities **across clients**. This means that its implementation can **never | ||
change** once entities have begun to sync, without risking massive duplication | ||
of entities. This means it must be generated using only immutable data in the | ||
specifics. If your type does not have any immutable fields to use, you will need | ||
to add one (e.g. a GUID, though be wary as they have the potential to conflict). | ||
While the hash gets written to disk as part of the metadata, the tag itself is | ||
never persisted locally. | ||
|
||
[EntityData]: https://cs.chromium.org/chromium/src/components/sync/model/entity_data.h | ||
|
||
## Storage | ||
|
||
A crucial requirement of USS is that the model must add support for keeping | ||
sync’s metadata in the same storage as its normal data. The metadata consists of | ||
one [`EntityMetadata`][EntityMetadata] proto for each data entity, and one | ||
[`ModelTypeState`][ModelTypeState] proto containing metadata pertaining to the | ||
state of the entire type (the progress marker, for example). This typically | ||
requires two extra tables in a database to do (one for each type of proto). | ||
|
||
Since the processor doesn’t know anything about the store, the bridge provides | ||
it with an implementation of the [`MetadataChangeList`][MCL] interface. The | ||
change processor writes metadata through this interface when changes occur, and | ||
the bridge simply has to ensure it gets passed along to the store and written | ||
along with the data changes. | ||
|
||
[EntityMetadata]: https://cs.chromium.org/chromium/src/components/sync/protocol/entity_metadata.proto | ||
[ModelTypeState]: https://cs.chromium.org/chromium/src/components/sync/protocol/model_type_state.proto | ||
|
||
### ModelTypeStore | ||
|
||
While the model type may store its data however it chooses, many types use | ||
[`ModelTypeStore`][Store], which was created specifically to provide a | ||
convenient persistence solution. It’s backed by a [LevelDB] to store serialized | ||
protos to disk. `ModelTypeStore` provides two `MetadataChangeList` | ||
implementations for convenience; both accessed via | ||
[`ModelTypeStore::WriteBatch`][WriteBatch]. One passes metadata changes directly | ||
into an existing `WriteBatch` and another caches them in memory until a | ||
`WriteBatch` exists to consume them. | ||
|
||
The store interface abstracts away the type and will handle setting up tables | ||
for the type’s data, so multiple `ModelTypeStore` objects for different types | ||
can share the same LevelDB backend just by specifying the same path and task | ||
runner. Sync already has a backend it uses for DeviceInfo that can be shared by | ||
other types via the | ||
[`ProfileSyncService::GetModelTypeStoreFactory`][StoreFactory] method. | ||
|
||
[Store]: https://cs.chromium.org/chromium/src/components/sync/model/model_type_store.h | ||
[LevelDB]: http://leveldb.org/ | ||
[WriteBatch]: https://cs.chromium.org/search/?q="class+WriteBatch"+file:model_type_store.h | ||
[StoreFactory]: https://cs.chromium.org/search/?q=GetModelTypeStoreFactory+file:profile_sync_service.h | ||
|
||
## Implementing ModelTypeSyncBridge | ||
|
||
### Initialization | ||
|
||
The bridge is required to load all of the metadata for its type from storage and | ||
provide it to the processor via the [`ModelReadyToSync`][ModelReadyToSync] | ||
method **before any local changes occur**. This can be tricky if the thread the | ||
bridge runs on is different from the storage mechanism. No data will be synced | ||
with the server if the processor is never informed that the model is ready. | ||
|
||
Since the tracking of changes and updating of metadata is completely | ||
independent, there is no need to wait for the sync engine to start before | ||
changes can be made. This prevents the need for an expensive association step in | ||
the initialization. | ||
|
||
[ModelReadyToSync]: https://cs.chromium.org/search/?q=ModelReadyToSync+file:/model_type_change_processor.h | ||
|
||
### MergeSyncData | ||
|
||
This method is called only once, when a type is first enabled. Sync will | ||
download all the data it has for the type from the server and provide it to the | ||
bridge using this method. Sync filters out any tombstones for this call, so | ||
`EntityData::is_deleted()` will never be true for the provided entities. The | ||
bridge must then examine the sync data and the local data and merge them | ||
together: | ||
|
||
* Any remote entities that don’t exist locally must be be written to local | ||
storage. | ||
* Any local entities that don’t exist remotely must be provided to sync via | ||
[`ModelTypeChangeProcessor::Put`][Put]. | ||
* Any entities that appear in both sets must be merged and the model and sync | ||
informed accordingly. Decide which copy of the data to use (or a merged | ||
version or neither) and update the local store and sync as necessary to | ||
reflect the decision. How the decision is made can vary by model type. | ||
|
||
The [`MetadataChangeList`][MCL] passed into the function is already populated | ||
with metadata for all the data passed in (note that neither the data nor the | ||
metadata have been committed to storage yet at this point). It must be given to | ||
the processor for any `Put` or `Delete` calls so the relevant metadata can be | ||
added/updated/deleted, and then passed to the store for persisting along with | ||
the data. | ||
|
||
Note that if sync gets disabled and the metadata cleared, entities that | ||
originated from other clients will exist as “local” entities the next time sync | ||
starts and merge is called. Since tombstones are not provided for merge, this | ||
can result in reviving the entity if it had been deleted on another client in | ||
the meantime. | ||
|
||
[Put]: https://cs.chromium.org/search/?q=Put+file:/model_type_change_processor.h | ||
|
||
### ApplySyncChanges | ||
|
||
While `MergeSyncData` provides the state of sync data using `EntityData` | ||
objects, `ApplySyncChanges` provides changes to the state using | ||
[`EntityChange`][EntityChange] objects. These changes must be applied to the | ||
local state. | ||
|
||
Here’s an example implementation of a type using `ModelTypeStore`: | ||
|
||
```cpp | ||
base::Optional<ModelError> DeviceInfoSyncBridge::ApplySyncChanges( | ||
std::unique_ptr<MetadataChangeList> metadata_change_list, | ||
EntityChangeList entity_changes) { | ||
std::unique_ptr<WriteBatch> batch = store_->CreateWriteBatch(); | ||
for (const EntityChange& change : entity_changes) { | ||
if (change.type() == EntityChange::ACTION_DELETE) { | ||
batch->DeleteData(change.storage_key()); | ||
} else { | ||
batch->WriteData(change.storage_key(), | ||
change.data().specifics.your_type().SerializeAsString()); | ||
} | ||
} | ||
|
||
batch->TransferMetadataChanges(std::move(metadata_change_list)); | ||
store_->CommitWriteBatch(std::move(batch), base::Bind(...)); | ||
NotifyModelOfChanges(); | ||
return {}; | ||
} | ||
``` | ||
A conflict can occur when an entity has a pending local commit when an update | ||
for the same entity comes from another client. In this case, the bridge’s | ||
[`ResolveConflict`][ResolveConflict] method will have been called prior to the | ||
`ApplySyncChanges` call in order to determine what should happen. This method | ||
defaults to having the remote version overwrite the local version unless the | ||
remote version is a tombstone, in which case the local version wins. | ||
[EntityChange]: https://cs.chromium.org/chromium/src/components/sync/model/entity_change.h | ||
[ResolveConflict]: https://cs.chromium.org/search/?q=ResolveConflict+file:/model_type_sync_bridge.h | ||
### Local changes | ||
The [`ModelTypeChangeProcessor`][MTCP] must be informed of any local changes via | ||
its `Put` and `Delete` methods. Since the processor cannot do any useful | ||
metadata tracking until `MergeSyncData` is called, the `IsTrackingMetadata` | ||
method is provided. It can be checked as an optimization to prevent unnecessary | ||
processing preparing the parameters to a `Put` or `Delete` call. | ||
Here’s an example of handling a local write using `ModelTypeStore`: | ||
```cpp | ||
void WriteLocalChange(std::string key, ModelData data) { | ||
std::unique_ptr<WriteBatch> batch = store_->CreateWriteBatch(); | ||
if (change_processor()->IsTrackingMetadata()) { | ||
change_processor()->Put(key, ModelToEntityData(data), | ||
batch->GetMetadataChangeList()); | ||
} | ||
batch->WriteData(key, specifics->SerializeAsString()); | ||
store_->CommitWriteBatch(std::move(batch), base::Bind(...)); | ||
} | ||
``` | ||
|
||
## Error handling | ||
|
||
If any errors occur during store operations that could compromise the | ||
consistency of the data and metadata, the processor’s | ||
[`ReportError`][ReportError] method should be called. The only exception to this | ||
is errors during `MergeSyncData` or `ApplySyncChanges`, which should just return | ||
a [`ModelError`][ModelError]. | ||
|
||
This will inform sync of the error, which will stop all communications with the | ||
server so bad data doesn’t get synced. Since the metadata might no longer be | ||
valid, the bridge will asynchronously receive a `DisableSync` call (this is | ||
implemented by the abstract base class; subclasses don’t need to do anything). | ||
All the metadata will be cleared from the store (if possible), and the type will | ||
be started again from scratch on the next client restart. | ||
|
||
[ReportError]: https://cs.chromium.org/search/?q=ReportError+file:/model_type_change_processor.h | ||
[ModelError]: https://cs.chromium.org/chromium/src/components/sync/model/model_error.h | ||
|
||
## Sync Integration Checklist | ||
|
||
* Define your specifics proto in [`//components/sync/protocol/`][protocol]. | ||
* Add a field for it to [`EntitySpecifics`][EntitySpecifics]. | ||
* Add it to the [`ModelType`][ModelType] enum and | ||
[`kModelTypeInfoMap`][info_map]. | ||
* Add it to the [proto value conversions][conversions] files. | ||
* Register a [`ModelTypeController`][ModelTypeController] for your type in | ||
[`ProfileSyncComponentsFactoryImpl::RegisterDataTypes`][RegisterDataTypes]. | ||
* Tell sync how to access your `ModelTypeSyncBridge` in | ||
[`ChromeSyncClient::GetSyncBridgeForModelType`][GetSyncBridge]. | ||
* Add to the [start order list][kStartOrder]. | ||
* Add an field for encrypted data to [`NigoriSpecifics`][NigoriSpecifics]. | ||
* Add to two encrypted types translation functions in | ||
[`nigori_util.cc`][nigori_util]. | ||
* Add a [preference][pref_names] for tracking whether your type is enabled. | ||
* Map your type to the pref in [`GetPrefNameForDataType`][GetPrefName]. | ||
* Check whether you should be part of a [pref group][RegisterPrefGroup]. | ||
* Add to the `SyncModelTypes` enum and `SyncModelType` suffix in | ||
[`histograms.xml`][histograms]. | ||
* Add to the [`SYNC_DATA_TYPE_HISTOGRAM`][DataTypeHistogram] macro. | ||
|
||
[protocol]: https://cs.chromium.org/chromium/src/components/sync/protocol/ | ||
[ModelType]: https://cs.chromium.org/chromium/src/components/sync/base/model_type.h | ||
[info_map]: https://cs.chromium.org/search/?q="kModelTypeInfoMap%5B%5D"+file:model_type.cc | ||
[conversions]: https://cs.chromium.org/chromium/src/components/sync/protocol/proto_value_conversions.h | ||
[ModelTypeController]: https://cs.chromium.org/chromium/src/components/sync/driver/model_type_controller.h | ||
[RegisterDataTypes]: https://cs.chromium.org/search/?q="ProfileSyncComponentsFactoryImpl::RegisterDataTypes" | ||
[GetSyncBridge]: https://cs.chromium.org/search/?q=GetSyncBridgeForModelType+file:chrome_sync_client.cc | ||
[kStartOrder]: https://cs.chromium.org/search/?q="kStartOrder[]" | ||
[NigoriSpecifics]: https://cs.chromium.org/chromium/src/components/sync/protocol/nigori_specifics.proto | ||
[nigori_util]: https://cs.chromium.org/chromium/src/components/sync/syncable/nigori_util.cc | ||
[pref_names]: https://cs.chromium.org/chromium/src/components/sync/base/pref_names.h | ||
[GetPrefName]: https://cs.chromium.org/search/?q=::GetPrefNameForDataType+file:sync_prefs.cc | ||
[RegisterPrefGroup]: https://cs.chromium.org/search/?q=::RegisterPrefGroups+file:sync_prefs.cc | ||
[histograms]: https://cs.chromium.org/chromium/src/tools/metrics/histograms/histograms.xml | ||
[DataTypeHistogram]: https://cs.chromium.org/chromium/src/components/sync/base/data_type_histogram.h | ||
|
||
## Testing | ||
|
||
The [`TwoClientUssSyncTest`][UssTest] suite is probably a good place to start | ||
for integration testing. Especially note the use of a `StatusChangeChecker` to | ||
wait for events to happen. | ||
|
||
[UssTest]: https://cs.chromium.org/chromium/src/chrome/browser/sync/test/integration/two_client_uss_sync_test.cc |