Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base Architecture #9

Merged
merged 31 commits into from
Aug 3, 2017
Merged
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
47f0b39
Spec out EntitySnapshots
nevir Jul 28, 2017
6d13f83
Spec out GraphSnapshots; and drop the schema typing until we can turn…
nevir Jul 28, 2017
35b21f1
Document transactions
nevir Jul 28, 2017
90e7d48
Checkpoint
nevir Jul 28, 2017
13f3bf3
Outline mergePayload
nevir Jul 30, 2017
74bdfdc
Sketch _mergeReferenceEdits
nevir Jul 30, 2017
70ff21d
Drop changeId reversion for now
nevir Jul 30, 2017
63389d6
Outline _rebuildInboundReferences
nevir Jul 30, 2017
fee34a4
Sketch _removeOrphanedNodes
nevir Jul 30, 2017
babf4b0
Commit should return metadata, too
nevir Jul 30, 2017
a0ca6ef
value -> node
nevir Jul 30, 2017
15c73a1
Add esnext.asynciterable for @types/graphql
nevir Jul 30, 2017
ec1ea33
Fix links
nevir Jul 30, 2017
c7ccaab
Sync docs w/ reality
nevir Jul 30, 2017
6f71543
Talk about querying
nevir Jul 30, 2017
6858daa
Cleanup
nevir Jul 30, 2017
a8ac6bf
Start to sketch the Cache interface
nevir Jul 31, 2017
5cf8892
Sketch out optimistic updates
nevir Aug 2, 2017
be13412
Curse you eslint
nevir Aug 2, 2017
b488b2f
Document reads
nevir Aug 2, 2017
06e0161
Plumb Configuration through
nevir Aug 2, 2017
f815f52
root node ids
nevir Aug 2, 2017
9b11a5a
Scope configuration to emit only EntityIds
nevir Aug 2, 2017
c30a20d
Link to the read operation
nevir Aug 2, 2017
108cb18
Rename GraphTransaction to SnapshotEditor
nevir Aug 2, 2017
f21a41d
OptimisticUpdateQueue
nevir Aug 2, 2017
81cdd0e
Reorganize writing docs
nevir Aug 2, 2017
755f4fb
Sketch out writes
nevir Aug 2, 2017
47f5929
Compackt
nevir Aug 2, 2017
352f250
Misc
nevir Aug 2, 2017
8dec887
Typos
nevir Aug 3, 2017
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .eslintrc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ rules:
no-var: error
object-shorthand: error
prefer-arrow-callback: [error, { allowNamedFunctions: true }]
prefer-destructuring: error
prefer-destructuring: [error, { array: false }]
prefer-numeric-literals: error
prefer-rest-params: error
prefer-spread: error
Expand Down
6 changes: 5 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,9 @@
"{src,test}/**/*.js": true,
"node_modules/": true,
"output/": true
}
},
"eslint.validate": [
"typescript",
"typescriptreact"
]
}
123 changes: 123 additions & 0 deletions docs/Architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Architecture Of This Cache

This document discusses the specific components of the cache, and roughly how they behave/interact. You may also want some additional background on the [motivation](./Motivation.md) behind the cache, as well as [the rationale](./Design Exploration.md) for some of the design choices.


## Design Requirements & Decisions

Contracts that we must adhere to:

1. All nodes are normalized in the cache. _This supports Apollo's existing, and desirable, normalization behavior_

2. Objects returned from query must not be mutated by the cache. _This allows downstream components to reason clearly about the values they're given from Apollo, and avoid excessive integrity checks._

Design points:

3. [The cache indexes entities](./Design%20Exploration.md#entities)

4. Cached entities directly (potentially circularly) [reference other cached entities](./Design%20Exploration.md#normalized-graph-cache)

5. Values from parameterized edges are [layered on top of entities via prototypes](./Design%20Exploration.md#dealing-with-parameterized-edges)

6. [Entities from the cache are _directly_ returned](./Design%20Exploration.md#normalized-graph-cache) where possible (no parameterized edges). _This minimizes the amount of work required when reading from the cache._

7. The cache will garbage collect orphaned entities, as well as provide a mechanism to directly evict entities (and any orphaned by that eviction).


## Snapshots

At its core, the cache maintains a normalized graph of entities, and indexes into that graph for efficient retrieval. Additionally, due to requirement (2) and design decision (6), this normalized graph must be _immutable_.

To maintain this, the cache tracks the current version of a node (and the overall graph) via snapshots. _Note: this is similar, but not identical, to [Relay Modern's concept of snapshots](https://github.com/facebook/relay/blob/master/packages/relay-runtime/ARCHITECTURE.md#example-data-flow-reading-and-observing-the-store)._


### Node Snapshots

The cache maintains an [`NodeSnapshot`](../src/NodeSnapshot.ts) for _important_ nodes in the graph - unlike the existing implementations, it does not maintain metadata for a node, unless its necessary. This snapshot maintains a reference to that node (in the normalized graph), and some metadata.


#### Snapshot Metadata

In addition to the node reference, all node snapshots maintain base metadata:

**Root**: Some entities (such as the [query or mutation roots](http://facebook.github.io/graphql/#sec-Type-System)) are considered to be entry points to the graph. They, and all the entities they transitively reference, are considered active and will not be garbage collected.

**Inbound references**: Each node snapshot maintains a list of all _inbound_ references to that node. This allows us to only (shallow) copy the minimal set of nodes when making edits to the graph, due to the immutability constraint.

**Outbound references**: Similarly, each snapshot also maintains a list of all
outbound references, in order to support reference-counted garbage collection.


#### Snapshot Types

There are several types of entities tracked by node snapshots, each with a specialized form of the snapshot:

[**Entities**](../src/NodeSnapshot.ts#L38-L69): Tracks an object modeling one of the application's domains.

[**Parameterized Values**](../src/NodeSnapshot.ts#L71-L111): Tracks the value of a parameterized edge, the node it occurs within, and the path to the edge. These are used at query time to layer the value of a parameterized edge on top of the underlying node.


### Graph Snapshots

All node snapshots referencing a particular version of the graph are collected into an identity map - a [`GraphSnapshot`](../src/GraphSnapshot.ts). This becomes a readonly view of all the nodes, as well as the primary entry point into the cache.


### Reading From The Cache

Because the cache is built to store values in a format that can be directly returned (for un-parameterized edges), most of the work to perform a query revolves around making sure that the cache can satisfy the query. The high level approach to performing a query is roughly:

1. Pre-process the query (if not already done), extracting paths to parameterized edges.

2. If there are parameterized edges in the query, fill in the object path up to them, taking advantage of object prototypes to point to the underlying nodes.

3. Verify that the query is satisfied by the cache. _The naive approach is to walk the selection set(s) expressed by the query; it's probably good enough for now_.

4. Return the query root, or view on top of it via (3).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean (2)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doh, yup.


Generally, when reading, we want to return whatever data we have, as well as a status indicating whether the query was completely satisfying. The caller can determine what to do if not satisfied.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

satisfying -> satisified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want our queries to feel like:

taqcv0s


See [`operations/read`](../src/operations/read.ts) for specific implementation details.

Note: this is likely the area of the cache with the most room for improvement. Step (3) has multiple opportunities for memoization and precomputation (per-query, per-fragment, per-node, etc).


### Writing To The Cache

As snapshots maintain a readonly immutable view into a version of the graph, we need a way to generate new versions. A [`SnapshotEditor`](../src/operations/SnapshotEditor.ts) encapsulates the logic for making edits to a snapshot in an immutable way (e.g. creating a new copy), following the builder pattern.

The logic for merging new values should be careful to apply the minimal set of edits to the parent snapshot in order to reach the new desired state. This is in an effort to speed up cache writes, as well as ensuring that object identities only change when their values (or referenced nodes) have changed.

At a high level, this looks something like:

1. Merge all changed scalar values from the payload, generating new node snapshots along the way.

2. Update any references that should now point to a new node (now that all nodes with changed values have been built).

3. Update any nodes that _transitively_ reference edited nodes.

4. Garbage collect any newly orphaned subgraphs.

See [`SnapshotEditor#mergePayload`](../src/SnapshotEditor.ts) for the specific implementation details.


### Optimistic Updates

Optimistic updates are tricky. Mutations can specify an optimistic response, to be applied immediately on top of the existing state of the cache. There are some interesting rules surrounding them:

1. There can be any number of optimistic updates active at a time, and the values from more recent ones take precedence.

2. Any optimistic update can be reverted at any time (but typically when the mutation completes, success or error) - the rest must continue to overlay the underlying state of the cache.

3. The data expressed by the optimistic update MUST take precedence over the base cache, even if we've gotten newer values from the server.

4. When querying the cache for values, it should prefer values present in optimistic updates over those in the underlying cache.

Due to (2) and (3), we know that we cannot blindly merge optimistic updates into an existing snapshot - and that we must track the base cache snapshot. Also, due to (3), whenever we receive new values from the server, we effectively need to update the raw cache snapshot, and then replay optimistic updates.

The approach that seems best here is to:

* Track all optimistic updates individually via an [optimistic state queue](../src/OptimisticUpdateQueue.ts), where each update is represented in the same format as a GraphQL response payload.

* The cache tracks both a base graph snapshot and - if there are active optimistic updates - an optimistic graph snapshot. Every time either the raw snapshot changes, or the optimistic state queue changes, we regenerate the unified snapshot by replaying the optimistic updates on top of the base snapshot.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it would be helpful to reason about optimistic updates by calling out explicit scenarios:

  1. User is offline for an extended period of time. App should behave as normal and network calls can be replayed once reconnected.
  2. User is mostly connected, updates will succeed the vast majority of the time.

In case 1 it seems reasonable to merge state changes directly to the cache. During an extended offline session, these changes would likely pile up and replaying events could become really slow.

Case 2 (well connected) seems like fewer updates would pile up at a given point in time and replaying them each time server data comes in wouldn't be too big of an issue.

One idea for a simplified optimistic update mechanism could be to merge all optimistic updates to the cache while still maintaining a queue of state changes. If one of the updates fails, you could bust the cache pulling fresh state from the server and then replay the additional queued state changes.

Copy link
Contributor Author

@nevir nevir Aug 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case 1 it seems reasonable to merge state changes directly to the cache. During an extended offline session, these changes would likely pile up and replaying events could become really slow.

Unfortunately, I don't think it's safe to merge them directly into the cache:

  • When we do end up getting back online, after the update finally flushes, the server may disagree, but we still want to represent the optimistic state until all optimistic updates have flushed.

  • Those requests could still fail at some point, which should invalidate that specific update, but not others. If we merge directly into the cache, we lose all ability to safely roll back.


I think we can flip the merging around: merge all the updates into one delta (as opposed to replaying one at a time), and just apply that merged update when the base store changes. It gets a little tricky, though, as each update can be rooted at a different node in the graph

Pulling enough server state to cover all updates may be untenable (too much data, or too many joins)


One future improvement is to merge optimistic updates where possible, so that we have fewer updates to apply on each write.
4 changes: 4 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -51,5 +51,9 @@
"rimraf": "^2.6.1",
"typescript": "^2.4.2",
"typescript-eslint-parser": "eslint/typescript-eslint-parser#ts-2.4"
},
"dependencies": {
"@types/graphql": "^0.10.0",
"tslib": "^1.7.0"
}
}
119 changes: 119 additions & 0 deletions src/Cache.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
import { DocumentNode } from 'graphql'; // eslint-disable-line import/no-extraneous-dependencies, import/no-unresolved

import { getQueryDefinitionOrDie, getSelectionSetOrDie } from './ast';
import { CacheSnapshot } from './CacheSnapshot';
import { Configuration } from './Configuration';
import { read, SnapshotEditor } from './operations';
import { ChangeId, NodeId } from './schema';

export interface ReadOptions {
query: DocumentNode;
variables: object;
optimistic: boolean;
rootId?: NodeId;
previousResult?: any;
}

export interface WriteOptions {
dataId: string;
result: any;
document: DocumentNode;
variables?: object;
}

export interface Transaction {
(cache: Cache): void;
}

/**
* The Hermes cache.
*
* @see https://github.com/apollographql/apollo-client/issues/1971
* @see https://github.com/apollographql/apollo-client/blob/2.0-alpha/src/data/cache.ts
*/
export class Cache {

/** Configuration used by various operations made against the cache. */
private readonly _config: Configuration;

/** The current version of the cache. */
private _snapshot: CacheSnapshot;

/**
* Reads the selection expressed by a query from the cache.
*/
read(options: ReadOptions): { result: any, complete: boolean } {
// TODO: Can we drop non-optimistic reads?
// https://github.com/apollographql/apollo-client/issues/1971#issuecomment-319402170
const snapshot = options.optimistic ? this._snapshot.optimistic : this._snapshot.baseline;
const query = getQueryDefinitionOrDie(options.query);

return read(this._config, snapshot, query.selectionSet);
}

/**
*
*/
watch(options: ReadOptions, callback: () => void): void {
// Random line to get ts/tslint to shut up.
return this.watch(options, callback);
}

/**
* Writes values for a selection to the cache.
*/
write(options: WriteOptions): void {
const selection = getSelectionSetOrDie(options.document);
const currentSnapshot = this._snapshot;

const editor = new SnapshotEditor(this._config, currentSnapshot.baseline);
editor.mergePayload(options.dataId, selection, options.result, options.variables);
const { snapshot: baseline, editedNodeIds } = editor.commit();

let optimistic = baseline;
if (currentSnapshot.optimisticStateQueue.hasUpdates()) {
const result = currentSnapshot.optimisticStateQueue.apply(this._config, baseline);
optimistic = result.snapshot;
for (const nodeId of result.editedNodeIds) {
editedNodeIds.add(nodeId);
}
}

// TODO: Let observers know about editedNodeIds.

this._snapshot = { baseline, optimistic, optimisticStateQueue: currentSnapshot.optimisticStateQueue };
}

/**
*
*/
async reset(): Promise<void> {
// Random line to get ts/tslint to shut up.
return this.reset();
}

/**
*
*/
performTransaction(transaction: Transaction): void {
// Random line to get ts/tslint to shut up.
return this.performTransaction(transaction);
}

/**
*
*/
recordOptimisticTransaction(transaction: Transaction, id: ChangeId): void {
// Random line to get ts/tslint to shut up.
return this.recordOptimisticTransaction(transaction, id);
}

/**
* Remove an optimistic update from the queue.
*/
removeOptimistic(id: ChangeId): void {
// Random line to get ts/tslint to shut up.
return this.removeOptimistic(id);
}

}
16 changes: 16 additions & 0 deletions src/CacheSnapshot.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
import { GraphSnapshot } from './GraphSnapshot';
import { OptimisticUpdateQueue } from './OptimisticUpdateQueue';

/**
* Maintains an immutable, point-in-time view of the cache.
*/
export class CacheSnapshot {
constructor(
/** The base snapshot for this version of the cache. */
public baseline: GraphSnapshot,
/** The optimistic view of this version of this cache (may be base). */
public optimistic: GraphSnapshot,
/** Individual optimistic updates for this version. */
public optimisticStateQueue: OptimisticUpdateQueue,
) {}
}
18 changes: 18 additions & 0 deletions src/Configuration.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
import { EntityId } from './schema';

/**
* Configuration used throughout the cache's operation.
*/
export interface Configuration {

/**
* Given a node, determines a _globally unique_ identifier for it to be used
* by the cache.
*
* Generally, any node that is considered to be an entity (domain object) by
* the application should be given an id. All entities are normalized within
* the cache; everything else is not.
*/
entityIdForNode(node: any): EntityId | undefined;

}
44 changes: 44 additions & 0 deletions src/GraphSnapshot.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
import { NodeSnapshot } from './NodeSnapshot';
import { NodeId } from './schema';

/**
* Maintains an identity map of all value snapshots that reference into a
* particular version of the graph.
*
* Provides an immutable view into the graph at a point in time.
*/
export class GraphSnapshot {

/**
* @internal
*/
constructor(
// TODO: Profile Object.create(null) vs Map.
private _values = Object.create(null) as { [Key in NodeId]: NodeSnapshot },
) {}

/**
* Retrieves the value identified by `id`.
*/
get(id: NodeId): object | undefined {
const snapshot = this.getSnapshot(id);
return snapshot ? snapshot.node : undefined;
}

/**
* Returns whether `id` exists as an value in the graph.
*/
has(id: NodeId): boolean {
return id in this._values;
}

/**
* Retrieves the snapshot for the value identified by `id`.
*
* @internal
*/
getSnapshot(id: NodeId): NodeSnapshot | undefined {
return this._values[id];
}

}
Loading