Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement InMemoryCache garbage collection and eviction. #5310

Merged
merged 11 commits into from
Sep 13, 2019
Prev Previous commit
Next Next commit
Implement mark-and-sweep garbage collection for EntityCache.
This implementation currently requires calling cache.gc() manually, since
the timing of garbage collection is subject to developer taste.
  • Loading branch information
benjamn committed Sep 12, 2019
commit a5ee5946e51e7a24627050ae7a3a96bf25d56ce3
125 changes: 110 additions & 15 deletions src/cache/inmemory/entityCache.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
import { NormalizedCache, NormalizedCacheObject, StoreObject } from './types';
import { wrap, OptimisticWrapperFunction } from 'optimism';
import { isReference } from './helpers';

const hasOwn = Object.prototype.hasOwnProperty;

Expand Down Expand Up @@ -31,18 +32,18 @@ export abstract class EntityCache implements NormalizedCache {
}

public abstract addLayer(
id: string,
layerId: string,
replay: (layer: EntityCache) => any,
): EntityCache;

public abstract removeLayer(id: string): EntityCache;
public abstract removeLayer(layerId: string): EntityCache;

// Although the EntityCache class is abstract, it contains concrete
// implementations of the various NormalizedCache interface methods that
// are inherited by the Root and Layer subclasses.

public toObject(): NormalizedCacheObject {
return this.data;
return { ...this.data };
Copy link
Member Author

@benjamn benjamn Sep 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the StoreWriter class updates entities non-destructively, this shallow copy of this.data is sufficient to provide a complete, immutable snapshot of the cache.

}

public get(dataId: string): StoreObject {
Expand All @@ -53,12 +54,16 @@ export abstract class EntityCache implements NormalizedCache {
public set(dataId: string, value: StoreObject): void {
if (!hasOwn.call(this.data, dataId) || value !== this.data[dataId]) {
this.data[dataId] = value;
delete this.refs[dataId];
if (this.depend) this.depend.dirty(dataId);
}
}

public delete(dataId: string): void {
this.data[dataId] = void 0;
if (this instanceof Layer) {
this.data[dataId] = void 0;
} else delete this.data[dataId];
delete this.refs[dataId];
if (this.depend) this.depend.dirty(dataId);
}

Expand All @@ -78,6 +83,78 @@ export abstract class EntityCache implements NormalizedCache {
});
}
}

private rootIds: {
[rootId: string]: Set<object>;
} = Object.create(null);

public retain(rootId: string, owner: object): void {
(this.rootIds[rootId] || (this.rootIds[rootId] = new Set<object>())).add(owner);
}

public release(rootId: string, owner: object): void {
const owners = this.rootIds[rootId];
if (owners && owners.delete(owner) && !owners.size) {
delete this.rootIds[rootId];
}
}

// This method will be overridden in the Layer class to merge root IDs for all
// layers (including the root).
public getRootIdSet() {
return new Set(Object.keys(this.rootIds));
}

// The goal of garbage collection is to remove IDs from the Root layer of the
// cache that are no longer reachable starting from any IDs that have been
// explicitly retained (see retain and release, above). Returns an array of
// dataId strings that were removed from the cache.
public gc() {
const ids = this.getRootIdSet();
const snapshot = this.toObject();
ids.forEach(id => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method for creating a stack is just so cool

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree!!

if (hasOwn.call(snapshot, id)) {
// Because we are iterating over an ECMAScript Set, the IDs we add here
// will be visited in later iterations of the forEach loop only if they
// were not previously contained by the Set.
Object.keys(this.findChildIds(id)).forEach(ids.add, ids);
// By removing IDs from the snapshot object here, we protect them from
// getting removed from the root cache layer below.
delete snapshot[id];
}
});
const idsToRemove = Object.keys(snapshot);
if (idsToRemove.length) {
let root: EntityCache = this;
while (root instanceof Layer) root = root.parent;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so this works up the layers to delete ids

Copy link
Member Author

@benjamn benjamn Sep 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I think about it:

  • Optimistic layers are temporary, so we don't really need to worry about garbage collecting them, since they should be totally removed sometime soon.
  • The goal of garbage collection, then, is to remove unreachable IDs from the Root layer of the cache.
  • The gc() method might have been called against a Layer object, which is important because optimistic Layers can retain otherwise unreachable entities.
    • If the layers didn't matter for garbage collection, we could just skip to the root at the beginning of the gc() method.
    • Instead, we skip to the root at the end of the gc() method to perform the final deletions of unreachable entities. That's what this code does.

idsToRemove.forEach(root.delete, root);
Copy link
Member Author

@benjamn benjamn Sep 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, this idiom is equivalent to

idsToRemove.forEach(id => root.delete(id))

except that it does not allocate a new function object.

}
return idsToRemove;
}

// Lazily tracks { __ref: <dataId> } strings contained by this.data[dataId].
private refs: {
[dataId: string]: Record<string, true>;
} = Object.create(null);

public findChildIds(dataId: string): Record<string, true> {
if (!hasOwn.call(this.refs, dataId)) {
const found = this.refs[dataId] = Object.create(null);
// Use the little-known replacer function API of JSON.stringify to find
// { __ref } objects quickly and without a lot of traversal code.
JSON.stringify(this.data[dataId], (_key, value) => {
if (isReference(value)) {
found[value.__ref] = true;
} else if (value && typeof value === "object") {
// Returning the value allows the traversal to continue, which is
// necessary only when the value could contain other values that might
// be reference objects.
return value;
}
});
}
return this.refs[dataId];
}
}

export namespace EntityCache {
Expand Down Expand Up @@ -107,14 +184,14 @@ export namespace EntityCache {
}

public addLayer(
id: string,
layerId: string,
replay: (layer: EntityCache) => any,
): EntityCache {
// The replay function will be called in the Layer constructor.
return new Layer(id, this, replay, this.sharedLayerDepend);
return new Layer(layerId, this, replay, this.sharedLayerDepend);
}

public removeLayer(): Root {
public removeLayer(layerId: string): Root {
// Never remove the root layer.
return this;
}
Expand All @@ -125,27 +202,27 @@ export namespace EntityCache {
// of the EntityCache.Root class.
class Layer extends EntityCache {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably out of scope for this PR, but I think it would be quite helpful to write a set of comments on the architecture of the cache and its layers.

constructor(
private id: string,
private parent: EntityCache,
private replay: (layer: EntityCache) => any,
public readonly id: string,
public readonly parent: Layer | EntityCache.Root,
public readonly replay: (layer: EntityCache) => any,
public readonly depend: DependType,
) {
super();
replay(this);
}

public addLayer(
id: string,
layerId: string,
replay: (layer: EntityCache) => any,
): EntityCache {
return new Layer(id, this, replay, this.depend);
return new Layer(layerId, this, replay, this.depend);
}

public removeLayer(id: string): EntityCache {
public removeLayer(layerId: string): EntityCache {
// Remove all instances of the given id, not just the first one.
const parent = this.parent.removeLayer(id);
const parent = this.parent.removeLayer(layerId);

if (id === this.id) {
if (layerId === this.id) {
// Dirty every ID we're removing.
// TODO Some of these IDs could escape dirtying if value unchanged.
if (this.depend) {
Expand All @@ -168,6 +245,8 @@ class Layer extends EntityCache {
};
}

// All the other inherited accessor methods work as-is, but the get method
// needs to fall back to this.parent.get when accessing a missing dataId.
public get(dataId: string): StoreObject {
if (hasOwn.call(this.data, dataId)) {
return super.get(dataId);
Expand All @@ -183,6 +262,22 @@ class Layer extends EntityCache {
}
return this.parent.get(dataId);
}

// Return a Set<string> of all the ID strings that have been retained by this
// Layer *and* any layers/roots beneath it.
public getRootIdSet(): Set<string> {
const ids = this.parent.getRootIdSet();
super.getRootIdSet().forEach(ids.add, ids);
return ids;
}

public findChildIds(dataId: string): Record<string, true> {
const fromParent = this.parent.findChildIds(dataId);
return hasOwn.call(this.data, dataId) ? {
...fromParent,
...super.findChildIds(dataId),
} : fromParent;
}
}

export function supportsResultCaching(store: any): store is EntityCache {
Expand Down
5 changes: 5 additions & 0 deletions src/cache/inmemory/inMemoryCache.ts
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,11 @@ export class InMemoryCache extends ApolloCache<NormalizedCacheObject> {
};
}

// Request garbage collection of unreachable normalized entities.
public gc() {
return this.optimisticData.gc();
}

public evict(query: Cache.EvictOptions): Cache.EvictionResult {
throw new InvariantError(`eviction is not implemented on InMemory Cache`);
}
Expand Down
5 changes: 5 additions & 0 deletions src/cache/inmemory/readFromStore.ts
Original file line number Diff line number Diff line change
Expand Up @@ -223,6 +223,11 @@ export class StoreReader {
cacheRedirects: (config && config.cacheRedirects) || {},
};

// Any IDs read explicitly from the cache (including ROOT_QUERY, most
// frequently) will be retained as reachable root IDs on behalf of their
// owner DocumentNode objects, until/unless evicted for all owners.
store.retain(rootId, query);

const execResult = this.executeStoreQuery({
query,
objectOrReference: rootId === 'ROOT_QUERY'
Expand Down
8 changes: 8 additions & 0 deletions src/cache/inmemory/types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,14 @@ export interface NormalizedCache {
* replace the state of the store
*/
replace(newData: NormalizedCacheObject): void;

/**
* Retain or release a given root ID on behalf of a specific "owner" object.
* During garbage collection, retained root IDs with one or more owners are
* considered immediately reachable. A single owner object counts only once.
*/
retain(rootId: string, owner: object): void;
release(rootId: string, owner: object): void;
}

/**
Expand Down
6 changes: 6 additions & 0 deletions src/cache/inmemory/writeToStore.ts
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,12 @@ export class StoreWriter {
dataIdFromObject?: IdGetter;
}): NormalizedCache {
const operationDefinition = getOperationDefinition(query)!;

// Any IDs written explicitly to the cache (including ROOT_QUERY, most
// frequently) will be retained as reachable root IDs on behalf of their
// owner DocumentNode objects, until/unless evicted for all owners.
store.retain(dataId, query);

return this.writeSelectionSetToStore({
result,
dataId,
Expand Down