Skip to content

Add support for cache policies instrospection #4210

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jonathansamines
Copy link
Contributor

@jonathansamines jonathansamines commented Dec 29, 2020

Adding support for cache policies introspection as described by #4104. I am copying the original issue problem statement for easy reference.

What problem are you trying to solve?

We are trying to build a general introspection mechanism for all hapi applications cache, similar to spring boot's actuator endpoints, allowing consumers to:

So far, by making use of existing public API's we have been able to introspect cache policies created by:

Server methods

We can iterate over server.methods and use the exposed .cache property to get the policy stats and drop methods.

Get existing cache policies tracked by server methods

While it is possible to introspect cache policies from server methods, it is not ideal as we have to implement several safeguards to get it right.

function getMethodName(ns, methodName) {
  const nsf = ns.length ? `${ns}.` : ns;

  return `${nsf}${methodName}`;
}

function getMethodsStats(ns, stats, methods) {
  for (const [methodName, methodFn] of Object.entries(methods)) {
    // server method without namespace
    if (typeof methodFn === 'function') {
      // not all server methods have a cache available
      if (methodFn.cache) {
        stats[getMethodName(ns, methodName)] = methodFn.cache.stats;
      }
    } else {
      // namespaced server methods
      getMethodsStats(getMethodName(ns, methodName), stats, methodFn);
    }
  }

  return stats;
}
Get a particular server method caching policy
const method = Hoek.reach(server.methods, 'segment-name');

if (method && method.cache) {
  return method.cache.drop('key');
}
Plugins

Unfortunately, a large source of application cache policies often comes from external plugins (e.g yar) and while some of them expose their cache policies, is not practical to expect each plugin author to expose their policies and manually keep track of them.

Changes

What changes are we introducing?

The framework now keeps track of all cache policies created by using server.cache and exposes them through a single server.caches property.

const cache = server.cache({ segment: 'countries', expiresIn: 60 * 60 * 1000 });
const policy = server.caches._default.countries;

assert(cache === policy);

Note: The main difference from proposal as described by #4104 is that segments are not directly available, as I didn't originally consider that segments can be reused across cache provisions.

Note 2: As you can see, when the default cache is provisioned, it is available through the _default identifier.

With that API in place, we can now create some API endpoints to introspect the cache policies:

List all cache policy stats
server.route({
  method: 'GET',
  path: '/actuator/cache',
  handler() {
    const stats = {};

    for (const [cacheName, cache] of Object.entries(server.caches)) {
      stats[cacheName] = {};
      for (const [segmentName, policy] of Object.entries(cache.segments)) {
        stats[cacheName][segmentName] = policy.stats;
      }
    }

    return stats;
  }
});
Get a single cache policy stats
server.route({
  method: 'GET',
  path: '/actuator/cache/{cacheName}/{segment},
  handler(request) {
    const policy = server.caches?.[request.params.cacheName]?.segments[request.params.segment];

    if (policy) {
      return policy.stats;
    }

    throw Boom.notFound('cache policy segment or key not found');
});
Drop a cache key within a cache policy
server.route({
  method: 'GET',
  path: '/actuator/cache/{cacheName}/{segment}/{key}',
  async handler(request, h) {
    const policy = server.caches?.[request.params.cacheName]?.segments[request.params.segment];

    if (policy) {
      await policy.drop(request.params.key);
      return h.response().code(204);
    }

    throw Boom.notFound('cache policy segment or key not found');
});

Please let me know your thoughts!

API.md Outdated
const server = Hapi.server({ port: 80 });
const cache = server.cache({ segment: 'countries', expiresIn: 60 * 60 * 1000 });

console.log(server.caches['_default'].segments.countries === cache); // true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is the first time that the name of the default cache has been public. If we want to give it a different name or expose a symbol for it, now might be a good time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hapijs/tsc any thoughts on that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @devinivy here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with _default. It is already a reserved name since you can't provision a cache with such a name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion the underscore doesn't just imply that it's internal, but that it's actually private.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, by @kanongil comment, renaming _default may actually lead to a breaking change because an application could already have such name in use. Using a symbol might be a way to avoid that.

@devinivy I am assuming that your suggestion would be to use an internal symbol, which we would have to expose in order to somehow "serialize" it's name right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that's what @devinivy meant.

@devinivy
Copy link
Member

I like how this came out, and I support landing it as a feature. Thanks for the complete contribution and laying out all the details!

Copy link
Contributor

@kanongil kanongil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It definitely makes sense to expose the created policies. I would do it in a slightly different way, as per my comments.

As for the implementation, I skipped the review for now. But I can say that it is faulty, since it does not expose all policies when multiple are created for the same segment (when shared is enabled).

API.md Outdated
const server = Hapi.server({ port: 80 });
const cache = server.cache({ segment: 'countries', expiresIn: 60 * 60 * 1000 });

console.log(server.caches['_default'].segments.countries === cache); // true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with _default. It is already a reserved name since you can't provision a cache with such a name.

@kanongil
Copy link
Contributor

kanongil commented Jan 4, 2021

Instead of exposing the cache provisions, with associated policies, this would be simpler if just exposed as an array of created policies. The same information would be exposed, as each policy includes a reference to the provisioned cache client. Maybe at server.cache.policies?

@jonathansamines
Copy link
Contributor Author

@kanongil I think that would work as long as we also expose the cache name associated to each policy, we'll need to have that, to disambiguate between policy objects created by different shared caches. Perhaps by exposing an additional property? i.e policy.cache

@jonathansamines
Copy link
Contributor Author

Considering the feedback received, I am thinking a better API would be:

  const server = Hapi.server();
  await server.cache.provision({ provider: CatboxMemory, name: 'common', shared: true });
  const cacheA = server.cache({ cache: 'common', segment: 'test', expiresIn: 1000 });
  const cacheB = server.cache({ cache: 'common', segment: 'test', expiresIn: 2000 });
  
  const [policyA, policyB] = server.cache.policies.filter((policy) => policy.segment === 'test' && policy.cache === 'common');

  console.log(policyA === cacheA);  // true
  console.log(policyB === cacheB);  // true

That would solve the following problems:

  • We no longer expose provision information. May be a good thing, as provisions were not previously exposed in any way, while policies where always returned by server.cache()
  • Would account for the cases when segments can be shared. With the previous iteration, shared caches would result in only the last policy segment to be tracked.

The new API will require the following changes:

  • Update catbox to rename the internal _segment property to segment, therefore exposing it as part of it's public API
  • Extend policies returned by catbox, with a new cache property. This would allow to disambiguate different provisions with the same segment name

We'll still have to define whether or not we want to rename the default cache. As for now, I am more inclined to keep the current _default name, as it would allow for this change to be released without breaking changes.

Thoughts?

@kanongil
Copy link
Contributor

kanongil commented Jan 6, 2021

@kanongil I think that would work as long as we also expose the cache name associated to each policy, we'll need to have that, to disambiguate between policy objects created by different shared caches. Perhaps by exposing an additional property? i.e policy.cache

Ah, yes the name would not be exposed (which would actually make the _default naming issue a moot point).

I would probably add an optional name property to the policy options and expose it through a .name property. If we map the "_default" name to undefined, we can also avoid naming it.

Alternatively, we would need a way to map clients to names. But this is impossible, since multiple provisions could share the same client.

@jonathansamines
Copy link
Contributor Author

jonathansamines commented Jan 7, 2021

@kanongil I am thinking it would also make sense to update Client with the optional .name property instead of Policy. Since Policy already has a public property to access Client (.client) that could be enough for our needs.

It also would accommodate nicely the existing model where we name provisions (clients). But now, catbox would also have access to those names.

@jonathansamines
Copy link
Contributor Author

Created hapijs/catbox#236 to add the missing public APIs on catbox.

@kanongil
Copy link
Contributor

kanongil commented Jan 7, 2021

I started thinking about how this tracking fits into lifecycle management, and I am afraid there are bad news towards your efforts.

Fundamentally, there are no lifecycle management for policies generated from server.cache(), and they can in principle be used ephemerally. Once a consumer is done with a policy, it just needs to dereference it. As such, remembering these inside the server (in server.cache.policies) can lead to memory leaks, and should not be done.

So how do we enable your use-case? This would need to be done using a configurable hook, which is called whenever a policy is generated with the appropriate data (policy, cacheName, segment). An event seems suitable for this, but policies can be provisioned at any point after server creation, so registering a listener in a plugin might miss some early ones. Alternatively it could be exposed as a callback server option, but that would make plugin integration very cumbersome.

Regardless of how this is triggered, your system would just listen for this event/callback, and do your own tracking (preferably using WeakRefs and finalizers which are supported since node v12.4).

How does this sound? Or am I overthinking it?

@jonathansamines
Copy link
Contributor Author

Fundamentally, there are no lifecycle management for policies generated from server.cache(), and they can in principle be used ephemerally. Once a consumer is done with a policy, it just needs to dereference it. As such, remembering these inside the server (in server.cache.policies) can lead to memory leaks, and should not be done.

Right, would that only be a problem if an application creates an indefinite (say, creating policies on demand) amount policies, or normal usage would also represent the same problem? The same could be said of cache provisions?

Regardless of how this is triggered, your system would just listen for this event/callback, and do your own tracking (preferably using WeakRefs and finalizers which are supported since node v12.4).

I am not quite familiar with WeakRefs yet, but if I can safely implement this feature at a plugin level by using events and WeakRefs, is it possible to implement the same at framework level?

Finally, how about tracking policies as an opt-in feature? I am thinking a server option would do well do enable this kind of functionality.

@kanongil
Copy link
Contributor

Fundamentally, there are no lifecycle management for policies generated from server.cache(), and they can in principle be used ephemerally. Once a consumer is done with a policy, it just needs to dereference it. As such, remembering these inside the server (in server.cache.policies) can lead to memory leaks, and should not be done.

Right, would that only be a problem if an application creates an indefinite (say, creating policies on demand) amount policies, or normal usage would also represent the same problem? The same could be said of cache provisions?

For server method caching I guess the lifetime follows the server, so this only applies to server.cache() policies. And yes, here it is possible to create on-demand policies. I don't know of any ephemeral use of this, but it is something that we implicitly allow.

Regardless of how this is triggered, your system would just listen for this event/callback, and do your own tracking (preferably using WeakRefs and finalizers which are supported since node v12.4).

I am not quite familiar with WeakRefs yet, but if I can safely implement this feature at a plugin level by using events and WeakRefs, is it possible to implement the same at framework level?

We could do this, but adding somewhat complicated logic to track something that we don't use internally seems not ideal. Besides, even applied correctly, it won't perfectly track the usage and expired policies could linger for some time.

As I propose, you would be responsible for the tracking, and you could choose to track the lifetimes, or just add a condition to using your tracking and require consumers to not use this kind of rarely-used?? ephemeral policies. Or live with the leaks depending on how serious they are.

Finally, how about tracking policies as an opt-in feature? I am thinking a server option would do well do enable this kind of functionality.

I don't think we want to do this in hapi itself. We presently lack a mechanism for plugins to track the usage, but this will be fixed if we add the proposed event. Eg.

server.events.on('policy', (policy, cacheName, segment) => {
    
});

@jonathansamines
Copy link
Contributor Author

@kanongil That sounds good enough for my use-case. Will give the implementation try.

@jonathansamines
Copy link
Contributor Author

Added policy event in #4219

@Nargonath
Copy link
Member

Nargonath commented Jan 15, 2021

It seems the changes proposed in this PR won't be implemented and a new PR has been created to provide an event that would allow cache policies tracking in userland. Shouldn't we close this PR in favor of #4219 @jonathansamines?

@jonathansamines jonathansamines deleted the feature/cache-policy-instrospection branch August 19, 2021 03:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants