Description
Initially the DLS cache was unbounded. This was not due to us being careless, but rather because Lucene makes the assumption that accessing live docs is cheap, and so we didn't it to run in linear time on cache miss. Unfortunately the ability to use templated role queries means you can have an unbounded number of bit sets in the cache, so we had to rollback this decision and move to a size-bound cache in order to avoid out-of-memory errors. We got some feedback that this has considerably slowed down some clusters, and the default of 50MB
that this cache may use is way too low.
We should probably move to a number that is more generous, and also depends on the available memory on the node, e.g. 10%
?
I wonder whether we should also decrease the default TTL. I think the TTL is important on this cache because it helps avoid keeping that cache full all the time in case some role queries are rarely used, which is probably typical when templated role queries are used. But the current default of 1 week means that someone running only one query every week would always use memory for this cache that might be better spent elsewhere. What about decreasing this default to maybe a couple hours?