Skip to content

Slowlog loggers do not seem to be cleaned up when no longer needed #56171

Closed
@DaveCTurner

Description

@DaveCTurner

Investigating https://discuss.elastic.co/t/index-creation-slows-down-over-time/230775 I wrote a script that creates and deletes indices repeatedly and left it running for several hours against a 7.6.2 cluster. Each iteration had indeed slowed down as reported, from around 6s to around 10s.

The hot threads API often reported a substantial amount of time spent doing things with loggers:

   20.3% (101.5ms out of 500ms) cpu usage by thread 'elasticsearch[node-0][clusterApplierService#updateTask][T#1]'
     2/10 snapshots sharing following 25 elements
       java.base@14/java.util.Collections$UnmodifiableCollection$1.hasNext(Collections.java:1048)
       app//org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:127)
       app//org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:111)
       app//org.elasticsearch.index.SearchSlowLog.setLevel(SearchSlowLog.java:127)
       app//org.elasticsearch.index.SearchSlowLog.<init>(SearchSlowLog.java:123)
       app//org.elasticsearch.index.IndexModule.<init>(IndexModule.java:151)
       app//org.elasticsearch.indices.IndicesService.createIndexService(IndicesService.java:595)
       app//org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:542)
       app//org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:173)
       app//org.elasticsearch.indices.cluster.IndicesClusterStateService.createIndices(IndicesClusterStateService.java:484)
       app//org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:246)
       app//org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$5(ClusterApplierService.java:517)
       app//org.elasticsearch.cluster.service.ClusterApplierService$$Lambda$4041/0x000000080175d440.accept(Unknown Source)
       java.base@14/java.lang.Iterable.forEach(Iterable.java:75)
       app//org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:514)
       app//org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:485)
       app//org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
       app//org.elasticsearch.cluster.service.ClusterApplierService.access$100(ClusterApplierService.java:73)
       app//org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:176)
       app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:633)
       app//org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)
       app//org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)
       java.base@14/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.base@14/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
       java.base@14/java.lang.Thread.run(Thread.java:832)
     2/10 snapshots sharing following 24 elements
       app//org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:127)
       app//org.elasticsearch.common.logging.Loggers.setLevel(Loggers.java:111)
       app//org.elasticsearch.index.SearchSlowLog.setLevel(SearchSlowLog.java:128)
       app//org.elasticsearch.index.SearchSlowLog.<init>(SearchSlowLog.java:123)
       app//org.elasticsearch.index.IndexModule.<init>(IndexModule.java:151)
       app//org.elasticsearch.indices.IndicesService.createIndexService(IndicesService.java:595)
       app//org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:542)
       app//org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:173)
       app//org.elasticsearch.indices.cluster.IndicesClusterStateService.createIndices(IndicesClusterStateService.java:484)
       app//org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:246)
       app//org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$5(ClusterApplierService.java:517)
       app//org.elasticsearch.cluster.service.ClusterApplierService$$Lambda$4041/0x000000080175d440.accept(Unknown Source)
       java.base@14/java.lang.Iterable.forEach(Iterable.java:75)
       app//org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:514)
       app//org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:485)
       app//org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
       app//org.elasticsearch.cluster.service.ClusterApplierService.access$100(ClusterApplierService.java:73)
       app//org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:176)
       app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:633)
       app//org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)
       app//org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)
       java.base@14/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.base@14/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
       java.base@14/java.lang.Thread.run(Thread.java:832)
     3/10 snapshots sharing following 14 elements
       app//org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$5(ClusterApplierService.java:517)
       app//org.elasticsearch.cluster.service.ClusterApplierService$$Lambda$4041/0x000000080175d440.accept(Unknown Source)
       java.base@14/java.lang.Iterable.forEach(Iterable.java:75)
       app//org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:514)
       app//org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:485)
       app//org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
       app//org.elasticsearch.cluster.service.ClusterApplierService.access$100(ClusterApplierService.java:73)
       app//org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:176)
       app//org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:633)
       app//org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)
       app//org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)
       java.base@14/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
       java.base@14/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
       java.base@14/java.lang.Thread.run(Thread.java:832)
     3/10 snapshots sharing following 10 elements
       java.base@14/jdk.internal.misc.Unsafe.park(Native Method)
       java.base@14/java.util.concurrent.locks.LockSupport.park(LockSupport.java:341)
       java.base@14/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block(AbstractQueuedSynchronizer.java:505)
       java.base@14/java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3137)
       java.base@14/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1614)
       java.base@14/java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:548)
       java.base@14/java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1056)
       java.base@14/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1116)
       java.base@14/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
       java.base@14/java.lang.Thread.run(Thread.java:832)

I took a heap dump and observed almost 150,000 loggers of the forms index.indexing.slowlog.<INDEX_UUID>, index.search.slowlog.fetch.<INDEX_UUID> and index.search.slowlog.query.<INDEX_UUID> (equal parts of each). There were only 1280 other instances of org.apache.logging.log4j.core.Logger.

I suspect this is a consequence of #47234 which introduced per-index loggers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions