Skip to content

Streaming queries are very inefficient #1195

Closed
@bboreham

Description

@bboreham

I noticed high resource usage in ruler and traced it back to a change where I turned on:

   - -querier.batch-iterators=true
   - -querier.ingester-streaming=true

Upon reverting this change, CPU went down to a third of what it was, memory down to a quarter and network traffic to a fifth.

Profiling suggests vast amounts of memory being used here:

github.com/cortexproject/cortex/pkg/querier/batch.newMergeIterator
/go/src/github.com/cortexproject/cortex/pkg/querier/batch/merge.go
  Total:      6.78TB     7.29TB (flat, cum) 38.52%
     20            .          .            
     21            .          .           	currErr error 
     22            .          .           } 
     23            .          .            
     24            .          .           func newMergeIterator(cs []chunk.Chunk) *mergeIterator { 
     25            .   151.63GB           	css := partitionChunks(cs) 
     26       5.39GB     5.39GB           	its := make([]*nonOverlappingIterator, 0, len(css)) 
     27            .          .           	for _, cs := range css { 
     28            .   365.93GB           		its = append(its, newNonOverlappingIterator(cs)) 
     29            .          .           	} 
     30            .          .            
     31            .          .           	c := &mergeIterator{ 
     32            .          .           		its:        its, 
     33      10.82GB    10.82GB           		h:          make(iteratorHeap, 0, len(its)), 
     34       3.36TB     3.36TB           		batches:    make(batchStream, 0, len(its)*2*promchunk.BatchSize), 
     35       3.40TB     3.40TB           		batchesBuf: make(batchStream, 0, len(its)*2*promchunk.BatchSize), 
     36            .          .           	} 
     37            .          .            
     38            .          .           	for _, iter := range c.its { 
     39            .          .           		if iter.Next(1) { 
     40            .          .           			c.h = append(c.h, iter) 

I am unclear why those sizes have *promchunk.BatchSize - they are allocating slices of Batch which are already sized that big.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions