Streaming queries are very inefficient

I noticed high resource usage in `ruler` and traced it back to a change where I turned on:

       - -querier.batch-iterators=true
       - -querier.ingester-streaming=true

Upon reverting this change, CPU went down to a third of what it was, memory down to a quarter and network traffic to a fifth.

Profiling suggests vast amounts of memory being used here:

```
github.com/cortexproject/cortex/pkg/querier/batch.newMergeIterator
/go/src/github.com/cortexproject/cortex/pkg/querier/batch/merge.go
  Total:      6.78TB     7.29TB (flat, cum) 38.52%
     20            .          .            
     21            .          .           	currErr error 
     22            .          .           } 
     23            .          .            
     24            .          .           func newMergeIterator(cs []chunk.Chunk) *mergeIterator { 
     25            .   151.63GB           	css := partitionChunks(cs) 
     26       5.39GB     5.39GB           	its := make([]*nonOverlappingIterator, 0, len(css)) 
     27            .          .           	for _, cs := range css { 
     28            .   365.93GB           		its = append(its, newNonOverlappingIterator(cs)) 
     29            .          .           	} 
     30            .          .            
     31            .          .           	c := &mergeIterator{ 
     32            .          .           		its:        its, 
     33      10.82GB    10.82GB           		h:          make(iteratorHeap, 0, len(its)), 
     34       3.36TB     3.36TB           		batches:    make(batchStream, 0, len(its)*2*promchunk.BatchSize), 
     35       3.40TB     3.40TB           		batchesBuf: make(batchStream, 0, len(its)*2*promchunk.BatchSize), 
     36            .          .           	} 
     37            .          .            
     38            .          .           	for _, iter := range c.its { 
     39            .          .           		if iter.Next(1) { 
     40            .          .           			c.h = append(c.h, iter) 
```

I am unclear why those sizes have `*promchunk.BatchSize` - they are allocating slices of `Batch` which are already sized that big.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Streaming queries are very inefficient #1195

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Streaming queries are very inefficient #1195

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions