Skip to content

Commit

Permalink
Loki: Change live tailing to only allow mutating the log line not the…
Browse files Browse the repository at this point in the history
… number of streams. (#6063)

* don't create more streams when live tailing logs, process the log line with the pipeline to allow mutating the content of the line but don't split it into new streams.

Signed-off-by: Edward Welch <edward.welch@grafana.com>

* update changelog

Signed-off-by: Edward Welch <edward.welch@grafana.com>

* don't mutate the incoming streams

Signed-off-by: Edward Welch <edward.welch@grafana.com>

* do not reuse the pipeline because it caches labels

Signed-off-by: Edward Welch <edward.welch@grafana.com>

* add note to upgrade guide

Signed-off-by: Edward Welch <edward.welch@grafana.com>
  • Loading branch information
slim-bean authored May 2, 2022
1 parent bb2e952 commit 98fda9b
Show file tree
Hide file tree
Showing 3 changed files with 50 additions and 34 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@
##### Fixes
* [5685](https://github.com/grafana/loki/pull/5685) **chaudum**: Assert that push values tuples consist of string values
##### Changes
* [6063](https://github.com/grafana/loki/pull/6063) **slim-bean**: Changes tailing API responses to not split a stream when using parsers in the query
* [6042](https://github.com/grafana/loki/pull/6042) **slim-bean**: Add a new configuration to allow fudging of ingested timestamps to guarantee sort order of duplicate timestamps at query time.
* [5777](https://github.com/grafana/loki/pull/5777) **tatchiuleung**: storage: make Azure blobID chunk delimiter configurable
* [5650](https://github.com/grafana/loki/pull/5650) **cyriltovena**: Remove more chunkstore and schema version below v9
Expand Down
24 changes: 24 additions & 0 deletions docs/sources/upgrading/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,30 @@ The output is incredibly verbose as it shows the entire internal config struct u

## Main / Unreleased

### Loki

#### Tail API no longer creates multiple streams when using parsers.

We expect this change to be non-impactful however it is a breaking change to existing behavior.

This change would likely only affect anyone who's doing machine to machine type work with Loki's tail API
and is expecting a parser in a query to alter the streams in a tail response.

Prior to this change a tail request with a parser (e.g. json, logfmt, regexp, pattern) would split the
incoming log stream into multiple streams based on the extracted labels after running the parser.

[PR 6063](https://github.com/grafana/loki/pull/6063) changes this behavior
to no longer split incoming streams when using a parser in the query, instead Loki will return exactly
the same streams with and without a parser in the query.

We found a significant performance impact when using parsers on live tailing queries which would
result in turning a single stream with multiple entries into multiple streams with single entries.
Often leading to the case where the tailing client could not keep up with the number of streams
being pushed and tailing logs being dropped.

This change will have no impact on viewing the tail output from Grafana or logcli.
Parsers can still be used to do filtering and reformatting of live tailed log lines.

## 2.5.0

### Loki
Expand Down
59 changes: 25 additions & 34 deletions pkg/ingester/tailer.go
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,7 @@ type tailer struct {
id uint32
orgID string
matchers []*labels.Matcher
pipeline syntax.Pipeline
expr syntax.Expr
expr syntax.LogSelectorExpr
pipelineMtx sync.Mutex

sendChan chan *logproto.Stream
Expand All @@ -52,7 +51,9 @@ func newTailer(orgID, query string, conn TailServer, maxDroppedStreams int) (*ta
if err != nil {
return nil, err
}
pipeline, err := expr.Pipeline()
// Make sure we can build a pipeline. The stream processing code doesn't have a place to handle
// this error so make sure we handle it here.
_, err = expr.Pipeline()
if err != nil {
return nil, err
}
Expand All @@ -61,7 +62,6 @@ func newTailer(orgID, query string, conn TailServer, maxDroppedStreams int) (*ta
return &tailer{
orgID: orgID,
matchers: matchers,
pipeline: pipeline,
sendChan: make(chan *logproto.Stream, bufferSizeForTailResponse),
conn: conn,
droppedStreams: make([]*logproto.DroppedStream, 0, maxDroppedStreams),
Expand Down Expand Up @@ -121,53 +121,44 @@ func (t *tailer) send(stream logproto.Stream, lbs labels.Labels) {
return
}

streams := t.processStream(stream, lbs)
if len(streams) == 0 {
return
}
for _, s := range streams {
select {
case t.sendChan <- s:
default:
t.dropStream(*s)
}
processed := t.processStream(stream, lbs)
select {
case t.sendChan <- processed:
default:
t.dropStream(stream)
}
}

func (t *tailer) processStream(stream logproto.Stream, lbs labels.Labels) []*logproto.Stream {
func (t *tailer) processStream(stream logproto.Stream, lbs labels.Labels) *logproto.Stream {
// Build a new pipeline for each call because the pipeline builds a cache of labels
// and if we don't start with a new pipeline that cache will grow unbounded.
// The error is ignored because it would be handled in the constructor of the tailer.
pipeline, _ := t.expr.Pipeline()

// Optimization: skip filtering entirely, if no filter is set
if log.IsNoopPipeline(t.pipeline) {
return []*logproto.Stream{&stream}
if log.IsNoopPipeline(pipeline) {
return &stream
}
// pipeline are not thread safe and tailer can process multiple stream at once.
t.pipelineMtx.Lock()
defer t.pipelineMtx.Unlock()

streams := map[uint64]*logproto.Stream{}

sp := t.pipeline.ForStream(lbs)
responseStream := &logproto.Stream{
Labels: stream.Labels,
Entries: make([]logproto.Entry, 0, len(stream.Entries)),
}
sp := pipeline.ForStream(lbs)
for _, e := range stream.Entries {
newLine, parsedLbs, ok := sp.ProcessString(e.Timestamp.UnixNano(), e.Line)
newLine, _, ok := sp.ProcessString(e.Timestamp.UnixNano(), e.Line)
if !ok {
continue
}
var stream *logproto.Stream
if stream, ok = streams[parsedLbs.Hash()]; !ok {
stream = &logproto.Stream{
Labels: parsedLbs.String(),
}
streams[parsedLbs.Hash()] = stream
}
stream.Entries = append(stream.Entries, logproto.Entry{
responseStream.Entries = append(responseStream.Entries, logproto.Entry{
Timestamp: e.Timestamp,
Line: newLine,
})
}
streamsResult := make([]*logproto.Stream, 0, len(streams))
for _, stream := range streams {
streamsResult = append(streamsResult, stream)
}
return streamsResult
return responseStream
}

// isMatching returns true if lbs matches all matchers.
Expand Down

0 comments on commit 98fda9b

Please sign in to comment.