Add support for pipeline.ordered setting for java execution #11524

colinsurprenant · 2020-01-21T21:23:29Z

Fixes #10938
Followup from abandoned PRs #11020 and #11099

TODO:

docs
test/spec harness around the new event ordering
performance tests

This PR adds the pipeline.ordered option that when used under the Java execution and only when pipeline.workers is set to 1 will preserve the batched events order throughout the filters+output execution.

Under this PR both the example in #11099 description and the aggregate filter scenario
#11099 (comment) are working correctly.

The Problem

The Java execution model is created from a graph representation of the configuration and under that model, each conditional evaluation is splitting the batch events is two, for the events evaluating on the true side and the false side of the condition and then moving down the computation path on each side in turns. This means that all events matching one side of the condition will be computed before the ones on the other side.

The Solution

Under that model, the only solution is to drip events from a given batch one-by-one down the filter+output computation so that the event ordering is always preserved.

This solution is obviously subobtimal and not very different from simply setting a single worker and a batch size of 1 element. The difference with this implementation is that a bigger batch size can still be specified and will be honoured at the input and PQ.

Possible Enhancement

Conceptually I think we could re-create the original batch after the filtering and before the output so that output plugins which leverage the batch such as the elasticsearch output could continue to leverage it.
Maybe we could have two compute method implementation/generation to either accept an array of events as it is now and also accept a single event to avoid the overhead of dealing with arrays/multiple elements when we know there is just one.

In both case I think these are optimizations/improvements that could be made as followups.

colinsurprenant · 2020-01-21T22:22:40Z

Some Ruby specs not passing because of the added warning logs - will fix that.

logstash-core/lib/logstash/environment.rb

logstash-core/lib/logstash/java_pipeline.rb

logstash-core/locales/en.yml

logstash-core/src/main/java/org/logstash/execution/WorkerLoop.java

colinsurprenant · 2020-01-22T04:38:14Z

Thanks @yaauie for the review.

Per our inline discussion, my intent with making pipeline.ordered defaults to true was to make sure that if a user set a single worker with the Java execution, the behaviour is going to be the same as the Ruby execution. I think we should still aim at preserving the Ruby execution behaviour in v7. WDYT?

yaauie · 2020-01-22T06:09:19Z

Re: defaulting to true

When I proposed a separate opt-in parameter (as opposed to relying on circumstances e.g. one pipeline worker), my reasoning was that maintaining strict ordrring comes with a significant penalty and should only ever be used in cases where a pipeline absolutely needs to rely on strictly-ordered execution.

While this implementation allows someone to opt-out with pipeline.ordered=false, I believe that it is far more likely to accidentally catch those whose configuration limits throughput for other reasons (e.g., rate-limiting a filter use by reducing throughput; one of their filters is not threadsafe and Logstash automatically overriding pipeline.workers to be 1).

colinsurprenant · 2020-01-22T14:51:37Z

@yaauie I hear what you say and agree on the principle; that's why I added the waning logs 'pipeline.ordered' setting is enabled and is likely less efficient, consider disabling if preserving event order is not necessary. Again for me it boils down to preserving a previous default behaviour. This default in v7 could be changed in v8.

I guess we could also flip the messaging and if there is a single worker and pipeline.ordered=false we could log a waning such as "this pipeline uses a single worker and 'pipeline.ordered' is disabled so event order will not be preserved ..." or something around these lines.
WDYT?

One of the problem I see is that by not fixing the default behaviour we will have to change the documentation of the aggregate filter for example

You should be very careful to set Logstash filter workers to 1 (-w 1 flag) for this filter to work correctly otherwise events may be processed out of sequence and unexpected results will occur.

And this is what we've been telling, to just use -w 1 to keep ordering. I am not sure that the expected (and documented) behaviour should be changed within a version (event if it mean it is less efficient)? On top of the warning log I added 'pipeline.ordered' setting is enabled and is likely less efficient ... we could complement that by underlining it in the release notes and maybe a blog post about it?

jsvd · 2020-01-22T14:55:10Z

What if simply introduce a default behaviour that, if pipeline.ordered is not set, it is enabled for workers = 1 and disabled for workers > 1? if the user sets this option (and we can know this), then it overrides the conditional behaviour.

[edit] this would actually reduce the need for warnings. it does the right thing by default, and if the user is overriding this setting then it's quite obvious what the consequence is (ordered events vs speed)

colinsurprenant · 2020-01-22T15:52:08Z

@jsvd note that pipeline.ordered can only work iif there is a single worker. The behaviour you are suggesting «introduce a default behaviour that, if pipeline.ordered is not set, it is enabled for workers = 1» is essentially what we have are: pipeline.ordered is enabled by default and depends on having a single worker. I think it will make it more obvious to have an explicit true default setting for pipeline.ordered instead of not having a default at all?

jsvd · 2020-01-22T17:01:52Z

Allowing someone to set pipeline.ordered: true + pipeline.workers: 3 and resulting in events not being ordered doesn't feel like a good experience, regardless of warning messages which can easily be missed.

What if we had a third setting value called "auto" (default), and it would work like this:

when "auto", order iif worker = 1
when "true", order if worker =1, throw error otherwise
when "false", don't order

colinsurprenant · 2020-01-22T17:17:12Z

@jsvd yeah, I like that. I would suggest that in v7 it defaults to auto as you suggest (to fix currently broken expectations) and in v8 make it defaults to false to prevent for the unnecessary induced extra slowdown of ordering when having a forced single worker pipeline (when non thread safe plugins are used for example)

colinsurprenant · 2020-01-22T19:24:21Z

@jsvd furthermore, with you suggestion this option becomes generic for both Java and Ruby execution, no need to specify that this option is only for the Java execution. Obviously if false is set, it won't change the Ruby execution behaviour but that is not important IMO.

yaauie · 2020-01-22T20:09:52Z

The one caveat I'd add on @jsvd's "default:auto" solution, is it is possible for safe_pipeline_worker_count to force their pipeline worker count down to 1 (even if they have specified something greater) due to a filter or output's declaration of non-threadsafety, and I wouldn't necessarily expect this to also incur the penalty of drip-mode sequential ordering.

colinsurprenant · 2020-01-22T20:12:37Z

@yaauie we could log a warning in this case?

yaauie · 2020-01-22T20:40:51Z

If our setting is defined like:

Setting::String.new('pipeline.ordered', 'auto', true, %w(true false auto))

Then we could do something like:

def effective_pipeline_ordered_value
  case settings.value('pipeline.ordered')
  when 'true'  then true
  when 'false' then false
  when 'auto'
    settings.set?('pipeline.workers') && settings.value('pipeline.workers') == 1
  else
    fail 'illegal state'
  end
end

colinsurprenant · 2020-01-22T20:46:27Z

@yaauie yeah, I think it makes sense to apply the auto setting only when the pipeline.workers is explicitly set to 1.

colinsurprenant · 2020-01-22T22:43:32Z

I have reorg'ed the validation as discussed. I also added the default option in logstash.yml and note that I also added a check for when pipeline.ordered is true and the number of workers is > 1 in the Ruby pipeline to avoid any possible confusion. This does not change the default with the Ruby execution behaviour since the default option is "auto" and will do nothing f since ordering is already preserved when using a single worker (similarly, false will also do nothing).

colinsurprenant · 2020-01-22T23:34:43Z

/cc @karenzone for docs.

karenzone

Nice work on the docs! I made a few suggestions, but will leave them up to your discretion. Otherwise LGTM

docs/static/settings-file.asciidoc

docs/static/running-logstash-command-line.asciidoc

colinsurprenant · 2020-01-23T22:42:17Z

Added java execution specs.

colinsurprenant · 2020-01-23T23:08:16Z

Also added a Ruby pipeline spec. At this point I think this is code-complete (pending green build) and ready for final reviewing.

colinsurprenant · 2020-01-24T17:17:04Z

I've done some performance tests.

Config 1

A relatively simple config with a few conditional branches.

input {
  generator {
    lines => [
      "line 1",
      "line 2",
      "line 3"
    ]
    count => 5000000
  }
}

filter {
  if [message] == "line 1" {
    mutate { add_field => { "line1" => "true" } }
  } else if [message] == "line 2" {
    mutate { add_field => { "line2" => "true" } }
  } else if [message] == "line 3" {
    mutate { add_field => { "line3" => "true" } }
  }
}

output { stdout { codec => dots } }

Test

bin/logstash --pipeline.ordered {true|false} --pipeline.workers 1 --java-execution {true|false} -f tmp/config1.conf | pv -Wbart > /dev/null

Results

Java Execution

pipeline.ordered	elapsed	steady TPS
false	0:02:07	123k/s
true	0:03:07	82k/s

Ruby Execution

pipeline.ordered	elapsed	steady TPS
false/true	0:02:05	125k/s

Config 2

A slightly more taxing config with regex conditionals and a grok filter.

input {
  generator {
    lines => [
      "the quick brown fox jumped over the lazy line 1",
      "the slow brown fox jumped over the lazy line 2",
      "the average brown fox jumped over the lazy line 3"
    ]
    count => 5000000
  }
}

filter {
  if [message] =~ "line 1" {
    mutate { add_field => { "line1" => "true" } }
  } else if [message] =~ "line 2" {
    mutate { add_field => { "line2" => "true" } }
  } else if [message] =~ "line 3" {
    mutate { add_field => { "line3" => "true" } }
  }
  grok {
    match => { "message" => "the %{WORD:speed} %{DATA} line %{NUMBER:num}" }
  }
}

output { stdout { codec => dots } }

Test

bin/logstash --pipeline.ordered {true|false} --pipeline.workers 1 --java-execution {true|false} -f tmp/config2.conf | pv -Wbart > /dev/null

Results

Java Execution

pipeline.ordered	elapsed	steady TPS
false	0:04:56	55k/s
true	0:08:22	30k/s

Ruby Execution

pipeline.ordered	elapsed	steady TPS
false/true	0:05:30	45k/s

colinsurprenant · 2020-01-24T17:22:53Z

A conclusion to the above performance tests is that the Ruby execution (which always preserve order) is actually faster than preserving order with the Java execution with this implementation.

I would still move forward with this as-is because it is now "doing it right" and followup on it to improve performance since the Ruby engine will be deprecated in the future.

colinsurprenant · 2020-01-27T17:01:29Z

I tried exposing the events collection from QueueBatch to avoid the extra to_a() RubyArray creation when ordering but there is not significative performance improvement.
I think the improvement we should look into would be to find the way to drip events into the filters but re-aggregate the batch after the filter but before the outputs, this would allow benefiting from the outputs multi_receive. This would possibly break ordering at the outputs if conditionals are used but I don't think ordering is actually important at the outputs because typically conditionals are only used to fork between outputs and in that respect ordering would not be changed.

andsel · 2020-01-27T17:08:03Z

So when Java pipeline compiler compiles the pipeline we could artificially put an aggregator "filter" after as joint point between last user filter and the output section that wait for all dripped events before moving on

colinsurprenant · 2020-01-27T17:15:23Z

@andsel That's an idea. I wonder if it would not be better to look into specializing the compute method to have a drip mode that would avoid shuffling single-element arrays and at the end of the filters it would aggregate the dripped events and then continue on with the outputs.

In any case I think we should probably move this optimization into another issue; I suggest we merge this PR knowing it might not be optimal but correct in behaviour and then followup with potential optimizations?

yaauie · 2020-01-27T18:47:02Z

+1 to merging as-is and deferring follow-up optimization to a separate issue.

colinsurprenant · 2020-01-27T19:32:16Z

@yaauie @jsvd let me know if we are good to go or there are other concerns we should look into?

colinsurprenant · 2020-01-29T18:10:55Z

This PR will be merged for v8 and v7.7. for 7.6 we have decided not to backport, at least not until we find a solution for the batch re-aggregation after the filters and before outputs (issue in #11550).

Note that in the mean time the workaround is to use the Ruby execution --java-execution false which does preserve events order with using a single worker.

reuse rubyArray for single element batches rename preserveBatchOrder to preserveEventOrder allow boolean and string values for the pipeline.ordered setting, reorg validation update docs yml typo Update docs/static/running-logstash-command-line.asciidoc Co-Authored-By: Karen Metts <35154725+karenzone@users.noreply.github.com> Update docs/static/running-logstash-command-line.asciidoc Co-Authored-By: Karen Metts <35154725+karenzone@users.noreply.github.com> java execution specs and spec support docs corrections per review typo close not shutdown Ruby pipeline spec

colinsurprenant · 2020-03-30T17:59:50Z

Rebatching before outputs was solved in #11710 and will be available un 7.7.0.

colinsurprenant added work in progress discuss Java Execution labels Jan 21, 2020

This was referenced Jan 21, 2020

[meta] Pipeline Java Execution important issues #11175

Closed

[POC][DISCUSS] preserve batched events order #11099

Closed

yaauie reviewed Jan 22, 2020

View reviewed changes

colinsurprenant requested a review from karenzone January 22, 2020 23:35

karenzone approved these changes Jan 23, 2020

View reviewed changes

colinsurprenant changed the title ~~[WIP] add support for pipeline.ordered setting for java execution~~ Add support for pipeline.ordered setting for java execution Jan 23, 2020

colinsurprenant added enhancement bug and removed discuss work in progress labels Jan 23, 2020

yaauie approved these changes Jan 27, 2020

View reviewed changes

colinsurprenant mentioned this pull request Jan 29, 2020

improve pipeline.ordered setting to rebatch before outputs #11550

Closed

colinsurprenant force-pushed the pipeline-ordered branch from ee81f7a to ceab96f Compare January 29, 2020 18:18

colinsurprenant mentioned this pull request Jan 29, 2020

[7.7 clean backport of #11524] add support for pipeline.ordered setting for java execution #11552

Merged

colinsurprenant added the v8.0.0 label Jan 29, 2020

colinsurprenant merged commit 0bc9fa5 into elastic:master Jan 29, 2020

colinsurprenant mentioned this pull request Feb 4, 2020

Java execution reorders events even with 1 pipeline worker #10938

Closed

colinsurprenant mentioned this pull request Mar 23, 2020

Rebatch after filters when using pipeline.ordered #11710

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for pipeline.ordered setting for java execution #11524

Add support for pipeline.ordered setting for java execution #11524

colinsurprenant commented Jan 21, 2020 •

edited

Loading

colinsurprenant commented Jan 21, 2020

colinsurprenant commented Jan 22, 2020

yaauie commented Jan 22, 2020

colinsurprenant commented Jan 22, 2020

jsvd commented Jan 22, 2020 •

edited

Loading

colinsurprenant commented Jan 22, 2020

jsvd commented Jan 22, 2020

colinsurprenant commented Jan 22, 2020

colinsurprenant commented Jan 22, 2020 •

edited

Loading

yaauie commented Jan 22, 2020

colinsurprenant commented Jan 22, 2020

yaauie commented Jan 22, 2020

colinsurprenant commented Jan 22, 2020

colinsurprenant commented Jan 22, 2020

colinsurprenant commented Jan 22, 2020

karenzone left a comment

colinsurprenant commented Jan 23, 2020

colinsurprenant commented Jan 23, 2020

colinsurprenant commented Jan 24, 2020 •

edited

Loading

colinsurprenant commented Jan 24, 2020

colinsurprenant commented Jan 27, 2020

andsel commented Jan 27, 2020

colinsurprenant commented Jan 27, 2020

yaauie commented Jan 27, 2020

colinsurprenant commented Jan 27, 2020

colinsurprenant commented Jan 29, 2020

colinsurprenant commented Mar 30, 2020

Add support for pipeline.ordered setting for java execution #11524

Add support for pipeline.ordered setting for java execution #11524

Conversation

colinsurprenant commented Jan 21, 2020 • edited Loading

The Problem

The Solution

Possible Enhancement

colinsurprenant commented Jan 21, 2020

colinsurprenant commented Jan 22, 2020

yaauie commented Jan 22, 2020

colinsurprenant commented Jan 22, 2020

jsvd commented Jan 22, 2020 • edited Loading

colinsurprenant commented Jan 22, 2020

jsvd commented Jan 22, 2020

colinsurprenant commented Jan 22, 2020

colinsurprenant commented Jan 22, 2020 • edited Loading

yaauie commented Jan 22, 2020

colinsurprenant commented Jan 22, 2020

yaauie commented Jan 22, 2020

colinsurprenant commented Jan 22, 2020

colinsurprenant commented Jan 22, 2020

colinsurprenant commented Jan 22, 2020

karenzone left a comment

Choose a reason for hiding this comment

colinsurprenant commented Jan 23, 2020

colinsurprenant commented Jan 23, 2020

colinsurprenant commented Jan 24, 2020 • edited Loading

Config 1

Test

Results

Java Execution

Ruby Execution

Config 2

Test

Results

Java Execution

Ruby Execution

colinsurprenant commented Jan 24, 2020

colinsurprenant commented Jan 27, 2020

andsel commented Jan 27, 2020

colinsurprenant commented Jan 27, 2020

yaauie commented Jan 27, 2020

colinsurprenant commented Jan 27, 2020

colinsurprenant commented Jan 29, 2020

colinsurprenant commented Mar 30, 2020

colinsurprenant commented Jan 21, 2020 •

edited

Loading

jsvd commented Jan 22, 2020 •

edited

Loading

colinsurprenant commented Jan 22, 2020 •

edited

Loading

colinsurprenant commented Jan 24, 2020 •

edited

Loading