Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for pipeline.ordered setting for java execution #11524

Merged
merged 1 commit into from
Jan 29, 2020

Conversation

colinsurprenant
Copy link
Contributor

@colinsurprenant colinsurprenant commented Jan 21, 2020

Fixes #10938
Followup from abandoned PRs #11020 and #11099

TODO:

  • docs
  • test/spec harness around the new event ordering
  • performance tests

This PR adds the pipeline.ordered option that when used under the Java execution and only when pipeline.workers is set to 1 will preserve the batched events order throughout the filters+output execution.

Under this PR both the example in #11099 description and the aggregate filter scenario
#11099 (comment) are working correctly.

The Problem

The Java execution model is created from a graph representation of the configuration and under that model, each conditional evaluation is splitting the batch events is two, for the events evaluating on the true side and the false side of the condition and then moving down the computation path on each side in turns. This means that all events matching one side of the condition will be computed before the ones on the other side.

The Solution

Under that model, the only solution is to drip events from a given batch one-by-one down the filter+output computation so that the event ordering is always preserved.

This solution is obviously subobtimal and not very different from simply setting a single worker and a batch size of 1 element. The difference with this implementation is that a bigger batch size can still be specified and will be honoured at the input and PQ.

Possible Enhancement

  • Conceptually I think we could re-create the original batch after the filtering and before the output so that output plugins which leverage the batch such as the elasticsearch output could continue to leverage it.
  • Maybe we could have two compute method implementation/generation to either accept an array of events as it is now and also accept a single event to avoid the overhead of dealing with arrays/multiple elements when we know there is just one.

In both case I think these are optimizations/improvements that could be made as followups.

@colinsurprenant
Copy link
Contributor Author

Some Ruby specs not passing because of the added warning logs - will fix that.

logstash-core/lib/logstash/environment.rb Outdated Show resolved Hide resolved
logstash-core/lib/logstash/java_pipeline.rb Outdated Show resolved Hide resolved
logstash-core/locales/en.yml Outdated Show resolved Hide resolved
@colinsurprenant
Copy link
Contributor Author

Thanks @yaauie for the review.

Per our inline discussion, my intent with making pipeline.ordered defaults to true was to make sure that if a user set a single worker with the Java execution, the behaviour is going to be the same as the Ruby execution. I think we should still aim at preserving the Ruby execution behaviour in v7. WDYT?

@yaauie
Copy link
Member

yaauie commented Jan 22, 2020

Re: defaulting to true

When I proposed a separate opt-in parameter (as opposed to relying on circumstances e.g. one pipeline worker), my reasoning was that maintaining strict ordrring comes with a significant penalty and should only ever be used in cases where a pipeline absolutely needs to rely on strictly-ordered execution.

While this implementation allows someone to opt-out with pipeline.ordered=false, I believe that it is far more likely to accidentally catch those whose configuration limits throughput for other reasons (e.g., rate-limiting a filter use by reducing throughput; one of their filters is not threadsafe and Logstash automatically overriding pipeline.workers to be 1).

@colinsurprenant
Copy link
Contributor Author

@yaauie I hear what you say and agree on the principle; that's why I added the waning logs 'pipeline.ordered' setting is enabled and is likely less efficient, consider disabling if preserving event order is not necessary. Again for me it boils down to preserving a previous default behaviour. This default in v7 could be changed in v8.

I guess we could also flip the messaging and if there is a single worker and pipeline.ordered=false we could log a waning such as "this pipeline uses a single worker and 'pipeline.ordered' is disabled so event order will not be preserved ..." or something around these lines.
WDYT?

One of the problem I see is that by not fixing the default behaviour we will have to change the documentation of the aggregate filter for example

You should be very careful to set Logstash filter workers to 1 (-w 1 flag) for this filter to work correctly otherwise events may be processed out of sequence and unexpected results will occur.

And this is what we've been telling, to just use -w 1 to keep ordering. I am not sure that the expected (and documented) behaviour should be changed within a version (event if it mean it is less efficient)? On top of the warning log I added 'pipeline.ordered' setting is enabled and is likely less efficient ... we could complement that by underlining it in the release notes and maybe a blog post about it?

@jsvd
Copy link
Member

jsvd commented Jan 22, 2020

What if simply introduce a default behaviour that, if pipeline.ordered is not set, it is enabled for workers = 1 and disabled for workers > 1? if the user sets this option (and we can know this), then it overrides the conditional behaviour.

[edit] this would actually reduce the need for warnings. it does the right thing by default, and if the user is overriding this setting then it's quite obvious what the consequence is (ordered events vs speed)

@colinsurprenant
Copy link
Contributor Author

@jsvd note that pipeline.ordered can only work iif there is a single worker. The behaviour you are suggesting «introduce a default behaviour that, if pipeline.ordered is not set, it is enabled for workers = 1» is essentially what we have are: pipeline.ordered is enabled by default and depends on having a single worker. I think it will make it more obvious to have an explicit true default setting for pipeline.ordered instead of not having a default at all?

@jsvd
Copy link
Member

jsvd commented Jan 22, 2020

Allowing someone to set pipeline.ordered: true + pipeline.workers: 3 and resulting in events not being ordered doesn't feel like a good experience, regardless of warning messages which can easily be missed.

What if we had a third setting value called "auto" (default), and it would work like this:

  • when "auto", order iif worker = 1
  • when "true", order if worker =1, throw error otherwise
  • when "false", don't order

@colinsurprenant
Copy link
Contributor Author

@jsvd yeah, I like that. I would suggest that in v7 it defaults to auto as you suggest (to fix currently broken expectations) and in v8 make it defaults to false to prevent for the unnecessary induced extra slowdown of ordering when having a forced single worker pipeline (when non thread safe plugins are used for example)

@colinsurprenant
Copy link
Contributor Author

colinsurprenant commented Jan 22, 2020

@jsvd furthermore, with you suggestion this option becomes generic for both Java and Ruby execution, no need to specify that this option is only for the Java execution. Obviously if false is set, it won't change the Ruby execution behaviour but that is not important IMO.

@yaauie
Copy link
Member

yaauie commented Jan 22, 2020

The one caveat I'd add on @jsvd's "default:auto" solution, is it is possible for safe_pipeline_worker_count to force their pipeline worker count down to 1 (even if they have specified something greater) due to a filter or output's declaration of non-threadsafety, and I wouldn't necessarily expect this to also incur the penalty of drip-mode sequential ordering.

@colinsurprenant
Copy link
Contributor Author

@yaauie we could log a warning in this case?

@yaauie
Copy link
Member

yaauie commented Jan 22, 2020

If our setting is defined like:

Setting::String.new('pipeline.ordered', 'auto', true, %w(true false auto))

Then we could do something like:

def effective_pipeline_ordered_value
  case settings.value('pipeline.ordered')
  when 'true'  then true
  when 'false' then false
  when 'auto'
    settings.set?('pipeline.workers') && settings.value('pipeline.workers') == 1
  else
    fail 'illegal state'
  end
end

@colinsurprenant
Copy link
Contributor Author

@yaauie yeah, I think it makes sense to apply the auto setting only when the pipeline.workers is explicitly set to 1.

@colinsurprenant
Copy link
Contributor Author

I have reorg'ed the validation as discussed. I also added the default option in logstash.yml and note that I also added a check for when pipeline.ordered is true and the number of workers is > 1 in the Ruby pipeline to avoid any possible confusion. This does not change the default with the Ruby execution behaviour since the default option is "auto" and will do nothing f since ordering is already preserved when using a single worker (similarly, false will also do nothing).

@colinsurprenant
Copy link
Contributor Author

/cc @karenzone for docs.

Copy link
Contributor

@karenzone karenzone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on the docs! I made a few suggestions, but will leave them up to your discretion. Otherwise LGTM

docs/static/settings-file.asciidoc Outdated Show resolved Hide resolved
docs/static/settings-file.asciidoc Outdated Show resolved Hide resolved
docs/static/running-logstash-command-line.asciidoc Outdated Show resolved Hide resolved
docs/static/running-logstash-command-line.asciidoc Outdated Show resolved Hide resolved
@colinsurprenant
Copy link
Contributor Author

Added java execution specs.

@colinsurprenant
Copy link
Contributor Author

Also added a Ruby pipeline spec. At this point I think this is code-complete (pending green build) and ready for final reviewing.

@colinsurprenant colinsurprenant changed the title [WIP] add support for pipeline.ordered setting for java execution Add support for pipeline.ordered setting for java execution Jan 23, 2020
@colinsurprenant
Copy link
Contributor Author

colinsurprenant commented Jan 24, 2020

I've done some performance tests.

Config 1

A relatively simple config with a few conditional branches.

input {
  generator {
    lines => [
      "line 1",
      "line 2",
      "line 3"
    ]
    count => 5000000
  }
}

filter {
  if [message] == "line 1" {
    mutate { add_field => { "line1" => "true" } }
  } else if [message] == "line 2" {
    mutate { add_field => { "line2" => "true" } }
  } else if [message] == "line 3" {
    mutate { add_field => { "line3" => "true" } }
  }
}

output { stdout { codec => dots } }

Test

bin/logstash --pipeline.ordered {true|false} --pipeline.workers 1 --java-execution {true|false} -f tmp/config1.conf | pv -Wbart > /dev/null

Results

Java Execution

pipeline.ordered elapsed steady TPS
false 0:02:07 123k/s
true 0:03:07 82k/s

Ruby Execution

pipeline.ordered elapsed steady TPS
false/true 0:02:05 125k/s

Config 2

A slightly more taxing config with regex conditionals and a grok filter.

input {
  generator {
    lines => [
      "the quick brown fox jumped over the lazy line 1",
      "the slow brown fox jumped over the lazy line 2",
      "the average brown fox jumped over the lazy line 3"
    ]
    count => 5000000
  }
}

filter {
  if [message] =~ "line 1" {
    mutate { add_field => { "line1" => "true" } }
  } else if [message] =~ "line 2" {
    mutate { add_field => { "line2" => "true" } }
  } else if [message] =~ "line 3" {
    mutate { add_field => { "line3" => "true" } }
  }
  grok {
    match => { "message" => "the %{WORD:speed} %{DATA} line %{NUMBER:num}" }
  }
}

output { stdout { codec => dots } }

Test

bin/logstash --pipeline.ordered {true|false} --pipeline.workers 1 --java-execution {true|false} -f tmp/config2.conf | pv -Wbart > /dev/null

Results

Java Execution

pipeline.ordered elapsed steady TPS
false 0:04:56 55k/s
true 0:08:22 30k/s

Ruby Execution

pipeline.ordered elapsed steady TPS
false/true 0:05:30 45k/s

@colinsurprenant
Copy link
Contributor Author

A conclusion to the above performance tests is that the Ruby execution (which always preserve order) is actually faster than preserving order with the Java execution with this implementation.

I would still move forward with this as-is because it is now "doing it right" and followup on it to improve performance since the Ruby engine will be deprecated in the future.

@colinsurprenant
Copy link
Contributor Author

  • I tried exposing the events collection from QueueBatch to avoid the extra to_a() RubyArray creation when ordering but there is not significative performance improvement.
  • I think the improvement we should look into would be to find the way to drip events into the filters but re-aggregate the batch after the filter but before the outputs, this would allow benefiting from the outputs multi_receive. This would possibly break ordering at the outputs if conditionals are used but I don't think ordering is actually important at the outputs because typically conditionals are only used to fork between outputs and in that respect ordering would not be changed.

@andsel
Copy link
Contributor

andsel commented Jan 27, 2020

So when Java pipeline compiler compiles the pipeline we could artificially put an aggregator "filter" after as joint point between last user filter and the output section that wait for all dripped events before moving on

@colinsurprenant
Copy link
Contributor Author

@andsel That's an idea. I wonder if it would not be better to look into specializing the compute method to have a drip mode that would avoid shuffling single-element arrays and at the end of the filters it would aggregate the dripped events and then continue on with the outputs.

In any case I think we should probably move this optimization into another issue; I suggest we merge this PR knowing it might not be optimal but correct in behaviour and then followup with potential optimizations?

@yaauie
Copy link
Member

yaauie commented Jan 27, 2020

+1 to merging as-is and deferring follow-up optimization to a separate issue.

@colinsurprenant
Copy link
Contributor Author

@yaauie @jsvd let me know if we are good to go or there are other concerns we should look into?

@colinsurprenant
Copy link
Contributor Author

This PR will be merged for v8 and v7.7. for 7.6 we have decided not to backport, at least not until we find a solution for the batch re-aggregation after the filters and before outputs (issue in #11550).

Note that in the mean time the workaround is to use the Ruby execution --java-execution false which does preserve events order with using a single worker.

reuse rubyArray for single element batches

rename preserveBatchOrder to preserveEventOrder

allow boolean and string values for the pipeline.ordered setting, reorg validation

update docs

yml typo

Update docs/static/running-logstash-command-line.asciidoc

Co-Authored-By: Karen Metts <35154725+karenzone@users.noreply.github.com>

Update docs/static/running-logstash-command-line.asciidoc

Co-Authored-By: Karen Metts <35154725+karenzone@users.noreply.github.com>

java execution specs and spec support

docs corrections per review

typo

close not shutdown

Ruby pipeline spec
@colinsurprenant
Copy link
Contributor Author

Rebatching before outputs was solved in #11710 and will be available un 7.7.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Java execution reorders events even with 1 pipeline worker
5 participants