Skip to content

Pipeline size limits and how to work around them #16320

Closed
@jsvd

Description

Logstash users can create extremely large pipelines due to the complex nature of their data processing needs. There is no documentation of code-level validation of limits on how big pipelines can get.
The following is an exercise of seeing how Logstash breaks when creating large pipelines, either due to many filters, many outputs, or many conditionals.

This should serve as a guide to implement protections and documentation to inform users that their pipeline won't work, but also - if possible - to instruct what they need to tweak to make them work.

Too Many Filters

Use the follow to generate a pipeline with many filters:

bin/ruby -e 'n = 2000; str = "input { heartbeat {} }\nfilter {\n" + ("  drop {}\n" * n) + "}\noutput { stdout {} }"; IO.write("cfg", str)'

It generates a pipeline in the shape of:

input { heartbeat {} }
filter {
  drop {}
  [ thousands ]
  drop {}
}
output { stdout {} }

With 2000 filters Logstash fails with:

[2024-07-11T16:23:41,762][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>125, "pipeline.sources"=>["/tmp/logstash-8.14.2/cfg"], :thread=>"#<Thread:0x2dc90929 /private/tmp/logstash-8.14.2/logstash-core/lib/logstash/java_pipeline.rb:134 run>"}
[2024-07-11T16:23:41,775][FATAL][org.logstash.Logstash    ][main] uncaught error (in thread Ruby-0-Thread-9: /private/tmp/logstash-8.14.2/logstash-core/lib/logstash/java_pipeline.rb:289)
java.lang.StackOverflowError: null
	at org.logstash.config.ir.CompiledPipeline$CompiledExecution.compileDependencies(org/logstash/config/ir/CompiledPipeline.java:534) ~[logstash-core.jar:?]
	at org.logstash.config.ir.CompiledPipeline$CompiledExecution.flatten(org/logstash/config/ir/CompiledPipeline.java:514) ~[logstash-core.jar:?]
	at org.logstash.config.ir.CompiledPipeline$CompiledExecution.filterDataset(org/logstash/config/ir/CompiledPipeline.java:435) ~[logstash-core.jar:?]
	at org.logstash.config.ir.CompiledPipeline$CompiledExecution.lambda$compileDependencies$6(org/logstash/config/ir/CompiledPipeline.java:537) ~[logstash-core.jar:?]

Note that the stack trace is printed to stdout, so when log.format=json then the stack trace doesn't show up:

{"level":"INFO","loggerName":"logstash.javapipeline","timeMillis":1720711602230,"thread":"[main]-pipeline-manager","logEvent":{"message":"Starting pipeline","pipeline_id":"main","pipeline.workers":1,"pipeline.batch.size":125,"pipeline.batch.delay":50,"pipeline.max_inflight":125,"pipeline.sources":["/tmp/logstash-8.14.2/cfg"],"thread":"#<Thread:0x770e64f2 /private/tmp/logstash-8.14.2/logstash-core/lib/logstash/java_pipeline.rb:134 run>"}}
{"level":"FATAL","loggerName":"org.logstash.Logstash","timeMillis":1720711602251,"thread":"Ruby-0-Thread-9: /private/tmp/logstash-8.14.2/logstash-core/lib/logstash/java_pipeline.rb:289","logEvent":{"message":"uncaught error (in thread Ruby-0-Thread-9: /private/tmp/logstash-8.14.2/logstash-core/lib/logstash/java_pipeline.rb:289)"}}

This is solved by increasing Xss (or ThreadStackSize). In the example above with 2000 filters, setting it to 4m (instead of 2m default) allows the pipeline to start.

Too Many Conditionals

bin/ruby -e 'n = 2000; str = "input { heartbeat {} }\nfilter {\n" + ("  if [message] == 1 {\n" * n) + "    sleep { time => 0.001 }\n  " + ("}" * n) + "\n}\noutput { stdout {} }"; IO.write("cfg", str); puts str'
input { heartbeat {} }
filter {
  if [message] == 1 {
  [ thousands of ifs]
  if [message] == 1 {
    sleep { time => 0.001 }
  }}}
}
output { stdout {} }

Fails with:

[2024-07-11T16:40:57,767][ERROR][logstash.agent           ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"Java::JavaLang::StackOverflowError", :message=>"", :backtrace=>["private.tmp.logstash_minus_8_dot_14_dot_2.vendor.bundle.jruby.$3_dot_1_dot_0.gems.treetop_minus_1_dot_6_dot_12.lib.treetop.runtime.syntax_node.RUBY$method$initialize$0(/private/tmp/logstash-8.14.2/vendor/bundle/jruby/3.1.0/gems/treetop-1.6.12/lib/treetop/runtime/syntax_node.rb:7)", "private.tmp.logstash_minus_8_dot_14_dot_2.vendor.bundle.jruby.$3_dot_1_dot_0.gems.treetop_minus_1_dot_6_dot_12.lib.treetop.runtime.compiled_parser.RUBY$method$instantiate_node$0(/private/tmp/logstash-8.14.2/vendor/bundle/jruby/3.1.0/gems/treetop-1.6.12/lib/treetop/runtime/compiled_parser.rb:100)",

This is also solved by increasing Xss (or ThreadStackSize). However to work with 2000 conditionals Xss had to be set to 20m and startup too 4minutes on a modern mbp pro.

Too Many Outputs

bin/ruby -e 'n = 4000; str = "input { heartbeat {} }\noutput {\n" + ("  stdout {}\n" * n) +  "}"; IO.write("cfg", str)'
input { heartbeat {} }
output {
  stdout {}
  [ thousands of outputs ]
  stdout {}
}
[2024-07-11T16:57:09,101][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>125, "pipeline.sources"=>["/tmp/logstash-8.14.2/cfg"], :thread=>"#<Thread:0x543b4d23 /private/tmp/logstash-8.14.2/logstash-core/lib/logstash/java_pipeline.rb:134 run>"}
[2024-07-11T16:57:12,736][ERROR][logstash.javapipeline    ][main] Worker loop initialization error {:pipeline_id=>"main", :error=>"Compiling \"CompiledDataset2\" in \"null\": Code of method \"compute(Lorg/jruby/RubyArray;ZZ)Ljava/util/Collection;\" of class \"org.logstash.generated.CompiledDataset2\" grows beyond 64 KB", :exception=>Java::OrgCodehausCommonsCompiler::InternalCompilerException, :stacktrace=>"org.codehaus.janino.UnitCompiler.compile2(org/codehaus/janino/UnitCompiler.java:367)\norg.codehaus.janino.UnitCompiler.access$000(org/codehaus/janino/UnitCompiler.java:227)\norg.codehaus.janino.UnitCompiler$1.visitCompilationUnit(org/codehaus/janino/UnitCompiler.java:337)\norg.codehaus.janino.UnitCompiler$1.visitCompilationUnit

Increasing Xss doesn't help in this case as it's a limit of code compilation.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions