Description
Logstash users can create extremely large pipelines due to the complex nature of their data processing needs. There is no documentation of code-level validation of limits on how big pipelines can get.
The following is an exercise of seeing how Logstash breaks when creating large pipelines, either due to many filters, many outputs, or many conditionals.
This should serve as a guide to implement protections and documentation to inform users that their pipeline won't work, but also - if possible - to instruct what they need to tweak to make them work.
Too Many Filters
Use the follow to generate a pipeline with many filters:
bin/ruby -e 'n = 2000; str = "input { heartbeat {} }\nfilter {\n" + (" drop {}\n" * n) + "}\noutput { stdout {} }"; IO.write("cfg", str)'
It generates a pipeline in the shape of:
input { heartbeat {} }
filter {
drop {}
[ thousands ]
drop {}
}
output { stdout {} }
With 2000 filters Logstash fails with:
[2024-07-11T16:23:41,762][INFO ][logstash.javapipeline ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>125, "pipeline.sources"=>["/tmp/logstash-8.14.2/cfg"], :thread=>"#<Thread:0x2dc90929 /private/tmp/logstash-8.14.2/logstash-core/lib/logstash/java_pipeline.rb:134 run>"}
[2024-07-11T16:23:41,775][FATAL][org.logstash.Logstash ][main] uncaught error (in thread Ruby-0-Thread-9: /private/tmp/logstash-8.14.2/logstash-core/lib/logstash/java_pipeline.rb:289)
java.lang.StackOverflowError: null
at org.logstash.config.ir.CompiledPipeline$CompiledExecution.compileDependencies(org/logstash/config/ir/CompiledPipeline.java:534) ~[logstash-core.jar:?]
at org.logstash.config.ir.CompiledPipeline$CompiledExecution.flatten(org/logstash/config/ir/CompiledPipeline.java:514) ~[logstash-core.jar:?]
at org.logstash.config.ir.CompiledPipeline$CompiledExecution.filterDataset(org/logstash/config/ir/CompiledPipeline.java:435) ~[logstash-core.jar:?]
at org.logstash.config.ir.CompiledPipeline$CompiledExecution.lambda$compileDependencies$6(org/logstash/config/ir/CompiledPipeline.java:537) ~[logstash-core.jar:?]
Note that the stack trace is printed to stdout, so when log.format=json
then the stack trace doesn't show up:
{"level":"INFO","loggerName":"logstash.javapipeline","timeMillis":1720711602230,"thread":"[main]-pipeline-manager","logEvent":{"message":"Starting pipeline","pipeline_id":"main","pipeline.workers":1,"pipeline.batch.size":125,"pipeline.batch.delay":50,"pipeline.max_inflight":125,"pipeline.sources":["/tmp/logstash-8.14.2/cfg"],"thread":"#<Thread:0x770e64f2 /private/tmp/logstash-8.14.2/logstash-core/lib/logstash/java_pipeline.rb:134 run>"}}
{"level":"FATAL","loggerName":"org.logstash.Logstash","timeMillis":1720711602251,"thread":"Ruby-0-Thread-9: /private/tmp/logstash-8.14.2/logstash-core/lib/logstash/java_pipeline.rb:289","logEvent":{"message":"uncaught error (in thread Ruby-0-Thread-9: /private/tmp/logstash-8.14.2/logstash-core/lib/logstash/java_pipeline.rb:289)"}}
This is solved by increasing Xss (or ThreadStackSize). In the example above with 2000 filters, setting it to 4m (instead of 2m default) allows the pipeline to start.
Too Many Conditionals
bin/ruby -e 'n = 2000; str = "input { heartbeat {} }\nfilter {\n" + (" if [message] == 1 {\n" * n) + " sleep { time => 0.001 }\n " + ("}" * n) + "\n}\noutput { stdout {} }"; IO.write("cfg", str); puts str'
input { heartbeat {} }
filter {
if [message] == 1 {
[ thousands of ifs]
if [message] == 1 {
sleep { time => 0.001 }
}}}
}
output { stdout {} }
Fails with:
[2024-07-11T16:40:57,767][ERROR][logstash.agent ] Failed to execute action {:action=>LogStash::PipelineAction::Create/pipeline_id:main, :exception=>"Java::JavaLang::StackOverflowError", :message=>"", :backtrace=>["private.tmp.logstash_minus_8_dot_14_dot_2.vendor.bundle.jruby.$3_dot_1_dot_0.gems.treetop_minus_1_dot_6_dot_12.lib.treetop.runtime.syntax_node.RUBY$method$initialize$0(/private/tmp/logstash-8.14.2/vendor/bundle/jruby/3.1.0/gems/treetop-1.6.12/lib/treetop/runtime/syntax_node.rb:7)", "private.tmp.logstash_minus_8_dot_14_dot_2.vendor.bundle.jruby.$3_dot_1_dot_0.gems.treetop_minus_1_dot_6_dot_12.lib.treetop.runtime.compiled_parser.RUBY$method$instantiate_node$0(/private/tmp/logstash-8.14.2/vendor/bundle/jruby/3.1.0/gems/treetop-1.6.12/lib/treetop/runtime/compiled_parser.rb:100)",
This is also solved by increasing Xss (or ThreadStackSize). However to work with 2000 conditionals Xss had to be set to 20m and startup too 4minutes on a modern mbp pro.
Too Many Outputs
bin/ruby -e 'n = 4000; str = "input { heartbeat {} }\noutput {\n" + (" stdout {}\n" * n) + "}"; IO.write("cfg", str)'
input { heartbeat {} }
output {
stdout {}
[ thousands of outputs ]
stdout {}
}
[2024-07-11T16:57:09,101][INFO ][logstash.javapipeline ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>125, "pipeline.sources"=>["/tmp/logstash-8.14.2/cfg"], :thread=>"#<Thread:0x543b4d23 /private/tmp/logstash-8.14.2/logstash-core/lib/logstash/java_pipeline.rb:134 run>"}
[2024-07-11T16:57:12,736][ERROR][logstash.javapipeline ][main] Worker loop initialization error {:pipeline_id=>"main", :error=>"Compiling \"CompiledDataset2\" in \"null\": Code of method \"compute(Lorg/jruby/RubyArray;ZZ)Ljava/util/Collection;\" of class \"org.logstash.generated.CompiledDataset2\" grows beyond 64 KB", :exception=>Java::OrgCodehausCommonsCompiler::InternalCompilerException, :stacktrace=>"org.codehaus.janino.UnitCompiler.compile2(org/codehaus/janino/UnitCompiler.java:367)\norg.codehaus.janino.UnitCompiler.access$000(org/codehaus/janino/UnitCompiler.java:227)\norg.codehaus.janino.UnitCompiler$1.visitCompilationUnit(org/codehaus/janino/UnitCompiler.java:337)\norg.codehaus.janino.UnitCompiler$1.visitCompilationUnit
Increasing Xss doesn't help in this case as it's a limit of code compilation.
Activity