Skip to content

Getting "dump an error event: error_class=Fluent::Plugin::ConcatFilter::TimeoutError" #37

@ntanh1

Description

@ntanh1

Hi team,

I'm using concat plugin v2.1.0 for my FluentD container, the config is as follow:

<source>
  type forward
  port 24224
  bind 0.0.0.0
</source>

<filter *.*>
  @type concat
  key log
  separator ""
  stream_identity_key container_id
  multiline_start_regexp /^---SL---/
  multiline_end_regexp /^---EL---&/
  flush_interval 10
</filter>

<filter *.*>
  @type record_transformer

  #this allows ruby syntax in the below conversion
  enable_ruby true
  <record>
    log ${record["log"] != nil ? record["log"].sub('---SL---','') : ''}
  </record>
</filter>

<filter *.*>
  @type record_transformer

  #this allows ruby syntax in the below conversion
  enable_ruby true
  <record>
    log ${record["log"] != nil ? record["log"].sub('---EL---&','') : ''}
  </record>
</filter>

further processing

................................................

So my log event will indicate its start point with ---SL--- and endpoint with ---EL---&.
There's a Java app running in another container and use fluent logging driver.

Problem is I'm getting timeout flush for some random event, e.g:

2017-09-26 07:27:16 +0000 [warn]: #0 fluent/log.rb:336:call: dump an error event: error_class=Fluent::Plugin::ConcatFilter::TimeoutError error="Timeout flush: docker.5eabe1bb5d52:5eabe1bb5d52251a978270f141ba2657c1e2ac3a5febfe081e48f0039aae7646" tag="docker.5eabe1bb5d52" time=#<Fluent::EventTime:0x007fd9c861a9c8 @sec=1506410836, @nsec=182951490> record={"container_name"=>"/gfast-sim-id-1-1", "source"=>"stdout", "log"=>"---SL---{\"date\":1506410826922,\"level\":\"INFO\",\"thread\":\"TestANV.1-1-1-Thread-1\",\"category\":\"com.alcatel.netconf.simulator.fwk.DeviceServer\",\"message\":\"TestANV.1-1-1 says hello with : NetconfClientInfo{username\\u003d\\u0027TLS-CLIENT\\u0027, sessionId\\u003d159, m_remoteHost\\u003d\\u0027anv\\u0027, m_remotePort\\u003d\\u00276524\\u0027}\"}---EL---&\r", "container_id"=>"5eabe1bb5d52251a978270f141ba2657c1e2ac3a5febfe081e48f0039aae7646"}

As you can see the log event is complete, we don't wait for pieces of that event and concat. So I cannot understand why the timeout happened.
It occurs quite randomly, some time with TestANV.1-1-1, some time with TestANV.1-1-2 (I have 5 such entities).

Can some one please help?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions