Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add functionality to the modify filter to move fields to the start or end #6103

Merged
merged 1 commit into from
Oct 21, 2022

Conversation

seveas
Copy link
Contributor

@seveas seveas commented Sep 26, 2022

In logging pipelines where downstream log receivers only inspect part of the message for efficiency reasons, it is useful to have known-importnat field at the start of the message and/or known-large fields at the end. This lets a user do so.

The specific use case for which I wrote this is sending exceptions with giant backtraces via fluent-bit to kafka and then splunk. Our splunk ingestor inspects the message to find a splunk_index field to determine which index to route to. For efficiency reasons, it only inspects the first few hundred bytes. This patch lets me always move the splunk_index field to the front and the backtrace to the back, making it possible for splunk to do the right thing.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@seveas
Copy link
Contributor Author

seveas commented Sep 26, 2022

Example configuration, logfile input and debug log:

[INPUT]
    name tail
    read_from_head true
    path random.log
    parser logfmt
    exit_on_eof true

[FILTER]
    name modify
    match *
    move_to_start splunk_
    move_to_end backtrace
XsI2p=/YTi8OBmUY splunk_index=exceptions fBUwQ=qsu68ms3VV backtrace="Back trace here" P+aQn=Hlh4+4XGVR M2cPv=LHrTQcBrhQ Zv9Ii=RRQF+ovKxF
backtrace="Back trace here" zxdbL=ujMjUm0DtR nHHnQ=WQw5bQdrrX splunk_index=exceptions 2geJD=8ni4bADWio sq/tR=MMFV6b0Wi6 r2RkK=odGYgfaVxz
splunk_index=exceptions dVK8C=7KCIUKxu0u 8P9GF=ust/rfkqKc KBSUr=QsAwe0qVBS SrpVi=9DDVS7YYUd backtrace="Back trace here" pw6e1=ihl4hDxNnl
m93CB=Rb2XpoNoZy Bh2Nf=nZn54wqRUq xh1YH=xTyL+KyId7 DqdsL=DjeYdll9nN splunk_index=exceptions backtrace="Back trace here" zhbRP=EFvKPtorhm
splunk_index=exceptions KRMLo=NhuhCNgoBy X0WYQ=v4WAmgYGw8 backtrace="Back trace here" rXHMG=PmNo0fxIlY fRX0P=J7FQ5Zaru6 thCJH=feuf+2sGRT
PyBFy=thSoPXnOhx IGirj=DEzW+jfWxM kUKRZ=UxBAOSbjy3 s//aT=25C/b0vOOr splunk_index=exceptions backtrace="Back trace here" 4JWgr=gd0pHy9cot
5jviw=DXPZLaEcps splunk_index=exceptions backtrace="Back trace here" BmcIR=AKdCWXEIky ZyCFJ=qAbDAd53/3 auOMg=WGZUBeqfCj OLl3s=Z1rJut6w9z
splunk_index=exceptions ieD/u=mOLfubIVac backtrace="Back trace here" bvdhB=YPSzQQO7r6 riMD6=9HkSCeyWdJ ilhca=VS0VeDb3HK lqhHb=flEP97ubLm
hrzdc=OSOaZcvuCB MzlD5=jaBhcZwwvR n2d2s=yaFfWm7vj1 MLy3t=tmgKomT362 VyjBc=m3O2yIwuBx splunk_index=exceptions backtrace="Back trace here"
9Mtma=FLuVDRbcoC backtrace="Back trace here" hVzqY=UfChGRo3D4 Wo7Sv=iD0LeEQdyU lg08P=8Hslz6j6UY JmQkc=ynx+F5W34v splunk_index=exceptions
backtrace="Back trace here" OKbSL=p3jIjqH5Kw +G00j=YU1uJgTVtG splunk_index=exceptions Jht8y=+Vv9XoDOMn Ear2n=wyEONNPGNE JIdhp=WfnF6ksJBz
d72fL=FkBgOTq9YX 4IiVR=PalJYDgre4 eyIPN=gFnMUZuZQ2 ucEam=GWW0sm3EF5 backtrace="Back trace here" ctdpz=XCqEw75Vvm splunk_index=exceptions
4NmHg=gUSHcd301d H/W/s=yz88/SkJqd 4/m1Y=/vNzmne10y l7XJf=v6+wrOwmSf splunk_index=exceptions backtrace="Back trace here" 7wlvw=l1VOUgeE0g
splunk_index=exceptions backtrace="Back trace here" k/f/g=CNWC7BainY K9x9E=F6KmL/c9s7 zMLVX=uvjjFjl9jc lMlkj=415e+glN1t 2cYb9=iTUzWPThBK
NMIQj=WkwYwV7tZ1 splunk_index=exceptions wdiVJ=wBVAoKa+Xd BS5zK=ijnvv8H8jU Sg/AO=oWc7txKZDS backtrace="Back trace here" POPQi=WTMyUiDHBi
Rw688=64L8ms153a nAUa4=cIdMgp4qJN YYZlS=pk0SdRW8Rh splunk_index=exceptions UQYDv=q44KRFnL4r backtrace="Back trace here" OmswD=8jPBGqaEvY
Cvc7W=xyX8xKL6WP O8jDh=yiQbqCpUex UTAw7=rzo7qbWtpz splunk_index=exceptions backtrace="Back trace here" 593iN=IQh5Mg5MOW weETh=9sjMb7AJsi
dul1G=Er34iROdwI tYwHQ=tsYmAKsnHS backtrace="Back trace here" gwahk=xcOPtSb8P9 splunk_index=exceptions fyz0C=jV7YvioZ15 ZGCli=VWSHJwdmou
lR/61=74J/60aWDU xZBZd=9m05PdS84y 5+sbu=oakOsMwmhy +O71+=CUWPpW5Ctl backtrace="Back trace here" miESW=7KdDEkh7vD splunk_index=exceptions
1hc21=o3DUqfgxHk splunk_index=exceptions A9taj=JAu+yPFlmH dF4IF=+74LkysATt backtrace="Back trace here" a7fTB=GNsvcVEv45 i/KFA=6WZm7K0fSF
Fluent Bit v2.0.0
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2022/09/26 19:48:18] [ info] Configuration:
[2022/09/26 19:48:18] [ info]  flush time     | 1.000000 seconds
[2022/09/26 19:48:18] [ info]  grace          | 5 seconds
[2022/09/26 19:48:18] [ info]  daemon         | 0
[2022/09/26 19:48:18] [ info] ___________
[2022/09/26 19:48:18] [ info]  inputs:
[2022/09/26 19:48:18] [ info]      tail
[2022/09/26 19:48:18] [ info] ___________
[2022/09/26 19:48:18] [ info]  filters:
[2022/09/26 19:48:18] [ info]      modify.0
[2022/09/26 19:48:18] [ info] ___________
[2022/09/26 19:48:18] [ info]  outputs:
[2022/09/26 19:48:18] [ info]      stdout.0
[2022/09/26 19:48:18] [ info] ___________
[2022/09/26 19:48:18] [ info]  collectors:
[2022/09/26 19:48:18] [ info] [fluent bit] version=2.0.0, commit=bcbbec0e6b, pid=54999
[2022/09/26 19:48:18] [debug] [engine] coroutine stack size: 12288 bytes (12.0K)
[2022/09/26 19:48:18] [ info] [storage] version=1.2.0, type=memory-only, sync=normal, checksum=disabled, max_chunks_up=128
[2022/09/26 19:48:18] [ info] [cmetrics] version=0.4.0
[2022/09/26 19:48:18] [debug] [tail:tail.0] created event channels: read=21 write=22
[2022/09/26 19:48:18] [debug] [input:tail:tail.0] flb_tail_fs_stat_init() initializing stat tail input
[2022/09/26 19:48:18] [debug] [input:tail:tail.0] scanning path random.log
[2022/09/26 19:48:18] [debug] [input:tail:tail.0] inode=116645693 with offset=0 appended as random.log
[2022/09/26 19:48:18] [debug] [input:tail:tail.0] scan_glob add(): random.log, inode 116645693
[0] tail.0: [1664214498.583125000, {"splunk_index"=>"exceptions", "XsI2p"=>"/YTi8OBmUY", "fBUwQ"=>"qsu68ms3VV", "P+aQn"=>"Hlh4+4XGVR", "M2cPv"=>"LHrTQcBrhQ", "Zv9Ii"=>"RRQF+ovKxF", "backtrace"=>"Back trace here"}]
[2022/09/26 19:48:18] [debug] [input:tail:tail.0] 1 new files found on path 'random.log'
[1] tail.0: [1664214498.583131000, {"splunk_index"=>"exceptions", "zxdbL"=>"ujMjUm0DtR", "nHHnQ"=>"WQw5bQdrrX", "2geJD"=>"8ni4bADWio", "sq/tR"=>"MMFV6b0Wi6", "r2RkK"=>"odGYgfaVxz", "backtrace"=>"Back trace here"}]
[2022/09/26 19:48:18] [ info] [output:stdout:stdout.0] worker #0 started
[2022/09/26 19:48:18] [debug] [filter:modify:modify.0] Initialized modify filter with 0 conditions and 2 rules
[2] tail.0: [1664214498.583134000, {"splunk_index"=>"exceptions", "dVK8C"=>"7KCIUKxu0u", "8P9GF"=>"ust/rfkqKc", "KBSUr"=>"QsAwe0qVBS", "SrpVi"=>"9DDVS7YYUd", "pw6e1"=>"ihl4hDxNnl", "backtrace"=>"Back trace here"}]
[2022/09/26 19:48:18] [debug] [stdout:stdout.0] created event channels: read=28 write=29
[3] tail.0: [1664214498.583136000, {"splunk_index"=>"exceptions", "m93CB"=>"Rb2XpoNoZy", "Bh2Nf"=>"nZn54wqRUq", "xh1YH"=>"xTyL+KyId7", "DqdsL"=>"DjeYdll9nN", "zhbRP"=>"EFvKPtorhm", "backtrace"=>"Back trace here"}]
[2022/09/26 19:48:18] [debug] [router] match rule tail.0:stdout.0
[2022/09/26 19:48:18] [ info] [http_server] listen iface=127.0.0.1 tcp_port=2020
[2022/09/26 19:48:18] [ info] [sp] stream processor started
[4] tail.0: [1664214498.583139000, {"splunk_index"=>"exceptions", "KRMLo"=>"NhuhCNgoBy", "X0WYQ"=>"v4WAmgYGw8", "rXHMG"=>"PmNo0fxIlY", "fRX0P"=>"J7FQ5Zaru6", "thCJH"=>"feuf+2sGRT", "backtrace"=>"Back trace here"}]
[2022/09/26 19:48:18] [debug] [input chunk] update output instances with new chunk size diff=2940
[2022/09/26 19:48:18] [debug] [input:tail:tail.0] [static files] processed 2.7K
[5] tail.0: [1664214498.583140000, {"splunk_index"=>"exceptions", "PyBFy"=>"thSoPXnOhx", "IGirj"=>"DEzW+jfWxM", "kUKRZ"=>"UxBAOSbjy3", "s//aT"=>"25C/b0vOOr", "4JWgr"=>"gd0pHy9cot", "backtrace"=>"Back trace here"}]
[2022/09/26 19:48:18] [ info] [input:tail:tail.0] inode=116645693 file=random.log ended, stop
[2022/09/26 19:48:18] [debug] [input:tail:tail.0] inode=116645693 file=random.log promote to TAIL_EVENT
[2022/09/26 19:48:18] [debug] [input:tail:tail.0] [static files] processed 0b, done
[2022/09/26 19:48:18] [debug] [task] created task=0x600001df4000 id=0 OK
[2022/09/26 19:48:18] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[6] tail.0: [1664214498.583142000, {"splunk_index"=>"exceptions", "5jviw"=>"DXPZLaEcps", "BmcIR"=>"AKdCWXEIky", "ZyCFJ"=>"qAbDAd53/3", "auOMg"=>"WGZUBeqfCj", "OLl3s"=>"Z1rJut6w9z", "backtrace"=>"Back trace here"}]
[2022/09/26 19:48:18] [ warn] [engine] service will shutdown in max 5 seconds
[2022/09/26 19:48:18] [ info] [input] pausing tail.0
[7] tail.0: [1664214498.583144000, {"splunk_index"=>"exceptions", "ieD/u"=>"mOLfubIVac", "bvdhB"=>"YPSzQQO7r6", "riMD6"=>"9HkSCeyWdJ", "ilhca"=>"VS0VeDb3HK", "lqhHb"=>"flEP97ubLm", "backtrace"=>"Back trace here"}]
[8] tail.0: [1664214498.583146000, {"splunk_index"=>"exceptions", "hrzdc"=>"OSOaZcvuCB", "MzlD5"=>"jaBhcZwwvR", "n2d2s"=>"yaFfWm7vj1", "MLy3t"=>"tmgKomT362", "VyjBc"=>"m3O2yIwuBx", "backtrace"=>"Back trace here"}]
[9] tail.0: [1664214498.583148000, {"splunk_index"=>"exceptions", "9Mtma"=>"FLuVDRbcoC", "hVzqY"=>"UfChGRo3D4", "Wo7Sv"=>"iD0LeEQdyU", "lg08P"=>"8Hslz6j6UY", "JmQkc"=>"ynx+F5W34v", "backtrace"=>"Back trace here"}]
[10] tail.0: [1664214498.583150000, {"splunk_index"=>"exceptions", "OKbSL"=>"p3jIjqH5Kw", "+G00j"=>"YU1uJgTVtG", "Jht8y"=>"+Vv9XoDOMn", "Ear2n"=>"wyEONNPGNE", "JIdhp"=>"WfnF6ksJBz", "backtrace"=>"Back trace here"}]
[11] tail.0: [1664214498.583152000, {"splunk_index"=>"exceptions", "d72fL"=>"FkBgOTq9YX", "4IiVR"=>"PalJYDgre4", "eyIPN"=>"gFnMUZuZQ2", "ucEam"=>"GWW0sm3EF5", "ctdpz"=>"XCqEw75Vvm", "backtrace"=>"Back trace here"}]
[12] tail.0: [1664214498.583154000, {"splunk_index"=>"exceptions", "4NmHg"=>"gUSHcd301d", "H/W/s"=>"yz88/SkJqd", "4/m1Y"=>"/vNzmne10y", "l7XJf"=>"v6+wrOwmSf", "7wlvw"=>"l1VOUgeE0g", "backtrace"=>"Back trace here"}]
[13] tail.0: [1664214498.583156000, {"splunk_index"=>"exceptions", "k/f/g"=>"CNWC7BainY", "K9x9E"=>"F6KmL/c9s7", "zMLVX"=>"uvjjFjl9jc", "lMlkj"=>"415e+glN1t", "2cYb9"=>"iTUzWPThBK", "backtrace"=>"Back trace here"}]
[14] tail.0: [1664214498.583157000, {"splunk_index"=>"exceptions", "NMIQj"=>"WkwYwV7tZ1", "wdiVJ"=>"wBVAoKa+Xd", "BS5zK"=>"ijnvv8H8jU", "Sg/AO"=>"oWc7txKZDS", "POPQi"=>"WTMyUiDHBi", "backtrace"=>"Back trace here"}]
[15] tail.0: [1664214498.583159000, {"splunk_index"=>"exceptions", "Rw688"=>"64L8ms153a", "nAUa4"=>"cIdMgp4qJN", "YYZlS"=>"pk0SdRW8Rh", "UQYDv"=>"q44KRFnL4r", "OmswD"=>"8jPBGqaEvY", "backtrace"=>"Back trace here"}]
[16] tail.0: [1664214498.583161000, {"splunk_index"=>"exceptions", "Cvc7W"=>"xyX8xKL6WP", "O8jDh"=>"yiQbqCpUex", "UTAw7"=>"rzo7qbWtpz", "593iN"=>"IQh5Mg5MOW", "weETh"=>"9sjMb7AJsi", "backtrace"=>"Back trace here"}]
[17] tail.0: [1664214498.583163000, {"splunk_index"=>"exceptions", "dul1G"=>"Er34iROdwI", "tYwHQ"=>"tsYmAKsnHS", "gwahk"=>"xcOPtSb8P9", "fyz0C"=>"jV7YvioZ15", "ZGCli"=>"VWSHJwdmou", "backtrace"=>"Back trace here"}]
[18] tail.0: [1664214498.583165000, {"splunk_index"=>"exceptions", "lR/61"=>"74J/60aWDU", "xZBZd"=>"9m05PdS84y", "5+sbu"=>"oakOsMwmhy", "+O71+"=>"CUWPpW5Ctl", "miESW"=>"7KdDEkh7vD", "backtrace"=>"Back trace here"}]
[19] tail.0: [1664214498.583167000, {"splunk_index"=>"exceptions", "1hc21"=>"o3DUqfgxHk", "A9taj"=>"JAu+yPFlmH", "dF4IF"=>"+74LkysATt", "a7fTB"=>"GNsvcVEv45", "i/KFA"=>"6WZm7K0fSF", "backtrace"=>"Back trace here"}]
[2022/09/26 19:48:18] [debug] [out flush] cb_destroy coro_id=0
[2022/09/26 19:48:18] [debug] [task] destroy task=0x600001df4000 (task_id=0)
[2022/09/26 19:48:19] [ info] [engine] service has stopped (0 pending tasks)
[2022/09/26 19:48:19] [ info] [input] pausing tail.0
[2022/09/26 19:48:19] [debug] [input:tail:tail.0] inode=116645693 removing file name random.log
[2022/09/26 19:48:19] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2022/09/26 19:48:19] [ info] [output:stdout:stdout.0] thread worker #0 stopped

… end

In logging pipelines where downstream log receivers only inspect part of
the message for efficiency reasons, it is useful to have known-importnat
field at the start of the message and/or known-large fields at the end.
This lets a user do so.

Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
@seveas seveas temporarily deployed to pr September 30, 2022 17:01 Inactive
@seveas seveas temporarily deployed to pr September 30, 2022 17:01 Inactive
@seveas seveas temporarily deployed to pr September 30, 2022 17:21 Inactive
@edsiper edsiper merged commit bb84001 into fluent:master Oct 21, 2022
@edsiper
Copy link
Member

edsiper commented Oct 21, 2022

thanks. notes:

  • please make sure to prefix your commits with the component name being modified
  • please submit another PR with a unit case for this functionality
  • submit an update to fluent-bit-docs

seveas added a commit to seveas/fluent-bit-docs that referenced this pull request Oct 21, 2022
These were added in fluent/fluent-bit#6103

Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
lecaros pushed a commit to fluent/fluent-bit-docs that referenced this pull request Oct 21, 2022
These were added in fluent/fluent-bit#6103

Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>

Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
mgeriesa pushed a commit to mgeriesa/fluent-bit that referenced this pull request Oct 25, 2022
…to the start or end (fluent#6103)

In logging pipelines where downstream log receivers only inspect part of
the message for efficiency reasons, it is useful to have known-importnat
field at the start of the message and/or known-large fields at the end.
This lets a user do so.

Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Manal Geries <mgeriesa@gmail.com>
sumitd2 pushed a commit to sumitd2/fluent-bit that referenced this pull request Feb 8, 2023
…to the start or end (fluent#6103)

In logging pipelines where downstream log receivers only inspect part of
the message for efficiency reasons, it is useful to have known-importnat
field at the start of the message and/or known-large fields at the end.
This lets a user do so.

Signed-off-by: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: root <root@sumit-acs.novalocal>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants