Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check lua transform for memory leaks #1721

Closed
MOZGIII opened this issue Feb 5, 2020 · 2 comments · Fixed by #1990
Closed

Check lua transform for memory leaks #1721

MOZGIII opened this issue Feb 5, 2020 · 2 comments · Fixed by #1990
Labels
domain: performance Anything related to Vector's performance transform: lua Anything `lua` transform related type: enhancement A value-adding code change that enhances its existing functionality. type: task Generic non-code related tasks

Comments

@MOZGIII
Copy link
Contributor

MOZGIII commented Feb 5, 2020

Some time ago, there was an internal discussion with the team that lua transform seems to be leaking memory. This is a follow up for that.

We need to check what's going on with the lua transform. Might be not a "did not free" kind of a leak.

Related to:
#825
#1549
#1708

@MOZGIII MOZGIII added this to the Raise reliability confidence milestone Feb 5, 2020
@MOZGIII MOZGIII self-assigned this Feb 5, 2020
@binarylogic binarylogic added transform: lua Anything `lua` transform related type: performance type: task Generic non-code related tasks labels Feb 6, 2020
@binarylogic
Copy link
Contributor

@MOZGIII I was able to get a copy of their config

Running: vector 0.7.1 (v0.7.1 x86_64-unknown-linux-musl 2020-01-24)

data_dir = "/var/lib/vector"

[sources.in]
type = "syslog"
mode = "tcp"
address = "0.0.0.0:5240"

##
## company-api
##
[transforms.company_api]
type = "field_filter"
inputs = ["in"]
field = "appname"
value = "company-api"

[transforms.company_api_json]
type = "json_parser"
inputs = ["company_api"]
drop_invalid = true

[transforms.company_api_lua]
type = "lua"
inputs = ["company_api_json"]
source = """
event["time"] = math.floor(event["timestamp"])
event["hostname"] = event["host"]
event["metadata_trace_id"] = event["metadata.trace_id"]
event["metadata_guild_id"] = event["metadata.guild_id"]
event["metadata_channel_id"] = event["metadata.channel_id"]
event["metadata_method"] = event["metadata.method"]
"""

[transforms.company_api_clean]
type = "remove_fields"
inputs = ["company_api_lua"]
fields = [
    # company-api
    "metadata.trace_id",
    "metadata.guild_id",
    "metadata.channel_id",
    "metadata.method",
    "metadata.error_code",
    "metadata.reactions_scanned",
    "metadata.min_message_id",
    "metadata.max_message_id",
    "metadata.mention_everyone",
    "metadata.mentions",
    "metadata.mention_roles",
    "metadata.circuit_breaker_tripped",
    "metadata.rpc_ip",
    "metadata.user_id",
    "metadata.user_ip",
    "metadata.session_id",
    "metadata.shard",
    "metadata.shard[0]",
    "metadata.shard[1]",
    "metadata.connection_type",
    "metadata.connection_id",
    "metadata.message_id",
    # rsyslog
    "timestamp",
    "pid",
    "appname",
    "facility",
    "host",
    "msgid",
    "procid",
    "severity",
    "version"
]

[sinks.company_api_clickhouse]
type = "clickhouse"
inputs = ["company_api_clean"]
host = "http://clickhouse-url:4213"
table = "company_api"
in_flight_limit = 100
rate_limit_duration_secs = 5
rate_limit_num = 60

[sinks.company_api_clickhouse.buffer]
type = "memory"
max_events = 1000
when_full = "block"

[sinks.company_api_clickhouse.batch]
max_size = 5245000
timeout_secs = 5

##
## company-admin
##
[transforms.company_admin]
type = "field_filter"
inputs = ["in"]
field = "appname"
value = "company-admin"

[transforms.company_admin_json]
type = "json_parser"
inputs = ["company_admin"]
drop_invalid = true

[transforms.company_admin_lua]
type = "lua"
inputs = ["company_admin_json"]
source = """
event["time"] = math.floor(event["timestamp"])
event["hostname"] = event["host"]
event["metadata_trace_id"] = event["metadata.trace_id"]
event["metadata_method"] = event["metadata.method"]
"""

[transforms.company_admin_clean]
type = "remove_fields"
inputs = ["company_admin_lua"]
fields = [
    # company-admin
    "metadata.trace_id",
    "metadata.guild_id",
    "metadata.channel_id",
    "metadata.method",
    "metadata.error_code",
    "metadata.reactions_scanned",
    "metadata.min_message_id",
    "metadata.max_message_id",
    "metadata.mention_everyone",
    "metadata.mentions",
    "metadata.mention_roles",
    "metadata.circuit_breaker_tripped",
    "metadata.rpc_ip",
    "metadata.user_id",
    "metadata.user_ip",
    "metadata.session_id",
    "metadata.shard",
    "metadata.shard[0]",
    "metadata.shard[1]",
    "metadata.connection_type",
    "metadata.connection_id",
    "metadata.message_id",
    # rsyslog
    "timestamp",
    "pid",
    "appname",
    "facility",
    "host",
    "msgid",
    "procid",
    "severity",
    "version"
]

[sinks.company_admin_clickhouse]
type = "clickhouse"
inputs = ["company_admin_clean"]
host = "http://clickhouse-url:4213"
table = "company_admin"

##
## company-media-proxy
##
[transforms.company_media_proxy]
type = "field_filter"
inputs = ["in"]
field = "appname"
value = "company-media-proxy"

[transforms.company_media_proxy_json]
type = "json_parser"
inputs = ["company_media_proxy"]
drop_invalid = true

[transforms.company_media_proxy_lua]
type = "lua"
inputs = ["company_media_proxy_json"]
source = """
event["time"] = math.floor(event["ts"])
event["hostname"] = event["host"]
"""

[transforms.company_media_proxy_clean]
type = "remove_fields"
inputs = ["company_media_proxy_lua"]
fields = [
    # company-media-proxy
    "ts",
    "level",
    "path",
    "error",
    "type",
    "height_px",
    "width_px",
    "width",
    "height",
    "ext",
    "size",
    "size_bytes",
    # rsyslog
    "timestamp",
    "pid",
    "appname",
    "facility",
    "host",
    "msgid",
    "procid",
    "severity",
    "version"
]

[sinks.company_media_proxy_clickhouse]
type = "clickhouse"
inputs = ["company_media_proxy_clean"]
host = "http://clickhouse-url:4213"
table = "company_media_proxy"
in_flight_limit = 100
rate_limit_duration_secs = 5
rate_limit_num = 60

[sinks.company_media_proxy_clickhouse.buffer]
type = "memory"
max_events = 1000
when_full = "block"

[sinks.company_media_proxy_clickhouse.batch]
max_size = 5245000
timeout_secs = 5

##
## company-unfurler
##
[transforms.company_unfurler]
type = "field_filter"
inputs = ["in"]
field = "appname"
value = "company-unfurler"

[transforms.company_unfurler_hostname]
type = "lua"
inputs = ["company_unfurler"]
source = """
event["hostname"] = event["host"]
"""

[transforms.company_unfurler_json]
type = "json_parser"
inputs = ["company_unfurler_hostname"]
drop_invalid = true

[transforms.company_unfurler_filter]
type = "field_filter"
inputs = ["company_unfurler_json"]
field = "msg"
value = "unfurl"

[transforms.company_unfurler_lua]
type = "lua"
inputs = ["company_unfurler_filter"]
source = """
event["time"] = event["ts"]:gsub("%+00:00", ""):gsub("%.%d+$", ""):gsub("T", " ")
"""

[transforms.company_unfurler_clean]
type = "remove_fields"
inputs = ["company_unfurler_lua"]
fields = [
  # company-unfurler
  "ts",
  "level",
  "duration",
  # rsyslog
  "timestamp",
  "pid",
  "appname",
  "facility",
  "msgid",
  "procid",
  "severity",
  "version"
]

[sinks.company_unfurler_clickhouse]
type = "clickhouse"
inputs = ["company_unfurler_clean"]
host = "http://clickhouse-url:4213"
table = "company_unfurler"
in_flight_limit = 100
rate_limit_duration_secs = 5
rate_limit_num = 60

[sinks.company_unfurler_clickhouse.buffer]
type = "memory"
max_events = 1000
when_full = "block"

##
## audit
##

[transforms.audit]
type = "field_filter"
inputs = ["in"]
field = "appname"
value = "audit"

[transforms.audit_lua]
type = "lua"
inputs = ["audit"]
source = """
event["tag"] = event["appname"]
event["hostname"] = event["host"]
event["pid"] = event["pid"]
event["content"] = event["message"]
event["time"] = event["timestamp"]:gsub("Z", ""):gsub("%.%d+$", ""):gsub("T", " ")
"""

[transforms.audit_clean]
type = "remove_fields"
inputs = ["audit_lua"]
fields = [
    # rsyslog
    "timestamp",
    "message",
    "pid",
    "appname",
    "facility",
    "host",
    "msgid",
    "procid",
    "severity",
    "version"
]

[sinks.audit_clickhouse]
type = "clickhouse"
inputs = ["audit_clean"]
host = "http://clickhouse-url:4213"
table = "syslog"
in_flight_limit = 100
rate_limit_duration_secs = 5
rate_limit_num = 60

[sinks.audit_clickhouse.buffer]
type = "memory"
max_events = 1000
when_full = "block"

##
## Debug
##
# [sinks.out]
# inputs = [
#   "audit_clean",
#   "company_api_clean",
#   "company_admin_clean",
#   "company_media_proxy_clean",
#   "company_unfurler_clean"
# ]
# type = "console"
# encoding = "json"

@MOZGIII
Copy link
Contributor Author

MOZGIII commented Mar 1, 2020

Thx, that'll come in handy. I plan to give this issue some time, until the #1549 closes. That alone may solve the issue, and if the bug persists (it should be easy to observe under load) - I can start looking for the cause. I have a lot of confidence that the execution model changes will solve this.

@ghost ghost assigned ghost and unassigned MOZGIII Mar 5, 2020
@ghost ghost closed this as completed in #1990 Mar 6, 2020
@binarylogic binarylogic added type: enhancement A value-adding code change that enhances its existing functionality. domain: performance Anything related to Vector's performance and removed type: performance labels Aug 6, 2020
@binarylogic binarylogic unassigned ghost Aug 6, 2020
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: performance Anything related to Vector's performance transform: lua Anything `lua` transform related type: enhancement A value-adding code change that enhances its existing functionality. type: task Generic non-code related tasks
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants