Skip to content

out_pgsql: add a configurable value called "daemon" for out pgsql plugin #7215

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

TomlinfreeGit
Copy link

@TomlinfreeGit TomlinfreeGit commented Apr 18, 2023

update out_pgsql plugin: add a configurable parameter to support run this plugin in a daemon mode.
if in daemon mode, configuration error of out_pgsql will not cause fluentbit crash or exit.

N/A

Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@TomlinfreeGit
Copy link
Author

TomlinfreeGit commented Apr 18, 2023

example config file for running out_pgsql in daemon mode:

[SERVICE]
    Flush  5
    Daemon  False
    Log_Level  error

[INPUT]
    name  cpu
    tag  cpu.local

[OUTPUT]
    name  pgsql
    host  192.168.142.132
    port  15432
    user  postgres
    password  yourpassword
    match  *
    database  agent
    table  test
    Daemon  True

[OUTPUT]
    name  stdout
    match  *

running result and valgrind result:

image
image


example config file for running out_pgsql not in daemon mode:

[SERVICE]
    Flush  5
    Daemon  False
    Log_Level  error

[INPUT]
    name  cpu
    tag  cpu.local

[OUTPUT]
    name  pgsql
    host  192.168.142.132
    port  15432
    user  postgres
    password  yourpassword
    match  *
    database  agent
    table  test
    Daemon  False

[OUTPUT]
    name  stdout
    match  *

running result and valgrind result:

image
image

@TomlinfreeGit
Copy link
Author

related document Merge Request
fluent/fluent-bit-docs#1081

@TomlinfreeGit TomlinfreeGit temporarily deployed to pr April 21, 2023 15:43 — with GitHub Actions Inactive
@TomlinfreeGit TomlinfreeGit temporarily deployed to pr April 21, 2023 15:43 — with GitHub Actions Inactive
@TomlinfreeGit TomlinfreeGit temporarily deployed to pr April 21, 2023 15:43 — with GitHub Actions Inactive
@TomlinfreeGit TomlinfreeGit temporarily deployed to pr April 21, 2023 15:43 — with GitHub Actions Inactive
@TomlinfreeGit TomlinfreeGit temporarily deployed to pr April 21, 2023 15:44 — with GitHub Actions Inactive
@TomlinfreeGit TomlinfreeGit temporarily deployed to pr April 21, 2023 16:08 — with GitHub Actions Inactive
@TomlinfreeGit TomlinfreeGit force-pushed the feature/daemon-out-pgsql branch from b4c4d34 to f3d62a4 Compare April 22, 2023 14:10
@TomlinfreeGit TomlinfreeGit changed the title Feature/daemon out pgsql out_pgsql: add a configurable value called "daemon" for out pgsql plugin Apr 22, 2023
@edsiper
Copy link
Member

edsiper commented Apr 25, 2023

@sxd pls take a look

@sxd
Copy link
Member

sxd commented Apr 25, 2023

@edsiper will take a look later today

@TomlinfreeGit the DCO test it's failing, can you please fix that in the meantime?

Best Regards!

@@ -143,6 +143,10 @@ int pgsql_next_connection(struct flb_pgsql_config *ctx)
struct mk_list *head;
int ret_conn = 1;

if (ctx ==NULL) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another space it's missing here too

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in the commit: out_pgsql: code-format &if config err then log err

@sxd
Copy link
Member

sxd commented Apr 25, 2023

@TomlinfreeGit so, the idea it's the just restart or make the error not a crash here right? what if the user don't see this error ? how they will suppose to know that there's an error in the configuration ?

Tom added 2 commits April 26, 2023 09:06
Signed-off-by: Tom <yao.lin@siemens.com>
Signed-off-by: Tom <yao.lin@siemens.com>
@TomlinfreeGit TomlinfreeGit force-pushed the feature/daemon-out-pgsql branch from f3d62a4 to 53ce527 Compare April 26, 2023 01:07
Signed-off-by: Tom <yao.lin@siemens.com>
@TomlinfreeGit
Copy link
Author

@TomlinfreeGit so, the idea it's the just restart or make the error not a crash here right? what if the user don't see this error ? how they will suppose to know that there's an error in the configuration ?

Yes, for other out_plugins, error configuration will not cause fluent-bit crash, just log error when failed to flush data, and can be noticed by health_check; However for out_pgsql, error configuration will cause fluent-bit crash directly.

  1. the purpose for this PR is to make it possible out_pgsql deal with error configuration the same way as other out_plugins.
  2. there will be error log both for init and flush process for out_pgsql configuration errors, if user don't see the logs, they can also notice this by health_check.
    image
    image

@TomlinfreeGit TomlinfreeGit requested a review from sxd April 26, 2023 02:18
@sxd
Copy link
Member

sxd commented Apr 26, 2023

@TomlinfreeGit Testing this after a couple of minutes, about 10 minutes, I don't see any error in the logs saying that something needs to be fixed, the plugin just finished, doesn't look like a desire behavior since everything will work and will start, but if I can't flush the logs to the desired database this may be an issue.

This it's what I can see in the logs:

Fluent Bit v2.1.0
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/04/26 12:50:03] [error] [output:pgsql:pgsql.0] failed connecting to host=192.168.2.69 with error: connection to server at "192.168.2.69", port 5432 failed: No route to host
	Is the server running on that host and accepting TCP/IP connections?

And that's it, now on the API I'm getting this:

curl -i http://127.0.0.1:2020/api/v2/health
HTTP/1.1 200 OK
Server: Monkey/1.7.0
Date: Wed, 26 Apr 2023 10:53:51 GMT
Transfer-Encoding: chunked
Content-Type: application/json

{"fluent-bit":{"version":"2.1.0","edition":"Community","flags":["FLB_HAVE_IN_STORAGE_BACKLOG","FLB_HAVE_CHUNK_TRACE","FLB_HAVE_PARSER","FLB_HAVE_RECORD_ACCESSOR","FLB_HAVE_STREAM_PROCESSOR","FLB_HAVE_TLS","FLB_HAVE_OPENSSL","FLB_HAVE_METRICS","FLB_HAVE_WASM","FLB_HAVE_AWS","FLB_HAVE_AWS_CREDENTIAL_PROCESS","FLB_HAVE_SIGNV4","FLB_HAVE_SQLDB","FLB_HAVE_METRICS","FLB_HAVE_HTTP_SERVER","FLB_HAVE_SYSTEMD","FLB_HAVE_VALGRIND","FLB_HAVE_FORK","FLB_HAVE_TIMESPEC_GET","FLB_HAVE_GMTOFF","FLB_HAVE_UNIX_SOCKET","FLB_HAVE_ATTRIBUTE_ALLOC_SIZE","FLB_HAVE_PROXY_GO","FLB_HAVE_LIBBACKTRACE","FLB_HAVE_REGEX","FLB_HAVE_UTF8_ENCODER","FLB_HAVE_LUAJIT","FLB_HAVE_C_TLS","FLB_HAVE_ACCEPT4","FLB_HAVE_INOTIFY","FLB_HAVE_GETENTROPY","FLB_HAVE_GETENTROPY_SYS_RANDOM"]}}

So, how someone will notice that there's an error if everything start without problem? in my opinion a daemon should fail if there's anything that may cause an error, just like inside a pod/container, it will be restarted until something it's fixed, this it's more like making an error kind of invisible.

On the other hand, which plugins in Fluent-Bit behaves like what you want to add?
Can you elaborate more on why we should implement this? I may be wrong so I want to read more

Thanks in advance!

@TomlinfreeGit
Copy link
Author

@TomlinfreeGit Testing this after a couple of minutes, about 10 minutes, I don't see any error in the logs saying that something needs to be fixed, the plugin just finished, doesn't look like a desire behavior since everything will work and will start, but if I can't flush the logs to the desired database this may be an issue.

This it's what I can see in the logs:

Fluent Bit v2.1.0
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/04/26 12:50:03] [error] [output:pgsql:pgsql.0] failed connecting to host=192.168.2.69 with error: connection to server at "192.168.2.69", port 5432 failed: No route to host
	Is the server running on that host and accepting TCP/IP connections?

And that's it, now on the API I'm getting this:

curl -i http://127.0.0.1:2020/api/v2/health
HTTP/1.1 200 OK
Server: Monkey/1.7.0
Date: Wed, 26 Apr 2023 10:53:51 GMT
Transfer-Encoding: chunked
Content-Type: application/json

{"fluent-bit":{"version":"2.1.0","edition":"Community","flags":["FLB_HAVE_IN_STORAGE_BACKLOG","FLB_HAVE_CHUNK_TRACE","FLB_HAVE_PARSER","FLB_HAVE_RECORD_ACCESSOR","FLB_HAVE_STREAM_PROCESSOR","FLB_HAVE_TLS","FLB_HAVE_OPENSSL","FLB_HAVE_METRICS","FLB_HAVE_WASM","FLB_HAVE_AWS","FLB_HAVE_AWS_CREDENTIAL_PROCESS","FLB_HAVE_SIGNV4","FLB_HAVE_SQLDB","FLB_HAVE_METRICS","FLB_HAVE_HTTP_SERVER","FLB_HAVE_SYSTEMD","FLB_HAVE_VALGRIND","FLB_HAVE_FORK","FLB_HAVE_TIMESPEC_GET","FLB_HAVE_GMTOFF","FLB_HAVE_UNIX_SOCKET","FLB_HAVE_ATTRIBUTE_ALLOC_SIZE","FLB_HAVE_PROXY_GO","FLB_HAVE_LIBBACKTRACE","FLB_HAVE_REGEX","FLB_HAVE_UTF8_ENCODER","FLB_HAVE_LUAJIT","FLB_HAVE_C_TLS","FLB_HAVE_ACCEPT4","FLB_HAVE_INOTIFY","FLB_HAVE_GETENTROPY","FLB_HAVE_GETENTROPY_SYS_RANDOM"]}}

So, how someone will notice that there's an error if everything start without problem? in my opinion a daemon should fail if there's anything that may cause an error, just like inside a pod/container, it will be restarted until something it's fixed, this it's more like making an error kind of invisible.

On the other hand, which plugins in Fluent-Bit behaves like what you want to add? Can you elaborate more on why we should implement this? I may be wrong so I want to read more

Thanks in advance!

Hi, I can give some some examples to explain how other out_plugins behave when user give wrong configuration.

example 1(a.config):

[SERVICE]
    Flush        5
    Daemon       False
    Log_Level    error
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_Port    2020
    Health_Check On

[INPUT]
    name     cpu
    interval_sec 5
    tag      cpu.local

[OUTPUT]
    name http
    host 192.1.1.1
    match  *
    retry_limit 1

[OUTPUT]
    name influxdb
    host 192.2.3.5
    port 8086
    bucket org
    match *
    retry_limit 1

[OUTPUT]
    name opensearch
    host 192.2.2.2
    port 9200
    match *
    retry_limit 1

running config file a.config for several minutes, fluent-bit will not crash, then you will get the following logs:
image

example 2(b.config):

[SERVICE]
    Flush        5
    Daemon       False
    Log_Level    error
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_Port    2020
    Health_Check On

[INPUT]
    name     cpu
    interval_sec 5
    tag      cpu.local

[OUTPUT]
    name            pgsql
    host            192.168.142.132
    port            15432
    user            postgres
    password        yourpwd
    match           *
    database        agent
    table           test

[OUTPUT]
    name stdout
    match *

running config file b.config, fluent-bit will exit directly, even if user give correct configuration for the other out_plugins(stdout) and you will get following logs:
image

example 3(c.config) out_pgsql running with daemon mode:

[SERVICE]
    Flush        5
    Daemon       False
    Log_Level    error
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_Port    2020
    Health_Check On

[INPUT]
    name     cpu
    interval_sec 5
    tag      cpu.local

[OUTPUT]
    name http
    host 192.1.1.1
    match  *
    retry_limit 1

[OUTPUT]
    name influxdb
    host 192.2.3.5
    port 8086
    bucket org
    match *
    retry_limit 1

[OUTPUT]
    name opensearch
    host 192.2.2.2
    port 9200
    match *
    retry_limit 1

[OUTPUT]
    name            pgsql
    host            192.168.142.132
    port            15432
    user            postgres
    password        yourpwd
    match           *
    database        agent
    table           test
    retry_limit     1
    Daemon          True

running config file c.config for several minutes, fluent-bit will not crash, then you will get the following logs:
image

@TomlinfreeGit
Copy link
Author

@TomlinfreeGit Testing this after a couple of minutes, about 10 minutes, I don't see any error in the logs saying that something needs to be fixed, the plugin just finished, doesn't look like a desire behavior since everything will work and will start, but if I can't flush the logs to the desired database this may be an issue.

This it's what I can see in the logs:

Fluent Bit v2.1.0
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2023/04/26 12:50:03] [error] [output:pgsql:pgsql.0] failed connecting to host=192.168.2.69 with error: connection to server at "192.168.2.69", port 5432 failed: No route to host
	Is the server running on that host and accepting TCP/IP connections?

And that's it, now on the API I'm getting this:

curl -i http://127.0.0.1:2020/api/v2/health
HTTP/1.1 200 OK
Server: Monkey/1.7.0
Date: Wed, 26 Apr 2023 10:53:51 GMT
Transfer-Encoding: chunked
Content-Type: application/json

{"fluent-bit":{"version":"2.1.0","edition":"Community","flags":["FLB_HAVE_IN_STORAGE_BACKLOG","FLB_HAVE_CHUNK_TRACE","FLB_HAVE_PARSER","FLB_HAVE_RECORD_ACCESSOR","FLB_HAVE_STREAM_PROCESSOR","FLB_HAVE_TLS","FLB_HAVE_OPENSSL","FLB_HAVE_METRICS","FLB_HAVE_WASM","FLB_HAVE_AWS","FLB_HAVE_AWS_CREDENTIAL_PROCESS","FLB_HAVE_SIGNV4","FLB_HAVE_SQLDB","FLB_HAVE_METRICS","FLB_HAVE_HTTP_SERVER","FLB_HAVE_SYSTEMD","FLB_HAVE_VALGRIND","FLB_HAVE_FORK","FLB_HAVE_TIMESPEC_GET","FLB_HAVE_GMTOFF","FLB_HAVE_UNIX_SOCKET","FLB_HAVE_ATTRIBUTE_ALLOC_SIZE","FLB_HAVE_PROXY_GO","FLB_HAVE_LIBBACKTRACE","FLB_HAVE_REGEX","FLB_HAVE_UTF8_ENCODER","FLB_HAVE_LUAJIT","FLB_HAVE_C_TLS","FLB_HAVE_ACCEPT4","FLB_HAVE_INOTIFY","FLB_HAVE_GETENTROPY","FLB_HAVE_GETENTROPY_SYS_RANDOM"]}}

So, how someone will notice that there's an error if everything start without problem? in my opinion a daemon should fail if there's anything that may cause an error, just like inside a pod/container, it will be restarted until something it's fixed, this it's more like making an error kind of invisible.

On the other hand, which plugins in Fluent-Bit behaves like what you want to add? Can you elaborate more on why we should implement this? I may be wrong so I want to read more

Thanks in advance!

Thanks for your kind review.
In your comment, I guess the you didn't turn on the health check function, you can turn on by config the "health_check=on":

[SERVICE]
    Flush        5
    Daemon       False
    Log_Level    error
    HTTP_Server  On
    HTTP_Listen  0.0.0.0
    HTTP_Port    2020
    Health_Check On

and call http://127.0.0.1:2020/api/v1/health to notice the health status of fluent-bit
https://docs.fluentbit.io/manual/administration/monitoring#health-check-for-fluent-bit

@TomlinfreeGit TomlinfreeGit requested a review from sxd April 28, 2023 01:07
@sxd
Copy link
Member

sxd commented May 22, 2023

@TomlinfreeGit Hi! I was on holidays for the last month so I didn't look into this, I'm getting into this today

@TomlinfreeGit
Copy link
Author

please do help for a review

@TomlinfreeGit Hi! I was on holidays for the last month so I didn't look into this, I'm getting into this today

@TomlinfreeGit
Copy link
Author

@sxd Hi please help do a review, if possible, thanks very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants