Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New component: receiver/mysqlrecordsreceiver #12082

Closed
JKashyap96 opened this issue Jul 5, 2022 · 6 comments
Closed

New component: receiver/mysqlrecordsreceiver #12082

JKashyap96 opened this issue Jul 5, 2022 · 6 comments
Labels
Sponsor Needed New component seeking sponsor Stale

Comments

@JKashyap96
Copy link

JKashyap96 commented Jul 5, 2022

The purpose and use-cases of the new component

This receiver queries MySQL for database records and creates a log record for each database record.

Use Cases

'BasicAuth' Password Authentication Mode Use Case:

  • The receiver supports basic username and password authentication for a database user

'BasicAuth' Password Encryption Use Case:

  • The receiver supports password encryption for 'BasicAuth' authentication_mode
  • To generate an encrypted password:
    • Specify 'encrypt_secret_path' to a secret file containing a 24 character string
    • Include the following in the otel config:
      service:
        telemetry:
          logs:
            level: debug
      The encrypted password will only be printed in the console with a debug log level. Once generated, the user can remove the telemetry field so as to enable logging at the default info level. To use the encrypted password, the user needs to specify password_type as 'encrypted' and also the encrypt_secret_path to the same secret file.

'IAMRDSAuth' Password Authentication Mode Use Case:

  • The receiver supports connecting to an AWS RDS MySQL DB instance
  • We first enable IAM database authentication
  • We create an IAM Policy for IAM Database Access
  • Specify 'aws_certificate_path' to a pem file containing certificates for different AWS regions
  • Internally, the receiver uses an authentication token to access the database instance instead of a password.

State Management Use Case:

  • The receiver supports saving the state of a query fetch into a csv file where a unique/auto-increment field is present in a table of a database.
  • The unique/auto-increment field can either be of type 'NUMBER' or 'TIMESTAMP', where a 'NUMBER' should be a non-negative integer and a 'TIMESTAMP' should be of the default timestamp storage format in mysql, i.e. '2006-01-02 15:04:05'.
  • This is basically the delta mode state management feature of the receiver where the current value/state of the unique/auto-increment field is saved in a csv file which can be retrieved later so as to fetch records after the saved state value.

Example configuration for the component

Overall config fields:

receivers:
  mysqlrecords:
    # authentication_mode is used for identifying the way of connecting to a mysql database instance
    # it has two possible values namely, 'BasicAuth' and 'IAMRDSAuth'
    # this is a mandatory field
    authentication_mode: BasicAuth

    # this is the username of the database user
    # this is a mandatory field
    username: testuser

    # this is the database name
    # this is a mandatory field
    database: testdatabase

    # this is the host name of the database instance
    # this is a mandatory field
    dbhost: testhost

    # for a RDS MySQL instance, this is the value of the region where the instance is present
    # this is a mandatory field when authentication_mode: 'IAMRDSAuth' and is not required in 'BasicAuth'.
    region: us-east-1

    # this is the path for the pem file containing certificates for different AWS regions
    # details : https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.SSL.html
    # this is a mandatory field when authentication_mode: 'IAMRDSAuth' and is not required in 'BasicAuth'.
    aws_certificate_path: global-bundle.pem

    # this is the password of the database user
    # this will be skipped while using authentication_mode : 'IAMRDSAuth' as an authentication token is used as a password in this case
    password: testpass

    # password_type refers to how the password of the user is entered in the receiver configuration
    # it has two possible values, namely, 'plaintext' and 'encrypted'
    # the default value of password_type is 'plaintext'
    # it has to be mandatorily passed on with value 'encrypted' so as to decrypt an encrypted password with a secret string stored in file in encrypt_secret_path
    password_type: encrypted

    # this is the path to a secret file containing a 24 character string that is used for encrypting a plaintext password
    # when specified with a plaintext password, it will result in a console output with an encrypted password for the plaintext which can be used instead used as a password in the config
    # this is a mandatory path required while passing an encrypted password in the config
    encrypt_secret_path: /path/to/secret/file.txt

    # this is the database port, will be considered 3306 by default if not specified
    dbport: 3306

    # this is the structure for database queries which are required to query from a database instance
    db_queries:

      # this is a user-defined value which a user needs to put in as an identifier for each query that the user wants to run for the receiver
      # it has to be unique for each query
      # this is a mandatory field for the db_queries struct
      - queryid: Q1

        # this is the query string the user wants to run for the receiver
        # this is a mandatory field for the db_queries struct
        query: select * from persons

        # STATE MANAGEMENT Feature

        # index_column_name is the name of the unique/auto-increment field present in the table
        index_column_name: PersonID

        # this is the value for the type of the unique/auto-increment field mentioned above
        # it has two possible values namely, 'NUMBER' and 'TIMESTAMP'
        # this is mandatory field that needs to be specified by an user trying to save the state of the index_column_name of a database query
        index_column_type: NUMBER

        # while doing state management of a query, a user can explicitly define the identifier value in a table, after which the records should be fetched in
        # this is the explicitly defined identifier value for a particular database query
        # for 'NUMBER' type the default value is 0 and for 'TIMESTAMP' the default value is currentTime - 48hrs
        initial_index_column_start_value: 5

    # this is required to ensure connections are closed by the driver safely before connection is closed by MySQL server, OS, or other middlewares
    # default is 3
    setconnmaxlifetimemins: 3

    # this is highly recommended to limit the number of connections used by the application. There is no recommended limit number because it depends on application and MySQL server
    # default is 5
    setmaxopenconns: 5

    # this is recommended to be set same to setmaxopenconns, when it is smaller than setmaxopenconns, connections can be opened and closed much more frequently than you expect.
    # default is 5
    setmaxidleconns: 5

    # this indicates the number of producer and consumer workers/threads which will used to fetch, convert and consume database records
    # by default it considers the value to be the number of queries that are to be run in the receiver
    # user can configure a maximum of 10 workers
    setmaxnodatabaseworkers: 4

    # this is the protocol value required for establishing a database connection
    # default is 'tcp'
    transport: tcp

    # default is true
    allow_native_passwords: true

    # this is the collection interval for collecting database records
    # default is 10s
    collection_interval: 10s

Example MySQL reference:

mysql> select * from Persons where PersonID = 3;
+----------+----------+-----------+----------+----------+
| PersonID | LastName | FirstName | Address | City |
+----------+----------+-----------+----------+----------+
| 3 | Kalita | Raktim | Hatigaon | Guwahati |
+----------+----------+-----------+----------+----------+
1 row in set (0.01 sec)

Example config:

receivers:
  mysqlrecords:
    authentication_mode: BasicAuth
    dbhost: localhost
    dbport: 3306
    transport: tcp
    username: test
    password: test
    database: employees
    db_queries:
      - queryid: Q1
        query: select * from Persons where PersonID = 3
    setmaxnodatabaseworkers: 1
exporters:
  logging:
    loglevel: debug
    sampling_initial: 2
    sampling_thereafter: 500

Example logging exporter console output:

Resource SchemaURL:
ScopeLogs #0
ScopeLogs SchemaURL:
InstrumentationScope
LogRecord #0
Timestamp: 1970-01-01 00:00:00 +0000 UTC
Severity:
ShortName:
Body: {"Address":"Hatigaon","City":"Guwahati","FirstName":"Raktim","LastName":"Kalita","PersonID":"3"}
Trace ID:
Span ID:
Flags: 0

Telemetry data types supported

logs

Is this a vendor-specific component? If so, are you proposing to contribute this as a representative of the vendor?

not really a vendor specific component, but trying to contribute as a Sumo Logic representative.

Sponsor (Optional)

@codeboten codeboten added the Sponsor Needed New component seeking sponsor label Jul 5, 2022
@tigrannajaryan
Copy link
Member

Quick comment on the state management: to persist the state make sure to use the existing storage extension concept: https://github.com/open-telemetry/opentelemetry-collector/tree/main/extension/experimental/storage

@github-actions
Copy link
Contributor

github-actions bot commented Nov 9, 2022

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@andrzej-stencel
Copy link
Member

I believe that instead of creating a new component, it makes more sense to re-use the SQL Query receiver, which already does this for metrics - it runs SQL queries and transforms the results into metrics. We should extend that receiver by adding logs support instead of creating a new component. I've created an issue for this here:

@JKashyap96
Copy link
Author

JKashyap96 commented Mar 23, 2023

Hi @astencel-sumo. I agree with your suggestion of extending the SQL Query Receiver. I hope the use cases here above get implemented with the state management feature modified using persistent storage as mentioned in the new issue. Looking forward towards this contribution :)

@github-actions github-actions bot removed the Stale label May 26, 2023
@github-actions
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

@andrzej-stencel
Copy link
Member

Closing this as the SQL Query receiver now supports logs collection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Sponsor Needed New component seeking sponsor Stale
Projects
None yet
Development

No branches or pull requests

4 participants