[cmd/serverless] add support for Logs Collection in the Datadog Agent Extension #6861

remeh · 2020-11-26T10:58:27Z

What does this PR do?

This PR adds support for Logs collection in the Datadog Agent Extension.

Log collection

In order to collect the logs from the environment, the extension is adding a route /lambda/logs into the extension HTTP server (running on port 8124) on which it is receiving logs from the AWS environment, these logs are the logs of the function, of the extensions and of the platform.

The pkg/serverless/aws package has been introduced with methods helping to get the ARN of the running function + the parsing of the log message received on the HTTP server.

Logs aggregation / processing

An instance of the Logs Agent is started in the extension if the DD_LOGS_ENABLED flag is set. This Logs Agent use the regular pipeline except that:

it is using a newly introduced NullAuditor (not writing any registry).
a new input source has been introduced capable of receiving logs message from AWS in a channel (see pkg/logs/input/channel). Note that it is specialized for AWS logs message but could be more generic with additional code if needed.
The JSON encoder has been modified to support providing a timestamp to messages, and, using the build tag serverless, it is not appending any hostname to the logs.

Synchronous flush of the pipeline

The Extension needs to do a synchronous flush of the buffered data on a signal. Today, it is used to naively flush at the end of the execution of the function, however, we already work on adding a smarter flush mechanism.

In order to do so, I had to adapt different parts of the pipeline:

Add Pipeline.Flush, triggering a flush of the Processor and of the Sender.
Add Processor.Flush, sending buffered message to the processor / encoder
Add Sender.Flush, sending the buffered data to the BatchStrategy sender
Add BatchStrategy.Flush to synchronously send the HTTP request to the intake to flush the data.

A mutex has been introduced and the select were all modified to support a signal for the flush. This may have some performances impact for which I want to add benchmarks.

Miscellaneous changes

Add message.NewMessageWithTime to send in the pipeline a log with a known/given timestamp
Auditor has been renamed RegistryAuditor and an interface Auditor has been introduced instead (to create the NullAuditor)
Some of the message contains useful information that we turn into metrics and directly into the aggregator: see here.

See https://docs.datadoghq.com/serverless/datadog_lambda_library/extension/

…rwarder interface.

…s to the logs agent Introducing the `logs/input/channel` implementation supporting receiving logs through a Go channel.

…ssages. This has been introduced to not change the original pipeline process. The original `Auditor` has been renamed `RegistryAuditor`. `Auditor` is now an interface for both `RegistryAuditor` and `NullAuditor`.

This commit is also adding the reading of the function ARN from the invoke event.

…sing the DogStatsD server. Currently supported enhanced metrics: billed_duration, duration, init_duration, max_memory_used.

This commit also adds the parsing of the time in the AWS log.

…d ARN support. This way, we can use the timestamp from the AWS LogMesage while sending the logs to the intake. This is what this commit is doing.

…rics and logs. For feature-parity with the Datadog Forwarder, we want to use this configuration entry / environment variable to append tags to every metric and log sent by the extension.

gh123man · 2020-11-30T14:07:47Z

pkg/logs/input/channel/tailer.go

+
+	// Loop terminates when the channel is closed.
+	for logline := range t.inputChan {
+		origin := message.NewOrigin(t.source)


I think we should also update t.source.BytesRead to be consistent with the other tailers.

prognant

Overall it looks good to me and fits pretty well within the existing Logs Agent code.

pkg/logs/auditor/null_auditor.go

pkg/logs/pipeline/provider.go

pkg/serverless/serverless.go

pkg/serverless/protocol.go

… env var.

…serverless-logs

Instead of spawning a new HTTP server (using yet another tcp port), we are re-using the already spawned HTTP server which is already used to communicate with the libraries. It also means there is no more configuration field to change the HTTP. This may have to be addressed if 8124 is not available for some reasons.

…statsd binary.

prognant

For the logs agent part it looks good to me. I think it will works pretty well.

prognant · 2021-01-08T09:24:11Z

pkg/config/log_format_serverless.go

+// buildCommonFormat returns the log common format seelog string
+func buildCommonFormat(loggerName LoggerName) string {
+	return fmt.Sprintf("%%Date(%s) | %s | %%LEVEL | %%Msg%%n", getLogDateFormat(), loggerName)
+}


I'm just curious, why do we remove contextual information from agent log when running in the serverless context ?
I guess here it's not a problem not to check for JMXFetch.

We are mimicking the logs that users are used to have while running functions in AWS Lambda (through our clients libraries for instance).

prognant · 2021-01-08T09:33:47Z

pkg/config/log.go

-	if loggerName == "JMXFETCH" {
-		return `%Msg%n`
-	}


I think this should be included in pkg/config/log_format.go, for both buildCommonFormat and buildJSONFormat.

Very nice catch, this must be a merge error or merge artifact.

This must be a merge error/artifact.

prognant

LGTM, just a quick question about tagging, will be there any tag to specifically isolate one lambda execution stream ?

remeh · 2021-01-12T10:20:37Z

@prognant thanks for the reviews... 🙇

will be there any tag to specifically isolate one lambda execution stream ?

You mean to identify that data sent by the Agent has been sent through a Lambda using the Extension? The functionname tag with the function name as the value is set on logs and on the enhanced metrics, but for now, not much more to differentiate logs emitted from a regular agent from logs emitted from the extension. Is this what you were wondering?

prognant · 2021-01-12T10:35:51Z

@prognant thanks for the reviews... 🙇

will be there any tag to specifically isolate one lambda execution stream ?

You mean to identify that data sent by the Agent has been sent through a Lambda using the Extension? The functionname tag with the function name as the value is set on logs and on the enhanced metrics, but for now, not much more to differentiate logs emitted from a regular agent from logs emitted from the extension. Is this what you were wondering?

I was thinking about isolating/filtering logs (could be metrics) emitted by one single execution of a given lambda function. I.e. if a lambda function has a huge number of concurrent invocation, having a tag (like a unique uuid for each lambda execution) per lambda execution would be useful, WDYT ?

remeh · 2021-01-12T11:00:28Z

I was thinking about isolating/filtering logs (could be metrics) emitted by one single execution of a given lambda function. I.e. if a lambda function has a huge number of concurrent invocation, having a tag (like a unique uuid for each lambda execution) per lambda execution would be useful, WDYT ?

Oh sorry, I've read your question too fast! We already support that with the request ID which is set here. This request ID is then sent in the JSON and it is available in the Datadog app, for instance in the log explorer, or to filter by a given request id, etc. 👍

… Extension (DataDog#6861) * Add the Serverless Datadog Agent implementation. See https://docs.datadoghq.com/serverless/datadog_lambda_library/extension/ * Fix 3rd party license list since new pkg are imported from the AWS SDK. * forwarder: adapt the SyncDefaultForwarder to recent changes in the Forwarder interface. * aggregator: fix unit tests with Flush now being a public function. * serverless: linter compliance. * serverless: errcheck compliance. * serverless: unused code linter. * serverless: linter compliance. * serverless: add http server listening for server logs and sending logs to the logs agent Introducing the `logs/input/channel` implementation supporting receiving logs through a Go channel. * serverless/logs: introduce a `NullAuditor` not doing anything with messages. This has been introduced to not change the original pipeline process. The original `Auditor` has been renamed `RegistryAuditor`. `Auditor` is now an interface for both `RegistryAuditor` and `NullAuditor`. * serverless: remove debug messages * serverless/logs: add a sync `Flush()` method to the logs pipeline. This commit is also adding the reading of the function ARN from the invoke event. * serverless/logs: remove unwanted debug lines. * serverless/logs: read REPORT messages and send AWS enhanced metrics using the DogStatsD server. Currently supported enhanced metrics: billed_duration, duration, init_duration, max_memory_used. * serverless/logs: enhanced metrics are distribution. * serverless/logs: comment on the sync flush of logs then statsd. * serverless/logs: configurable logs type we're subscribing to. This commit also adds the parsing of the time in the AWS log. * serverless/arn: remove version from the ARN string if any. * serverless/logs: create aws package containing both AWS LogMessage and ARN support. This way, we can use the timestamp from the AWS LogMesage while sending the logs to the intake. This is what this commit is doing. * serverless/logs: parse time up to the millisecond. * serverless/logs: send the enhanced metrics with the proper timestamp. * serverless/logs: comments and send the report log to the intake. * serverless/logs: send aws.lambda.enhanced.memorysize * serverless/logs: send formated platform logs. * serverless/logs: cleaning comments here and there. * config: fix log conflict while adding the serverless format. * serverless/logs: linter compliance. * cmd/serverless: start -> run * dogstatsd: add an unit test in serverless mode. * tasks: remove serverless test tag * forwarder: rename `SyncDefaultForwarder` to `SyncForwarder` since it's not a default forwarder. * [cmd/serverless] change default port for the HTTP server collecting logs * serverless: support using DD_TAGS to add extra tags while sending metrics and logs. For feature-parity with the Datadog Forwarder, we want to use this configuration entry / environment variable to append tags to every metric and log sent by the extension. * serverless/logs: send the logs with the ARN and RequestId information. * serverless: linter compliance * serverless/logs: use the function name if available as the "service". * serverless: hit the correct domain using the `AWS_LAMBDA_RUNTIME_API` env var. * serverless: re-use the daemon http server to receive the logs from AWS. Instead of spawning a new HTTP server (using yet another tcp port), we are re-using the already spawned HTTP server which is already used to communicate with the libraries. It also means there is no more configuration field to change the HTTP. This may have to be addressed if 8124 is not available for some reasons. * dogstatsd: fix a unit test. * dogstatsd: fix a unit test. * tasks/dogstatsd: slighlty increase the maximum valid size for the dogstatsd binary. * misc: use the correct log-format for non-serverless logs. This must be a merge error/artifact.

remeh added 30 commits October 12, 2020 15:11

Add the Serverless Datadog Agent implementation.

0c2dd99

See https://docs.datadoghq.com/serverless/datadog_lambda_library/extension/

Merge remote-tracking branch 'origin/master' into remeh/serverless

8c5ac59

Fix 3rd party license list since new pkg are imported from the AWS SDK.

2f8961a

forwarder: adapt the SyncDefaultForwarder to recent changes in the Fo…

a85d148

…rwarder interface.

aggregator: fix unit tests with Flush now being a public function.

91437be

serverless: linter compliance.

218d54f

serverless: errcheck compliance.

52355a1

serverless: unused code linter.

6544a41

serverless: linter compliance.

c0de114

Merge branch 'master' into remeh/serverless

105065b

serverless: add http server listening for server logs and sending log…

338a74d

…s to the logs agent Introducing the `logs/input/channel` implementation supporting receiving logs through a Go channel.

serverless/logs: introduce a NullAuditor not doing anything with me…

46a48ac

…ssages. This has been introduced to not change the original pipeline process. The original `Auditor` has been renamed `RegistryAuditor`. `Auditor` is now an interface for both `RegistryAuditor` and `NullAuditor`.

serverless: remove debug messages

ede6441

serverless/logs: add a sync Flush() method to the logs pipeline.

34adc1c

This commit is also adding the reading of the function ARN from the invoke event.

serverless/logs: remove unwanted debug lines.

6e9c982

serverless/logs: read REPORT messages and send AWS enhanced metrics u…

fdc94fc

…sing the DogStatsD server. Currently supported enhanced metrics: billed_duration, duration, init_duration, max_memory_used.

serverless/logs: enhanced metrics are distribution.

0ebfd5e

serverless/logs: comment on the sync flush of logs then statsd.

42eca40

serverless/logs: configurable logs type we're subscribing to.

1987a12

This commit also adds the parsing of the time in the AWS log.

serverless/arn: remove version from the ARN string if any.

7b35fba

serverless/logs: create aws package containing both AWS LogMessage an…

8ff11f0

…d ARN support. This way, we can use the timestamp from the AWS LogMesage while sending the logs to the intake. This is what this commit is doing.

serverless/logs: parse time up to the millisecond.

7603ee0

serverless/logs: send the enhanced metrics with the proper timestamp.

948d09d

serverless/logs: comments and send the report log to the intake.

d0e2441

serverless/logs: send aws.lambda.enhanced.memorysize

640ca62

serverless/logs: send formated platform logs.

4a078bc

serverless/logs: cleaning comments here and there.

93a4721

config: fix log conflict while adding the serverless format.

6757623

serverless/logs: linter compliance.

65a8e1e

Merge branch 'master' into remeh/serverless-logs

013a786

remeh requested a review from a team as a code owner November 26, 2020 15:24

serverless: support using DD_TAGS to add extra tags while sending met…

3ccd66a

…rics and logs. For feature-parity with the Datadog Forwarder, we want to use this configuration entry / environment variable to append tags to every metric and log sent by the extension.

gh123man reviewed Nov 30, 2020

View reviewed changes

remeh added 2 commits November 30, 2020 17:39

serverless/logs: send the logs with the ARN and RequestId information.

4569a31

serverless: linter compliance

41f0f93

remeh modified the milestones: 7.25.0, 7.26.0 Dec 9, 2020

prognant reviewed Dec 10, 2020

View reviewed changes

pkg/logs/auditor/null_auditor.go Show resolved Hide resolved

pkg/logs/pipeline/provider.go Show resolved Hide resolved

pkg/serverless/serverless.go Outdated Show resolved Hide resolved

pkg/serverless/protocol.go Outdated Show resolved Hide resolved

remeh added 7 commits December 16, 2020 15:24

serverless/logs: use the function name if available as the "service".

06e6b14

serverless: hit the correct domain using the AWS_LAMBDA_RUNTIME_API…

2124b2b

… env var.

Merge branch 'master' of github.com:DataDog/datadog-agent into remeh/…

05c2849

…serverless-logs

dogstatsd: fix a unit test.

a5646de

Merge remote-tracking branch 'origin/master' into remeh/serverless-logs

3e163b7

dogstatsd: fix a unit test.

7f24782

remeh force-pushed the remeh/serverless-logs branch from e1ac48c to 7f24782 Compare January 7, 2021 09:21

remeh added 2 commits January 7, 2021 10:48

Merge remote-tracking branch 'origin/master' into remeh/serverless-logs

41897ce

tasks/dogstatsd: slighlty increase the maximum valid size for the dog…

0342dfc

…statsd binary.

remeh requested a review from a team as a code owner January 7, 2021 12:57

prognant reviewed Jan 8, 2021

View reviewed changes

remeh added 2 commits January 11, 2021 10:49

misc: use the correct log-format for non-serverless logs.

1ccbda1

This must be a merge error/artifact.

Merge remote-tracking branch 'origin/master' into remeh/serverless-logs

09b6a01

KSerrania removed the request for review from a team January 11, 2021 10:51

prognant approved these changes Jan 12, 2021

View reviewed changes

remeh merged commit 519c2cc into master Jan 12, 2021

remeh deleted the remeh/serverless-logs branch January 12, 2021 11:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[cmd/serverless] add support for Logs Collection in the Datadog Agent Extension #6861

[cmd/serverless] add support for Logs Collection in the Datadog Agent Extension #6861

Uh oh!

remeh commented Nov 26, 2020 •

edited

Loading

Uh oh!

gh123man Nov 30, 2020

Uh oh!

prognant left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

prognant left a comment

Uh oh!

prognant Jan 8, 2021

Uh oh!

remeh Jan 11, 2021

Uh oh!

prognant Jan 8, 2021

Uh oh!

remeh Jan 11, 2021

Uh oh!

prognant left a comment

Uh oh!

remeh commented Jan 12, 2021

Uh oh!

prognant commented Jan 12, 2021

Uh oh!

remeh commented Jan 12, 2021 •

edited

Loading

Uh oh!

Uh oh!

[cmd/serverless] add support for Logs Collection in the Datadog Agent Extension #6861

[cmd/serverless] add support for Logs Collection in the Datadog Agent Extension #6861

Uh oh!

Conversation

remeh commented Nov 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Log collection

Logs aggregation / processing

Synchronous flush of the pipeline

Miscellaneous changes

Uh oh!

gh123man Nov 30, 2020

Choose a reason for hiding this comment

Uh oh!

prognant left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

prognant left a comment

Choose a reason for hiding this comment

Uh oh!

prognant Jan 8, 2021

Choose a reason for hiding this comment

Uh oh!

remeh Jan 11, 2021

Choose a reason for hiding this comment

Uh oh!

prognant Jan 8, 2021

Choose a reason for hiding this comment

Uh oh!

remeh Jan 11, 2021

Choose a reason for hiding this comment

Uh oh!

prognant left a comment

Choose a reason for hiding this comment

Uh oh!

remeh commented Jan 12, 2021

Uh oh!

prognant commented Jan 12, 2021

Uh oh!

remeh commented Jan 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

remeh commented Nov 26, 2020 •

edited

Loading

remeh commented Jan 12, 2021 •

edited

Loading