Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/githubreceiver] add skeleton for tracing GH Actions events #34944

Closed

Conversation

msarahan
Copy link

This is an initial skeleton of work adapted from @krzko from the discussion in #27460. It has much/most of the implementation details left out, but the general idea is to refactor what krzko had into a folder in internal, and plumb the configuration in with the top-level github (scraper) configuration.

I think we will want to remove the restriction that at least one scraper must be configured, since this receiver could be used with the GHA tracing and not the metrics.

CC @adrielp - not sure if this is what you had in mind for a skeleton, but I hope it is helpful. If you have more specific directions for what you'd like to see, let me know.

Copy link

linux-foundation-easycla bot commented Aug 30, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@adrielp
Copy link
Contributor

adrielp commented Sep 2, 2024

@msarahan - Thanks for starting this contribution! I appreciate the quick turnaround on getting the skeleton opened up here.

A couple things to start off with to get this PR ready for merge:

  • the title needs to change, check out title guidelines - something like [receiver/githubreceiver] add GitHub tracing webhook skeleton or similar would probably suffice
  • a changelog needs to be added.
  • looks like there's a couple workflow failures, one of which I'm unfamiliar with (tracing correctness) - will have to look into that one. but most can be solved by running the following script locally. I put it at the same dir level as my collector-contrib fork and run it from there.
  • The metadata.yaml needs to be updated. A little below this section in the docs is helpful. For this PR I think all that is needed is to add development: [metrics, traces] and re-generate. (double check that though).

It has much/most of the implementation details left out, but the general idea is to refactor what krzko had into a folder in internal, and plumb the configuration in with the top-level github (scraper) configuration.

Sounds right 👍🏼

I think we will want to remove the restriction that at least one scraper must be configured, since this receiver could be used with the GHA tracing and not the metrics.

Agreed with a slight change. until the Webhook has functionality, I don't think this should change. And when it does become functionally available, we should check to ensure at least one of metrics/traces is configured in the event that a user forgets to configure either.

@msarahan msarahan changed the title adapt krzko's work into skeleton in githubreceiver for Actions tracing [receiver/githubreceiver] add skeleton for tracing GH Actions events Sep 3, 2024
@msarahan
Copy link
Author

msarahan commented Sep 4, 2024

Thanks for the review, @adrielp. I think I fixed up most of it, except

The metadata.yaml needs to be updated. A little below this section in the docs is helpful. For this PR I think all that is needed is to add development: [metrics, traces] and re-generate.

I found that making this change generated code that assumed that there was a happy path for tracing, but this PR does not add that happy path.

Also, there's an error when running your prep.sh script that I've been trying to understand:

ERRO [linters_context/goanalysis] buildir: panic during analysis: Cannot range over: func(yield func(K, V) bool), goroutine 6175 [running]:
runtime/debug.Stack()
        runtime/debug/stack.go:26 +0x5e
github.com/golangci/golangci-lint/pkg/goanalysis.(*action).analyzeSafe.func1()
        github.com/golangci/golangci-lint@v1.59.1/pkg/goanalysis/runner_action.go:105 +0x5a
panic({0x1815360?, 0xc000877900?})
        runtime/panic.go:785 +0x132
honnef.co/go/tools/go/ir.(*builder).rangeStmt(0xc002465910, 0xc00266af00, 0xc00211e9c0, 0x0, {0x1d35710, 0xc00211e9c0})
        honnef.co/go/tools@v0.4.7/go/ir/builder.go:2225 +0x894
honnef.co/go/tools/go/ir.(*builder).stmt(0xc002465910, 0xc00266af00, {0x1d3a278?, 0xc00211e9c0?})
        honnef.co/go/tools@v0.4.7/go/ir/builder.go:2438 +0x20a
honnef.co/go/tools/go/ir.(*builder).stmtList(...)
        honnef.co/go/tools@v0.4.7/go/ir/builder.go:859
honnef.co/go/tools/go/ir.(*builder).stmt(0xc002465910, 0xc00266af00, {0x1d3a188?, 0xc0023965d0?})
        honnef.co/go/tools@v0.4.7/go/ir/builder.go:2396 +0x1415
honnef.co/go/tools/go/ir.(*builder).buildFunction(0xc002465910, 0xc00266af00)
        honnef.co/go/tools@v0.4.7/go/ir/builder.go:2508 +0x417
honnef.co/go/tools/go/ir.(*builder).buildFuncDecl(0xc002465910, 0xc0006e99e0, 0xc002396600)
        honnef.co/go/tools@v0.4.7/go/ir/builder.go:2545 +0x189
honnef.co/go/tools/go/ir.(*Package).build(0xc0006e99e0)
        honnef.co/go/tools@v0.4.7/go/ir/builder.go:2649 +0xb46
sync.(*Once).doSlow(0xc004c28700?, 0xc00211efc0?)
        sync/once.go:76 +0xb4
sync.(*Once).Do(...)
        sync/once.go:67
honnef.co/go/tools/go/ir.(*Package).Build(...)
        honnef.co/go/tools@v0.4.7/go/ir/builder.go:2567
honnef.co/go/tools/internal/passes/buildir.run(0xc004c28620)
        honnef.co/go/tools@v0.4.7/internal/passes/buildir/buildir.go:86 +0x18b
github.com/golangci/golangci-lint/pkg/goanalysis.(*action).analyze(0xc0037b1ed0)
        github.com/golangci/golangci-lint@v1.59.1/pkg/goanalysis/runner_action.go:191 +0x9cd
github.com/golangci/golangci-lint/pkg/goanalysis.(*action).analyzeSafe.func2()
        github.com/golangci/golangci-lint@v1.59.1/pkg/goanalysis/runner_action.go:113 +0x17
github.com/golangci/golangci-lint/pkg/timeutils.(*Stopwatch).TrackStage(0xc002422910, {0x1a4edad, 0x7}, 0xc00499e748)
        github.com/golangci/golangci-lint@v1.59.1/pkg/timeutils/stopwatch.go:111 +0x44
github.com/golangci/golangci-lint/pkg/goanalysis.(*action).analyzeSafe(0x0?)
        github.com/golangci/golangci-lint@v1.59.1/pkg/goanalysis/runner_action.go:112 +0x6e
github.com/golangci/golangci-lint/pkg/goanalysis.(*loadingPackage).analyze.func2(0xc0037b1ed0)
        github.com/golangci/golangci-lint@v1.59.1/pkg/goanalysis/runner_loadingpackage.go:80 +0xa5
created by github.com/golangci/golangci-lint/pkg/goanalysis.(*loadingPackage).analyze in goroutine 2743
        github.com/golangci/golangci-lint@v1.59.1/pkg/goanalysis/runner_loadingpackage.go:75 +0x1e9 
ERRO [linters_context/goanalysis] fact_purity: panic during analysis: interface conversion: interface {} is nil, not *buildir.IR, goroutine 6167 [running]:
runtime/debug.Stack()
        runtime/debug/stack.go:26 +0x5e
github.com/golangci/golangci-lint/pkg/goanalysis.(*action).analyzeSafe.func1()
        github.com/golangci/golangci-lint@v1.59.1/pkg/goanalysis/runner_action.go:105 +0x5a
panic({0x18c86e0?, 0xc002715110?})
        runtime/panic.go:785 +0x132
honnef.co/go/tools/analysis/facts/purity.purity(0xc00038ed20)
        honnef.co/go/tools@v0.4.7/analysis/facts/purity/purity.go:109 +0x299
github.com/golangci/golangci-lint/pkg/goanalysis.(*action).analyze(0xc0037c9b10)
        github.com/golangci/golangci-lint@v1.59.1/pkg/goanalysis/runner_action.go:191 +0x9cd
github.com/golangci/golangci-lint/pkg/goanalysis.(*action).analyzeSafe.func2()
        github.com/golangci/golangci-lint@v1.59.1/pkg/goanalysis/runner_action.go:113 +0x17
github.com/golangci/golangci-lint/pkg/timeutils.(*Stopwatch).TrackStage(0xc002422910, {0x1a98caf, 0xb}, 0xc003d0bf48)
        github.com/golangci/golangci-lint@v1.59.1/pkg/timeutils/stopwatch.go:111 +0x44
github.com/golangci/golangci-lint/pkg/goanalysis.(*action).analyzeSafe(0xc002264a80?)
        github.com/golangci/golangci-lint@v1.59.1/pkg/goanalysis/runner_action.go:112 +0x6e
github.com/golangci/golangci-lint/pkg/goanalysis.(*loadingPackage).analyze.func2(0xc0037c9b10)
        github.com/golangci/golangci-lint@v1.59.1/pkg/goanalysis/runner_loadingpackage.go:80 +0xa5
created by github.com/golangci/golangci-lint/pkg/goanalysis.(*loadingPackage).analyze in goroutine 2743
        github.com/golangci/golangci-lint@v1.59.1/pkg/goanalysis/runner_loadingpackage.go:75 +0x1e9 
ERRO [linters_context/goanalysis] typedness: panic during analysis: interface conversion: interface {} is nil, not *buildir.IR, goroutine 6176 [running]:
runtime/debug.Stack()

It seems like the "real" error is Cannot range over: func(yield func(K, V) bool), and that typedness: panic during analysis: interface conversion: interface {} is nil, not *buildir.IR might be "aftermath", but my go is rusty. I'll keep working on it, but any insight you can add would be much appreciated.

@github-actions github-actions bot added the cmd/otelcontribcol otelcontribcol command label Sep 4, 2024
@adrielp
Copy link
Contributor

adrielp commented Sep 5, 2024

The metadata.yaml needs to be updated. A little below this section in the docs is helpful. For this PR I think all that is needed is to add development: [metrics, traces] and re-generate.
I found that making this change generated code that assumed that there was a happy path for tracing, but this PR does not add that happy path.

I see that component lifecycle tests are generated for testing and fail with:

│                                                                                                                                                                     │
│=== Failed                                                                                                                                                           │
│=== FAIL: . TestComponentLifecycle/traces-shutdown (0.00s)                                                                                                           │
│    generated_component_test.go:59:                                                                                                                                  │
│                Error Trace:    /Users/adrielperkins/dev/trash/opentelemetry-collector-contrib/receiver/githubreceiver/generated_component_test.go:59                │
│                Error:          Received unexpected error:                                                                                                           │
│                                telemetry type is not supported                                                                                                      │
│                Test:           TestComponentLifecycle/traces-shutdown                                                                                               │
│                                                                                                                                                                     │
│=== FAIL: . TestComponentLifecycle/traces-lifecycle (0.00s)                                                                                                          │
│    generated_component_test.go:65:                                                                                                                                  │
│                Error Trace:    /Users/adrielperkins/dev/trash/opentelemetry-collector-contrib/receiver/githubreceiver/generated_component_test.go:65                │
│                Error:          Received unexpected error:                                                                                                           │
│                                telemetry type is not supported                                                                                                      │
│                Test:           TestComponentLifecycle/traces-lifecycle                                                                                              │
│                                                                                                                                                                     │
│=== FAIL: . TestComponentLifecycle (2.01s)    

I need to do the in-depth code review beyond the core fundamentals already mentioned. I'm thinking we should be able to add a tiny bit more to make this pass where the receiver with traces configured can run no-op. (the indepth code review is on my todo list)

Also, there's an error when running your prep.sh script that I've been trying to understand:

ERRO [linters_context/goanalysis] buildir: panic during analysis: Cannot range over: func(yield func(K, V) bool), goroutine 6175 [running]:
runtime/debug.Stack()
.........

So looks like this is a bug in golanglint-ci. I went in on your branch and checked out running it against other receivers. I ended up doing the following:

cd opentelemetry-collector-contrib/internal/tools
go get -u github.com/golangci/golangci-lint && make tidy
cd - && cd receiver/githubreceiver
../../.tools/golangci-lint --config ../../.golangci.yml run

Doing so updated golanglint-ci from v1.59.1 to v1.60.3 which had no issues.

go.mod Outdated Show resolved Hide resolved
cmd/otelcontribcol/go.mod Outdated Show resolved Hide resolved
"go.opentelemetry.io/collector/confmap"
"go.opentelemetry.io/collector/otelcol/otelcoltest"
"go.opentelemetry.io/collector/receiver/scraperhelper"

"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/githubreceiver/internal"
"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/githubreceiver/internal/metadata"
"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/githubreceiver/internal/scraper/githubscraper"
"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/githubreceiver/internal/webhook"
"github.com/open-telemetry/opentelemetry-collector-contrib/receiver/webhookeventreceiver"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be importing the webhookevent receiver component as part of our tests here and instead should follow the pattern the webhookevent receiver uses to test itself.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you mean here. I am using the webhookreceiver to try to avoid reimplementing the webhook functionality. I'm interpreting your request as "please don't use code from other receivers" - which I'm fine to respect. One thing that I've tried to interpret this is to use anonymous structs that have the same fields as the expected webhook configuration struct. Struct stuff doesn't really work that way (much to my Python-headed duck-typing consternation), so the next nearest thing is an interface with getter-type functions. Am I on the right track, or do you mean something else?

One thing I'm a little hung up on is that the Validate() method for the webhookreceiver's config is being called somehow, and I can't figure out when or how. Perhaps this is one reason why you're avoiding using the webhookreceiver.Config struct. I have a debugger breakpoint on the only line where that can be added to our error collection, and the breakpoint is never hit. Nevertheless, the tests fail with:

=== RUN   TestLoadConfig
    config_test.go:36: 
                Error Trace:    /home/msarahan/code/opentelemetry-collector-contrib/receiver/githubreceiver/config_test.go:36
                Error:          Received unexpected error:
                                receivers::github/customname: missing receiver server endpoint from config
                Test:           TestLoadConfig
--- FAIL: TestLoadConfig (0.00s)
=== RUN   TestLoadConfig_Traces
    config_test.go:89: 
                Error Trace:    /home/msarahan/code/opentelemetry-collector-contrib/receiver/githubreceiver/config_test.go:89
                Error:          Received unexpected error:
                                receivers::github/customname: missing receiver server endpoint from config
                Test:           TestLoadConfig_Traces

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using the webhookreceiver to try to avoid reimplementing the webhook functionality. I'm interpreting your request as "please don't use code from other receivers" - which I'm fine to respect. One thing that I've tried to interpret this is to use anonymous structs that have the same fields as the expected webhook configuration struct.

I think my suggestion is to re-implement it's functionality so that this component is entirely decoupled from any changes to the WebHook receiver component. This removes a dependency that can change separately from this code.

@crobert-1, @TylerHelmuth, @andrzej-stencel - I'm curious to hear your thoughts here and get your opinion.

@@ -40,6 +43,14 @@ func TestLoadConfig(t *testing.T) {
defaultConfigGitHubScraper.(*Config).Scrapers = map[string]internal.Config{
githubscraper.TypeStr: (&githubscraper.Factory{}).CreateDefaultConfig(),
}
defaultConfigGitHubScraper.(*Config).WebhookConfig = webhook.Config{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Webhook should be separate from defaultConfigGitHubScraper.

Copy link
Author

@msarahan msarahan Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed this defaultConfigGithubReceiver to make it more clear that it is the top-level configuration struct. I also renamed WebhookConfig to Tracing.

Does this resolve your concern, or is it still needing further alteration?

…tor-contrib into githubreceiver-actions-traces
@@ -24,6 +25,7 @@ type Config struct {
scraperhelper.ControllerConfig `mapstructure:",squash"`
Scrapers map[string]internal.Config `mapstructure:"scrapers"`
metadata.MetricsBuilderConfig `mapstructure:",squash"`
Traces traces.Config `mapstructure:"traces"`
Copy link
Contributor

@adrielp adrielp Sep 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Elfo404 - This is what I was talking about. Could you give some purview on the complications or ease of backend logic that is able so separate events that are built into traces vs events that could be built into metrics of just sent as raw events?

My thinking here, and this could come in a subsequent iteration, is that it'd be simpler to only require the configuration of one GitHub app to sent all data, instead of requiring the configuration of multiple to point to different hooks that process logs and traces.

Also curious your thoughts here @TylerHelmuth

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I was looking at a previous revision which called it Webhook.

I think keeping it simple from a config POV is better (i.e. reverting back to calling this webhook). Reason being you wouldn't have to manage multiple GH apps. (which if needed for other reasons you can still do by having different instances of the receiver with different config, for instance because of rate limits). Then we can add different config sections to customize the behavior of how traces or metrics are generated.

Internally is pretty straightforward to differentiate between events; take a look at what we did in https://github.com/grafana/grafana-ci-otel-collector/blob/main/receiver/githubactionsreceiver/receiver.go#L229, where we differentiate on the event type, and generate metrics on non-skipped workflow_job events, and after generate traces and logs on completed workflow_job and workflow_run events (https://github.com/grafana/grafana-ci-otel-collector/blob/main/receiver/githubactionsreceiver/receiver.go#L257-L293).

it's a bit of spaghetti code, but the metrics bit was thrown together in very little time. I'm sure that with some routing logic we can keep the implementation clean and not require users to define different webhooks.

TLDR; my thought is that the configuration should be independent on how the event handling is implemented, the fact that traces are generated thanks to the webhook is an implementation detail. WDYT?

@@ -0,0 +1,168 @@
# GitHub Actions Receiver

<!-- status autogenerated section -->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole auto generated section can be deleted.

I think most of the information in this readme should be in the readme at the root of the receiver, and if necessary, can link to this as an internal document after adjustement.

var _ component.Config = (*Config)(nil)

// Validate checks the receiver configuration is valid
func (cfg *Config) Validate() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: revisit this after the discussion of using the webhook receiver as part of this or re-implementing similar code.

Co-authored-by: Adriel Perkins <adrielp@liatrio.com>
Copy link
Contributor

github-actions bot commented Oct 4, 2024

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Oct 4, 2024
@msarahan
Copy link
Author

msarahan commented Oct 4, 2024

My work has taken a turn away from this, and I'm afraid I probably can't continue here. Would anyone like to pick up this work, or shall I close this PR?

@github-actions github-actions bot removed the Stale label Oct 5, 2024
@adrielp
Copy link
Contributor

adrielp commented Oct 8, 2024

My work has taken a turn away from this, and I'm afraid I probably can't continue here. Would anyone like to pick up this work, or shall I close this PR?

Firstly, thanks for the work on this @msarahan. Secondly thanks for letting us know. I'll try to pick up this work from here. I know contributions can take time, but if you ever have bandwidth again, feel free to contribute, always happy to see new folks contributing!

Copy link
Contributor

This PR was marked stale due to lack of activity. It will be closed in 14 days.

@github-actions github-actions bot added the Stale label Oct 23, 2024
Copy link
Contributor

github-actions bot commented Nov 7, 2024

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@github-actions github-actions bot closed this Nov 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants