-
Notifications
You must be signed in to change notification settings - Fork 19
Pr/exec #1787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Add Exec scraper type to api/v1 mirroring playbook ExecAction - Implement ExecScraper with script execution via duty/shell.Run() - Support AWS/GCP/Azure/K8s connection credential injection - Support git checkout before script execution - Parse stdout as JSON/YAML to create multiple config items - Register exec scraper in ScraperSpec and common registry - Add comprehensive unit tests for output parsing - All tests pass and linting clean
Add 8 example ScrapeConfig fixtures demonstrating exec scraper capabilities: - exec-simple.yaml: Basic inline bash script with JSON output - exec-aws-connection.yaml: AWS CLI with credential injection - exec-git-checkout.yaml: Script execution from Git repository - exec-python.yaml: Python script with Kubernetes connection - exec-nodejs.yaml: Node.js script with AWS connection - exec-env-transform.yaml: Environment variables and transforms - exec-yaml-output.yaml: YAML output parsing example - exec-comprehensive.yaml: Multi-feature demonstration Also includes exec-README.md with detailed documentation covering: - Feature overview for each fixture - Key capabilities demonstrated - Usage patterns and best practices - Scheduling recommendations
- Add exec_suite_test.go with Ginkgo test suite setup - Add exec_git_integration_test.go with git checkout integration test - Test verifies exec scraper can clone repo and execute inline script - Uses file:// URL to clone config-db repo itself - Inline script checks for go.mod to verify checkout worked - All tests pass successfully
Tests exec scraper's ability to checkout git repos and run scripts
- Created fixtures/exec-backstage-catalog.yaml with Python script to scrape Backstage entities - Parses Components, APIs, Systems, and Domains from catalog-model/examples - Extracts metadata, properties (type, lifecycle), and tags from multiple sources - Maps relationships (owner, domain, system, dependsOn, consumesApis) using CEL expressions - Handles multi-document YAML files with safe_load_all - Created integration test in scrapers/exec/exec_backstage_integration_test.go - Tests entity scraping, relationship extraction, properties, and tag aggregation - Fixed exec_git_integration_test.go unused variable Note: Git checkout currently fails with 'reference not found' - requires investigation
WalkthroughAdds an Exec scraping feature: new Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Scheduler
participant ExecScraper
participant Checkout as Git/Checkout
participant Shell as ShellExecutor
participant Parser
participant Results as ResultsCollector
Scheduler->>ExecScraper: Trigger Scrape(ctx)
ExecScraper->>ExecScraper: iterate Exec configs
alt config has checkout
ExecScraper->>Checkout: perform git checkout
Checkout-->>ExecScraper: workspace ready / error
end
ExecScraper->>Shell: run script (env, connections, workspace)
Shell-->>ExecScraper: stdout, stderr, exitCode
alt exitCode == 0
ExecScraper->>Parser: parseOutput(stdout)
alt parsed as JSON array/object
Parser-->>ExecScraper: structured data
ExecScraper->>Results: create result(s) per item
else parsed as YAML
Parser-->>ExecScraper: YAML→JSON data
ExecScraper->>Results: create result(s)
else plain text
Parser-->>ExecScraper: raw text
ExecScraper->>Results: create single textual result
end
else
ExecScraper->>Results: emit error result (stderr, exitCode)
end
ExecScraper-->>Scheduler: aggregated ScrapeResults
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
chart/crds/configs.flanksource.com_scrapeconfigs.yaml (1)
1875-2087: Consider markingaccountas required forclickhouse.azureBlobStorageThe CRD correctly reflects the Go struct (only
collectionis required). However, theaccountfield is used unconditionally inGetAccountKeyCommand()at runtime without validation—if it's empty, the command will fail. Either markaccountas required in the CRD schema, or add controller-level validation to prevent users from creating specs that pass admission but fail at execution.
♻️ Duplicate comments (5)
fixtures/exec-comprehensive.yaml (3)
29-48: Critical YAML syntax error: duplicate "name" key.Lines 29 and 32 both define a
namekey in the same YAML mapping. This is the same structural error as in Example 1.
51-62: Critical YAML syntax error: duplicate "name" key.Lines 51 and 54 both define a
namekey in the same YAML mapping. This is the same structural error as in Examples 1 and 2.
65-99: Critical YAML syntax error: duplicate "name" key.Lines 65 and 68 both define a
namekey in the same YAML mapping. This is the same structural error as in the previous examples.fixtures/exec-env-transform.yaml (1)
8-11: Critical YAML syntax error: duplicate "name" key.Lines 8 and 11 both define a
namekey in the same YAML mapping, which is invalid YAML syntax. This is the same structural error present inexec-comprehensive.yaml.fixtures/exec-backstage-catalog.yaml (1)
11-14: Critical YAML syntax error: duplicate "name" key.Lines 11 and 14 both define a
namekey in the same YAML mapping, which is invalid YAML syntax. This is the same structural error present in other exec fixture files.Additionally, this fixture is used by the failing integration test
exec_backstage_integration_test.go. The duplicate key issue may contribute to unexpected behavior, though the primary test failure appears to be related to git checkout.
🧹 Nitpick comments (2)
go.mod (1)
1-464: Verify dependency updates for compatibility.The go.mod file shows extensive dependency updates across AWS SDK v2, Azure SDK, Kubernetes libraries, and OpenTelemetry components. While these updates appear to align with the new Exec feature requirements, ensure:
- All updated dependencies are tested in integration tests
- Breaking changes in major version bumps are addressed
- Security advisories for the updated versions are reviewed
Do you want me to generate a script to check for known vulnerabilities in the updated dependencies?
scrapers/exec/exec_test.go (1)
75-98: Plain-text assertion is tightly coupled to internal representationThe plain-text test currently expects the config to be a quoted string (
"This is plain text..."), which bakes in how the value is serialized rather than the underlying content. That makes the test brittle if the implementation changes to keep the raw string and defer JSON quoting to a later stage.Consider relaxing this to assert the underlying text only, e.g. just checking that
results[0].Configis a string containing the original message (without enforcing JSON-quoted formatting).
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (27)
.gitignore(1 hunks)api/v1/exec.go(1 hunks)api/v1/scrapeconfig_types.go(1 hunks)api/v1/types.go(3 hunks)api/v1/zz_generated.deepcopy.go(3 hunks)chart/crds/configs.flanksource.com_scrapeconfigs.yaml(5 hunks)chart/crds/configs.flanksource.com_scrapeplugins.yaml(1 hunks)config/schemas/config_exec.schema.json(1 hunks)config/schemas/scrape_config.schema.json(8 hunks)fixtures/exec-README.md(1 hunks)fixtures/exec-aws-connection.yaml(1 hunks)fixtures/exec-backstage-catalog.yaml(1 hunks)fixtures/exec-comprehensive.yaml(1 hunks)fixtures/exec-env-transform.yaml(1 hunks)fixtures/exec-git-checkout-test.yaml(1 hunks)fixtures/exec-git-checkout.yaml(1 hunks)fixtures/exec-nodejs.yaml(1 hunks)fixtures/exec-python.yaml(1 hunks)fixtures/exec-simple.yaml(1 hunks)fixtures/exec-yaml-output.yaml(1 hunks)go.mod(10 hunks)scrapers/common.go(2 hunks)scrapers/exec/exec.go(1 hunks)scrapers/exec/exec_backstage_integration_test.go(1 hunks)scrapers/exec/exec_git_integration_test.go(1 hunks)scrapers/exec/exec_suite_test.go(1 hunks)scrapers/exec/exec_test.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (8)
api/v1/types.go (2)
api/v1/exec.go (1)
Exec(9-26)api/v1/common.go (1)
BaseScraper(240-255)
scrapers/common.go (1)
scrapers/exec/exec.go (1)
ExecScraper(12-12)
api/v1/scrapeconfig_types.go (1)
api/v1/exec.go (1)
Exec(9-26)
scrapers/exec/exec_backstage_integration_test.go (2)
api/context.go (1)
NewScrapeContext(29-34)scrapers/exec/exec.go (1)
ExecScraper(12-12)
api/v1/exec.go (1)
api/v1/common.go (1)
BaseScraper(240-255)
scrapers/exec/exec_suite_test.go (3)
api/context.go (2)
ScrapeContext(16-27)NewScrapeContext(29-34)api/v1/scrapeconfig_types.go (1)
ScrapeConfig(50-56)api/v1/config.go (1)
ParseConfigs(30-43)
api/v1/zz_generated.deepcopy.go (2)
api/v1/exec.go (1)
Exec(9-26)api/v1/common.go (1)
BaseScraper(240-255)
scrapers/exec/exec_test.go (1)
api/v1/exec.go (1)
Exec(9-26)
🪛 GitHub Actions: Test
scrapers/exec/exec_backstage_integration_test.go
[error] 35-35: Exec Scraper - Backstage Catalog Integration test failed. got error: failed to execute script: error during git clone: reference not found
scrapers/exec/exec_git_integration_test.go
[error] 33-33: Exec Scraper - Git Checkout Integration test failed. Expected 2 config items from script output but got 1; error: failed to checkout git repository and execute script
🪛 YAMLlint (1.37.1)
fixtures/exec-git-checkout-test.yaml
[error] 12-12: duplication of key "name" in mapping
(key-duplicates)
fixtures/exec-aws-connection.yaml
[error] 11-11: duplication of key "name" in mapping
(key-duplicates)
fixtures/exec-env-transform.yaml
[error] 11-11: duplication of key "name" in mapping
(key-duplicates)
fixtures/exec-backstage-catalog.yaml
[error] 14-14: duplication of key "name" in mapping
(key-duplicates)
fixtures/exec-yaml-output.yaml
[error] 11-11: duplication of key "name" in mapping
(key-duplicates)
fixtures/exec-comprehensive.yaml
[error] 12-12: duplication of key "name" in mapping
(key-duplicates)
[error] 32-32: duplication of key "name" in mapping
(key-duplicates)
[error] 54-54: duplication of key "name" in mapping
(key-duplicates)
[error] 68-68: duplication of key "name" in mapping
(key-duplicates)
🔇 Additional comments (23)
.gitignore (1)
29-34: LGTM!The ignore patterns are appropriate for build artifacts and the new exec-checkout workflow.
chart/crds/configs.flanksource.com_scrapeplugins.yaml (1)
6-6: LGTM!Controller-gen version bump to v0.19.0 aligns with the broader API surface changes introduced in this PR.
fixtures/exec-git-checkout-test.yaml (1)
46-46: Verify placeholder URL before deployment.The checkout URL is set to a placeholder that will be replaced. Ensure that this fixture is only used in tests where the URL is properly substituted, or consider adding a comment indicating this is test-only.
api/v1/zz_generated.deepcopy.go (1)
644-676: LGTM!The auto-generated DeepCopy implementation for the Exec type correctly handles all fields including pointer fields (Checkout), slices (Env, Artifacts), and nested structs (BaseScraper, Connections).
config/schemas/config_exec.schema.json (1)
1-749: LGTM! Comprehensive schema definition.The JSON schema provides thorough type definitions for the Exec feature with:
- Strict validation via
additionalProperties: false- Proper required field enforcement (script, path, etc.)
- Complete connection type coverage (AWS, GCP, Azure, K8s, Git)
- Flexible environment variable and artifact handling
The schema aligns well with the API type definitions in api/v1/exec.go.
fixtures/exec-simple.yaml (1)
1-31: LGTM! Clean example fixture.This fixture provides a clear, minimal example of the Exec scraper with:
- Simple inline bash script
- JSON output via heredoc
- JSONPath-based field mappings
Good starting point for users learning the Exec feature.
api/v1/scrapeconfig_types.go (1)
114-116: LGTM! Consistent type detection pattern.The exec type detection follows the same pattern as existing scraper types and is correctly placed at the end of the conditional chain.
scrapers/common.go (1)
7-7: ExecScraper registration wiring looks correctImporting
scrapers/execand registeringexec.ExecScraper{}inAllis consistent with the existing pattern for other scrapers.Also applies to: 31-31
fixtures/exec-README.md (1)
1-174: README coverage and examples look consistent with the Exec feature setThe described fixtures, runtimes, connection handling, output formats, and patterns match the added exec YAMLs and the new Exec scraper behavior. This is a solid, discoverable overview for users.
fixtures/exec-git-checkout.yaml (1)
1-20: Git checkout exec fixture looks structurally soundThe example correctly wires
exec, inline Bash, and acheckoutblock, matching the new Exec model and providing a clear Git‑based usage pattern.api/v1/types.go (1)
19-20: Exec type is correctly wired into config registry and plugin applicationAdding
"exec": Exec{}toAllScraperConfigs, theExec []Execfield onScraperSpecwithjson/yaml:"exec,omitempty", and the correspondingApplyPluginloop is consistent with how other scraper types are integrated.Also applies to: 72-73, 145-147
scrapers/exec/exec_test.go (1)
9-73: Good coverage of parseOutput for structured outputsThe tests for JSON object/array, YAML, and empty output exercise the main branches of
parseOutputand at least validate result counts and absence of errors, which should catch regressions in basic parsing behavior.scrapers/exec/exec_backstage_integration_test.go (1)
14-86: The fixture configuration and repository are properly configured. Theexec-backstage-catalog.yamlfixture correctly specifieshttps://github.com/backstage/backstageand the expected pathpackages/catalog-model/examplesexists in the repository with valid YAML example files. The git checkout configuration contains no apparent errors (thedepth: 1setting is valid, and using the default branch without an explicit ref is standard practice).To diagnose the actual pipeline failure, please provide the specific error logs from the pipeline run. The assertion about "reference not found" during git clone cannot be confirmed from the fixture configuration alone.
scrapers/exec/exec_git_integration_test.go (1)
14-66: No issues found - fixture and script are correctly implemented.The fixture file
fixtures/exec-git-checkout-test.yamlexists and is properly configured. The script correctly outputs 2 JSON objects matching the test expectations (config-1 and config-2), and the test code is well-formed and properly aligned with the fixture structure. No evidence of pipeline failure or missing components exists.Likely an incorrect or invalid review comment.
fixtures/exec-comprehensive.yaml (1)
9-26: The YAML structure is correct and intentional—not a syntax error.Lines 9 and 12 do not create a YAML parsing error. The first
name(line 9) identifies the exec scraper configuration, while the secondname(line 12) specifies the JSONPath expression to extract the name field from scraped items. This is the documented schema pattern for exec scrapers, as evidenced by:
- All four examples in this file follow the same pattern (lines 9/12, 29/32, 51/54, 65/68)
- The README explicitly documents this under "Key Features Demonstrated → Field extraction: JSONPath-based ID, name, type extraction"
- The comprehensive example demonstrates this pattern works across multiple scraper types
The config-db exec scraper schema intentionally uses
namefor both the config identifier and the output field extraction specification.Likely an incorrect or invalid review comment.
scrapers/exec/exec.go (4)
1-16: LGTM!Clean package structure with appropriate imports and a straightforward
CanScrapeimplementation.
18-48: LGTM!The scrape loop correctly:
- Maps configuration fields to the shell execution config
- Handles execution errors gracefully by appending error results and continuing
- Captures stderr in the error message for non-zero exit codes
- Delegates output parsing to
parseOutput
50-74: Verify plain text handling behavior.YAML is extremely permissive—almost any plain text is valid YAML. This means the plain text fallback (lines 71-73) will rarely execute. Instead, plain text outputs will:
- Fail JSON parsing
- Pass YAML parsing (as a scalar string)
- Go through
createResultsFromJSON→ get re-marshaled as a JSON stringFor example, script output
Hello Worldbecomes"Hello World"(quoted), and multi-line text gets escaped newlines.If this is intentional (always produce JSON-compatible output), this is fine. If raw text preservation is expected, consider checking
config.Formator adding explicit format detection.
76-112: LGTM!The function correctly handles all JSON types:
- Arrays produce one result per item
- Objects and scalars produce single results
- Marshal errors are captured as error results without stopping processing
config/schemas/scrape_config.schema.json (3)
922-1008: LGTM!The
Execdefinition correctly:
- Inherits BaseScraper fields for consistent metadata
- Requires
scriptas the only mandatory field- References valid definitions for
connections,checkout,env, andartifacts
1009-1035: LGTM!The
ExecConnectionsdefinition provides flexible cloud provider connection options (AWS, GCP, Azure, Kubernetes) with proper references to their respective connection types.
2995-3000: LGTM!The
execproperty is correctly added toScraperSpecas an optional array ofExecdefinitions, consistent with other scraper types in the schema.chart/crds/configs.flanksource.com_scrapeconfigs.yaml (1)
2508-3794: Exec scraper schema looks consistent; ensure it stays generated from the canonical typesThe new
spec.exec[]schema (artifacts, checkout, connections, env, script) looks structurally consistent with the rest of the CRD:
- Uses the same
valueFrompattern (configMapKeyRef/helmRef/secretKeyRef/serviceAccount) as other credential-bearing fields.connectionsmirrors existing AWS/Azure/GCP/Kubernetes connection shapes.envreuses the familiar EnvVar‑like structure.scriptis required and clearly documented as “inline or path to script file”.Given the size and complexity, it’s important this section remains derived from the Go API types / JSON Schemas rather than hand-edited, so future changes to
Exec/ExecConnectionsdon’t silently diverge from the installed CRD.If not already part of CI, consider a simple guard:
#!/bin/bash # Regenerate CRDs & schemas and verify no diff. make generate-crds generate-schemas # or the project’s existing targets git diff -- chart/crds/configs.flanksource.com_scrapeconfigs.yaml config/schemas || trueto ensure the CRD, Go types, and JSON Schemas for
execremain in sync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
cmd/server.go (1)
103-109: Scopednolintfor deprecated Echo logger is acceptable, but plan a migrationThe localized
//nolint:staticcheck // SA1019 ignore.aroundmiddleware.LoggerWithConfigis a reasonable way to quiet the deprecation warning for now. Consider migrating to the recommended logging approach from Echo in a follow‑up so this suppression can eventually be removed.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
Makefile(1 hunks)cmd/server.go(2 hunks)fixtures/exec-aws-connection.yaml(1 hunks)fixtures/exec-backstage-catalog.yaml(1 hunks)fixtures/exec-comprehensive.yaml(1 hunks)fixtures/exec-env-transform.yaml(1 hunks)fixtures/exec-git-checkout-test.yaml(1 hunks)fixtures/exec-yaml-output.yaml(1 hunks)scrapers/exec/exec_backstage_integration_test.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
- fixtures/exec-comprehensive.yaml
- fixtures/exec-yaml-output.yaml
- fixtures/exec-backstage-catalog.yaml
- scrapers/exec/exec_backstage_integration_test.go
- fixtures/exec-aws-connection.yaml
- fixtures/exec-git-checkout-test.yaml
🔇 Additional comments (3)
cmd/server.go (1)
28-32: Config‑DB imports re‑grouping looks fineReordering
v1,db,jobs, andutilsalongsidescrapersis clean and non‑functional; no issues from this change.Makefile (1)
196-199: golangci‑lint version bump is straightforward; verify CI/local compatibilityUpdating
GOLANGCI_LINT_VERSIONtov2.7.2fits the existing/v2install path and should be safe, but it may enable/alter some linters. Please confirmmake golangci-lintand CI lints still pass as expected with this version.fixtures/exec-env-transform.yaml (1)
54-55: .toJSON() is a valid and well-supported method in gomplate v3.The expression uses gomplate templating (not CEL), and
.toJSON()is a documented gomplate function that marshals objects to JSON strings. This pattern is consistently used throughout the codebase for relationship expressions.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
api/v1/exec.go (2)
9-13: Consider adding validation for the Script field.The
Scriptfield lacks anomitemptytag, making it required in the JSON/YAML, but Go strings default to empty. Without validation tags or runtime checks, an empty script could reach the execution layer and cause issues.Consider one of these approaches:
- Add a validation tag if using a validation framework:
- Script string `json:"script" yaml:"script" template:"true"` + Script string `json:"script" yaml:"script" template:"true" validate:"required,min=1"`
- Document that validation is performed at runtime in the scraper implementation.
21-22: Consider whether environment variables need templating support.The
Envfield lacks atemplate:"true"tag while other dynamic fields likeScriptandArtifactshave it. If environment variable values need to support dynamic templating (e.g., referencing other config values), consider adding the template tag.If templating is needed for environment variables:
- Env []types.EnvVar `json:"env,omitempty" yaml:"env,omitempty"` + Env []types.EnvVar `json:"env,omitempty" yaml:"env,omitempty" template:"true"`chart/crds/configs.flanksource.com_scrapeconfigs.yaml (1)
3338-3342: Consider renaming booleanserviceAccountflag under exec.connections.kubernetesWithin
spec.exec[*].connections.kubernetes,serviceAccountis a boolean toggle (“will allow access to KUBERNETES env vars”), whereas throughout the rest of the CRDserviceAccountis a string field (the SA name) inside variousvalueFromblocks.The meaning difference is clear from the description, but the reused name with a different type may be confusing to users and tooling.
If you have the flexibility, consider renaming this boolean to something like
useServiceAccountEnvorenableServiceAccountEnvfor clarity and to avoid overloadingserviceAccountacross different shapes.Also applies to: 3362-3406
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
api/v1/exec.go(1 hunks)api/v1/zz_generated.deepcopy.go(3 hunks)chart/crds/configs.flanksource.com_scrapeconfigs.yaml(5 hunks)fixtures/exec-python.yaml(1 hunks)scrapers/exec/exec.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- scrapers/exec/exec.go
- api/v1/zz_generated.deepcopy.go
- fixtures/exec-python.yaml
🧰 Additional context used
🧬 Code graph analysis (1)
api/v1/exec.go (1)
api/v1/common.go (2)
BaseScraper(240-255)Script(30-35)
🔇 Additional comments (6)
api/v1/exec.go (2)
3-7: Imports are correctly specified and all types exist in duty v1.0.1126.
connection.ExecConnections,connection.GitConnection,types.EnvVar, andshell.Artifactare all active in the codebase with proper code generation support.
15-19: Consider addingtemplate:"true"toCheckoutfield for consistency.The
Connectionsfield supports templating viatemplate:"true", butCheckoutdoes not. If Git repository URLs or checkout parameters should support dynamic values (e.g., branch names or repository URLs from environment variables), the template tag should be added to maintain consistency with howScriptandArtifactsfields are handled.Note: The
Envfield also lacks the template tag, suggesting this may be intentional. If repository URLs should remain static for stability or security reasons, the current design is appropriate.chart/crds/configs.flanksource.com_scrapeconfigs.yaml (4)
6-6: controller-gen version bump looks fineThe CRD annotation is updated to
v0.19.0; no issues from the manifest perspective assuming the generator/tooling is aligned with this version in the build pipeline.
1985-2499: ClickHouse + Azure Blob additions are structurally consistentThe new/expanded
clickhouse.azureBlobStorageand related ClickHouse fields (clickhouseURL,query, credentials withvalueFrom, requiredcollection) mirror the existing credential/valueFrom patterns used elsewhere in the CRD and keepquery-based behavior for ClickHouse. I don't see structural or type inconsistencies here.
2508-3795: Exec scraper schema looks coherent; nice reuse of connection/env patternsThe new
spec.exec[]block (artifacts,checkout,connections,env,script, transforms, etc.) matches existing schema conventions:
artifacts[*].pathis correctly required.checkoutand nested connection/credential fields follow the established{name,value,valueFrom}pattern.connectionscovers AWS/Azure/GCP/Kubernetes variants with the same shapes used elsewhere.env[]andscriptare modelled as expected, withscriptcorrectly marked as required.This all lines up with an “exec + optional git checkout + connection injection” model and should be easy to consume from clients.
3500-3502: Execscriptsemantics are clear; required flag is appropriate
scriptis documented as “an inline script to run” and is the only required field in the exec item, which matches the intended API: checkout/env/connections are optional helpers around a core script. This looks like the right balance between flexibility and validation.Also applies to: 3793-3794
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
scrapers/exec/exec_git_integration_test.go (1)
42-66: Make config field assertions safer to avoid panics on shape changesThe direct type assertions on the nested
configfields can panic if the Exec scraper ever changes the shape or type of the returned payload:configData1 := config1["config"].(map[string]any) ... configData2 := config2["config"].(map[string]any)You can keep the test strict while avoiding panics by checking the assertion result:
- configData1 := config1["config"].(map[string]any) - Expect(configData1["key"]).To(Equal("value1")) - Expect(configData1["source"]).To(Equal("git-checkout-test")) + configData1, ok := config1["config"].(map[string]any) + Expect(ok).To(BeTrue(), "config field should be a map[string]any") + Expect(configData1["key"]).To(Equal("value1")) + Expect(configData1["source"]).To(Equal("git-checkout-test")) @@ - configData2 := config2["config"].(map[string]any) - Expect(configData2["key"]).To(Equal("value2")) - Expect(configData2["source"]).To(Equal("git-checkout-test")) + configData2, ok := config2["config"].(map[string]any) + Expect(ok).To(BeTrue(), "config field should be a map[string]any") + Expect(configData2["key"]).To(Equal("value2")) + Expect(configData2["source"]).To(Equal("git-checkout-test"))This keeps the intent of the assertions but makes the test failure mode clearer and less brittle to internal representation changes.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
fixtures/exec-azure-devops.yaml(1 hunks)fixtures/exec-git-checkout-test.yaml(1 hunks)scrapers/exec/exec_git_integration_test.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
scrapers/exec/exec_git_integration_test.go (4)
api/v1/exec.go (1)
Exec(9-26)api/context.go (1)
NewScrapeContext(29-34)scrapers/exec/exec_suite_test.go (1)
DefaultContext(22-22)scrapers/exec/exec.go (1)
ExecScraper(13-13)
🪛 GitHub Actions: Test
scrapers/exec/exec_git_integration_test.go
[error] 34-34: Exec Scraper - Git Checkout Integration: should checkout git repository and execute script. Expected 2 config items from script output but got 1. failed to execute script: error during git clone: reference not found.
🔇 Additional comments (1)
fixtures/exec-git-checkout-test.yaml (1)
1-45: Fixture structure and script output look consistentThe ScrapeConfig wiring (type/id/name JSONPaths, inline script JSON payload, and git checkout settings) is coherent and matches the Exec scraper expectations; no issues spotted here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
.github/workflows/test.yml (2)
109-110: Consider pinning the PyYAML version for reproducibility.Same concern as in the
testjob: installingpyyamlwithout a version constraint can lead to non-reproducible builds and supply chain risks.Apply this diff to pin the PyYAML version:
- - name: Install pyyaml for backstage test - run: pip3 install pyyaml + - name: Install pyyaml for backstage test + run: pip3 install pyyaml==6.0.2
28-29: Pin PyYAML version in both test jobs for reproducibility.Installing
pyyamlwithout a version constraint reduces build reproducibility and introduces supply chain risk. Pinning dependencies in files like requirements.txt is widely considered good practice, and pinning versions is crucial for maintaining stability and reproducibility, as builds could fail due to breaking changes in new package versions.Both the
testjob (lines 28-29) andtest-prodjob (lines 109-110) install pyyaml without pinning. Pin both to the same version:- name: Install pyyaml for backstage test run: pip3 install pyyaml==6.0.2Note: The e2e job correctly does not require this dependency, as backstage integration tests are only part of the main test suite, not the e2e tests.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
.github/workflows/test.yml(2 hunks)fixtures/exec-backstage-catalog.yaml(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- fixtures/exec-backstage-catalog.yaml
requires: flanksource/duty#1687
Summary by CodeRabbit
New Features
Documentation
CRD / Schema
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.