Skip to content

fix: imported safe-output fragments no longer override explicit threat-detection: false#18231

Merged
pelikhan merged 4 commits intomainfrom
copilot/fix-imported-safe-outputs
Feb 25, 2026
Merged

fix: imported safe-output fragments no longer override explicit threat-detection: false#18231
pelikhan merged 4 commits intomainfrom
copilot/fix-imported-safe-outputs

Conversation

Copy link
Contributor

Copilot AI commented Feb 24, 2026

When a workflow sets threat-detection: false (or enabled: false), imported fragments that lack a threat-detection key would silently re-enable it via auto-defaulting in parseThreatDetectionConfig, triggering a spurious validation error when sandbox.agent: false was also set.

Root cause: result.ThreatDetection == nil was ambiguous — it could mean either "not configured" (no safe-outputs in main) or "explicitly disabled", so mergeSafeOutputConfig couldn't distinguish them.

Changes

  • imports.goThreatDetection is only merged from an imported fragment if the fragment's raw config map explicitly contains a threat-detection key. Auto-enabled defaults (fragments that have safe-outputs but no explicit threat-detection key) are ignored during merge. This makes threat-detection a true workflow-level concern.
Original prompt

This section details on the original issue you should resolve

<issue_title>fix: imported safe-output fragments override explicit threat-detection: false</issue_title>
<issue_description>## Bug

When a workflow explicitly disables threat detection (threat-detection: false or threat-detection: { enabled: false }), imported safe-output fragments re-enable it during merge, causing a compilation error when sandbox.agent: false is also set.

Reproduction

Workflow frontmatter:

sandbox:
  agent: false
safe-outputs:
  activation-comments: false
  threat-detection: false

With any imported fragment that has its own safe-outputs: (e.g., safe-output-add-comment.md with safe-outputs: { add-comment: { max: 1 } }).

Error:

error: threat detection requires sandbox.agent to be enabled. Threat detection runs inside the agent sandbox (AWF) with fully blocked network. Either enable sandbox.agent or remove the threat-detection configuration from safe-outputs.

Root Cause

mergeSafeOutputConfig() in imports.go:512-634 calls extractSafeOutputsConfig() on each imported fragment. The auto-enablement logic at safe_outputs_config.go:479-491 fires because the fragment's outputMap has no threat-detection key, so it defaults to &ThreatDetectionConfig{}.

Then at imports.go:633-635:

if result.ThreatDetection == nil && importedConfig.ThreatDetection != nil {
    result.ThreatDetection = importedConfig.ThreatDetection
}

The main workflow's explicit disable (nil) gets overwritten by the fragment's auto-enabled default.

Suggested Fix

Skip merging ThreatDetection from imported fragments — it's a workflow-level concern, not a per-fragment setting. Alternatively, suppress auto-enablement of threat detection inside extractSafeOutputsConfig when called from the import merge path.</issue_description>

Comments on the Issue (you are @copilot in this section)


🔒 GitHub Advanced Security automatically protects Copilot coding agent pull requests. You can protect all pull requests by enabling Advanced Security for your repositories. Learn more about Advanced Security.



✨ PR Review Safe Output Test - Run 22377081592

💥 [THE END] — Illustrated by Smoke Claude

…t-detection: false

When a workflow sets threat-detection: false (or enabled: false), add a
ThreatDetectionExplicitlyDisabled flag to SafeOutputsConfig. The flag is
set in extractSafeOutputsConfig when the threat-detection key is present
but resolves to nil. mergeSafeOutputConfig now guards the ThreatDetection
merge with !result.ThreatDetectionExplicitlyDisabled so imported fragments
cannot re-enable threat detection that was explicitly disabled.

Adds three tests:
- TestMergeSafeOutputsThreatDetectionExplicitDisableNotOverridden (unit)
- TestMergeSafeOutputsThreatDetectionImportedWhenMainHasNone (regression)
- TestSafeOutputsImportDoesNotReenableThreatDetection (integration)

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix imported safe-output fragments overriding threat detection settings fix: imported safe-output fragments no longer override explicit threat-detection: false Feb 24, 2026
@pelikhan
Copy link
Contributor

@copilot

@pelikhan
Copy link
Contributor

@copilot Review the code and make it as clean and concise as possible. Simplify the implementation to improve clarity and maintainability.

…instead of sentinel

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 25, 2026

@copilot Review the code and make it as clean and concise as possible. Simplify the implementation to improve clarity and maintainability.

Simplified in eb45e44. Replaced the ThreatDetectionExplicitlyDisabled boolean sentinel (touching 3 files) with a direct raw-map check in imports.go:

if result.ThreatDetection == nil {
    if _, hasTD := config["threat-detection"]; hasTD && importedConfig.ThreatDetection != nil {
        result.ThreatDetection = importedConfig.ThreatDetection
    }
}

This means ThreatDetection is only propagated from an import that explicitly carries a threat-detection key — auto-enabled defaults in fragments are ignored. The ThreatDetectionExplicitlyDisabled struct field and the corresponding else if in safe_outputs_config.go are both gone.

@kdehl16-web

This comment has been minimized.

@github-actions
Copy link
Contributor

Commit pushed: 7c9871c

💥 [THE END] — Illustrated by Smoke Claude

@pelikhan pelikhan marked this pull request as ready for review February 25, 2026 01:49
Copilot AI review requested due to automatic review settings February 25, 2026 01:49
@pelikhan pelikhan merged commit da64eba into main Feb 25, 2026
@pelikhan pelikhan deleted the copilot/fix-imported-safe-outputs branch February 25, 2026 01:50
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a merge-time regression where imported safe-outputs fragments could unintentionally re-enable threat detection (via defaulting) even when the main workflow explicitly disabled it.

Changes:

  • Gate ThreatDetection merge from imports on the imported fragment explicitly containing a threat-detection key.
  • Add unit/integration coverage to ensure explicit threat-detection: false in the main workflow is preserved across imports.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
pkg/workflow/imports.go Prevents imported fragments without an explicit threat-detection key from setting ThreatDetection during merge.
pkg/workflow/safe_outputs_import_test.go Adds tests covering explicit disable preservation and explicit import enable behavior.
tmp-smoke-test-22377081592.txt Adds a temporary smoke-test artifact file (likely unintended).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1833 to +1843
// Simulate main workflow that explicitly disabled threat-detection:
// threat-detection: false → parseThreatDetectionConfig returns nil.
topConfig := &SafeOutputsConfig{
ThreatDetection: nil,
AddComments: &AddCommentsConfig{},
}

// Import fragment with safe-outputs but no threat-detection key.
importedJSON := []string{
`{"add-comment":{"max":1}}`,
}
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn’t exercise the regression path because topConfig pre-defines AddComments, so MergeSafeOutputs will delete the imported add-comment entry as an override conflict before mergeSafeOutputConfig runs. That means the test would pass even without the threat-detection merge guard. Consider using a topConfig that represents a workflow with safe-outputs present but without add-comment (e.g., only meta/messages fields) and assert that AddComments is merged while ThreatDetection stays nil.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1 @@
Test file for PR push - smoke test run 22377081592
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a temporary CI/smoke-test artifact that shouldn’t be committed to the repository. Please remove this file from the PR (and add it to .gitignore if it’s generated by local/CI workflows).

Suggested change
Test file for PR push - smoke test run 22377081592

Copilot uses AI. Check for mistakes.
github-actions bot added a commit that referenced this pull request Feb 25, 2026
…cedence

- Add `validate` command section to setup/cli.md (Building section, after
  compile). The command was added in #18191 but was missing from the main
  CLI reference page; only the compilation-process.md reference table had
  been updated.

- Add NOTE callout to reference/threat-detection.md clarifying that an
  explicit `threat-detection: false` in a workflow takes precedence over
  imported fragments, documenting the behaviour fixed in #18231.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: imported safe-output fragments override explicit threat-detection: false

4 participants