Skip to content

Conversation

@cesarfda
Copy link
Contributor

This pull request adds a new table plugin for secret scanning to the Kolide Launcher, integrating the gitleaks library to detect secrets in files and raw data. It introduces the kolide_secret_scan table, which can be queried for secret findings, and includes comprehensive unit tests for the new functionality. The PR also updates several dependencies in go.mod to support the new feature.

New secret scanning table:

  • Added ee/tables/secretscan/table.go implementing the kolide_secret_scan table using gitleaks to scan file paths and raw data for secrets. The table returns details such as rule ID, description, severity, line/column numbers, entropy, and a redacted context for each finding.
  • Registered the new table in pkg/osquery/table/table.go so it is available for queries. [1] [2]

Testing and reliability:

  • Added ee/tables/secretscan/table_test.go with extensive unit tests covering table registration, secret detection, severity mapping, redaction logic, error handling, and edge cases.

Local query with interactive:

SELECT * FROM kolide_secret_scan WHERE path = '/tmp/test_secrets.txt';
+-----------------------+----------+-----------------+---------------------------------------------------------------------------------------------------------+----------+-------------+--------------+------------+---------+------------------+
| path | raw_data | rule_id | description | severity | line_number | column_start | column_end | entropy | redacted_context |
+-----------------------+----------+-----------------+---------------------------------------------------------------------------------------------------------+----------+-------------+--------------+------------+---------+------------------+
| /tmp/test_secrets.txt | | slack-bot-token | Identified a Slack Bot token, which may compromise bot integrations and communication channel security. | high | 1 | 17 | 59 | 4.71 | xoxb... |
+-----------------------+----------+-----------------+---------------------------------------------------------------------------------------------------------+----------+-------------+--------------+------------+---------+------------------+

image

Dependency updates:

  • Added github.com/zricethezav/gitleaks/v8 and several related indirect dependencies to go.mod for secret scanning support. Updated other dependencies to latest versions for compatibility and security. [1] [2] [3] [4]

Copy link
Contributor

@directionless directionless left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like the scanning tests to be a little different. I would like:

One is a test using the table test pattern and a set of specific strings as input. For each string they should:

  • Test raw_data
  • Create a temp file and test with it.
  • Validate that each of those two produces the expected result.

The other is takes specific test files, that are sourced from a test-data directory and validates their results.

Both cases should have examples of different kinds of secrets, as well as some non-secret files.

Comment on lines +22 to +23
// highSeverityRules contains rule IDs that are considered high severity
var highSeverityRules = map[string]bool{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we can just pull this whole list from the library, and not enumerate it? I'm think about how it might change over time.

Comment on lines +16 to +17
"github.com/zricethezav/gitleaks/v8/detect"
"github.com/zricethezav/gitleaks/v8/report"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the resultant launcher binary size change with this addition?

// Initialize gitleaks detector with default config (~150 secret patterns)
// If initialization fails, we store the error and return it at query time
// rather than returning nil (which would cause a panic during plugin registration)
detector, detectorErr := detect.NewDetectorDefaultConfig()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has the effect of creating this once, per launcher session, and reusing it. Is that the intended effect?

)
continue
}
results = append(results, fileResults...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to include filepath in the results, or is it coming out of the scanFile?

Comment on lines +173 to +176
// Don't echo back the raw data for security
for i := range rawResults {
rawResults[i]["raw_data"] = "[scanned]"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? I would expect sqlite to filter it out.

select * where raw_data = 'secret'

and if the table returns raw_data = [scanned] sqlite will filter it out

}

// determineSeverity maps a rule ID and entropy to a severity level
func determineSeverity(ruleID string, entropy float32) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot decide if we know enough to do this. We might not, and it might be better to surface the raw data, and we can be more opinionated in checks. I'd probably strike this

return sharedDetector, sharedDetectorErr
}

func TestTablePlugin(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test feels of low value.

}
}

func TestGenerate_DetectorError(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test feels low value

Comment on lines +232 to +234
// redact returns a redacted version of a secret for safe logging/display
// Shows first 4 characters followed by "..." to provide context without exposing the full secret
func redact(secret string) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is clever, but I think the thresholds are off. I might say we should fully redact everything under 6 characters, and for 6 and above redact all but the first 3.

Because a 4 character feels like it leaks too much

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants