Move detectors.IsKnownFalsePositive from the detectors and into the engine #2643

dustin-decker · 2024-03-29T03:01:44Z

Description:

Move detectors.IsKnownFalsePositive from the detectors and into the engine.

Refactor-robot was used but result was reviewed and altered in a few cases.

Checklist:

Tests passing (make test-community)?
Lint passing (make lint this requires golangci-lint)?

…ove-fp-filtering

rgmz · 2024-03-29T03:31:05Z

Imo, it would be ideal for this to still be configurable to some degree. Certain secret patterns may not play nicely with the default list and may even have their own separate list (e.g., #1953.)

dustin-decker · 2024-03-29T03:35:05Z

We're working on centralizing the filtering and can make it more configurable in the future. I have left the detectors that I saw that had custom lists in place.

The wordlists were also filtered to 4 char minimum a while back so that resolved the case you identified in #1953, but I recognize that it's just of an example and still can occur in other cases.

rgmz · 2024-04-02T17:19:51Z

pkg/detectors/falsepositives.go

+					if !IsKnownFalsePositive(string(result.Raw), falsePositives, wordCheck) {
+						filteredResults = append(filteredResults, result)
+					}


Do you think it would be useful to trace log that something was filtered? There have been a few times — most recently #2620 — where the cause of something not being detected wasn't immediately obvious, and required custom logging to see that the FP check was responsible.

(Incidentally, there's a lot of noise from this specific log)

@rgmz in internal discussions we've actually been leaning towards expanding --results to include these in order to solve that problem. We're not quite there yet but it seems like a potentially good way to go.

I like that. It would be an intuitive expansion of the flag.

I've extended the results flag with an additional filtered_unverified option. It just logs that it was omitted, it doesn't actually 'notify' on it though.

rosecodym · 2024-04-03T21:16:52Z

pkg/detectors/falsepositives.go

+			switch result.DetectorType {
+			case detectorspb.DetectorType_CustomRegex:
+				filteredResults = append(filteredResults, result)
+				break


Why are we breaking for these detector types? My understanding is that the effect will be that when a custom regex (or gcp) detector returns multiple unverified results, we'll ignore all but the first one. Is that intended? And even if it is, is this function the place for it? (It doesn't sound like it has anything to do with false positives.)

The lint action is actually flagging this as a "redundant break" because it is breaking out of the switch, not the for.

🤦 thanks for correcting me!

rosecodym

Ok, I went through the detector changes and I made this list of detectors that had their false positive check expanded, i.e. these detectors now perform wordlist-based false positive checking on material that they did not before:

azurebatch used to filter on account key, but now filters on endpoint
azurecontainerregistry used to filter on password, but now filters on endpoint
column used to filter on nothing, but now filters on Raw
facebookoauth used to filter on api id, but now filters on api secret
githubapp used to filter on nothing, but now filters on private key
paypaloauth used to filter on key id, but now filters on key secret
postgres used to filter on the password only, but now filters on the entire connection string
pusherchannelkey used to filter on the secret, but now filters on the app id
shopify used to filter on the key, but now filters on the key+domain
trufflehogenterprise used to filter on the secret, but now filters on the key

I don't think that any of these are necessarily wrong but we should ensure that they're all acceptable. (E.g. shopify now runs word lists against domains.)

It also looks like the url detector lost all of its custom logic (specifically it no longer checks the allowKnownTestSites flag, and no longer skips the word list). This seems less likely to have been intentional.

dustin-decker · 2024-04-08T22:00:21Z

Thank you for your analysis, Cody.

I've made the following changes, the rest seemed okay to me:

[x] azurebatch used to filter on account key, but now filters on endpoint - excluded
[x] azurecontainerregistry used to filter on password, but now filters on endpoint - excluded
[x] postgres used to filter on the password only, but now filters on the entire connection string - excluded, pattern is specific
[x] shopify used to filter on the key, but now filters on the key+domain - excluded, pattern is specific
[x] uri lost all of its custom logic (specifically it no longer checks the allowKnownTestSites flag, and no longer skips the word list). This seems less likely to have been intentional - excluded, it does still use the allowKnownTestSites flag

I've also extended the results flag with an additional filtered_unverified option. It just logs that it was omitted, it doesn't actually 'notify' on it though.

rosecodym

Looks pretty good! I'm happy to have been wrong about the work level. I noticed that my previous detector list was missing one: dotmailer used to run the FP check against the password, but now it runs it against the key ID, which looks like an email address. That might be another one to exclude.

It also looks like you've got some new lint errors, but nothing show-stopping. Thanks for doing this!

This PR adds false positive information to the Result protobuf message in anticipation of us tracking it as first-class secret metadata. We're not doing that yet (it's blocked behind #2643) but setting up the messages now means we'll be able to do it later with less of a code delta.

This PR: Creates an optional interface that detectors can use to customize their false positive detection Implements this interface on detectors that have custom logic In most cases this "custom logic" is simply a no-op because the detector does not participate in false positive detection Eliminates inline (old-style) false positive exclusion in a few detectors that #2643 missed

dustin-decker force-pushed the remove-fp-filtering branch from e58f065 to db5ab72 Compare March 29, 2024 03:04

dustin-decker added 2 commits March 28, 2024 20:15

Remove detectors.IsKnownFalsePositive from detectors

fd0e685

Centralize false positive removal in engine

03098bc

dustin-decker force-pushed the remove-fp-filtering branch from db5ab72 to 03098bc Compare March 29, 2024 03:15

dustin-decker added 3 commits March 28, 2024 20:19

Merge branch 'main' of github.com:trufflesecurity/trufflehog into rem…

55d030d

…ove-fp-filtering

Don't apply fp filtering on custom regex to preserve previous behavior.

e2bf219

fix empty branch

ff4bb0a

dustin-decker marked this pull request as ready for review March 29, 2024 03:36

dustin-decker requested review from a team as code owners March 29, 2024 03:36

zricethezav approved these changes Mar 29, 2024

View reviewed changes

dustin-decker added 2 commits March 29, 2024 09:11

update excludes

f9e3d74

update filtering

3fce83e

rgmz reviewed Apr 2, 2024

View reviewed changes

rosecodym reviewed Apr 3, 2024

View reviewed changes

Add result flag option and exclude some detectors

211f183

rosecodym approved these changes Apr 9, 2024

View reviewed changes

rosecodym mentioned this pull request Apr 22, 2024

Add false positive info to protobuf #2729

Merged

2 tasks

dustin-decker merged commit 14e44db into main Apr 22, 2024
9 of 10 checks passed

dustin-decker deleted the remove-fp-filtering branch April 22, 2024 22:18

rgmz mentioned this pull request Apr 24, 2024

Scan commit metadata #2713

Merged

2 tasks

rosecodym mentioned this pull request Apr 24, 2024

Expose detector-specific false positive logic #2743

Merged

2 tasks

rosecodym mentioned this pull request May 6, 2024

Private keys not being detected by Trufflehog >=3.74.0 #2788

Closed

ahrav mentioned this pull request May 6, 2024

[bug] - Ignore FP check for the private key detector #2793

Merged

2 tasks

rgmz mentioned this pull request Jun 13, 2024

Return match/reason from detectors.IsKnownFalsePositive #2969

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move detectors.IsKnownFalsePositive from the detectors and into the engine #2643

Move detectors.IsKnownFalsePositive from the detectors and into the engine #2643

dustin-decker commented Mar 29, 2024 •

edited

Loading

rgmz commented Mar 29, 2024 •

edited

Loading

dustin-decker commented Mar 29, 2024

rgmz Apr 2, 2024 •

edited

Loading

rosecodym Apr 3, 2024

rgmz Apr 3, 2024

dustin-decker Apr 8, 2024

rosecodym Apr 3, 2024

mcastorina Apr 4, 2024

rosecodym Apr 4, 2024

rosecodym left a comment •

edited

Loading

dustin-decker commented Apr 8, 2024

rosecodym left a comment

Move detectors.IsKnownFalsePositive from the detectors and into the engine #2643

Move detectors.IsKnownFalsePositive from the detectors and into the engine #2643

Conversation

dustin-decker commented Mar 29, 2024 • edited Loading

Description:

Checklist:

rgmz commented Mar 29, 2024 • edited Loading

dustin-decker commented Mar 29, 2024

rgmz Apr 2, 2024 • edited Loading

Choose a reason for hiding this comment

rosecodym Apr 3, 2024

Choose a reason for hiding this comment

rgmz Apr 3, 2024

Choose a reason for hiding this comment

dustin-decker Apr 8, 2024

Choose a reason for hiding this comment

rosecodym Apr 3, 2024

Choose a reason for hiding this comment

mcastorina Apr 4, 2024

Choose a reason for hiding this comment

rosecodym Apr 4, 2024

Choose a reason for hiding this comment

rosecodym left a comment • edited Loading

Choose a reason for hiding this comment

dustin-decker commented Apr 8, 2024

rosecodym left a comment

Choose a reason for hiding this comment

dustin-decker commented Mar 29, 2024 •

edited

Loading

rgmz commented Mar 29, 2024 •

edited

Loading

rgmz Apr 2, 2024 •

edited

Loading

rosecodym left a comment •

edited

Loading