Skip to content

Commit

Permalink
Merge pull request #2283 from DataDog/add-query-string-automatic-reda…
Browse files Browse the repository at this point in the history
…ction

Add query string automatic redaction
  • Loading branch information
lloeki authored Sep 23, 2022
2 parents 6970088 + 91bee81 commit 0a9a7ae
Show file tree
Hide file tree
Showing 14 changed files with 350 additions and 33 deletions.
3 changes: 1 addition & 2 deletions .rubocop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,7 @@ Metrics/BlockLength:
- spec/**/*

Metrics/ModuleLength:
Exclude:
- spec/**/*
Enabled: false

Metrics/ParameterLists:
Enabled: false
Expand Down
27 changes: 22 additions & 5 deletions docs/GettingStarted.md
Original file line number Diff line number Diff line change
Expand Up @@ -1522,15 +1522,20 @@ run app
| `headers` | Hash of HTTP request or response headers to add as tags to the `rack.request`. Accepts `request` and `response` keys with Array values e.g. `['Last-Modified']`. Adds `http.request.headers.*` and `http.response.headers.*` tags respectively. | `{ response: ['Content-Type', 'X-Request-ID'] }` |
| `middleware_names` | Enable this if you want to use the last executed middleware class as the resource name for the `rack` span. If enabled alongside the `rails` instrumention, `rails` takes precedence by setting the `rack` resource name to the active `rails` controller when applicable. Requires `application` option to use. | `false` |
| `quantize` | Hash containing options for quantization. May include `:query` or `:fragment`. | `{}` |
| `quantize.base` | Defines behavior for URL base (scheme, host, port). Removes URL base from `http.url` tag by default, leaving a path, and sets `http.base_url`. May be `:show` to keep URL base in `http.url` tag and not set `http.base_url` tag. Option must be nested inside the `quantize` option. | `nil` |
| `quantize.base` | Defines behavior for URL base (scheme, host, port). May be `:show` to keep URL base in `http.url` tag and not set `http.base_url` tag, or `nil` to remove URL base from `http.url` tag by default, leaving a path and setting `http.base_url`. Option must be nested inside the `quantize` option. | `nil` |
| `quantize.query` | Hash containing options for query portion of URL quantization. May include `:show` or `:exclude`. See options below. Option must be nested inside the `quantize` option. | `{}` |
| `quantize.query.show` | Defines which values should always be shown. Shows no values by default. May be an Array of strings, or `:all` to show all values. Option must be nested inside the `query` option. | `nil` |
| `quantize.query.exclude` | Defines which values should be removed entirely. Excludes nothing by default. May be an Array of strings, or `:all` to remove the query string entirely. Option must be nested inside the `query` option. | `nil` |
| `quantize.fragment` | Defines behavior for URL fragments. Removes fragments by default. May be `:show` to show URL fragments. Option must be nested inside the `quantize` option. | `nil` |
| `quantize.query.show` | Defines which values should always be shown. May be an Array of strings, `:all` to show all values, or `nil` to show no values. Option must be nested inside the `query` option. | `nil` |
| `quantize.query.exclude` | Defines which values should be removed entirely. May be an Array of strings, `:all` to remove the query string entirely, or `nil` to exclude nothing. Option must be nested inside the `query` option. | `nil` |
| `quantize.query.obfuscate` | Defines query string redaction behaviour. May be a hash of options, `:internal` to use the default internal obfuscation settings, or `nil` to disable obfuscation. Note that obfuscation is a string-wise operation, not a key-value operation. When enabled, `query.show` defaults to `:all` if otherwise unset. Option must be nested inside the `query` option. | `nil` |
| `quantize.query.obfuscate.with` | Defines the string to replace obfuscated matches with. May be a String. Option must be nested inside the `query.obfuscate` option. | `'<redacted>'` |
| `quantize.query.obfuscate.regex` | Defines the regex with which the query string will be redacted. May be a Regexp, or `:internal` to use the default internal Regexp, which redacts well-known sensitive data. Each match is redacted entirely by replacing it with `query.obfuscate.with`. Option must be nested inside the `query.obfuscate` option. | `:internal` |
| `quantize.fragment` | Defines behavior for URL fragments. May be `:show` to show URL fragments, or `nil` to remove fragments. Option must be nested inside the `quantize` option. | `nil` |
| `request_queuing` | Track HTTP request time spent in the queue of the frontend server. See [HTTP request queuing](#http-request-queuing) for setup details. Set to `true` to enable. | `false` |
| `web_service_name` | Service name for frontend server request queuing spans. (e.g. `'nginx'`) | `'web-server'` |

Deprecation notice: `quantize.base` will change its default from `:exclude` to `:show` in a future version. Voluntarily moving to `:show` is recommended.
Deprecation notice:
- `quantize.base` will change its default from `:exclude` to `:show` in a future version. Voluntarily moving to `:show` is recommended.
- `quantize.query.show` will change its default to `:all` in a future version, together with `quantize.query.obfuscate` changing to `:internal`. Voluntarily moving to these future values is recommended.

**Configuring URL quantization behavior**

Expand Down Expand Up @@ -1567,6 +1572,18 @@ Datadog.configure do |c|
# Show URL fragments
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path?category_id&sort_by#featured
c.tracing.instrument :rack, quantize: { fragment: :show }
# Obfuscate query string, defaulting to showing all values
# http://example.com/path?password=qwerty&sort_by=asc#featured --> /path?<redacted>&sort_by=asc
c.tracing.instrument :rack, quantize: { query: { obfuscate: {} } }
# Obfuscate query string using the provided regex, defaulting to showing all values
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path?<redacted>&sort_by=asc
c.tracing.instrument :rack, quantize: { query: { obfuscate: { regex: /category_id=\d+/ } } }
# Obfuscate query string using a custom redaction string
# http://example.com/path?password=qwerty&sort_by=asc#featured --> /path?REMOVED&sort_by=asc
c.tracing.instrument :rack, quantize: { query: { obfuscate: { with: 'REMOVED' } } }
end
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,6 @@ def self.libdatadog_folder_relative_to_native_lib_folder(

# Used to check if profiler is supported, including user-visible clear messages explaining why their
# system may not be supported.
# rubocop:disable Metrics/ModuleLength
module Supported
private_class_method def self.explain_issue(*reason, suggested:)
{ reason: reason, suggested: suggested }
Expand Down Expand Up @@ -284,7 +283,6 @@ def self.pkg_config_missing?(command: $PKGCONFIG) # rubocop:disable Style/Global
no_binaries_for_current_platform unless Libdatadog.pkgconfig_folder
end
end
# rubocop:enable Metrics/ModuleLength
end
end
end
2 changes: 0 additions & 2 deletions lib/datadog/appsec/contrib/rack/gateway/watcher.rb
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ module Contrib
module Rack
module Gateway
# Watcher for Rack gateway events
# rubocop:disable Metrics/ModuleLength
module Watcher
# rubocop:disable Metrics/AbcSize
# rubocop:disable Metrics/MethodLength
Expand Down Expand Up @@ -161,7 +160,6 @@ def active_span
end
end
end
# rubocop:enable Metrics/ModuleLength
end
end
end
Expand Down
2 changes: 0 additions & 2 deletions lib/datadog/ci/ext/environment.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ module Datadog
module CI
module Ext
# Defines constants for CI tags
# rubocop:disable Metrics/ModuleLength:
module Environment
include Kernel # Ensure that kernel methods are always available (https://sorbet.org/docs/error-reference#7003)

Expand Down Expand Up @@ -513,7 +512,6 @@ def extract_name_email(name_and_email)
[nil, name_and_email]
end
end
# rubocop:enable Metrics/ModuleLength:
end
end
end
2 changes: 1 addition & 1 deletion lib/datadog/core/configuration.rb
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
module Datadog
module Core
# Configuration provides a unique access point for configurations
module Configuration # rubocop:disable Metrics/ModuleLength
module Configuration
include Kernel # Ensure that kernel methods are always available (https://sorbet.org/docs/error-reference#7003)

# Used to ensure that @components initialization/reconfiguration is performed one-at-a-time, by a single thread.
Expand Down
2 changes: 0 additions & 2 deletions lib/datadog/core/telemetry/collector.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ module Datadog
module Core
module Telemetry
# Module defining methods for collecting metadata for telemetry
# rubocop:disable Metrics/ModuleLength
module Collector
include Datadog::Core::Configuration

Expand Down Expand Up @@ -228,7 +227,6 @@ def patch_error(integration)
end
end
end
# rubocop:enable Metrics/ModuleLength
end
end
end
2 changes: 0 additions & 2 deletions lib/datadog/core/workers/async.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@ module Workers
module Async
# Adds threading behavior to workers
# to run tasks asynchronously.
# rubocop:disable Metrics/ModuleLength
module Thread
FORK_POLICY_STOP = :stop
FORK_POLICY_RESTART = :restart
Expand Down Expand Up @@ -175,7 +174,6 @@ def restart_after_fork(&block)
end
end
end
# rubocop:enable Metrics/ModuleLength
end
end
end
Expand Down
2 changes: 1 addition & 1 deletion lib/datadog/profiling.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

module Datadog
# Contains profiler for generating stack profiles, etc.
module Profiling # rubocop:disable Metrics/ModuleLength
module Profiling
GOOGLE_PROTOBUF_MINIMUM_VERSION = Gem::Version.new('3.0')
private_constant :GOOGLE_PROTOBUF_MINIMUM_VERSION

Expand Down
2 changes: 0 additions & 2 deletions lib/datadog/tracing/contrib/aws/services.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
module Datadog
module Tracing
module Contrib
# rubocop:disable Metrics/ModuleLength:
module Aws
SERVICES = %w[
ACM
Expand Down Expand Up @@ -117,7 +116,6 @@ module Aws
XRay
].freeze
end
# rubocop:enable Metrics/ModuleLength:
end
end
end
2 changes: 0 additions & 2 deletions lib/datadog/tracing/contrib/ethon/easy_patch.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ def self.included(base)
end

# InstanceMethods - implementing instrumentation
# rubocop:disable Metrics/ModuleLength
module InstanceMethods
include Contrib::HttpAnnotationHelper

Expand Down Expand Up @@ -168,7 +167,6 @@ def analytics_sample_rate
datadog_configuration[:analytics_sample_rate]
end
end
# rubocop:enable Metrics/ModuleLength
end
end
end
Expand Down
2 changes: 0 additions & 2 deletions lib/datadog/tracing/contrib/grape/endpoint.rb
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ module Datadog
module Tracing
module Contrib
module Grape
# rubocop:disable Metrics/ModuleLength
# Endpoint module includes a list of subscribers to create
# traces when a Grape endpoint is hit
module Endpoint
Expand Down Expand Up @@ -245,7 +244,6 @@ def datadog_configuration
end
end
end
# rubocop:enable Metrics/ModuleLength
end
end
end
Expand Down
76 changes: 68 additions & 8 deletions lib/datadog/tracing/contrib/utils/quantization/http.rb
Original file line number Diff line number Diff line change
Expand Up @@ -59,22 +59,26 @@ def query(query, options = {})

def query!(query, options = {})
options ||= {}
options[:show] = options[:show] || []
options[:obfuscate] = {} if options[:obfuscate] == :internal
options[:show] = options[:show] || (options[:obfuscate] ? :all : [])
options[:exclude] = options[:exclude] || []

# Short circuit if query string is meant to exclude everything
# or if the query string is meant to include everything
return '' if options[:exclude] == :all
return query if options[:show] == :all

collect_query(query, uniq: true) do |key, value|
if options[:exclude].include?(key)
[nil, nil]
else
value = options[:show].include?(key) ? value : nil
[key, value]
unless options[:show] == :all && !(options[:obfuscate] && options[:exclude])
query = collect_query(query, uniq: true) do |key, value|
if options[:exclude].include?(key)
[nil, nil]
else
value = options[:show] == :all || options[:show].include?(key) ? value : nil
[key, value]
end
end
end

options[:obfuscate] ? obfuscate_query(query, options[:obfuscate]) : query
end

# Iterate over each key value pair, yielding to the block given.
Expand Down Expand Up @@ -105,6 +109,62 @@ def collect_query(query, options = {})
end

private_class_method :collect_query

# Scans over the query string and obfuscates sensitive data by
# replacing matches with an opaque value
def obfuscate_query(query, options = {})
options[:regex] = nil if options[:regex] == :internal
re = options[:regex] || OBFUSCATOR_REGEX
with = options[:with] || OBFUSCATOR_WITH

query.gsub(re, with)
end

private_class_method :obfuscate_query

OBFUSCATOR_WITH = '<redacted>'.freeze

# rubocop:disable Layout/LineLength
OBFUSCATOR_REGEX = %r{
(?: # JSON-ish leading quote
(?:"|%22)?
)
(?: # common keys
(?:old_?|new_?)?p(?:ass)?w(?:or)?d(?:1|2)? # pw, password variants
|pass(?:_?phrase)? # pass, passphrase variants
|secret
|(?: # key, key_id variants
api_?
|private_?
|public_?
|access_?
|secret_?
)key(?:_?id)?
|token
|consumer_?(?:id|key|secret)
|sign(?:ed|ature)?
|auth(?:entication|orization)?
)
(?:
# '=' query string separator, plus value til next '&' separator
(?:\s|%20)*(?:=|%3D)[^&]+
# JSON-ish '": "somevalue"', key being handled with case above, without the opening '"'
|(?:"|%22) # closing '"' at end of key
(?:\s|%20)*(?::|%3A)(?:\s|%20)* # ':' key-value spearator, with surrounding spaces
(?:"|%22) # opening '"' at start of value
(?:%2[^2]|%[^2]|[^"%])+ # value
(?:"|%22) # closing '"' at end of value
)
|(?: # other common secret values
bearer(?:\s|%20)+[a-z0-9._\-]+
|token(?::|%3A)[a-z0-9]{13}
|gh[opsu]_[0-9a-zA-Z]{36}
|ey[I-L](?:[\w=-]|%3D)+\.ey[I-L](?:[\w=-]|%3D)+(?:\.(?:[\w.+/=-]|%3D|%2F|%2B)+)?
|-{5}BEGIN(?:[a-z\s]|%20)+PRIVATE(?:\s|%20)KEY-{5}[^\-]+-{5}END(?:[a-z\s]|%20)+PRIVATE(?:\s|%20)KEY(?:-{5})?(?:\n|%0A)?
|(?:ssh-(?:rsa|dss)|ecdsa-[a-z0-9]+-[a-z0-9]+)(?:\s|%20)*(?:[a-z0-9/.+]|%2F|%5C|%2B){100,}(?:=|%3D)*(?:(?:\s+)[a-z0-9._-]+)?
)
}ix.freeze
# rubocop:enable Layout/LineLength
end
end
end
Expand Down
Loading

0 comments on commit 0a9a7ae

Please sign in to comment.