Skip to content

Commit

Permalink
Merge pull request #2265 from DataDog/use-full-url-in-http-url-tag
Browse files Browse the repository at this point in the history
Use full URL in `http.url` tag
  • Loading branch information
lloeki committed Sep 23, 2022
2 parents c4e8a0a + 8c848ec commit 6970088
Show file tree
Hide file tree
Showing 16 changed files with 158 additions and 73 deletions.
2 changes: 1 addition & 1 deletion .rubocop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ Style/NumericLiterals:
Enabled: false

Metrics/ClassLength:
Max: 140
Enabled: false

Metrics/BlockLength:
Max: 42
Expand Down
27 changes: 19 additions & 8 deletions docs/GettingStarted.md
Original file line number Diff line number Diff line change
Expand Up @@ -1522,39 +1522,50 @@ run app
| `headers` | Hash of HTTP request or response headers to add as tags to the `rack.request`. Accepts `request` and `response` keys with Array values e.g. `['Last-Modified']`. Adds `http.request.headers.*` and `http.response.headers.*` tags respectively. | `{ response: ['Content-Type', 'X-Request-ID'] }` |
| `middleware_names` | Enable this if you want to use the last executed middleware class as the resource name for the `rack` span. If enabled alongside the `rails` instrumention, `rails` takes precedence by setting the `rack` resource name to the active `rails` controller when applicable. Requires `application` option to use. | `false` |
| `quantize` | Hash containing options for quantization. May include `:query` or `:fragment`. | `{}` |
| `quantize.base` | Defines behavior for URL base (scheme, host, port). Removes URL base from `http.url` tag by default, leaving a path, and sets `http.base_url`. May be `:show` to keep URL base in `http.url` tag and not set `http.base_url` tag. Option must be nested inside the `quantize` option. | `nil` |
| `quantize.query` | Hash containing options for query portion of URL quantization. May include `:show` or `:exclude`. See options below. Option must be nested inside the `quantize` option. | `{}` |
| `quantize.query.show` | Defines which values should always be shown. Shows no values by default. May be an Array of strings, or `:all` to show all values. Option must be nested inside the `query` option. | `nil` |
| `quantize.query.exclude` | Defines which values should be removed entirely. Excludes nothing by default. May be an Array of strings, or `:all` to remove the query string entirely. Option must be nested inside the `query` option. | `nil` |
| `quantize.fragment` | Defines behavior for URL fragments. Removes fragments by default. May be `:show` to show URL fragments. Option must be nested inside the `quantize` option. | `nil` |
| `request_queuing` | Track HTTP request time spent in the queue of the frontend server. See [HTTP request queuing](#http-request-queuing) for setup details. Set to `true` to enable. | `false` |
| `web_service_name` | Service name for frontend server request queuing spans. (e.g. `'nginx'`) | `'web-server'` |

Deprecation notice: `quantize.base` will change its default from `:exclude` to `:show` in a future version. Voluntarily moving to `:show` is recommended.

**Configuring URL quantization behavior**

```ruby
Datadog.configure do |c|
# Default behavior: all values are quantized, fragment is removed.
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path?category_id&sort_by
# http://example.com/path?categories[]=1&categories[]=2 --> http://example.com/path?categories[]
# Default behavior: all values are quantized, base is removed, fragment is removed.
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path?category_id&sort_by
# http://example.com:8080/path?categories[]=1&categories[]=2 --> /path?categories[]
# Remove URL base (scheme, host, port)
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path?category_id&sort_by#featured
c.tracing.instrument :rack, quantize: { base: :exclude }
# Show URL base
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path?category_id&sort_by#featured
c.tracing.instrument :rack, quantize: { base: :show }
# Show values for any query string parameter matching 'category_id' exactly
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path?category_id=1&sort_by
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path?category_id=1&sort_by
c.tracing.instrument :rack, quantize: { query: { show: ['category_id'] } }
# Show all values for all query string parameters
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path?category_id=1&sort_by=asc
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path?category_id=1&sort_by=asc
c.tracing.instrument :rack, quantize: { query: { show: :all } }
# Totally exclude any query string parameter matching 'sort_by' exactly
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path?category_id
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path?category_id
c.tracing.instrument :rack, quantize: { query: { exclude: ['sort_by'] } }
# Remove the query string entirely
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path
c.tracing.instrument :rack, quantize: { query: { exclude: :all } }
# Show URL fragments
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path?category_id&sort_by#featured
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path?category_id&sort_by#featured
c.tracing.instrument :rack, quantize: { fragment: :show }
end
```
Expand Down
2 changes: 0 additions & 2 deletions lib/datadog/appsec/configuration/settings.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ module AppSec
module Configuration
# Configuration settings, acting as an integration registry
# TODO: as with Configuration, this is a trivial implementation
# rubocop:disable Metrics/ClassLength
class Settings
class << self
def boolean
Expand Down Expand Up @@ -188,7 +187,6 @@ def reset!
initialize
end
end
# rubocop:enable Metrics/ClassLength
end
end
end
3 changes: 0 additions & 3 deletions lib/datadog/core/configuration/agent_settings_resolver.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,6 @@ module Configuration
#
# Whenever there is a conflict (different configurations are provided in different orders), it MUST warn the users
# about it and pick a value based on the following priority: code > environment variable > defaults.
#
# rubocop:disable Metrics/ClassLength
class AgentSettingsResolver
AgentSettings = \
Struct.new(
Expand Down Expand Up @@ -359,7 +357,6 @@ def adapter(kind_or_custom_adapter, *args, **kwargs)
end
end
end
# rubocop:enable Metrics/ClassLength
end
end
end
2 changes: 0 additions & 2 deletions lib/datadog/core/configuration/components.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ module Datadog
module Core
module Configuration
# Global components for the trace library.
# rubocop:disable Metrics/ClassLength
class Components
class << self
def build_health_metrics(settings)
Expand Down Expand Up @@ -431,7 +430,6 @@ def shutdown!(replacement = nil)
telemetry.emit_closing! unless replacement
end
end
# rubocop:enable Metrics/ClassLength
end
end
end
3 changes: 0 additions & 3 deletions lib/datadog/core/configuration/settings.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ module Configuration
# Global configuration settings for the trace library.
# @public_api
# rubocop:disable Metrics/BlockLength
# rubocop:disable Metrics/ClassLength
# rubocop:disable Layout/LineLength
class Settings
include Base
Expand Down Expand Up @@ -713,9 +712,7 @@ def initialize(*_)
end
end
end

# rubocop:enable Metrics/BlockLength
# rubocop:enable Metrics/ClassLength
# rubocop:enable Layout/LineLength
end
end
Expand Down
2 changes: 1 addition & 1 deletion lib/datadog/profiling/collectors/old_stack.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ module Collectors
# Runs on its own background thread.
#
# This class has the prefix "Old" because it will be deprecated by the new native CPU Profiler
class OldStack < Core::Worker # rubocop:disable Metrics/ClassLength
class OldStack < Core::Worker
include Core::Workers::Polling

DEFAULT_MAX_TIME_USAGE_PCT = 2.0
Expand Down
88 changes: 62 additions & 26 deletions lib/datadog/tracing/contrib/rack/middlewares.rb
Original file line number Diff line number Diff line change
Expand Up @@ -123,18 +123,6 @@ def call(env)
# rubocop:disable Metrics/PerceivedComplexity
# rubocop:disable Metrics/MethodLength
def set_request_tags!(trace, request_span, env, status, headers, response, original_env)
# http://www.rubydoc.info/github/rack/rack/file/SPEC
# The source of truth in Rack is the PATH_INFO key that holds the
# URL for the current request; but some frameworks may override that
# value, especially during exception handling.
#
# Because of this, we prefer to use REQUEST_URI, if available, which is the
# relative path + query string, and doesn't mutate.
#
# REQUEST_URI is only available depending on what web server is running though.
# So when its not available, we want the original, unmutated PATH_INFO, which
# is just the relative path without query strings.
url = env['REQUEST_URI'] || original_env['PATH_INFO']
request_header_collection = Header::RequestHeaderCollection.new(env)
request_headers_tags = parse_request_headers(request_header_collection)
response_headers_tags = parse_response_headers(headers || {})
Expand Down Expand Up @@ -176,14 +164,35 @@ def set_request_tags!(trace, request_span, env, status, headers, response, origi
request_span.set_tag(Tracing::Metadata::Ext::HTTP::TAG_METHOD, env['REQUEST_METHOD'])
end

url = parse_url(env, original_env)

if request_span.get_tag(Tracing::Metadata::Ext::HTTP::TAG_URL).nil?
options = configuration[:quantize]
options = configuration[:quantize] || {}

# Quantization::HTTP.url base defaults to :show, but we are transitioning
options[:base] ||= :exclude

request_span.set_tag(
Tracing::Metadata::Ext::HTTP::TAG_URL,
Contrib::Utils::Quantization::HTTP.url(url, options)
)
end

if request_span.get_tag(Tracing::Metadata::Ext::HTTP::TAG_BASE_URL).nil?
options = configuration[:quantize]

unless options[:base] == :show
base_url = Contrib::Utils::Quantization::HTTP.base_url(url)

unless base_url.empty?
request_span.set_tag(
Tracing::Metadata::Ext::HTTP::TAG_BASE_URL,
base_url
)
end
end
end

if request_span.get_tag(Tracing::Metadata::Ext::HTTP::TAG_CLIENT_IP).nil?
Tracing::ClientIp.set_client_ip_tag(
request_span,
Expand All @@ -192,19 +201,6 @@ def set_request_tags!(trace, request_span, env, status, headers, response, origi
)
end

if request_span.get_tag(Tracing::Metadata::Ext::HTTP::TAG_BASE_URL).nil?
request_obj = ::Rack::Request.new(env)

base_url = if request_obj.respond_to?(:base_url)
request_obj.base_url
else
# Compatibility for older Rack versions
request_obj.url.chomp(request_obj.fullpath)
end

request_span.set_tag(Tracing::Metadata::Ext::HTTP::TAG_BASE_URL, base_url)
end

if request_span.get_tag(Tracing::Metadata::Ext::HTTP::TAG_STATUS_CODE).nil? && status
request_span.set_tag(Tracing::Metadata::Ext::HTTP::TAG_STATUS_CODE, status)
end
Expand Down Expand Up @@ -238,6 +234,46 @@ def configuration
Datadog.configuration.tracing[:rack]
end

def parse_url(env, original_env)
request_obj = ::Rack::Request.new(env)

# scheme, host, and port
base_url = if request_obj.respond_to?(:base_url)
request_obj.base_url
else
# Compatibility for older Rack versions
request_obj.url.chomp(request_obj.fullpath)
end

# https://github.com/rack/rack/blob/main/SPEC.rdoc
#
# The source of truth in Rack is the PATH_INFO key that holds the
# URL for the current request; but some frameworks may override that
# value, especially during exception handling.
#
# Because of this, we prefer to use REQUEST_URI, if available, which is the
# relative path + query string, and doesn't mutate.
#
# REQUEST_URI is only available depending on what web server is running though.
# So when its not available, we want the original, unmutated PATH_INFO, which
# is just the relative path without query strings.
#
# SCRIPT_NAME is the first part of the request URL path, so that
# the application can know its virtual location. It should be
# prepended to PATH_INFO to reflect the correct user visible path.
request_uri = env['REQUEST_URI'].to_s
fullpath = if request_uri.empty?
query_string = original_env['QUERY_STRING'].to_s
path = original_env['SCRIPT_NAME'].to_s + original_env['PATH_INFO'].to_s

query_string.empty? ? path : "#{path}?#{query_string}"
else
request_uri
end

base_url + fullpath
end

def parse_user_agent_header(headers)
headers.get(Tracing::Metadata::Ext::HTTP::HEADER_USER_AGENT)
end
Expand Down
16 changes: 15 additions & 1 deletion lib/datadog/tracing/contrib/utils/quantization/http.rb
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,14 @@ def url(url, options = {})
options[:placeholder] || PLACEHOLDER
end

def base_url(url, options = {})
URI.parse(url).tap do |uri|
uri.path = ''
uri.query = nil
uri.fragment = nil
end.to_s
end

def url!(url, options = {})
options ||= {}

Expand All @@ -32,8 +40,14 @@ def url!(url, options = {})
uri.query = (!query.nil? && query.empty? ? nil : query)
end

# Remove any URI framents
# Remove any URI fragments
uri.fragment = nil unless options[:fragment] == :show

if options[:base] == :exclude
uri.host = nil
uri.port = nil
uri.scheme = nil
end
end.to_s
end

Expand Down
2 changes: 0 additions & 2 deletions lib/datadog/tracing/span_operation.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ module Tracing
# It gives a Span a context which can be used to
# build a Span. When completed, it yields the Span.
#
# rubocop:disable Metrics/ClassLength
# @public_api
class SpanOperation
include Metadata
Expand Down Expand Up @@ -516,6 +515,5 @@ def duration_nano
alias :span_type :type
alias :span_type= :type=
end
# rubocop:enable Metrics/ClassLength
end
end
2 changes: 0 additions & 2 deletions lib/datadog/tracing/trace_operation.rb
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ module Tracing
# For async support, a {Datadog::Tracing::TraceOperation} should be employed
# per execution context (e.g. Thread, etc.)
#
# rubocop:disable Metrics/ClassLength
# @public_api
class TraceOperation
include Metadata::Tagging
Expand Down Expand Up @@ -454,6 +453,5 @@ def build_trace(spans, partial = false)
)
end
end
# rubocop:enable Metrics/ClassLength
end
end
2 changes: 0 additions & 2 deletions lib/datadog/tracing/trace_segment.rb
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ module Datadog
module Tracing
# Serializable construct representing a trace
# @public_api
# rubocop:disable Metrics/ClassLength
class TraceSegment
TAG_NAME = 'name'.freeze
TAG_RESOURCE = 'resource'.freeze
Expand Down Expand Up @@ -203,6 +202,5 @@ def service_tag
meta[TAG_SERVICE]
end
end
# rubocop:enable Metrics/ClassLength
end
end
2 changes: 0 additions & 2 deletions lib/datadog/tracing/tracer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ module Tracing
# example, a trace can be used to track the entire time spent processing a complicated web request.
# Even though the request may require multiple resources and machines to handle the request, all
# of these function calls and sub-requests would be encapsulated within a single trace.
# rubocop:disable Metrics/ClassLength
class Tracer
attr_reader \
:trace_flush,
Expand Down Expand Up @@ -528,6 +527,5 @@ def skip_trace(name)
end
end
end
# rubocop:enable Metrics/ClassLength
end
end
Loading

0 comments on commit 6970088

Please sign in to comment.