Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use full URL in http.url tag #2265

Merged
merged 9 commits into from
Sep 23, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .rubocop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ Style/NumericLiterals:
Enabled: false

Metrics/ClassLength:
Max: 140
Enabled: false

Metrics/BlockLength:
Max: 42
Expand Down
27 changes: 19 additions & 8 deletions docs/GettingStarted.md
Original file line number Diff line number Diff line change
Expand Up @@ -1522,39 +1522,50 @@ run app
| `headers` | Hash of HTTP request or response headers to add as tags to the `rack.request`. Accepts `request` and `response` keys with Array values e.g. `['Last-Modified']`. Adds `http.request.headers.*` and `http.response.headers.*` tags respectively. | `{ response: ['Content-Type', 'X-Request-ID'] }` |
| `middleware_names` | Enable this if you want to use the last executed middleware class as the resource name for the `rack` span. If enabled alongside the `rails` instrumention, `rails` takes precedence by setting the `rack` resource name to the active `rails` controller when applicable. Requires `application` option to use. | `false` |
| `quantize` | Hash containing options for quantization. May include `:query` or `:fragment`. | `{}` |
| `quantize.base` | Defines behavior for URL base (scheme, host, port). Removes URL base from `http.url` tag by default, leaving a path, and sets `http.base_url`. May be `:show` to keep URL base in `http.url` tag and not set `http.base_url` tag. Option must be nested inside the `quantize` option. | `nil` |
| `quantize.query` | Hash containing options for query portion of URL quantization. May include `:show` or `:exclude`. See options below. Option must be nested inside the `quantize` option. | `{}` |
| `quantize.query.show` | Defines which values should always be shown. Shows no values by default. May be an Array of strings, or `:all` to show all values. Option must be nested inside the `query` option. | `nil` |
| `quantize.query.exclude` | Defines which values should be removed entirely. Excludes nothing by default. May be an Array of strings, or `:all` to remove the query string entirely. Option must be nested inside the `query` option. | `nil` |
| `quantize.fragment` | Defines behavior for URL fragments. Removes fragments by default. May be `:show` to show URL fragments. Option must be nested inside the `quantize` option. | `nil` |
| `request_queuing` | Track HTTP request time spent in the queue of the frontend server. See [HTTP request queuing](#http-request-queuing) for setup details. Set to `true` to enable. | `false` |
| `web_service_name` | Service name for frontend server request queuing spans. (e.g. `'nginx'`) | `'web-server'` |

Deprecation notice: `quantize.base` will change its default from `:exclude` to `:show` in a future version. Voluntarily moving to `:show` is recommended.

**Configuring URL quantization behavior**

```ruby
Datadog.configure do |c|
# Default behavior: all values are quantized, fragment is removed.
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path?category_id&sort_by
# http://example.com/path?categories[]=1&categories[]=2 --> http://example.com/path?categories[]
Copy link
Contributor Author

@lloeki lloeki Sep 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ironically, the doc here was consistent with the internal Quantization::HTTP.url but not with the user-facing Rack integration http.url tagging behaviour!

# Default behavior: all values are quantized, base is removed, fragment is removed.
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path?category_id&sort_by
# http://example.com:8080/path?categories[]=1&categories[]=2 --> /path?categories[]

# Remove URL base (scheme, host, port)
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path?category_id&sort_by#featured
c.tracing.instrument :rack, quantize: { base: :exclude }

# Show URL base
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path?category_id&sort_by#featured
c.tracing.instrument :rack, quantize: { base: :show }

# Show values for any query string parameter matching 'category_id' exactly
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path?category_id=1&sort_by
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path?category_id=1&sort_by
c.tracing.instrument :rack, quantize: { query: { show: ['category_id'] } }
marcotc marked this conversation as resolved.
Show resolved Hide resolved

# Show all values for all query string parameters
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path?category_id=1&sort_by=asc
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path?category_id=1&sort_by=asc
c.tracing.instrument :rack, quantize: { query: { show: :all } }

# Totally exclude any query string parameter matching 'sort_by' exactly
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path?category_id
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path?category_id
c.tracing.instrument :rack, quantize: { query: { exclude: ['sort_by'] } }

# Remove the query string entirely
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path
c.tracing.instrument :rack, quantize: { query: { exclude: :all } }

# Show URL fragments
# http://example.com/path?category_id=1&sort_by=asc#featured --> http://example.com/path?category_id&sort_by#featured
# http://example.com/path?category_id=1&sort_by=asc#featured --> /path?category_id&sort_by#featured
c.tracing.instrument :rack, quantize: { fragment: :show }
end
```
Expand Down
2 changes: 0 additions & 2 deletions lib/datadog/appsec/configuration/settings.rb
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ module AppSec
module Configuration
# Configuration settings, acting as an integration registry
# TODO: as with Configuration, this is a trivial implementation
# rubocop:disable Metrics/ClassLength
class Settings
class << self
def boolean
Expand Down Expand Up @@ -188,7 +187,6 @@ def reset!
initialize
end
end
# rubocop:enable Metrics/ClassLength
end
end
end
3 changes: 0 additions & 3 deletions lib/datadog/core/configuration/agent_settings_resolver.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,6 @@ module Configuration
#
# Whenever there is a conflict (different configurations are provided in different orders), it MUST warn the users
# about it and pick a value based on the following priority: code > environment variable > defaults.
#
# rubocop:disable Metrics/ClassLength
class AgentSettingsResolver
AgentSettings = \
Struct.new(
Expand Down Expand Up @@ -359,7 +357,6 @@ def adapter(kind_or_custom_adapter, *args, **kwargs)
end
end
end
# rubocop:enable Metrics/ClassLength
end
end
end
2 changes: 0 additions & 2 deletions lib/datadog/core/configuration/components.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ module Datadog
module Core
module Configuration
# Global components for the trace library.
# rubocop:disable Metrics/ClassLength
class Components
class << self
def build_health_metrics(settings)
Expand Down Expand Up @@ -431,7 +430,6 @@ def shutdown!(replacement = nil)
telemetry.emit_closing! unless replacement
end
end
# rubocop:enable Metrics/ClassLength
end
end
end
3 changes: 0 additions & 3 deletions lib/datadog/core/configuration/settings.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ module Configuration
# Global configuration settings for the trace library.
# @public_api
# rubocop:disable Metrics/BlockLength
# rubocop:disable Metrics/ClassLength
# rubocop:disable Layout/LineLength
class Settings
include Base
Expand Down Expand Up @@ -713,9 +712,7 @@ def initialize(*_)
end
end
end

# rubocop:enable Metrics/BlockLength
# rubocop:enable Metrics/ClassLength
# rubocop:enable Layout/LineLength
end
end
Expand Down
2 changes: 1 addition & 1 deletion lib/datadog/profiling/collectors/old_stack.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ module Collectors
# Runs on its own background thread.
#
# This class has the prefix "Old" because it will be deprecated by the new native CPU Profiler
class OldStack < Core::Worker # rubocop:disable Metrics/ClassLength
class OldStack < Core::Worker
include Core::Workers::Polling

DEFAULT_MAX_TIME_USAGE_PCT = 2.0
Expand Down
88 changes: 62 additions & 26 deletions lib/datadog/tracing/contrib/rack/middlewares.rb
Original file line number Diff line number Diff line change
Expand Up @@ -123,18 +123,6 @@ def call(env)
# rubocop:disable Metrics/PerceivedComplexity
# rubocop:disable Metrics/MethodLength
def set_request_tags!(trace, request_span, env, status, headers, response, original_env)
# http://www.rubydoc.info/github/rack/rack/file/SPEC
# The source of truth in Rack is the PATH_INFO key that holds the
# URL for the current request; but some frameworks may override that
# value, especially during exception handling.
#
# Because of this, we prefer to use REQUEST_URI, if available, which is the
# relative path + query string, and doesn't mutate.
#
# REQUEST_URI is only available depending on what web server is running though.
# So when its not available, we want the original, unmutated PATH_INFO, which
# is just the relative path without query strings.
url = env['REQUEST_URI'] || original_env['PATH_INFO']
request_header_collection = Header::RequestHeaderCollection.new(env)
request_headers_tags = parse_request_headers(request_header_collection)
response_headers_tags = parse_response_headers(headers || {})
Expand Down Expand Up @@ -176,14 +164,35 @@ def set_request_tags!(trace, request_span, env, status, headers, response, origi
request_span.set_tag(Tracing::Metadata::Ext::HTTP::TAG_METHOD, env['REQUEST_METHOD'])
end

url = parse_url(env, original_env)

if request_span.get_tag(Tracing::Metadata::Ext::HTTP::TAG_URL).nil?
options = configuration[:quantize]
options = configuration[:quantize] || {}

# Quantization::HTTP.url base defaults to :show, but we are transitioning
options[:base] ||= :exclude
Comment on lines +170 to +173
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be done here, as doing it as a config default would be silently overridden when :quantize is manually set, which would be most surprising to the user.

Probably a general problem with such nested configuration keys.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it correctly, this will be removed in the next major release, is that right?

If so, I suggest writing them down in the comment, so we can find it later.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, can you create a new file named docs/NextReleaseUpgradeGuide.md (name suggestion only), just add one section:

# Breaking changes

To aid with the next release process, please tag future breaking changes the inline code comment `# DEV-2.0:` prefix. 

## List of breaking changes

* `http.url` will change its default...

We've been meaning to do it, and it seems like your PR wants to have a breaking change in the next release.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this, I suggest adding the comment in this line with the prefix # DEV-2.0:.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points! Will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll do that in another PR, along with this suggestion: #2283 (comment)


request_span.set_tag(
Tracing::Metadata::Ext::HTTP::TAG_URL,
Contrib::Utils::Quantization::HTTP.url(url, options)
)
end

if request_span.get_tag(Tracing::Metadata::Ext::HTTP::TAG_BASE_URL).nil?
options = configuration[:quantize]

unless options[:base] == :show
base_url = Contrib::Utils::Quantization::HTTP.base_url(url)

unless base_url.empty?
request_span.set_tag(
Tracing::Metadata::Ext::HTTP::TAG_BASE_URL,
base_url
)
end
end
end

if request_span.get_tag(Tracing::Metadata::Ext::HTTP::TAG_CLIENT_IP).nil?
Tracing::ClientIp.set_client_ip_tag(
request_span,
Expand All @@ -192,19 +201,6 @@ def set_request_tags!(trace, request_span, env, status, headers, response, origi
)
end

if request_span.get_tag(Tracing::Metadata::Ext::HTTP::TAG_BASE_URL).nil?
request_obj = ::Rack::Request.new(env)

base_url = if request_obj.respond_to?(:base_url)
request_obj.base_url
else
# Compatibility for older Rack versions
request_obj.url.chomp(request_obj.fullpath)
end

request_span.set_tag(Tracing::Metadata::Ext::HTTP::TAG_BASE_URL, base_url)
end

if request_span.get_tag(Tracing::Metadata::Ext::HTTP::TAG_STATUS_CODE).nil? && status
request_span.set_tag(Tracing::Metadata::Ext::HTTP::TAG_STATUS_CODE, status)
end
Expand Down Expand Up @@ -238,6 +234,46 @@ def configuration
Datadog.configuration.tracing[:rack]
end

def parse_url(env, original_env)
request_obj = ::Rack::Request.new(env)

# scheme, host, and port
base_url = if request_obj.respond_to?(:base_url)
request_obj.base_url
else
# Compatibility for older Rack versions
request_obj.url.chomp(request_obj.fullpath)
end

# https://github.com/rack/rack/blob/main/SPEC.rdoc
#
# The source of truth in Rack is the PATH_INFO key that holds the
# URL for the current request; but some frameworks may override that
# value, especially during exception handling.
#
# Because of this, we prefer to use REQUEST_URI, if available, which is the
# relative path + query string, and doesn't mutate.
#
# REQUEST_URI is only available depending on what web server is running though.
# So when its not available, we want the original, unmutated PATH_INFO, which
# is just the relative path without query strings.
#
# SCRIPT_NAME is the first part of the request URL path, so that
# the application can know its virtual location. It should be
# prepended to PATH_INFO to reflect the correct user visible path.
request_uri = env['REQUEST_URI'].to_s
fullpath = if request_uri.empty?
query_string = original_env['QUERY_STRING'].to_s
path = original_env['SCRIPT_NAME'].to_s + original_env['PATH_INFO'].to_s

query_string.empty? ? path : "#{path}?#{query_string}"
else
request_uri
end

base_url + fullpath
end

def parse_user_agent_header(headers)
headers.get(Tracing::Metadata::Ext::HTTP::HEADER_USER_AGENT)
end
Expand Down
16 changes: 15 additions & 1 deletion lib/datadog/tracing/contrib/utils/quantization/http.rb
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,14 @@ def url(url, options = {})
options[:placeholder] || PLACEHOLDER
lloeki marked this conversation as resolved.
Show resolved Hide resolved
end

def base_url(url, options = {})
URI.parse(url).tap do |uri|
uri.path = ''
uri.query = nil
uri.fragment = nil
end.to_s
end

def url!(url, options = {})
options ||= {}

Expand All @@ -32,8 +40,14 @@ def url!(url, options = {})
uri.query = (!query.nil? && query.empty? ? nil : query)
end

# Remove any URI framents
# Remove any URI fragments
uri.fragment = nil unless options[:fragment] == :show

if options[:base] == :exclude
uri.host = nil
uri.port = nil
uri.scheme = nil
end
end.to_s
end

Expand Down
2 changes: 0 additions & 2 deletions lib/datadog/tracing/span_operation.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,6 @@ module Tracing
# It gives a Span a context which can be used to
# build a Span. When completed, it yields the Span.
#
# rubocop:disable Metrics/ClassLength
# @public_api
class SpanOperation
include Metadata
Expand Down Expand Up @@ -516,6 +515,5 @@ def duration_nano
alias :span_type :type
alias :span_type= :type=
end
# rubocop:enable Metrics/ClassLength
end
end
2 changes: 0 additions & 2 deletions lib/datadog/tracing/trace_operation.rb
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,6 @@ module Tracing
# For async support, a {Datadog::Tracing::TraceOperation} should be employed
# per execution context (e.g. Thread, etc.)
#
# rubocop:disable Metrics/ClassLength
# @public_api
class TraceOperation
include Metadata::Tagging
Expand Down Expand Up @@ -454,6 +453,5 @@ def build_trace(spans, partial = false)
)
end
end
# rubocop:enable Metrics/ClassLength
end
end
2 changes: 0 additions & 2 deletions lib/datadog/tracing/trace_segment.rb
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ module Datadog
module Tracing
# Serializable construct representing a trace
# @public_api
# rubocop:disable Metrics/ClassLength
class TraceSegment
TAG_NAME = 'name'.freeze
TAG_RESOURCE = 'resource'.freeze
Expand Down Expand Up @@ -203,6 +202,5 @@ def service_tag
meta[TAG_SERVICE]
end
end
# rubocop:enable Metrics/ClassLength
end
end
2 changes: 0 additions & 2 deletions lib/datadog/tracing/tracer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ module Tracing
# example, a trace can be used to track the entire time spent processing a complicated web request.
# Even though the request may require multiple resources and machines to handle the request, all
# of these function calls and sub-requests would be encapsulated within a single trace.
# rubocop:disable Metrics/ClassLength
class Tracer
attr_reader \
:trace_flush,
Expand Down Expand Up @@ -528,6 +527,5 @@ def skip_trace(name)
end
end
end
# rubocop:enable Metrics/ClassLength
end
end
Loading