Skip to content

Commit

Permalink
Block public access to some expensive parameters (#1084)
Browse files Browse the repository at this point in the history
Some query parameters cause expensive queries, so they are now disallowed for public, non-logged-in usage. If you try to use them without being logged in, you’ll get a 403 response with a message about what parameters for that controller require logging in.

If we need to extend this to other places, we can use the new `BlockedParamsConcern` on a controller.

Fixes #1070.
  • Loading branch information
Mr0grog authored Feb 7, 2023
1 parent f422b89 commit e02fd14
Show file tree
Hide file tree
Showing 8 changed files with 131 additions and 3 deletions.
28 changes: 25 additions & 3 deletions app/assets/data/swagger.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,6 @@ paths:
format: uuid4
- $ref: '#/parameters/chunk'
- $ref: '#/parameters/chunk_size'
- $ref: '#/parameters/include_total'
- $ref: '#/parameters/sort'
- name: capture_time
in: query
Expand All @@ -241,7 +240,7 @@ paths:
- name: source_metadata[:key]
type: string
in: query
description: >
description: >-
Filter results by a given `key` in the `source_metadata` field. You
can include this parameter multiple times to filter by more than
one `key`. *Note that this field is not indexed, so these queries
Expand All @@ -250,6 +249,9 @@ paths:
- `/api/v0/versions?source_metadata[version_id]=12345678`
- `/api/v0/versions?source_metadata[account]=versionista1&source_metadata[has_content]=true`
🚫 You must be logged in to use this parameter!
- name: different
in: query
description: >-
Expand All @@ -266,6 +268,8 @@ paths:
If present, include a `change_from_previous` field in the result
that represents a change object between this version and the
previous version of this page.
🚫 You must be logged in to use this parameter!
- name: include_change_from_earliest
in: query
type: boolean
Expand All @@ -285,6 +289,8 @@ paths:
notation for intervals. For example, to get all 4XX status versions:
?status=[400,500)
🚫 You must be logged in to use this parameter!
responses:
'200':
description: successful operation
Expand Down Expand Up @@ -381,6 +387,8 @@ paths:
If present, include a `change_from_previous` field in the result
that represents a change object between this version and the
previous version of this page.
🚫 You must be logged in to use this parameter!
- name: include_change_from_earliest
in: query
type: boolean
Expand All @@ -390,6 +398,8 @@ paths:
If present, include a `change_from_earliest` field in the result
that represents a change object between this version and the
earliest version of this page.
🚫 You must be logged in to use this parameter!
responses:
'200':
description: successful operation
Expand Down Expand Up @@ -923,7 +933,6 @@ paths:
parameters:
- $ref: '#/parameters/chunk'
- $ref: '#/parameters/chunk_size'
- $ref: '#/parameters/include_total'
- $ref: '#/parameters/sort'
- name: capture_time
in: query
Expand Down Expand Up @@ -953,6 +962,9 @@ paths:
- `/api/v0/versions?source_metadata[version_id]=12345678`
- `/api/v0/versions?source_metadata[account]=versionista1&source_metadata[has_content]=true`
🚫 You must be logged in to use this parameter!
- name: include_change_from_previous
in: query
type: boolean
Expand All @@ -962,6 +974,8 @@ paths:
If present, include a `change_from_previous` field in the result
that represents a change object between this version and the
previous version of this page.
🚫 You must be logged in to use this parameter!
- name: include_change_from_earliest
in: query
type: boolean
Expand All @@ -971,6 +985,8 @@ paths:
If present, include a `change_from_earliest` field in the result
that represents a change object between this version and the
earliest version of this page.
🚫 You must be logged in to use this parameter!
- name: status
in: query
type: string
Expand All @@ -981,6 +997,8 @@ paths:
notation for intervals. For example, to get all 4XX status versions:
?status=[400,500)
🚫 You must be logged in to use this parameter!
responses:
'200':
description: successful operation
Expand Down Expand Up @@ -1012,6 +1030,8 @@ paths:
If present, include a `change_from_previous` field in the result
that represents a change object between this version and the
previous version of this page.
🚫 You must be logged in to use this parameter!
- name: include_change_from_earliest
in: query
type: boolean
Expand All @@ -1021,6 +1041,8 @@ paths:
If present, include a `change_from_earliest` field in the result
that represents a change object between this version and the
earliest version of this page.
🚫 You must be logged in to use this parameter!
responses:
'200':
description: successful operation
Expand Down
3 changes: 3 additions & 0 deletions app/controllers/api/v0/annotations_controller.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
class Api::V0::AnnotationsController < Api::V0::ApiController
include SortingConcern
include BlockedParamsConcern

block_params_for_public_users params: [:include_total]
before_action(only: [:create]) { authorize :api, :annotate? }
before_action :set_annotation, only: [:show]

Expand Down
3 changes: 3 additions & 0 deletions app/controllers/api/v0/changes_controller.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
class Api::V0::ChangesController < Api::V0::ApiController
include SortingConcern
include BlockedParamsConcern

block_params_for_public_users params: [:include_total]

def index
query = changes_collection
Expand Down
8 changes: 8 additions & 0 deletions app/controllers/api/v0/pages_controller.rb
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
class Api::V0::PagesController < Api::V0::ApiController
include SortingConcern
include BlockedParamsConcern

# Params that can cause expensive performance overhead require logging in.
block_params_for_public_users actions: [:index],
params: [
:include_earliest,
:include_latest
]

def index
query = page_collection
Expand Down
10 changes: 10 additions & 0 deletions app/controllers/api/v0/versions_controller.rb
Original file line number Diff line number Diff line change
@@ -1,9 +1,19 @@
class Api::V0::VersionsController < Api::V0::ApiController
include SortingConcern
include BlockedParamsConcern

SAMPLE_DAYS_DEFAULT = 183
SAMPLE_DAYS_MAX = 365

# Params that can cause expensive performance overhead require logging in.
block_params_for_public_users actions: :all,
params: [:source_metadata, :status]
block_params_for_public_users actions: [:index, :sampled],
params: [
:include_change_from_previous,
:include_change_from_earliest
]

def index
query = version_collection
paging = pagination(query)
Expand Down
57 changes: 57 additions & 0 deletions app/controllers/concerns/blocked_params_concern.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# Block unauthenticated requests that use certain params with a 403 (Forbidden)
# error. This can be used to prevent abuse of options that may cause expensive
# operations.
module BlockedParamsConcern
extend ActiveSupport::Concern

module ClassMethods
attr_reader :blocked_public_params

private

# Raise an exception on any non-logged-in request that uses the specified
# params.
#
# @param actions [:all, Array<Symbol>] Only block the params on these
# actions. If `:all` or not set, blocking is applied to all actions.
# @param params [Array<Symbol, String>] Param names to block.
#
# @example
# class MyController
# include BlockedParamsConcern
# block_params_for_public_users actions: [:index, :show]
# params: [:bad, :params, :here]
# end
def block_params_for_public_users(actions: nil, params:)
actions = nil if actions == :all
actions = [actions] unless actions.nil? || actions.is_a?(Array)
actions = actions&.collect(&:to_s)

@blocked_public_params ||= {}
@blocked_public_params[actions] = params.collect(&:to_s)
end
end

included do
before_action :check_non_public_params!
end

protected

def check_non_public_params!
return unless self.class.blocked_public_params

blocked = self.class.blocked_public_params.flat_map do |actions, params|
actions.nil? || actions.include?(action_name) ? params : nil
end

raise_for_non_public_params!(blocked)
end

def raise_for_non_public_params!(blocked_params)
if !current_user && blocked_params.intersect?(params.keys)
names = blocked_params.collect {|p| "`#{p}`"}.join(', ')
raise Api::ForbiddenError, "You must be logged in to use the params: #{names}"
end
end
end
14 changes: 14 additions & 0 deletions test/controllers/api/v0/pages_controller_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,13 @@ def chunk_size_in_url(url)
assert_kind_of Hash, results[0]['earliest'], '"earliest" property was not a hash'
end

test 'should not include earliest version if not logged in' do
with_rails_configuration(:allow_public_view, true) do
get api_v0_pages_path(include_earliest: true)
assert_response :forbidden
end
end

test 'includes latest version if include_latest = true' do
sign_in users(:alice)
get api_v0_pages_path(include_latest: true)
Expand All @@ -193,6 +200,13 @@ def chunk_size_in_url(url)
assert_kind_of Hash, results[0]['latest'], '"latest" property was not a hash'
end

test 'should not include lastest version if not logged in' do
with_rails_configuration(:allow_public_view, true) do
get api_v0_pages_path(include_latest: true)
assert_response :forbidden
end
end

test 'includes versions if include_versions = true' do
# Temporarily skip because of https://github.com/edgi-govdata-archiving/web-monitoring-db/issues/264
# until https://github.com/edgi-govdata-archiving/web-monitoring-db/issues/274
Expand Down
11 changes: 11 additions & 0 deletions test/controllers/api/v0/versions_controller_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -305,6 +305,17 @@ def teardown
)
end

test 'cannot query by source_metadata fields if not logged in' do
with_rails_configuration(:allow_public_view, true) do
version = versions(:page1_v1)
get api_v0_versions_path(params: {
source_metadata: { version_id: version.source_metadata['version_id'] }
})

assert_response(:forbidden)
end
end

test 'meta.total_results should be the total results across all chunks' do
# For now, there no reasonable way to count versions in a reasonable amount
# of time, so this functionality is disabled. This test is kept in case
Expand Down

0 comments on commit e02fd14

Please sign in to comment.