Skip to content

Conversation

@rusackas
Copy link
Member

SUMMARY

This PR fixes a security vulnerability (issue #31944) where users with "can samples on Datasource" permission could read data samples from datasets they don't have proper access to.

Root Cause:
The get_samples() function in superset/views/datasource/utils.py was creating QueryContext instances and calling get_payload() directly without first enforcing access control through raise_for_access().

This allowed users who only had the "can samples on Datasource" permission to bypass datasource-level security checks and read samples from datasets they shouldn't have access to.

Fix:
Added raise_for_access() calls on both the samples and count_star query contexts before fetching any data. This ensures users must have proper datasource access (schema access, datasource access, or ownership) before samples can be retrieved.

Code Change:

try:
    # Enforce access control before fetching data.
    # This prevents users with "can samples on Datasource" permission from
    # reading samples from datasets they don't have access to.
    samples_instance.raise_for_access()
    count_star_instance.raise_for_access()

    count_star_data = count_star_instance.get_payload()["queries"][0]
    # ... rest of the function

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A - This is a security fix with no UI changes.

TESTING INSTRUCTIONS

  1. Create a user with only "can samples on Datasource" permission
  2. Create a dataset that the user does NOT have access to (no datasource_access, schema_access, or ownership)
  3. Before this fix: User could access /datasource/samples?datasource_id=<id>&datasource_type=table and retrieve samples
  4. After this fix: User receives a 403/401 error when attempting to access samples from datasets they don't have permission to access

Unit Tests:
The PR includes comprehensive unit tests that verify:

  • raise_for_access() is called before fetching data
  • Security exceptions are properly raised when access is denied
  • Both samples and count_star query contexts are checked for access

ADDITIONAL INFORMATION

  • Has associated issue: Fixes get_samples() doesn't raise for access #31944
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

🤖 Generated with Claude Code

This commit fixes a security vulnerability (issue #31944) where users
with "can samples on Datasource" permission could read data samples
from datasets they don't have access to.

The get_samples() function was creating QueryContext instances and
calling get_payload() directly without first enforcing access control.
This allowed users to bypass datasource-level security checks.

The fix adds raise_for_access() calls on both the samples and count_star
query contexts before fetching any data. This ensures users must have
proper datasource access before samples can be retrieved.

Fixes #31944

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@dosubot dosubot bot added the authentication:access-control Rlated to access control label Dec 11, 2025
The original tests tried to patch Flask's current_app proxy, but this
caused issues with MagicMock returning unexpected values (coroutines
instead of integers) when comparing in get_limit_clause.

By mocking get_limit_clause directly, we avoid Flask app context
complexities and make the tests more focused on testing the actual
access control logic.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

get_samples() doesn't raise for access

1 participant