Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add config to disable dataset ownership on the old api #13051

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/docker_build_push.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,14 @@ SHA=$(git rev-parse HEAD)
REPO_NAME="apache/superset"

if [[ "${GITHUB_EVENT_NAME}" == "pull_request" ]]; then
REFSPEC=$(echo "${GITHUB_HEAD_REF}" | sed 's/[^a-zA-Z0-9]/-/' | head -c 40)
REFSPEC=$(echo "${GITHUB_HEAD_REF}" | sed 's/[^a-zA-Z0-9]/-/g' | head -c 40)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed fix, sed was only replacing one non alphanumeric character

PR_NUM=$(echo "${GITHUB_REF}" | sed 's:refs/pull/::' | sed 's:/merge::')
LATEST_TAG="pr-${PR_NUM}"
elif [[ "${GITHUB_EVENT_NAME}" == "release" ]]; then
REFSPEC=$(echo "${GITHUB_REF}" | sed 's:refs/tags/::' | head -c 40)
LATEST_TAG="${REFSPEC}"
else
REFSPEC=$(echo "${GITHUB_REF}" | sed 's:refs/heads/::' | sed 's/[^a-zA-Z0-9]/-/' | head -c 40)
REFSPEC=$(echo "${GITHUB_REF}" | sed 's:refs/heads/::' | sed 's/[^a-zA-Z0-9]/-/g' | head -c 40)
LATEST_TAG="${REFSPEC}"
fi

Expand Down
6 changes: 6 additions & 0 deletions superset/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -1057,6 +1057,12 @@ class CeleryConfig: # pylint: disable=too-few-public-methods
'class="alert-link">here</a>.'
)

# Turn this key to False to disable ownership check on the old dataset MVC and
# datasource API /datasource/save.
#
# Warning: This config key is deprecated and will be removed in version 2.0.0"
OLD_API_CHECK_DATASET_OWNERSHIP = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also deprecate this configuration flag for 2.0

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is deprecated in 2.0, do we think we'll have another way to allow non-owners to edit datasets (perhaps only allowing non-destructive changes)? The concept of shared physical datasets is something that's pretty relied on at Airbnb, and I think is a "good product feature" since it prevents duplication of datasets and ensures everyone is using the same source of truth

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm... @etr2460 before this point, my understanding was that this was a hole in our RBAC configuration, and what this flag was for was to allow organizations time to migrate their existing data. If this is a requirement on Superset going forward then I think it would be good to discuss it. @dpgaspar mentioned that he thinks eventually migrating to a group permission rather than an individual permission could accommodate this need - anyone in a specified group would be able to edit the datasource, rather than a list of owners. Can you talk more about how this is relied on at Airbnb? Would providing a group as the "owner" of the datasource resolve the issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Providing a group could fix this, as long as it's easy to make sure everyone is in that group (not sure what group means in this case, are we talking about a role like Alpha or Gamma?).

Basically we rely on this because many physical datasets are used by many people on many teams (think a table like bookings). Many people would want to build charts based off this table, and make calculated columns or metrics that could be shared across other charts or to other people (maybe there's a calculated column called is_weekend that does logic based on the dttm the booking is). We want to enable and encourage this behavior, otherwise we'd end up in a situation where there are dozens of physical datasets for bookings, each with different metrics and calculated columns (and even worse, different definitions of the same metrics or calculated columns).

Finally, while we do want everyone to be able to take some actions on a dataset (adding metrics, calculated columns, better column descriptions, etc.) it would be nice to restrict other actions to owners only (deleting the dataset, changing the SQL of a virtual dataset). In an ideal world, we'd have more fine grain permission on datasets that could allow anyone to do non-destructive actions, but only let owners/admins perform destructive ones.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Erik and I discussed a bit out of band - I think we're going to explore the topic of the correct behavior here and bring a proposal back to the community. For now, let's consider this conversation non-blocking for this PR.


# SQLA table mutator, every time we fetch the metadata for a certain table
# (superset.connectors.sqla.models.SqlaTable), we call this hook
# to allow mutating the object with this callback.
Expand Down
48 changes: 41 additions & 7 deletions superset/connectors/sqla/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -173,13 +173,25 @@ class TableColumnInlineView( # pylint: disable=too-many-ancestors
edit_form_extra_fields = add_form_extra_fields

def pre_add(self, item: "models.SqlMetric") -> None:
check_ownership(item.table)
logger.warning(
"This endpoint is deprecated and will be removed in version 2.0.0"
)
if app.config["OLD_API_CHECK_DATASET_OWNERSHIP"]:
check_ownership(item.table)

def pre_update(self, item: "models.SqlMetric") -> None:
check_ownership(item.table)
logger.warning(
"This endpoint is deprecated and will be removed in version 2.0.0"
)
if app.config["OLD_API_CHECK_DATASET_OWNERSHIP"]:
check_ownership(item.table)

def pre_delete(self, item: "models.SqlMetric") -> None:
check_ownership(item.table)
logger.warning(
"This endpoint is deprecated and will be removed in version 2.0.0"
)
if app.config["OLD_API_CHECK_DATASET_OWNERSHIP"]:
check_ownership(item.table)


class SqlMetricInlineView( # pylint: disable=too-many-ancestors
Expand Down Expand Up @@ -256,13 +268,25 @@ class SqlMetricInlineView( # pylint: disable=too-many-ancestors
edit_form_extra_fields = add_form_extra_fields

def pre_add(self, item: "models.SqlMetric") -> None:
check_ownership(item.table)
logger.warning(
"This endpoint is deprecated and will be removed in version 2.0.0"
)
if app.config["OLD_API_CHECK_DATASET_OWNERSHIP"]:
check_ownership(item.table)

def pre_update(self, item: "models.SqlMetric") -> None:
check_ownership(item.table)
logger.warning(
"This endpoint is deprecated and will be removed in version 2.0.0"
)
if app.config["OLD_API_CHECK_DATASET_OWNERSHIP"]:
check_ownership(item.table)

def pre_delete(self, item: "models.SqlMetric") -> None:
check_ownership(item.table)
logger.warning(
"This endpoint is deprecated and will be removed in version 2.0.0"
)
if app.config["OLD_API_CHECK_DATASET_OWNERSHIP"]:
check_ownership(item.table)


class RowLevelSecurityListWidget(
Expand Down Expand Up @@ -476,10 +500,17 @@ class TableModelView( # pylint: disable=too-many-ancestors
}

def pre_add(self, item: "TableModelView") -> None:
logger.warning(
"This endpoint is deprecated and will be removed in version 2.0.0"
)
validate_sqlatable(item)

def pre_update(self, item: "TableModelView") -> None:
check_ownership(item)
logger.warning(
"This endpoint is deprecated and will be removed in version 2.0.0"
)
if app.config["OLD_API_CHECK_DATASET_OWNERSHIP"]:
check_ownership(item)

def post_add( # pylint: disable=arguments-differ
self,
Expand Down Expand Up @@ -522,6 +553,9 @@ def edit(self, pk: str) -> FlaskResponse:
def refresh( # pylint: disable=no-self-use, too-many-branches
self, tables: Union["TableModelView", List["TableModelView"]]
) -> FlaskResponse:
logger.warning(
"This endpoint is deprecated and will be removed in version 2.0.0"
)
if not isinstance(tables, list):
tables = [tables]

Expand Down
6 changes: 3 additions & 3 deletions superset/views/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -989,7 +989,7 @@ def schemas( # pylint: disable=no-self-use
self, db_id: int, force_refresh: str = "false"
) -> FlaskResponse:
logger.warning(
"This API endpoint is deprecated and will be removed in version 1.0.0"
"This API endpoint is deprecated and will be removed in version 2.0.0"
)
db_id = int(db_id)
database = db.session.query(Database).get(db_id)
Expand Down Expand Up @@ -1754,7 +1754,7 @@ def publish( # pylint: disable=no-self-use
) -> FlaskResponse:
"""Gets and toggles published status on dashboards"""
logger.warning(
"This API endpoint is deprecated and will be removed in version 1.0.0"
"This API endpoint is deprecated and will be removed in version 2.0.0"
)
session = db.session()
Role = ab_models.Role
Expand Down Expand Up @@ -2071,7 +2071,7 @@ def select_star(
) -> FlaskResponse:
logging.warning(
"%s.select_star "
"This API endpoint is deprecated and will be removed in version 1.0.0",
"This API endpoint is deprecated and will be removed in version 2.0.0",
self.__class__.__name__,
)
stats_logger.incr(f"{self.__class__.__name__}.select_star.init")
Expand Down
11 changes: 6 additions & 5 deletions superset/views/datasource.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
from flask_appbuilder.security.decorators import has_access_api
from flask_babel import _

from superset import db
from superset import app, db
from superset.connectors.connector_registry import ConnectorRegistry
from superset.datasets.commands.exceptions import DatasetForbiddenError
from superset.exceptions import SupersetException, SupersetSecurityException
Expand Down Expand Up @@ -55,10 +55,11 @@ def save(self) -> FlaskResponse:

if "owners" in datasource_dict and orm_datasource.owner_class is not None:
# Check ownership
try:
check_ownership(orm_datasource)
except SupersetSecurityException:
raise DatasetForbiddenError()
if app.config["OLD_API_CHECK_DATASET_OWNERSHIP"]:
try:
check_ownership(orm_datasource)
except SupersetSecurityException:
raise DatasetForbiddenError()

datasource_dict["owners"] = (
db.session.query(orm_datasource.owner_class)
Expand Down