-
Notifications
You must be signed in to change notification settings - Fork 16.1k
A tagging system for dashboards, charts and queries #5524
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Fixing tests. |
| from .base import BaseSupersetView | ||
|
|
||
|
|
||
| class ObjectTypeConverter(BaseConverter): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this to a utils.py file?
Also get some unit test for these basic methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, this class is used for validating the path in the views defined in this file. For example:
@expose('/tags/<object_type:object_type>/<int:object_id>/', methods=['GET'])
def get(self, object_type, object_id):
...Using the validator, /tags/chart/1/ is a valid path (since chart is in the ObjectTypes enum) but /tags/foo/1 is invalid. Since the concept of object_type is only used in tag views, I think moving this to utils.py will not be very useful, since it's unlikely that it will be used in other places.
I'll add unit tests, good point.
|
(1) Would suggest that the 'Altered' token goes before the tags, so as to reduce all the movement when you add tags |
⚡️
You're right. Right now it works by separating the tags with a comma,
👍 |
Codecov Report
@@ Coverage Diff @@
## master #5524 +/- ##
==========================================
- Coverage 63.6% 63.54% -0.06%
==========================================
Files 359 364 +5
Lines 22781 23185 +404
Branches 2530 2556 +26
==========================================
+ Hits 14489 14734 +245
- Misses 8277 8436 +159
Partials 15 15
Continue to review full report at Codecov.
|
|
Phew! So many fixes. |
|
@vylc I added better searching for tags: |
mistercrunch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wanted to release this round of review on the Python side, I'll do another round on the JS side, coming soon.
| Session = sessionmaker(autoflush=False) | ||
|
|
||
|
|
||
| class TagTypes(enum.Enum): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I though enums were a py3.4 thing but they appear to have been backported to 2.7, maybe 2.7.6?
I'm starting to think we should deprecate our supporte for <py3.6 and embrace all the new great stuff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think even in 2.7.3. There's also a backport enum34 which we could add as an optional dependency, something like:
# setup.py
if sys.version_info < (2, 7, 3):
deps.append('enum34')And I'm +1 on moving towards newer versions. I think when developing libraries it's important to be more conservative, but for applications it's fine.
| """ | ||
|
|
||
| # explicit tags, added manually by the owner | ||
| custom = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally [despite the perf implications] I prefer having readable strings in the db. You could also model this as fks to reference tables so that the db has all the information instead of a fk to an enum that lives only in code.
From what I'm seeing I'm thinking the tag name may have the info anyways, which makes it work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQLAlchemy is storing them as strings in the DB, though (at least in sqlite):
sqlite> SELECT DISTINCT object_type FROM tagged_object;
chart
dashboard
query
sqlite> .schema tagged_object
CREATE TABLE tagged_object (
created_on DATETIME,
changed_on DATETIME,
id INTEGER NOT NULL,
tag_id INTEGER,
object_id INTEGER,
object_type VARCHAR(9),
created_by_fk INTEGER,
changed_by_fk INTEGER,
PRIMARY KEY (id),
FOREIGN KEY(tag_id) REFERENCES tag (id),
CONSTRAINT objecttypes CHECK (object_type IN ('query', 'chart', 'dashboard')),
FOREIGN KEY(created_by_fk) REFERENCES ab_user (id),
FOREIGN KEY(changed_by_fk) REFERENCES ab_user (id)
);| session = Session(bind=connection) | ||
|
|
||
| # add `owner:` tags | ||
| for owner_id in cls.get_owners_ids(target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could refactor 109-117 into it's own method as it looks the same as 148-158
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚡️
| object_type = 'query' | ||
|
|
||
| @classmethod | ||
| def get_owners_ids(cls, target): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note that eventually I'd like to get all models to be consistent and have and owners many to many field. No need to change it as part of this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| TaggedObject.object_id == target.obj_id, | ||
| Tag.type == TagTypes.favorited_by, | ||
| Tag.name == name, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could just query.delete() at the next line here (as long as you change session.query(TaggedObject.id) -> session.query(TaggedObject)), no need for another query I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that, but SQLAlchemy complained saying it can't delete() on a join. It works where we have only a single table, eg:
session.query(TaggedObject).filter(
TaggedObject.object_type == cls.object_type,
TaggedObject.object_id == target.id,
).delete()
superset/views/tags.py
Outdated
|
|
||
| @expose('/tags/suggestions/', methods=['GET']) | ||
| def suggestions(self): | ||
| query = db.session.query( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the way I like to break lines when doing method chaining in python is to wrap in parenthesis and break before the dot as in:
query = (
db.session.query(TaggedObject)
.group_by(TaggedObject.tag_id)
.order_by(func.count().desc())
.all()
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems very javascripty, but I'm ok with doing it that way. :)
I miss the pipe operator from Hack:
function piped_example(array<int> $arr): int {
return $arr
|> \array_map($x ==> $x * $x, $$)
|> \array_filter($$, $x ==> $x % 2 == 0)
|> \count($$);
}
superset/views/tags.py
Outdated
| tags = json.dumps([ | ||
| {'id': obj.tag.id, 'name': obj.tag.name} for obj in query]) | ||
|
|
||
| return Response(tags, status=200, content_type='application/json') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use return self.json_success(tags) where tag is the object, not the json string.
optional side-mission: Looks like we may want to move https://github.com/apache/incubator-superset/blob/master/superset/views/core.py#L86 to view/base.py as a method of BaseSupersetView
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this new BaseSupersetView.json_success expects an object, it could be a way to standardize to simplejson.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I'll do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this is more complicated than I was expecting. The core view uses different encoding options depending on the visualization type (by calling viz_obj.json_dumps), so it requires a lot of refactoring.
Let's do that on a separate PR.
| return Response(status=201) # 201 CREATED | ||
|
|
||
| @expose('/tags/<object_type:object_type>/<int:object_id>/', methods=['DELETE']) | ||
| def delete(self, object_type, object_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that some of these endpoints we'd get for free by creating a FAB ModelView object, it may not be exactly what you need though, unclear. I'm ok with the more explicit/controlled endpoints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| TaggedObject.object_id == SavedQuery.id, | ||
| TaggedObject.object_type == ObjectTypes.query, | ||
| ), | ||
| ).group_by(TaggedObject.object_id, TaggedObject.object_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to understand what kind of SQL will come out of that query object... Is the group_by needed?
Also you may be able to parameterize the ObjectTypes as in:
tagged_models = [Slice, Dashboard, Query]
for tm in tagged_models:
query = query.outerjoin(tm, {...})It would make it easier to add more tagged models in the future but may make the code more opaque.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The group by is needed because a user might search for tags "a" and "b", and if an object is tagged with both it would return duplicates.
|
@williaster, I've replaced the URL polyfill with |
|
@elibrumbaugh, thanks for the feedback!
Good point, right now this is far from optimal: I'll see if I can make the component have a fix width and have the tags wrap. Thoughts on this?
The colors come from the Bootstrap theme. I created #5590 to address the problem separately.
From a UX point of view we think they should not be hidden, neither to read nor to write. Let's talk about this Friday during our meeting? I'm ok with changing the way tags are created — it should be easy to change that while keeping the data model and the REST endpoints from this PR.
😄 |
|
@elibrumbaugh can you think of a good way to get back to that overall landing page w/out having to click on a tag? not completely intuitive clicking on a tab will take you back to that page |
|
@williaster, are you ok with moving forward and merging this PR, and then changing the way the tag component is display in dashboards? |
|
@betodealmeida in the meetup last week I thought we discussed possibly splitting this into two parts: one for backend and one for frontend? So I'd propose either
I think we should not merge PRs that affect UI/UX if the UI/UX is not where we want it to be. |
|
@williaster, I removed the tag input from this PR, leaving only the backend. |
| return ( | ||
| <div className="react-tags-rw"> | ||
| {this.state.tags.map(tag => ( | ||
| <Label bsStyle="info"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this needs a key prop as explained here.
|
Moved the backend to #6823. |
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
this looks so good, is there a link to the frontend files? |
|
@1AB9502 I'll post the PR for the frontend today for review. |
|
Huge thanks! |
|
@mihir174 @michael-s-molina @Steejay Hey both, not sure if you were aware of this PR. Tagging was once bought up with both BE and FE solutions, it was closed because the proposed UI/UX was not ideal. To bring tagging(BE) back to life, we will have to rewrite a big trunk of work basically, but still, a shortcut to solve the content organization problem. |
|
thanks @junlincc! tagging seems like a potential solution! From previous research around query grouping, sharing was another part of the workflow that was important to consider when thinking about obj organization. how might we share a group of queries (dashboards, charts, etc)? Searching for the object is one of many steps in a workflow. What other steps do we need to consider when designing for object organization? Some open questions to consider as we move forward. |


This PR introduces a tagging system for Superset, allowing dashboards, charts and queries to be tagged. It also allows searching for a given tag, and will be the basis for a new landing page (see #5327).
Adding tags
Here's an example showing adding tags to a chart (dashboards work the same way). Suggestions come from pre-existing tags, and are ordered by popularity:
And here's how it looks like when a user is not an owner of the chart/dashboard:
Searching for tags
Clicking on a tag will take the user to a page searching for that tag. For example, clicking on the "test" tag:
For this, I added a new tab "Tags" to the welcome page, using the same layout as the "Favorites" tab. I changed the page so that the active tab can be specified by the URL anchor, as well as the search parameter. For example, the URL
/superset/welcome?q=test#tagswill show the "Tags" tab instead of the default "Dashboards" with the search form pre-populated with "test".Implicit tags
Dashboard, chart and (saved) queries have implicit tags related to their owners, types and favorites. For example, all objects owned by the admin have the tag
owner:1. All charts have the tagtype:chart. Objects favorited by the admin have the tagfavorited_by:1.These tags are automatically added by a migration script, and kept in sync through SQLAlchemy event listeners. They are currently not surfaced to the user, but can be searched for. For example, it's possible to search for
owner:1in the welcome page to see all objects owned by the admin, or even search forowner:{{ current_user_id() }}.Queries
Currently queries cannot be tagged, because I wasn't sure were to add the component. Any suggestions, @vylc?