-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Save column data into json_metadata for all Query executions #20059
Conversation
Is there a reason not to reuse the ExtraJSONMixin? |
I'm using the extra mixin here |
Codecov Report
@@ Coverage Diff @@
## master #20059 +/- ##
===========================================
- Coverage 66.45% 54.56% -11.89%
===========================================
Files 1721 1721
Lines 64479 64476 -3
Branches 6795 6794 -1
===========================================
- Hits 42852 35184 -7668
- Misses 19897 27564 +7667
+ Partials 1730 1728 -2
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
superset/queries/dao.py
Outdated
# save payload into query object | ||
db.session.add(query) | ||
query.set_extra_json_key("columns", columns) | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do these have return statement?
superset/queries/dao.py
Outdated
def save_metadata(query: Query, query_payload: str) -> None: | ||
# parse payload | ||
try: | ||
payload: Dict[str, Any] = json.loads(query_payload) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that we are dumping a json then reloading it within the same process doesn't bod well for me. It's probably also not a great idea to passing strings around between functions.
Can we refactor _execution_context_convertor
to return the raw dict instead? (Maybe add a new public function?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but can we maybe update ExecutionContextConvertor.to_payload
to return Python dict and only do JSON dumps in ExecuteSqlCommand.run
? Looks like to_payload
is only used in this one place.
Or better yet, replace json_success in _create_response_from_execution_context
with self.json_response(...)
, which dumps dict to strings automatically.
We should do JSON serialization only when we are very close to sending it to the response.
cc @ofekisr
My apologies. The fact that the the command is called |
278a88f
to
df128da
Compare
df128da
to
6af71ee
Compare
"payload": self._execution_context_convertor.to_payload( | ||
self._execution_context, status | ||
), | ||
"payload": self._execution_context_convertor.serialize_payload(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we maybe just let the command return raw dict here, too, and use self.json_response
in whatever view handler the result may be used, so that json.dumps in the executor itself can be cleaned up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but there is specific logic in the json.dumps
that needs the status
to understand how to serialize. I don't see a clean way to do this unless we move all this logic outside of the command.
Can we revisit this in another refactor ticket?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think using the shared .json_response
…pache#20059) * add save_metadata function to QueryDAO * use set_extra_json_key * added test * Update queries_test.py * fix pylint * add to session * add to session * refactor * forgot the return
SUMMARY
Anytime a user executes a query in Sqllab we want to store columns data inside the new
json_metadata
field to allow us leverage this information whenever a user may want to run this same query inside Explore.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
ADDITIONAL INFORMATION