-
Notifications
You must be signed in to change notification settings - Fork 13.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: mechanism for re-saving charts #25196
Conversation
Codecov Report
@@ Coverage Diff @@
## master #25196 +/- ##
===========================================
- Coverage 69.09% 58.73% -10.37%
===========================================
Files 1904 1904
Lines 74598 74602 +4
Branches 8246 8246
===========================================
- Hits 51544 43815 -7729
- Misses 20917 28650 +7733
Partials 2137 2137
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 292 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
@betodealmeida The other hand, I think we should try to avoid hacks too much in Superset, hacks have too much for now! |
Thank you for the PR @betodealmeida. We have been discussing this internally at Airbnb and it's clear that these problems happen because we are missing schema versioning and validation for these types of fields. We have many bugs that were reported because these schemas change every time we introduce a new feature, essentially creating a new version of it, but we don't correctly support/migrate old versions. The fact that many of these schema transformations are on the frontend is really problematic because we end up creating columns in the database to store those transformations so they can be used by Python code when processing API requests as you can see by #15824, #15865, and #15846. @john-bodley Here's the motivation for the Another critical problem of that approach is that Superset assets can be consumed through APIs, without a frontend, and in that case, changes in feature flags don't reflect in the schemas returned by these APIs. We have 20 feature flags that will be removed in 4.0 (including I understand the motivation for this PR but before we go down this road, is it possible to fix the issue using other means? Can you explain what breaks (video, stacktrace) and which chart types are affected? |
I totally agree, but keep in mind that there are hacks and hacks. This PR is a hack in a sense that "we shouldn't have to do this", not in the sense of "this is really brittle and risky". As is, this PR does nothing, and the only thing it does is re-save the chart when the admin sets the column to true. |
I totally agree, we need versioning, better separation of concerns between backend/frontend and query/presentation, as well as a simpler canonical data model for charts (which would also make it easier to develop new charts).
I agree here too, I've mentioned and been wanting to write a SIP proposing us to adopt blueprints, not only so it's easier to have multiple versions of an API at the same time, but we could even have blueprints for feature flags as well.
You and @geido probably know this better than me, but my understanding of the problem is that some legacy charts still require From the errors I saw, there's no stacktrace, just a subtle silent error where the chart query is generated incorrectly. To the user, apparently no error happens, but the data presented might be fundamentally wrong. I'm still running tests to see which charts are affected and if this PR can fix them. And to be clear, I don't think this is a road, I'm not proposing a new way of doing things. This is just a workaround allowing admins to fix some (or all) charts. This PR is a no-op unless the admin sets the flag to true in one or more charts, so it should be a safe workaround. I don't think we should need this — which is why I called it a hack — and I hope we never need it again in the future. Edit: regarding other fixes, the possible solutions we discussed all seemed risky, specially compared to this approach. We'd have to either modify chart metadata, or have the backend filter it out depending on the visualization type. |
I think in the long run we should really consider implementing a simple Node service that can render query context on the fly from form data (we could place this logic fairly easily in the websocket server). This would remove the need for persisting query context, which we can't update without resaving the chart if the |
I don't know if it's possible, but what I'd like instead is a simpler contract between visualizations and the backend, so that this isn't even needed. |
Thank you for the additional context @betodealmeida.
Got it. Let me know the results.
Me too. |
@betodealmeida @michael-s-molina while there's definitely a lot of standardization that can be done, I think this will be very difficult to achieve. For example, the table chart triggers different queries if the pagination control is checked. Just looking at existing |
@villebro As always, I think there's a middle ground between visualization-specific code and business logic that needs to be supported by our APIs. I think this work is highly correlated with the new plugin architecture we were thinking about. |
Can you share more about this? Can we have a bigger discussion? |
Also, @Vitor-Avila raised a good point: this won't work if the user viewing chart doesn't have permission to overwrite the chart, so they might be looking at incorrect data until an admin opens the chart. I'm going to work on a different approach. |
Sure. We can write a SIP or discussion thread about it after 3.0. |
I definitely don't want to go back to the times of But I think we can generalize the request. At the end of the day we need the following in order to get data (I might be forgetting something):
And then each viz type has its own controls for presentation. Also, each viz should have a standard method for data transformation, something like Edit: also, while we're there we should get rid of annotation layers and just have a chart support multiple datasources. |
I think a discussion or even a meetup might be better. I have a lot of ideas and I know other people also have them, it would be nice to brainstorm before we have a proposal. |
@betodealmeida this would be great, I'd love to participate 👍 |
Maybe when @michael-s-molina is in town we could do a committer meetup and discuss this and other similar topics? I'd love to talk about Flask blueprints and my vision for them. :) |
I would love that! |
SUMMARY
There's currently a bug in Superset where (1) if a chart is saved with a
granularity_sqla
, and (2) theGENERIC_CHART_AXIS
feature is enabled after the chart creation, then the chart will break. This happens for some charts. Strippinggranularity_sqla
is not an option, because some charts still need it even whenGENERIC_CHART_AXIS
is enabled.@Vitor-Avila found a workaround where saving the chart again will fix it. So in this PR I'm adding a new column to charts called
force_save
. When the column is true, the chart will save itself when loaded (in explore or a dashboard), and the column will be set to false. This way, charts can be forced to save by setting this column to true, which could be done conditionally (for certain viz types, eg).Note that while this is a hack, it's a pattern we've used before. For alerts and reports we had to introduce logic to re-save the chart in order to set
query_context
. See #15824, #15865, and #15846.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
TESTING INSTRUCTIONS
Still in progress.
ADDITIONAL INFORMATION