Description
[SIP-38] Visualization plugin refactoring
Motivation
One of the most commonly reoccurring questions in the Superset community, on Slack and elsewhere, is that of how to add a new data visualization. The answer, in short, has been “it’s hard.” While that may be true, the goal of this SIP is to lay out both tactical refactor needs for the current implementation to mature, as well as proposing a handful of roadmap features to make plugin development significantly easier. These changes will make upcoming modifications of existing plugins (see SIP-34) drastically simpler, and steer toward opening an ecosystem of Superset visualization plugins.
Much planning and work has already been done to address the difficulty of adding/editing plugins, including a new query API endpoint, but there are many blocking issues and code migrations remaining to complete this process. Special thanks to @kristw, @williaster, @xtinec, and @conglei for their significant contributions to the frontend and API work thus far. These issues, and proposed solutions for them, are enumerated below. Additional suggestions are welcome.
Proposed Changes
General Goals:
- As much code and configuration as possible for individual visualization plugins should be moved out of
incubator-superset
and into the individual plugin’s repos (in a perfect world, a new plugin wouldn’t require touching two repos and opening two PRs). - Reduce frustration in working on plugin repos, allowing people to more easily see changes as they make them
Issue: |
Proposal: |
Issue: |
Proposal: |
Issue:
|
Proposal:
|
Issue: |
Proposal: |
Issue: |
Proposal: |
Additional (follow-up) refactoring tasks
- Follow CSS-in-JS patterns (see SIP-37) in viz components, sharing common theme styles/variables with
incubator-superset
. Theme variables may need to be moved tosuperset-ui
to be consumed by bothsuperset-ui-plugins
andincubator-superset
. - Audit and address issues with, and completeness of, i18n of plugin text.
- Converting all viz components to TypeScript (see SIP-36)
New or Changed Public Interfaces
The query endpoint at /api/v1/query
needs significant enhancement, as laid out in the proposals above (post-processing options, tests, docs).
New dependencies
N/A
Migration Plan and Compatibility
N/A
Rejected Alternatives
- Reintroducing viz plugins into incubator-superset
Having the plugins be in their own repos is troublesome from a workflow perspective (due to the multiple PRs required, NPM Link work needed, and separate build processes required). The proposals laid out above seek to minimize this difficulty. While it is certainly possible (and indeed likely easier) to move the plugins back into Superset itself (like Redash and Metabase do), solving these more difficult problems seems more likely to open the door to a true plugin ecosystem for Superset. - Moving data transformations to plugins (JS), deprecating Pandas
The idea has been floated that perhaps data transformation (at least in some cases) might be more the responsibility of the viz plugin itself than the backend, and maybe if we moved that logic, we could deprecate Pandas. To test the theory, some basic benchmarking attempts were made on large rollup and pivot tasks, to compare the performance of Pandas against Zebras, Datalib, Ramda, and Lodash. This approach, at least as a global migration, was decided against for these reasons:- Sending an entire dataset over the wire, if the frontend just needs a rollup, is a waste of resources
- If post-processing is done on the backend, the result can be cached for use by multiple charts (or multiple clients and reloads)
- Neither Zebras nor Datalib provides an out-of-the-box pivot function on par with Pandas, and the
pivotWith
"recipe" from the Ramda cookbook looked to be significantly slower than Pandas (approx 10x). - All these libraries provide grouping, sorting, map/reduce functionality, so you can pivot the data manually. But then, so does Lodash, which matched (or slightly beat) the other JS libraries' performance. This was still about 2x slower than Pandas.
- TL;DR: If you want to avoid writing Python for a new viz or calling it through the new API, and want to do a little data munging on the frontend, just use lodash or vanilla JS for best results.
Metadata
Assignees
Type
Projects
Status
Implemented / Done