Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SIP-5] [Embeddable Charts] Decouple charts from slice and formData #5680

Closed
kristw opened this issue Aug 20, 2018 · 4 comments
Closed

[SIP-5] [Embeddable Charts] Decouple charts from slice and formData #5680

kristw opened this issue Aug 20, 2018 · 4 comments
Labels
sip Superset Improvement Proposal

Comments

@kristw
Copy link
Contributor

kristw commented Aug 20, 2018

@williaster @conglei @mistercrunch @betodealmeida @hughhhh

cc @john-bodley @michellethomas @graceguo-supercat @timifasubaa

[Embeddable Charts] Decouple charts from slice and formData

Motivation

One key part of the embeddable project is to move towards chart plugins system, which we can register only necessary charts for superset or register custom ones as wish. This will give more flexibility to the developers to customize their superset instances (making it more lightweight, include-only-needed, or include custom vis type) as well as improve maintainability of the superset codebase.

In order to do that, we aim to split the chart types (i.e. most of the things in src/visualizations) into one or more plugins (npm packages), independent from superset. Then, we will implement a registry mechanism for importing plugins. However, before getting to that points, there are a few issues:

  • For the plugin system to work, each chart should be independent from superset platform. Current chart codes are tightly coupled with slice. The charts are technically a child of the slice, but they take the parent as argument and call utility functions from slice (mostly jquery wrapper, which can be replaced with native js). It should to be more one-way dataflow and dispatch events that parent can pick-up rather than calling parents.
  • The formData is a big object that contains many fields, very different from one chart to another. It should be clearer what fields each chart needs.
  • Some charts are React components, but instead of using it in the slice directly, the current code contructs an empty <div> then call ReactDOM.render() on the <div>. This was done to support the non-React components. This may be not addressed yet in this SIP but the refactored chart code will enable us to remove this complicated logic.

Proposed change.

The goal of this SIP is to make each chart decoupled, ready to be converted into an independent React component, before separating them into plugins. And we do this cleanup early so we can continue to take improvements/bugfixes from the community while we replace the core part to use plugin system.

The immediate benefit of this process is clearer documenting the props for each visualization, and repairs/revision of some visualization code.

After the work in this SIP is done, the code in Chart.jsx can be simplified. ChartBody.jsx can be replaced with another component that handle only lazy-loading logic and rendering (extracted from Chart.jsx).

New or Changed Public Interfaces

Current chart code are in the form

function chart(slice, formData, setControlValue) {
}

We will change it to

const propTypes = ...;

function chart(element, props) {
  PropTypes.checkPropTypes(propTypes, props, 'prop', 'ChartName');
  const { data, width, height, ... } = props;
  // old code with no more calls to slice.xxx or formData.xxx or setControlValue
}

chart.propTypes = propTypes;

function adaptor(slice, formData, setControlValue) {
  const element = document.querySelector(slice.selector);
  return chart(element, { 
    data: ...
    width: ...
    height: ...
    xxx: ...
    onXXX: (...) => { setControlValue(...); }
  });
}

Eventually the adaptor will be broken into their own files or adapted to other formats. First pass will keep them in the same file to avoid merge conflicts at index.js.

Please see more concrete example in the "Work in progress" section below.

New dependencies

None

Progress

PR Status Description
#5469 Merged BigNumber
#5669 Merged Word cloud
#5670 Merged Tree Map
#5671 Merged Chord Vis
#5672 Merged iframe and markup
#5690 Merged Horizon Chart
#5691 Merged Force-directed Graph
#5699 Merged Sunburst
#5701 Merged Sankey
#5704 Merged Heatmap
#5705 Merged Pivot Table
#5707 Merged Table
#5718 Merged Partition
#5719 Merged World Map
#5721 Merged Country Map
#5758 Merged Histogram
#5760 Merged Calendar Heatmap
#5761 Merged Parallel Coordinates
#5762 Merged Paired t-test
#5763 Merged Rose
#5775 Merged Time Series Table
#5783 Merged MapBox
#5789 Merged FilterBox
#5838 Merged NVD3: Area, Bar, BoxPlot, Bubble, Bullet, compare, dist_bar, line, time_pivot, pie, dual_line

Also completed

  • line_multi
  • EventFlow
  • Deck.gl: deck_scatter, deck_screengrid, deck_grid, deck_hex, deck_path, deck_geojson, deck_arc, deck_polygon, deck_multi
@kristw kristw changed the title [SIP-5] [Embeddable Charts] Decouple charts from slice and formData [SIP-5] [Embeddable Charts] Decouple charts from slice and formData Aug 20, 2018
@williaster williaster added the sip Superset Improvement Proposal label Aug 20, 2018
@mistercrunch
Copy link
Member

Note that the chart function's signature is actually function deckGeoJson(slice, payload, setControlValue). setControlValue is a hook to a Redux action that allows for the chart to modify controls. For instance the deckgl visualizations zooming/panning set the ViewportControl. The adaptor function needs to receive/dispatch it.

To make this frontend visualization plugin possible, we'll need to figure out how to get rid of the visualization-specific backend code in viz.py over time, to make sure that visualizations plugins are fully defined as Javascript. One easy step forward is to move the backend's query_obj code, which takes form_data as input and returns a query object (simple logic). The get_data method is probably harder to migrate to the backend (takes a dataframe and returns nested objects), but it can probably be generalized quite a bit, so that it's not visualization-specific as much as ordering pandas-like operations on the result set.

So I'm wondering where these new frontend methods should live. I'm guessing you were thinking about export the adaptor and using this as an the interface between chart and the visualization plugin. Now I'm thinking we may want to export a more complex object.

export const visualization = {
  adaptor,
  queryGenerator: (formData) => {
    return generateQuery(formData);
  },
  dataframeOperations: (formData) => (['pivot_table',  { cols: [formData.cols], rows: formData.groupby }]),
};

I'm not sure how to model the query vs dataframe operations, should it be incorporate into the query, or made into its own thing. Clearly on the backend we need two abstractions as one is used to interact with connectors (querying) and the other applies to all connectors (dataframe transformation).

I also wonder whether we could expose most of the pandas API from the frontend by doing a bit of magic, where the dataframeOperations above would somehow do something like getattr(df, command)(**props) on the backend.

@kristw
Copy link
Contributor Author

kristw commented Aug 21, 2018

Thanks @mistercrunch

Note that the chart function's signature is actually function (slice, payload, setControlValue). setControlValue is a hook to a Redux action that allows for the chart to modify controls.

I have revised the proposal above to include how to handle setControlValue.

To make this frontend visualization plugin possible, we'll need to figure out how to get rid of the visualization-specific backend code in viz.py over time, to make sure that visualizations plugins are fully defined as Javascript.

Agree. Let me move this discussion into another SIP, so this one will be kept small and focus on the front-end chart code clean up. We aim that the first pass of this process is to decouple front-end side of these classic vis into packages and be able to use current APIs to power them (i.e. without modifying viz.py yet). Meanwhile there are other workstreams on defining the common query and data format for explore API v2 that can be used to power multiple visualizations and decide what are the post-transformations in js.

@john-bodley
Copy link
Member

@mistercrunch I agree with your comment

I'm not sure how to model the query vs dataframe operations, should it be incorporate into the query, or made into its own thing. Clearly on the backend we need two abstractions as one is used to interact with connectors (querying) and the other applies to all connectors (dataframe transformation).

and I'm also perplexed as to how to best generalize this. It seems a first step is to map arbitrary data (via a query) into a somewhat generic (potentially visualization aware) form.

@kristw
Copy link
Contributor Author

kristw commented Nov 14, 2018

Going to close this as complete: All visualization components have been refactored as described here.

Noting that this SIP was used more as formalizing architecture design and not for any formal voting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sip Superset Improvement Proposal
Projects
Status: Implemented / Done
Development

No branches or pull requests

4 participants