Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation and add schema details for Custom Dataset Preview #1884

Closed
1 task done
ravi-kumar-pilla opened this issue Apr 30, 2024 · 3 comments
Closed
1 task done

Comments

@ravi-kumar-pilla
Copy link
Contributor

Description

We have introduced custom dataset preview for our users allowing them to implement preview method. However, the documentation needs to be updated with some details regarding the expected dict schema and also add some validation for the NewTypes (eg., TablePreview).

Context

#1847 (comment)

Possible Implementation

  1. The return type of the preview function should match one of the following types
TablePreview = NewType("TablePreview", dict)
ImagePreview = NewType("ImagePreview", bytes)
PlotlyPreview = NewType("PlotlyPreview", dict)
JSONPreview = NewType("JSONPreview", dict) 
  1. Kedro-Viz expects the dict should contain the below schema -

    TablePreview :

    preview={
        'index': number[], 
        'columns': string[], 
        'data': any[][]  // List[List[Any]] 
    }

    index - An array of 0 indexed integers representing nrows
    columns - An array of strings representing names of ncolumns
    data - A 2D array representing data for the TablePreview

    Example -

    Catalog -

    companies:
      type: pandas.CSVDataset
      filepath: ${_base_location}/01_raw/companies.csv
      metadata:
        kedro-viz:
          layer: raw
          preview_args:
            nrows: 5

    TablePreview value returned from preview() function -

    preview={'index': [0, 1, 2, 3, 4], 'columns': ['id', 'company_rating', 'company_location', 'total_fleet_count', 'iata_approved'], 'data': [[35029, '100%', 'Niue', 4.0, 'f'], [30292, '67%', 'Anguilla', 6.0, 'f'], [19032, '67%', 'Russian Federation', 4.0, 'f'], [8238, '91%', 'Barbados', 15.0, 't'], [30342, nan, 'Sao Tome and Principe', 2.0, 't']]}
  2. We should enforce the schema in the NewType that is introduced to avoid blank UI as mentioned here

  3. We should document the expected {key:value} pairs.

NOTE: This ticket needs to be updated with schema details of other return types (ImagePreview, PlotlyPreview, JSONPreview) for reference

Possible Alternatives

Checklist

  • Include labels so that we can categorise your feature request
@Huongg
Copy link
Contributor

Huongg commented Jul 15, 2024

to summarise: we're going to document the table preview and what schema exactly we're expecting.

@Huongg Huongg moved this from Inbox to Backlog in Kedro-Viz Jul 15, 2024
@rashidakanchwala
Copy link
Contributor

We need to also discuss how to be approach TablePreview when we do this ticket.

@astrojuanlu
Copy link
Member

I was trying to prove that this should work for non-Pandas datasets but I couldn't get it to work at all #1847 (comment)

@rashidakanchwala rashidakanchwala moved this from Backlog to Todo in Kedro-Viz Jul 22, 2024
@rashidakanchwala rashidakanchwala self-assigned this Jul 22, 2024
@rashidakanchwala rashidakanchwala changed the title Update documentation and schema validation for Custom Dataset Preview Update documentation and add schema details for Custom Dataset Preview Jul 22, 2024
@rashidakanchwala rashidakanchwala added Documentation and removed Python Pull requests that update Python code labels Aug 19, 2024
@github-project-automation github-project-automation bot moved this from Todo to Done in Kedro-Viz Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

4 participants