Skip to content

Allow passing pandas indexes in addition to lists and arrays #2808

Closed

Description

There are some places where Altair currently requires a list or an array to be passed and doesn't accept e.g. a pandas index, although this can easily be converted to a list or an array. The errors from this can sometimes be confusing so I suggest that we are more lenient and at least allow passing pandas Indices by converting them to lists automatically. Optionally, we can convert any data structure that has a tolist() method.

In this example the error is quite helpful, although still a bit confusing since indexes and arrays are often uses interchangeable when working with pandas directly

source = data.cars().melt(id_vars=['Origin', 'Name', 'Year', 'Horsepower', 'Cylinders'])
dropdown_options = source['variable'].drop_duplicates()  # This line needs explicit conversion

dropdown = alt.binding_select(
    options=dropdown_options,
    name='X-axis column '
)
selection = alt.selection_point(
    fields=['variable'],
    value=[{'variable': dropdown_options[0]}],
    bind=dropdown
)

alt.Chart(source).mark_circle().encode(
    x=alt.X('value:Q', title=''),
    y='Horsepower',
    color='Origin',
).add_params(
    selection
).transform_filter(
    selection
)
SchemaValidationError: Invalid specification

        altair.vegalite.v5.schema.core.BindRadioSelect->options, validating 'type'

        {0: 'Miles_per_Gallon', 406: 'Displacement', 812: 'Weight_in_lbs', 1218: 'Acceleration'} is not of type 'array'

In other cases, such as the one below, the traceback is huge and the error message is not at all helpful

import altair as alt
from vega_datasets import data

barley = data.barley()

barley['variety'] = pd.Categorical(
    barley['variety'],
    ordered=True,
    categories=[
        'Manchuria',
         'No. 457',
         'No. 462',
         'No. 475',
         'Glabron',
         'Svansota',
         'Velvet',
         'Trebi',
         'Wisconsin No. 38',
         'Peatland'
    ]
)

alt.Chart(barley).mark_bar().encode(
    x=alt.X('variety', sort=barley['variety'].cat.categories),  # This line needs manual conversion
    y=alt.Y('sum(yield)'),
    color='site:N'
)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions