feat: make pandas and NumPy optional dependencies, don't require PyArrow for plotting with Polars/Modin/cuDF #3452

MarcoGorelli · 2024-06-30T15:48:18Z

Suggested in #3445 (comment)

closes #3456
closes #3445
closes #3473
~~closes #3471~~ (this wasn't an issue to begin with)

Demo

Polars can work without PyArrow nor pandas installed (EDIT: and without NumPy either!):

PyArrow table is still supported

(notice the .to_arrow)

Ibis is still supported, and now Date dtype works too!

(previously, it would have been necessary to manually cast 'date' to 'datetime' type before plotting)

Performance

There's also a perf improvement to staying Polars-native where possible - for example, if I run

import polars as pl
import altair as alt
from datetime import date

# dataset: https://github.com/Mcompetitions/M6-methods/blob/main/assets_m6.csv
df = pl.read_csv('../scratch/assets.csv', try_parse_dates=True)
df = df.filter(pl.col('date')>date(2022, 11, 15))
def func():
    alt.Chart(df).mark_line().encode(
        x='date', y='price', color='symbol',
    ).to_json()
results = %timeit -o -q func()
results.best

in IPython, then I see:

Altair main branch: 0.033584889399935494
Altair MarcoGorelli:narwhalify branch: 0.024283270300657023

That's for a (4475, 3)-shaped dataframe - for dataframe with many columns (esp. with strings / categoricals) I'd expect the difference to grow

Addendum: I hadn't noticed this originally, but this PR also makes Altair's import times about twice as fast

dangotbanned · 2024-06-30T16:56:12Z

@MarcoGorelli if all the tests are passing locally for you, try running these two:

hatch run generate-schema-wrapper
hatch run update-init-file

If there are any changes other than whitespace, you'll get a more informative error than https://github.com/vega/altair/actions/runs/9733301918/job/26860053484?pr=3452

Otherwise those two should clean everything up.

tools/generate_schema_wrapper.py

altair/vegalite/v5/schema/channels.py

tools/generate_schema_wrapper.py

dangotbanned · 2024-06-30T19:38:07Z

https://github.com/vega/altair/actions/runs/9734437115/job/26862450764?pr=3452

@MarcoGorelli try this to run the tests on the matrix locally:

hatch test --all

The help for hatch is pretty useful as well:

hatch test -h

dangotbanned · 2024-07-02T11:31:56Z

@MarcoGorelli these were two minor things. I couldn't figure out a way to request these another way.

fix: add missing `pandas` import to doctest

519     Examples 520     -------- 521     >>> data = pd.DataFrame({'foo': ['A', 'B', 'A', 'B'], UNEXPECTED EXCEPTION: NameError("name 'pd' is not defined") Traceback (most recent call last):   File "..\Local\hatch\env\virtual\.pythons\3.12\python\Lib\doctest.py", line 1361, in __run           exec(compile(example.source, filename, "single",   File "<doctest altair.utils.core.parse_shorthand[0]>", line 1, in <module> NameError: name 'pd' is not defined C:\Users\danie\Documents\GitHub\altair\altair\utils\core.py:521: UnexpectedException ============================================== short test summary info ==============================================  FAILED altair/utils/core.py::altair.utils.core.parse_shorthand ========================= 1 failed, 1250 passed, 1 skipped, 2 xfailed, 1 xpassed in 30.64s ==========================

diff --git a/altair/utils/core.py b/altair/utils/core.py
index e6620146..d5d753db 100644
--- a/altair/utils/core.py
+++ b/altair/utils/core.py
@@ -520,6 +520,7 @@ def parse_shorthand(
 
     Examples
     --------
+    >>> import pandas as pd
     >>> data = pd.DataFrame({'foo': ['A', 'B', 'A', 'B'],
     ...                      'bar': [1, 2, 3, 4]})

refactor(typing): Reuse `_typing` alias as `InferredVegaLiteType`

diff --git a/altair/utils/core.py b/altair/utils/core.py
index e6620146..f3a5469a 100644
--- a/altair/utils/core.py
+++ b/altair/utils/core.py
@@ -42,6 +42,7 @@ else:
 if TYPE_CHECKING:
     from types import ModuleType
     import typing as t
+    from altair.vegalite.v5.schema._typing import StandardType_T as InferredVegaLiteType
     from altair.utils._dfi_types import DataFrame as DfiDataFrame
     from altair.utils.data import DataType
     from narwhals.typing import IntoExpr
@@ -199,9 +200,6 @@ TIMEUNITS = [
 ]
 
 
-InferredVegaLiteType = Literal["ordinal", "nominal", "quantitative", "temporal"]
-
-
 def infer_vegalite_type_for_pandas(
     data: object,
 ) -> InferredVegaLiteType | tuple[InferredVegaLiteType, list[Any]]:

mattijn · 2024-07-02T21:53:43Z

Thank you very much for making this Pull Request. That is very much appreciated!

This PR introduces a new dependency on narwhals. Narwhals is described as a lightweight and extensible compatibility layer between multiple dataframe libraries.
Narwhals can replace our hard dependency on pandas, the size of pandas is ~11 MB (without mentioning its subdependencies). This by itself is a very interesting development already.

Next to this it enables support for native dataframes (pandas still included) without converting these to a pyarrow Tables through the dataframe interchange protocol.

The latest release of Altair is ~850 kB and after this PR, Altair will have the following hard dependencies:

jinja2 : ~130 kB
jsonschema : ~90 kB
typing_extensions: ~35 kB
packaging : ~55 kB
narwhals : ~80 kB
~~numpy : 18 MB~~ (also removed as part of this PR)

When using Altair with only remote endpoints (like APIs that are returning JSON-data), I've the feeling that we do not need to rely on numpy. ~~Therefor I'm still hopefull that we can reduce this hard dependency on numpy as well one day.~~

I've a few questions regarding this PR and narwhals.

Just curious, why is the lower limit of pandas in narwhals version 1.1.5? As the supported lower limit of pandas in Altair before this PR was version 0.25.
You serialize the nw.Date type to:

if dtype == nw.Date:
	columns.append(nw.col(name).dt.to_string("%Y-%m-%d"))

And change a test from:

"a": {"date": "2019-01-01T00:00:00"},

To:

"a": {"date": "2019-01-01"},

Upon first sight, I think we cannot introduce this breaking change. Does this change still adhere to full ISO-8601 dates representation as is required by Javascript? See also past issues like this: #1027.

Can you explain the behavior of nw.from_native()?
When I look to this code:

data = nw.from_native(data, eager_only=True, strict=False)
if isinstance(data, nw.DataFrame):
	return data
if isinstance(data, DataFrameLike):
	from altair.utils.data import arrow_table_from_dfi_dataframe

	pa_table = arrow_table_from_dfi_dataframe(data)
	return nw.from_native(pa_table, eager_only=True)

I'm a bit surprised that the data object, as a result of passing it into nw.from_native can be a non-narwhals dataframe. If it is not supported by narwhals it returns the original data object?

Maybe a naive question, but do you have any intention to support dataframes that support the dataframe interface protocol within narwhals?
Not being a type expert, but I cannot really understand why there needs to be quotes around the type "IntoDataFrame", like in:

DataType: TypeAlias = Union[
    Dict[Any, Any], "IntoDataFrame", SupportsGeoInterface, DataFrameLike
]

I remember I have discussed this with @dangotbanned recently in #3431, see #3431 (comment) onwards, but I can't find a clear answer anymore on this topic.

Thanks again for this PR! Really promising! (also this PR will supersede #3384)

dangotbanned · 2024-07-03T09:14:23Z

Not being a type expert, but I cannot really understand why there needs to be quotes around the type "IntoDataFrame", like in:
DataType: TypeAlias = Union[
    Dict[Any, Any], "IntoDataFrame", SupportsGeoInterface, DataFrameLike
]
I remember I have discussed this with @dangotbanned recently in #3431, see #3431 (comment) onwards, but I can't find a clear answer anymore on this topic.

Thanks for the ping @mattijn

The good news is you can trust the ruff config now for managing cases like this (particularly TCH), what @MarcoGorelli has written is correct.

https://github.com/MarcoGorelli/altair/blob/2df9422dca75a26eb16fd0c8030b6e89b4d4fa0f/altair/utils/data.py#L44-L66

All of the imports within the TYPE_CHECKING block must be quoted when used in a runtime alias definition. IntoDataFrame is used in: DataType, TIntoDataFrame, SampleReturnType.

AFAIK, this should be fine to move to a regular import though

alt.utils.data requires explicit exports in alt.utils.__init__.__all__
narwhals is already imported at runtime in https://github.com/MarcoGorelli/altair/blob/2df9422dca75a26eb16fd0c8030b6e89b4d4fa0f/altair/utils/data.py#L26

altair/utils/data.py

The user will see this message ``` altair\utils\data.py:17:5: TID251 `typing.Optional` is banned: Use `Union[T, None]` instead. `typing.Optional` is likely to be confused with `altair.Optional`, which have similar but different semantic meaning. See vega#3449 Found 1 error. ``` Partial fix for vega#3452 (comment)

MarcoGorelli · 2024-07-03T11:07:24Z

Thanks so much for your quick and careful reviews @mattijn and @dangotbanned 🙏 ! Much appreciated

No reason :) I just lowered the minimum version of pandas in Narwhals - I set it to 0.25.3 rather than 0.25 though, as when I tried using 0.25 in CI it just froze indefinitely. 0.25.3 is also the one that Altair is currently testing against in CI - is it OK to set that as the minimum?
Interesting, thanks, and I'm sorry for not having picked up on that! Quoting the issue

If you were east of London when running this code, you wouldn't see this issue 😄.

that might be why I didn't see it when testing it out 😳 .
Polars doesn't let us format Dates with time directives (because the underlying Rust crate chrono doesn't either) - but, we can just cast to datetime, I've added a comment about why this is done. On my laptop, pyarrow.compute.strftime is about 200 times slower than casting Date to Datetime - so, the extra cast to ensure consistency, in the grand scheme of the things, is quite negligible, and not the bottleneck of the sanitize function
That's because of strict=False in the first line of the function https://narwhals-dev.github.io/narwhals/api-reference/narwhals/#narwhals.from_native, the idea is:
- strict=False: if the object isn't supported by Narwhals, then it just gets passed through as-is
- strict=True: if the object isn't supported by Narwhals, then raise
We'd like to have 2 levels of supported dataframes in Narwhals:
- full API
- basic. I.e. an extended and more user-friendly version of the dataframe interchange protocol. Anything which already supports the interchange protocol would be in scope. My main gripe with the interchange protocol is that it's bundled with dataframe libraries themselves, which means is super slow to change - for example, Jon reported BUG: Conversion of datetime64[ns, UTC] to Arrow C format string is not implemented pandas-dev/pandas#54239, and it got fixed, but it's going to be several years until the minimum pandas version gets bumped up enough that it can be used... 😭
thanks @dangotbanned for helping out with this one!

I've the feeling that we do not need to rely on numpy

"in for a penny, in for a pound" - I've added a commit to do this. NumPy is used very little, mostly isinstance checks and np.ceil, so I think it can already be made optional. Then Altair would truly be the lightest visualisation library in the ecosystem?

I've caught up with Polars devs, and they're on board with using Altair in polars.DataFrame.plot if the plots can be done directly without going via pandas, so we may want to have a conversation later about how that should look (personally, I think df.plot.line(x='x', y='y') as a shorthand for alt.Chart(df).mark_line().encode(x='x', y='y').interactive() would be really nice, and users would ❤️ love ❤️ it)

dangotbanned · 2024-07-03T11:58:58Z

Thanks @MarcoGorelli

When I originally suggested narwhals I didn't expect you'd take an active role - and this PR is certainly far beyond that! 😄
I get the impression that this is beneficial for both projects, as any issues you've encountered have been quickly resolved in subsequent narwhals releases

I've caught up with Polars devs, and they're on board with using Altair in polars.DataFrame.plot if the plots can be done directly without going via pandas, so we may want to have a conversation later about how that should look (personally, I think df.plot.line(x='x', y='y') as a shorthand for alt.Chart(df).mark_line().encode(x='x', y='y').interactive() would be really nice, and users would ❤️ love ❤️ it)

I think most people would agree on the value of this, but probably best for another issue.
@joelostblom would probably be able to chime in, there was also this discussion #2789 (comment)

I would expect there would be more support if the implementation was different to the existing hvPlot (polars) / hvPlot accessor.
Currently, the burden would be on altair to define what kinds of charts (e.g. line, scatter, etc), but this isn't a 1-to-1 with how many of the other plotting libraries work.

My suggestion would be for polars to define a protocol with some chart methods, with altair implementing those that make sense for quick EDA.
That way, these methods/functions may have less of a chance of leaking out into the rest of the altair api.

tools/schemapi/schemapi.py

MarcoGorelli · 2024-07-03T13:56:06Z

Thanks @dangotbanned !

I would expect there would be more support if the implementation was different to the existing hvPlot (polars) / hvPlot accessor.

Definitely - plot is marked as unstable in Polars 😎 What I had in mind was just something like

class PlotNameSpace:
    def __init__(self, df: DataFrame) -> None:
        self.df = df
    def line(self, *args, **kwargs):
        return alt.Chart(self.df).mark_line().encode(*args, **kwargs).interactive()
    def point(self, *args, **kwargs):
        return alt.Chart(self.df).mark_point().encode(*args, **kwargs).interactive()
    ...

Anyway, we can figure out the details later, whether or not Altair decides to go ahead with Narwhals

(for the record, I think hvplot is amazing and its developers are really friendly, so I'd still advocating for it to be advertised in the Polars docs - but if Altair could provide a more lightweight alternative which doesn't require other heavy libraries, I think it'd be a more suitable default)

mattijn · 2024-07-03T14:02:50Z

Really awesome to see you were able to remove numpy as a hard dependency too! Its great to see this happening!

A few more comments while you might still have this PR open.

In my opinion, there is no need to serialize microseconds for type Date (also reducing characters in the vega-lite specification). If we define local_iso_fmt_string = "%Y-%m-%dT%H:%M:%S" and use that when nw.Date applies and when nw.Datetime applies we can use f"{local_iso_fmt_string}%.f" or something similar.

A single comment like # using `strict=False` will return the `data` as-is if the object cannot be converted. would be appreciated for my future-self or other people who will be reading the following line:

altair/altair/utils/core.py

Line 477 in 7312f25

data = nw.from_native(data, eager_only=True, strict=False)

This line can be removed:

altair/altair/utils/core.py

Line 451 in 7312f25

# (~200x faster for pyarrow) and so isn't the bottleneck in this function.

Thanks again for pushing this!

mattijn · 2024-07-03T14:14:01Z

A bit off topic, but regarding polars.DataFrame.plot functionality: (1) you also might learn some lessons from this archived repo: https://github.com/altair-viz/pdvega, (2) there is also some work underway to extend the .interactive() method within altair #3394.

mattijn

For me this is a +1, it would be great if there is another maintainer doing a review.

binste · 2024-07-03T14:44:21Z

This is exciting! :) I'd also like to have a quick look. Hopefully get to it this weekend. @jonmmease if you have a moment you might want to have a look as well as you've worked on the interchange protocol implementation and hence might be aware of edge cases we need to consider.

MarcoGorelli · 2024-07-14T19:10:27Z

Thanks @jonmmease for your careful review, really appreciate it! 🙏

But when the "vegafusion" data transformer is enabled, VegaFusion is able to detect which columns are needed and it will request only the required columns when loading into pyarrow (See here).

Aah, that's amazing! I'd missed that.
To address this here, we've made a new release of Narwhals which includes interchange-level support - that is to say, if the input object implements the interchange protocol, then we let you inspect the schema

Like this, parse_shorthand doesn't need to do any data conversion, and so:

for Ibis users using the default data transformers, we fix the Date dtype bug
for Ibis users using the vegafusion data transformer, we preserves the status-quo 😇

(and maybe polars.LazyFrame should also implement the interchange protocol just at the metadata level, but that's one for a separate discussion)

mattijn · 2024-07-15T09:26:49Z

Thanks for the changes @MarcoGorelli!

I still have one question which I cannot really understand. Given the following example:

import datetime
import polars as pl
import pyarrow as pa
import narwhals as nw

data_type_date = [{"date": datetime.date(2024,7,15)}]
local_iso_fmt_string = "%Y-%m-%dT%H:%M:%S"
nw_selector = [nw.col('date').cast(nw.Datetime).dt.to_string(local_iso_fmt_string)]

pl_df = pl.DataFrame(data_type_date)
nw_pl_df = nw.from_native(pl_df, eager_only=True)
nw_pl_df.select(nw_selector).rows(named=True)

This returns

[{'date': '2024-07-15T00:00:00'}]

But if I do:

py_tb = pa.Table.from_pylist(data_type_date)
nw_py_tb = nw.from_native(py_tb, eager_only=True)
nw_py_tb.select(nw_selector).rows(named=True)

It returns

[{'date': '2024-07-15T00:00:00.000000'}]

I expected these output to be equal.

MarcoGorelli · 2024-07-15T09:42:24Z

thanks @mattijn for taking a look!

Libraries differ in how they interpret strftime directives - there's an issue about this in Arrow apache/arrow#20146
We also have a note about this to the Narwhals docs: https://narwhals-dev.github.io/narwhals/api-reference/expressions_dt/#narwhals.expression.ExprDateTimeNamespace.to_string

Practically speaking, if '2024-07-15' gets transformed to

'2024-07-15T00:00:00' for Polars and pandas-like libraries, and
'2024-07-15T00:00:00.000000' for PyArrow

does that cause any issues?

Note that, if the datetime contains fractional seconds, then the outputs match

In [2]: import datetime
   ...: import polars as pl
   ...: import pyarrow as pa
   ...: import narwhals as nw
   ...:
   ...: data_type_date = [{"date": datetime.datetime(2024,7,15,1,2,3,123456)}]
   ...: local_iso_fmt_string = "%Y-%m-%dT%H:%M:%S%.f"
   ...: nw_selector = [nw.col('date').cast(nw.Datetime).dt.to_string(local_iso_fmt_string)]
   ...:
   ...: pl_df = pl.DataFrame(data_type_date)
   ...: nw_pl_df = nw.from_native(pl_df, eager_only=True)
   ...: nw_pl_df.select(nw_selector).rows(named=True)
Out[2]: [{'date': '2024-07-15T01:02:03.123456'}]

In [3]: py_tb = pa.Table.from_pylist(data_type_date)
   ...: nw_py_tb = nw.from_native(py_tb, eager_only=True)
   ...: nw_py_tb.select(nw_selector).rows(named=True)
Out[3]: [{'date': '2024-07-15T01:02:03.123456'}]

mattijn · 2024-07-15T10:04:27Z

Yes for type nw.Datetime it works correctly. The question I have is about the nw.Date type.

One change with your latest comment compared to my comment above is that for me the local_iso_fmt_string does not contain a %.f.

I thought it was an issue with pyarrow, but if I do:

import pyarrow.compute as pc
pc.strftime(py_tb['date'], format=local_iso_fmt_string).to_pylist()

It returns

['2024-07-15T00:00:00']

That seems right. And in the narwhals docs you link to (thanks for the link!) it is mentioned that:

for PyArrow, we replace "%S.%f" with "%S".

That makes me even more confused with your statement:

Practically speaking, if '2024-07-15' gets transformed to

'2024-07-15T00:00:00' for Polars and pandas-like libraries, and

'2024-07-15T00:00:00.000000' for PyArrow

Thanks for answering so promptly! Appreciated.

MarcoGorelli · 2024-07-15T10:09:45Z

The difference is the .cast(nw.Datetime) 😉 If you start with a datetime, then PyArrow's strftime outputs fractional seconds even if we just specify '%S':

In [1]: import pyarrow as pa

In [2]: from datetime import date, datetime

In [3]: import pyarrow.compute as pc

In [4]: pc.strftime(pa.array([datetime(2020, 1, 1)]), '%Y-%m-%dT%H%:%M%:%S')
Out[4]:
<pyarrow.lib.StringArray object at 0x7fd513149cc0>
[
  "2020-01-01T00%:00%:00.000000"
]

So, in summary:

if we're starting with dates, then we cast to datetime and format with '%Y-%m-%dT%H:%M:%S'. PyArrow differs from other libraries here in that it outputs '.000000' too
if we start with datetimes, then we format with '%Y-%m-%dT%H:%M:%S%.f'

mattijn · 2024-07-15T10:14:38Z

Interesting... Let me see if I can find something on this at the pyarrow library. Thanks!

mattijn · 2024-07-15T11:10:39Z

We have to cast the nw.Date type to nw.Datetime only because polars does not support time directives isn't it? Maybe we check upfront if we have to cast or not. @jonmmease, how important is this? Do you have some suggestions here. Its about this piece:

altair/altair/utils/core.py

Lines 451 to 456 in 8b4b3db

    
           if dtype == nw.Date: 
        
               # Polars doesn't allow formatting `Date` with time directives. 
        
               # The date -> datetime cast is extremely fast compared with `to_string` 
        
               columns.append( 
        
                   nw.col(name).cast(nw.Datetime).dt.to_string(local_iso_fmt_string) 
        
               )

MarcoGorelli · 2024-07-15T11:52:17Z

Sure - if you're OK with a teeny-tiny bit of Polars-specific logic, this can work 👍

        if dtype == nw.Date and nw.get_native_namespace(data) is get_polars():
            # Polars doesn't allow formatting `Date` with time directives.
            # The date -> datetime cast is extremely fast compared with `to_string`
            columns.append(
                nw.col(name).cast(nw.Datetime).dt.to_string(local_iso_fmt_string)
            )
        elif dtype == nw.Date:
            columns.append(nw.col(name).dt.to_string(local_iso_fmt_string))
        elif dtype == nw.Datetime:
            columns.append(nw.col(name).dt.to_string(f"{local_iso_fmt_string}%.f"))

Just pushed a commit to do that

jonmmease

To address this here, we've made a new release of Narwhals which includes interchange-level support - that is to say, if the input object implements the interchange protocol, then we let you inspect the schema

Amazing! This looks great, and having the narwhals metadata interface to all DataFrame interchange protocol objects is really nice.

@mattijn I don't the difference in the inclusion of fractional seconds will make any different to Vega in the browser, so I wouldn't worry about it. That said, @MarcoGorelli 's small polars workaround looks fine as well, so I'm happy to keep that since it maintains identical results.

Thanks again for this great PR @MarcoGorelli, and for all of the work you put into narwhals to make it possible (even during the review process!). I'm ready for this to be merged, but want to pass it back over to you one last time in case any of my review comments shake something loose for you.

Please add a comment with a ✅ when you're ready for it to merge.

jonmmease · 2024-07-15T15:16:08Z

altair/utils/core.py

-        and column.describe_categorical["is_ordered"]
-        and column.describe_categorical["categories"] is not None
+        nw.is_ordered_categorical(column)
+        and not (categories := column.cat.get_categories()).is_empty()


Oh, nice. I had missed that dropping Python 3.7 means we can use the walrus operator!

😄 there's a few others if you're brave enough to try out auto-walrus

jonmmease · 2024-07-15T15:24:13Z

altair/utils/core.py

+    if isinstance(data, DataFrameLike):
+        from altair.utils.data import arrow_table_from_dfi_dataframe
+
+        pa_table = arrow_table_from_dfi_dataframe(data)


Not a priority now (and I can't think of any specific scenarios that would benefit), was just curious.

For posterity, this ibis team is in the process of removing the hard pyarrow dependency for some use cases (ibis-project/ibis#9552), but it will still be required for many (all?) specific backends I think.

jonmmease · 2024-07-15T15:25:07Z

altair/utils/core.py

+        # Early return if already a Narwhals DataFrame
+        return data
+    # Using `strict=False` will return `data` as-is if the object cannot be converted.
+    data = nw.from_native(data, eager_only=True, strict=False)


Ok, so they get wrapped by narwhals fine (just as normal pandas DataFrames), but then we use from_native whenever we need to be able to tell the difference between a GeoDataFrame and a regular DataFrame?

jonmmease · 2024-07-15T15:26:50Z

tests/utils/test_dataframe_interchange.py

I was referring to the file name itself. It's currently test_dataframe_interchange.py, and I was suggesting renaming to test_to_values_narwhals.py.

MarcoGorelli · 2024-07-15T15:50:19Z

thanks @jonmmease ! review comments look great to me, much appreciated 🙏 ✅

binste · 2024-07-15T17:46:23Z

Thanks to everyone involved! This is very exciting and we should cut a release soon :) I'll open a new discussion for it.

* refactor(typing): reduce type ignores in `api.py` * feat(typing): adds `_is_undefined` guard Direct copy from another PR, but is helpful here https://github.com/dangotbanned/altair/blob/d607c70824860ef3478d7d3996d26dc516d69369/altair/vegalite/v5/api.py#L504 * fix(typing): `ignore[index]` from #3455 * fix(typing): `[ignore[arg-type]` from #3452 * refactor(typing): use `_is_undefined` in some cases * refactor: reduce code duplication in `Chart.to_dict` * fix(typing): Recover `TopLevelMixin.resolve_` return types I removed these in 6adf564 due to how complex they were getting, but both `mypy` and `pyright` seem happy with this solution * fix(typing): narrow annotated types for `_check_if_valid_subspec` Following investigation during #3480 (comment) * refactor: remove unused `_wrapper_classes` argument from `SchemaBase` Fixes: - #3480 (comment) - #3480 (comment) - #3480 (comment)

MarcoGorelli changed the title ~~WIP: make pandas optional dependency, don't require PyArrow for plotting with Polars/Modin/cuDF~~ WIP: feat: make pandas optional dependency, don't require PyArrow for plotting with Polars/Modin/cuDF Jun 30, 2024

MarcoGorelli changed the title ~~WIP: feat: make pandas optional dependency, don't require PyArrow for plotting with Polars/Modin/cuDF~~ feat: WIP make pandas optional dependency, don't require PyArrow for plotting with Polars/Modin/cuDF Jun 30, 2024

dangotbanned reviewed Jun 30, 2024

View reviewed changes

tools/generate_schema_wrapper.py Outdated Show resolved Hide resolved

dangotbanned reviewed Jun 30, 2024

View reviewed changes

altair/vegalite/v5/schema/channels.py Show resolved Hide resolved

dangotbanned reviewed Jun 30, 2024

View reviewed changes

altair/vegalite/v5/schema/channels.py Show resolved Hide resolved

dangotbanned reviewed Jun 30, 2024

View reviewed changes

tools/generate_schema_wrapper.py Outdated Show resolved Hide resolved

MarcoGorelli mentioned this pull request Jul 2, 2024

Remove PyArrow dependency for Polars support #3445

Closed

MarcoGorelli changed the title ~~feat: WIP make pandas optional dependency, don't require PyArrow for plotting with Polars/Modin/cuDF~~ feat: make pandas optional dependency, don't require PyArrow for plotting with Polars/Modin/cuDF Jul 2, 2024

MarcoGorelli marked this pull request as ready for review July 2, 2024 15:26