PERF: pd.Series.map too slow for huge dictionary

Just found out a performance issue with `pd.Series.map`, seems it is very slow when the input is a huge dictionary. 

I noticed a similar issue reported before: #21278 and indeed for `Series` input, the first run might be slow and then for the later runs, they are very fast because hashable indexing is built. However, it doesn't seem to apply to `dict` input.

I slightly changed the example in #21278, and the runtime doesn't change if being run multiple times. And it is much faster using `apply` and `dict.get`. 

So I am curious if this performance issue is being aware , and i would expect performance when a dict is assigned between `pd.Series.map` and `pd.Series.apply(lambda x: blabla)` is quite similar.

```python
n = 1000000
domain = np.arange(0, n)
ranges = domain+10
maptable = pd.Series(ranges, index=domain).sort_index().to_dict()
query_vals = pd.Series([1,2,3])

%timeit query_vals.map(maptable)
```

while much faster if doing below:
```python
query_vals.apply(lambda x: maptable.get(x))
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

PERF: pd.Series.map too slow for huge dictionary #34717

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

PERF: pd.Series.map too slow for huge dictionary #34717

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions