Skip to content

Commit 15f5265

Browse files
committed
REF: DataFrame._setitem_array dont use iloc.__setitem__
2 parents a5c1f5e + 0b16fb3 commit 15f5265

File tree

92 files changed

+2613
-2081
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

92 files changed

+2613
-2081
lines changed

asv_bench/benchmarks/series_methods.py

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -284,16 +284,29 @@ def time_clip(self, n):
284284

285285
class ValueCounts:
286286

287-
params = ["int", "uint", "float", "object"]
288-
param_names = ["dtype"]
287+
params = [[10 ** 3, 10 ** 4, 10 ** 5], ["int", "uint", "float", "object"]]
288+
param_names = ["N", "dtype"]
289289

290-
def setup(self, dtype):
291-
self.s = Series(np.random.randint(0, 1000, size=100000)).astype(dtype)
290+
def setup(self, N, dtype):
291+
self.s = Series(np.random.randint(0, N, size=10 * N)).astype(dtype)
292292

293-
def time_value_counts(self, dtype):
293+
def time_value_counts(self, N, dtype):
294294
self.s.value_counts()
295295

296296

297+
class Mode:
298+
299+
params = [[10 ** 3, 10 ** 4, 10 ** 5], ["int", "uint", "float", "object"]]
300+
param_names = ["N", "dtype"]
301+
302+
def setup(self, N, dtype):
303+
np.random.seed(42)
304+
self.s = Series(np.random.randint(0, N, size=10 * N)).astype(dtype)
305+
306+
def time_mode(self, N, dtype):
307+
self.s.mode()
308+
309+
297310
class Dir:
298311
def setup(self):
299312
self.s = Series(index=tm.makeStringIndex(10000))

ci/deps/azure-37.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ dependencies:
1818
- numpy
1919
- python-dateutil
2020
- nomkl
21-
- pyarrow
21+
- pyarrow=0.15.1
2222
- pytz
2323
- s3fs>=0.4.0
2424
- moto>=1.3.14

ci/deps/azure-macos-37.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ dependencies:
2121
- numexpr
2222
- numpy=1.16.5
2323
- openpyxl
24-
- pyarrow>=0.15.0
24+
- pyarrow=0.15.1
2525
- pytables
2626
- python-dateutil==2.7.3
2727
- pytz

doc/source/user_guide/visualization.rst

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -552,6 +552,9 @@ These can be specified by the ``x`` and ``y`` keywords.
552552
.. ipython:: python
553553
554554
df = pd.DataFrame(np.random.rand(50, 4), columns=["a", "b", "c", "d"])
555+
df["species"] = pd.Categorical(
556+
["setosa"] * 20 + ["versicolor"] * 20 + ["virginica"] * 10
557+
)
555558
556559
@savefig scatter_plot.png
557560
df.plot.scatter(x="a", y="b");
@@ -579,6 +582,21 @@ each point:
579582
df.plot.scatter(x="a", y="b", c="c", s=50);
580583
581584
585+
.. ipython:: python
586+
:suppress:
587+
588+
plt.close("all")
589+
590+
If a categorical column is passed to ``c``, then a discrete colorbar will be produced:
591+
592+
.. versionadded:: 1.3.0
593+
594+
.. ipython:: python
595+
596+
@savefig scatter_plot_categorical.png
597+
df.plot.scatter(x="a", y="b", c="species", cmap="viridis", s=50);
598+
599+
582600
.. ipython:: python
583601
:suppress:
584602

doc/source/whatsnew/v0.8.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -176,7 +176,7 @@ New plotting methods
176176
Vytautas Jancauskas, the 2012 GSOC participant, has added many new plot
177177
types. For example, ``'kde'`` is a new option:
178178

179-
.. code-block:: python
179+
.. ipython:: python
180180
181181
s = pd.Series(
182182
np.concatenate((np.random.randn(1000), np.random.randn(1000) * 0.5 + 3))

doc/source/whatsnew/v1.2.2.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,10 @@ including other versions of pandas.
1414

1515
Fixed regressions
1616
~~~~~~~~~~~~~~~~~
17+
- Fixed regression in :class:`DataFrame` constructor reordering element when construction from datetime ndarray with dtype not ``"datetime64[ns]"`` (:issue:`39422`)
1718
- Fixed regression in :meth:`~DataFrame.to_pickle` failing to create bz2/xz compressed pickle files with ``protocol=5`` (:issue:`39002`)
19+
- Fixed regression in :func:`pandas.testing.assert_series_equal` and :func:`pandas.testing.assert_frame_equal` always raising ``AssertionError`` when comparing extension dtypes (:issue:`39410`)
20+
- Fixed regression in :meth:`~DataFrame.to_csv` opening ``codecs.StreamWriter`` in binary mode instead of in text mode and ignoring user-provided ``mode`` (:issue:`39247`)
1821
-
1922

2023
.. ---------------------------------------------------------------------------

doc/source/whatsnew/v1.3.0.rst

Lines changed: 49 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,9 @@ Other enhancements
5252
- :meth:`DataFrame.apply` can now accept NumPy unary operators as strings, e.g. ``df.apply("sqrt")``, which was already the case for :meth:`Series.apply` (:issue:`39116`)
5353
- :meth:`DataFrame.apply` can now accept non-callable DataFrame properties as strings, e.g. ``df.apply("size")``, which was already the case for :meth:`Series.apply` (:issue:`39116`)
5454
- :meth:`Series.apply` can now accept list-like or dictionary-like arguments that aren't lists or dictionaries, e.g. ``ser.apply(np.array(["sum", "mean"]))``, which was already the case for :meth:`DataFrame.apply` (:issue:`39140`)
55+
- :meth:`DataFrame.plot.scatter` can now accept a categorical column as the argument to ``c`` (:issue:`12380`, :issue:`31357`)
5556
- :meth:`.Styler.set_tooltips` allows on hover tooltips to be added to styled HTML dataframes.
57+
- :meth:`Series.loc.__getitem__` and :meth:`Series.loc.__setitem__` with :class:`MultiIndex` now raising helpful error message when indexer has too many dimensions (:issue:`35349`)
5658

5759
.. ---------------------------------------------------------------------------
5860
@@ -95,6 +97,45 @@ Preserve dtypes in :meth:`~pandas.DataFrame.combine_first`
9597
combined.dtypes
9698
9799
100+
.. _whatsnew_130.notable_bug_fixes.setitem_with_bool_casting:
101+
102+
Consistent Casting With Setting Into Boolean Series
103+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
104+
105+
Setting non-boolean values into a :class:`Series with ``dtype=bool`` consistently
106+
cast to ``dtype=object`` (:issue:`38709`)
107+
108+
.. ipython:: python
109+
110+
orig = pd.Series([True, False])
111+
ser = orig.copy()
112+
ser.iloc[1] = np.nan
113+
ser2 = orig.copy()
114+
ser2.iloc[1] = 2.0
115+
116+
*pandas 1.2.x*
117+
118+
.. code-block:: ipython
119+
120+
In [1]: ser
121+
Out [1]:
122+
0 1.0
123+
1 NaN
124+
dtype: float64
125+
126+
In [2]:ser2
127+
Out [2]:
128+
0 True
129+
1 2.0
130+
dtype: object
131+
132+
*pandas 1.3.0*
133+
134+
.. ipython:: python
135+
136+
ser
137+
ser2
138+
98139
.. _whatsnew_130.api_breaking.deps:
99140

100141
Increased minimum versions for dependencies
@@ -242,7 +283,7 @@ Datetimelike
242283
Timedelta
243284
^^^^^^^^^
244285
- Bug in constructing :class:`Timedelta` from ``np.timedelta64`` objects with non-nanosecond units that are out of bounds for ``timedelta64[ns]`` (:issue:`38965`)
245-
-
286+
- Bug in constructing a :class:`TimedeltaIndex` incorrectly accepting ``np.datetime64("NaT")`` objects (:issue:`39462`)
246287
-
247288

248289
Timezones
@@ -289,11 +330,14 @@ Indexing
289330
- Bug in :meth:`DataFrame.loc.__setitem__` raising ValueError when expanding unique column for :class:`DataFrame` with duplicate columns (:issue:`38521`)
290331
- Bug in :meth:`DataFrame.iloc.__setitem__` and :meth:`DataFrame.loc.__setitem__` with mixed dtypes when setting with a dictionary value (:issue:`38335`)
291332
- Bug in :meth:`DataFrame.__setitem__` not raising ``ValueError`` when right hand side is a :class:`DataFrame` with wrong number of columns (:issue:`38604`)
333+
- Bug in :meth:`Series.__setitem__` raising ``ValueError`` when setting a :class:`Series` with a scalar indexer (:issue:`38303`)
292334
- Bug in :meth:`DataFrame.loc` dropping levels of :class:`MultiIndex` when :class:`DataFrame` used as input has only one row (:issue:`10521`)
293335
- Bug in :meth:`DataFrame.__getitem__` and :meth:`Series.__getitem__` always raising ``KeyError`` when slicing with existing strings an :class:`Index` with milliseconds (:issue:`33589`)
294336
- Bug in setting ``timedelta64`` values into numeric :class:`Series` failing to cast to object dtype (:issue:`39086`)
295337
- Bug in setting :class:`Interval` values into a :class:`Series` or :class:`DataFrame` with mismatched :class:`IntervalDtype` incorrectly casting the new values to the existing dtype (:issue:`39120`)
338+
- Bug in setting ``datetime64`` values into a :class:`Series` with integer-dtype incorrect casting the datetime64 values to integers (:issue:`39266`)
296339
- Bug in incorrectly raising in :meth:`Index.insert`, when setting a new column that cannot be held in the existing ``frame.columns``, or in :meth:`Series.reset_index` or :meth:`DataFrame.reset_index` instead of casting to a compatible dtype (:issue:`39068`)
340+
- Bug in :meth:`RangeIndex.append` where a single object of length 1 was concatenated incorrectly (:issue:`39401`)
297341

298342
Missing
299343
^^^^^^^
@@ -325,11 +369,13 @@ I/O
325369
- Bug in :func:`to_hdf` raising ``KeyError`` when trying to apply for subclasses of ``DataFrame`` or ``Series`` (:issue:`33748`)
326370
- Bug in :meth:`~HDFStore.put` raising a wrong ``TypeError`` when saving a DataFrame with non-string dtype (:issue:`34274`)
327371
- Bug in :func:`json_normalize` resulting in the first element of a generator object not being included in the returned ``DataFrame`` (:issue:`35923`)
372+
- Bug in :func:`read_csv` apllying thousands separator to date columns when column should be parsed for dates and ``usecols`` is specified for ``engine="python"`` (:issue:`39365`)
328373
- Bug in :func:`read_excel` forward filling :class:`MultiIndex` names with multiple header and index columns specified (:issue:`34673`)
329374
- :func:`read_excel` now respects :func:`set_option` (:issue:`34252`)
330375
- Bug in :func:`read_csv` not switching ``true_values`` and ``false_values`` for nullable ``boolean`` dtype (:issue:`34655`)
331376
- Bug in :func:`read_json` when ``orient="split"`` does not maintain numeric string index (:issue:`28556`)
332377
- :meth:`read_sql` returned an empty generator if ``chunksize`` was no-zero and the query returned no results. Now returns a generator with a single empty dataframe (:issue:`34411`)
378+
- Bug in :func:`read_hdf` returning unexpected records when filtering on categorical string columns using ``where`` parameter (:issue:`39189`)
333379

334380
Period
335381
^^^^^^
@@ -341,7 +387,7 @@ Plotting
341387
^^^^^^^^
342388

343389
- Bug in :func:`scatter_matrix` raising when 2d ``ax`` argument passed (:issue:`16253`)
344-
-
390+
- Prevent warnings when matplotlib's ``constrained_layout`` is enabled (:issue:`25261`)
345391
-
346392

347393
Groupby/resample/rolling
@@ -363,7 +409,7 @@ Reshaping
363409
- Bug in :func:`join` over :class:`MultiIndex` returned wrong result, when one of both indexes had only one level (:issue:`36909`)
364410
- :meth:`merge_asof` raises ``ValueError`` instead of cryptic ``TypeError`` in case of non-numerical merge columns (:issue:`29130`)
365411
- Bug in :meth:`DataFrame.join` not assigning values correctly when having :class:`MultiIndex` where at least one dimension is from dtype ``Categorical`` with non-alphabetically sorted categories (:issue:`38502`)
366-
- :meth:`Series.value_counts` returns keys in original order (:issue:`12679`, :issue:`11227`)
412+
- :meth:`Series.value_counts` and :meth:`Series.mode` return consistent keys in original order (:issue:`12679`, :issue:`11227` and :issue:`39007`)
367413
- Bug in :meth:`DataFrame.apply` would give incorrect results when used with a string argument and ``axis=1`` when the axis argument was not supported and now raises a ``ValueError`` instead (:issue:`39211`)
368414
-
369415

0 commit comments

Comments
 (0)