@@ -29,13 +29,6 @@ this area.
2929 production code, we recommended that you take advantage of the optimized
3030 pandas data access methods exposed in this chapter.
3131
32- .. warning ::
33-
34- Whether a copy or a reference is returned for a setting operation, may
35- depend on the context. This is sometimes called ``chained assignment `` and
36- should be avoided. See :ref: `Returning a View versus Copy
37- <indexing.view_versus_copy>`.
38-
3932See the :ref: `MultiIndex / Advanced Indexing <advanced >` for ``MultiIndex `` and more advanced indexing documentation.
4033
4134See the :ref: `cookbook<cookbook.selection> ` for some advanced strategies.
@@ -299,12 +292,6 @@ largely as a convenience since it is such a common operation.
299292Selection by label
300293------------------
301294
302- .. warning ::
303-
304- Whether a copy or a reference is returned for a setting operation, may depend on the context.
305- This is sometimes called ``chained assignment `` and should be avoided.
306- See :ref: `Returning a View versus Copy <indexing.view_versus_copy >`.
307-
308295.. warning ::
309296
310297 ``.loc `` is strict when you present slicers that are not compatible (or convertible) with the index type. For example
@@ -445,12 +432,6 @@ For more information about duplicate labels, see
445432Selection by position
446433---------------------
447434
448- .. warning ::
449-
450- Whether a copy or a reference is returned for a setting operation, may depend on the context.
451- This is sometimes called ``chained assignment `` and should be avoided.
452- See :ref: `Returning a View versus Copy <indexing.view_versus_copy >`.
453-
454435pandas provides a suite of methods in order to get **purely integer based indexing **. The semantics follow closely Python and NumPy slicing. These are ``0-based `` indexing. When slicing, the start bound is *included *, while the upper bound is *excluded *. Trying to use a non-integer, even a **valid ** label will raise an ``IndexError ``.
455436
456437The ``.iloc `` attribute is the primary access method. The following are valid inputs:
@@ -1722,234 +1703,10 @@ You can assign a custom index to the ``index`` attribute:
17221703 df_idx.index = pd.Index([10 , 20 , 30 , 40 ], name = " a" )
17231704 df_idx
17241705
1725- .. _indexing.view_versus_copy :
1726-
1727- Returning a view versus a copy
1728- ------------------------------
1729-
1730- .. warning ::
1731-
1732- :ref: `Copy-on-Write <copy_on_write >`
1733- will become the new default in pandas 3.0. This means that chained indexing will
1734- never work. As a consequence, the ``SettingWithCopyWarning `` won't be necessary
1735- anymore.
1736- See :ref: `this section <copy_on_write_chained_assignment >`
1737- for more context.
1738- We recommend turning Copy-on-Write on to leverage the improvements with
1739-
1740- ```
1741- pd.options.mode.copy_on_write = True
1742- ` ``
1743-
1744- even before pandas 3.0 is available.
1745-
1746- When setting values in a pandas object, care must be taken to avoid what is called
1747- ``chained indexing ``. Here is an example.
1748-
1749- .. ipython :: python
1750-
1751- dfmi = pd.DataFrame([list (' abcd' ),
1752- list (' efgh' ),
1753- list (' ijkl' ),
1754- list (' mnop' )],
1755- columns = pd.MultiIndex.from_product([[' one' , ' two' ],
1756- [' first' , ' second' ]]))
1757- dfmi
1758-
1759- Compare these two access methods:
1760-
1761- .. ipython :: python
1762-
1763- dfmi[' one' ][' second' ]
1764-
1765- .. ipython :: python
1766-
1767- dfmi.loc[:, (' one' , ' second' )]
1768-
1769- These both yield the same results, so which should you use? It is instructive to understand the order
1770- of operations on these and why method 2 (``.loc ``) is much preferred over method 1 (chained ``[] ``).
1771-
1772- ``dfmi['one'] `` selects the first level of the columns and returns a DataFrame that is singly-indexed.
1773- Then another Python operation ``dfmi_with_one['second'] `` selects the series indexed by ``'second' ``.
1774- This is indicated by the variable ``dfmi_with_one `` because pandas sees these operations as separate events.
1775- e.g. separate calls to ``__getitem__ ``, so it has to treat them as linear operations, they happen one after another.
1776-
1777- Contrast this to ``df.loc[:,('one','second')] `` which passes a nested tuple of ``(slice(None),('one','second')) `` to a single call to
1778- ``__getitem__ ``. This allows pandas to deal with this as a single entity. Furthermore this order of operations *can * be significantly
1779- faster, and allows one to index *both * axes if so desired.
1780-
17811706 Why does assignment fail when using chained indexing?
17821707~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
17831708
1784- .. warning ::
1785-
1786- :ref: `Copy-on-Write <copy_on_write >`
1787- will become the new default in pandas 3.0. This means that chained indexing will
1788- never work. As a consequence, the ``SettingWithCopyWarning `` won't be necessary
1789- anymore.
1790- See :ref: `this section <copy_on_write_chained_assignment >`
1791- for more context.
1792- We recommend turning Copy-on-Write on to leverage the improvements with
1793-
1794- ```
1795- pd.options.mode.copy_on_write = True
1796- ` ``
1797-
1798- even before pandas 3.0 is available.
1799-
1800- The problem in the previous section is just a performance issue. What's up with
1801- the ``SettingWithCopy `` warning? We don't **usually ** throw warnings around when
1802- you do something that might cost a few extra milliseconds!
1803-
1804- But it turns out that assigning to the product of chained indexing has
1805- inherently unpredictable results. To see this, think about how the Python
1806- interpreter executes this code:
1807-
1808- .. code-block :: python
1809-
1810- dfmi.loc[:, (' one' , ' second' )] = value
1811- # becomes
1812- dfmi.loc.__setitem__ ((slice (None ), (' one' , ' second' )), value)
1813-
1814- But this code is handled differently:
1815-
1816- .. code-block :: python
1817-
1818- dfmi[' one' ][' second' ] = value
1819- # becomes
1820- dfmi.__getitem__ (' one' ).__setitem__ (' second' , value)
1821-
1822- See that ``__getitem__ `` in there? Outside of simple cases, it's very hard to
1823- predict whether it will return a view or a copy (it depends on the memory layout
1824- of the array, about which pandas makes no guarantees), and therefore whether
1825- the ``__setitem__ `` will modify ``dfmi `` or a temporary object that gets thrown
1826- out immediately afterward. **That's ** what ``SettingWithCopy `` is warning you
1827- about!
1828-
1829- .. note :: You may be wondering whether we should be concerned about the ``loc``
1830- property in the first example. But ``dfmi.loc `` is guaranteed to be ``dfmi ``
1831- itself with modified indexing behavior, so ``dfmi.loc.__getitem__ `` /
1832- ``dfmi.loc.__setitem__ `` operate on ``dfmi `` directly. Of course,
1833- ``dfmi.loc.__getitem__(idx) `` may be a view or a copy of ``dfmi ``.
1834-
1835- Sometimes a ``SettingWithCopy `` warning will arise at times when there's no
1836- obvious chained indexing going on. **These ** are the bugs that
1837- ``SettingWithCopy `` is designed to catch! pandas is probably trying to warn you
1838- that you've done this:
1839-
1840- .. code-block :: python
1841-
1842- def do_something (df ):
1843- foo = df[[' bar' , ' baz' ]] # Is foo a view? A copy? Nobody knows!
1844- # ... many lines here ...
1845- # We don't know whether this will modify df or not!
1846- foo[' quux' ] = value
1847- return foo
1848-
1849- Yikes!
1850-
1851- .. _indexing.evaluation_order :
1852-
1853- Evaluation order matters
1854- ~~~~~~~~~~~~~~~~~~~~~~~~
1855-
1856- .. warning ::
1857-
1858- :ref: `Copy-on-Write <copy_on_write >`
1859- will become the new default in pandas 3.0. This means than chained indexing will
1860- never work. As a consequence, the ``SettingWithCopyWarning `` won't be necessary
1861- anymore.
1862- See :ref: `this section <copy_on_write_chained_assignment >`
1863- for more context.
1864- We recommend turning Copy-on-Write on to leverage the improvements with
1865-
1866- ```
1867- pd.options.mode.copy_on_write = True
1868- ` ``
1869-
1870- even before pandas 3.0 is available.
1871-
1872- When you use chained indexing, the order and type of the indexing operation
1873- partially determine whether the result is a slice into the original object, or
1874- a copy of the slice.
1875-
1876- pandas has the ``SettingWithCopyWarning `` because assigning to a copy of a
1877- slice is frequently not intentional, but a mistake caused by chained indexing
1878- returning a copy where a slice was expected.
1879-
1880- If you would like pandas to be more or less trusting about assignment to a
1881- chained indexing expression, you can set the :ref: `option <options >`
1882- ``mode.chained_assignment `` to one of these values:
1883-
1884- * ``'warn' ``, the default, means a ``SettingWithCopyWarning `` is printed.
1885- * ``'raise' `` means pandas will raise a ``SettingWithCopyError ``
1886- you have to deal with.
1887- * ``None `` will suppress the warnings entirely.
1888-
1889- .. ipython :: python
1890- :okwarning:
1891-
1892- dfb = pd.DataFrame({' a' : [' one' , ' one' , ' two' ,
1893- ' three' , ' two' , ' one' , ' six' ],
1894- ' c' : np.arange(7 )})
1895-
1896- # This will show the SettingWithCopyWarning
1897- # but the frame values will be set
1898- dfb[' c' ][dfb[' a' ].str.startswith(' o' )] = 42
1899-
1900- This however is operating on a copy and will not work.
1901-
1902- .. ipython :: python
1903- :okwarning:
1904- :okexcept:
1905-
1906- with pd.option_context(' mode.chained_assignment' ,' warn' ):
1907- dfb[dfb[' a' ].str.startswith(' o' )][' c' ] = 42
1908-
1909- A chained assignment can also crop up in setting in a mixed dtype frame.
1910-
1911- .. note ::
1912-
1913- These setting rules apply to all of ``.loc/.iloc ``.
1914-
1915- The following is the recommended access method using ``.loc `` for multiple items (using ``mask ``) and a single item using a fixed index:
1916-
1917- .. ipython :: python
1918-
1919- dfc = pd.DataFrame({' a' : [' one' , ' one' , ' two' ,
1920- ' three' , ' two' , ' one' , ' six' ],
1921- ' c' : np.arange(7 )})
1922- dfd = dfc.copy()
1923- # Setting multiple items using a mask
1924- mask = dfd[' a' ].str.startswith(' o' )
1925- dfd.loc[mask, ' c' ] = 42
1926- dfd
1927-
1928- # Setting a single item
1929- dfd = dfc.copy()
1930- dfd.loc[2 , ' a' ] = 11
1931- dfd
1932-
1933- The following *can * work at times, but it is not guaranteed to, and therefore should be avoided:
1934-
1935- .. ipython :: python
1936- :okwarning:
1937-
1938- dfd = dfc.copy()
1939- dfd[' a' ][2 ] = 111
1940- dfd
1941-
1942- Last, the subsequent example will **not ** work at all, and so should be avoided:
1943-
1944- .. ipython :: python
1945- :okwarning:
1946- :okexcept:
1947-
1948- with pd.option_context(' mode.chained_assignment' ,' raise' ):
1949- dfd.loc[0 ][' a' ] = 1111
1950-
1951- .. warning ::
1952-
1953- The chained assignment warnings / exceptions are aiming to inform the user of a possibly invalid
1954- assignment. There may be false positives; situations where a chained assignment is inadvertently
1955- reported.
1709+ :ref: `Copy-on-Write <copy_on_write >` is the new default with pandas 3.0.
1710+ This means than chained indexing will never work.
1711+ See :ref: `this section <copy_on_write_chained_assignment >`
1712+ for more context.
0 commit comments