Description
Code Sample, a copy-pastable example if possible
import pandas as pd
df = pd.DataFrame(index=[1, 2], columns = list('ab'))
# Single assignment on existing column
df.loc[1, 'a'] = 1
# Multiple assignment on existing column
df.loc[2, ['a', 'b']] = [1, 2]
# Single assignment on new column
df.loc[3, 'c'] = 3
# Multiple assignment on single new column
df.loc[[1, 2], 'l'] = [6, 7]
# Multiple assignment on multiple new column
# Surprisingly fails
try:
df.loc[3, ['d', 'e']] = [6, 7]
print(df)
except KeyError:
print("Error")
Problem description
The behaviour of loc is surprising:
- When assigning, new columns and index are created as required.
- You can assign multiple index and columns
- BUT, if you try to create a column or index while doing multiple assignment, new columns and index are NOT created as required, and you get a KeyError.
Futhermore, using the indexing operators []
directly will create the required columns.
Expected Output
a b c l d e
1 1 NaN NaN 6.0 NaN NaN
2 1 2 NaN 7.0 NaN NaN
3 NaN NaN 3.0 NaN 6.0 7.0
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.7.2.final.0
python-bits: 64
OS: Darwin
OS-release: 18.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.24.1
pytest: 4.2.0
pip: 19.0.2
setuptools: 40.8.0
Cython: 0.28.5
numpy: 1.15.2
scipy: 1.2.0
pyarrow: None
xarray: None
IPython: 7.0.1
sphinx: 1.8.1
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 3.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: 4.2.5
bs4: None
html5lib: 1.0.1
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None