Closed
Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
$ pip install numexpr=2.8.5
import pandas as pd
df = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
a = 8
df.query("A == a@", engine="numexpr")
Issue Description
Traceback (most recent call last):
File "C:\Users\QW664QA\Miniconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3433, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-34-7f773e498449>", line 1, in <module>
df.query("A == @a", engine="numexpr")
File "C:\Users\QW664QA\Miniconda3\lib\site-packages\pandas\core\frame.py", line 4060, in query
res = self.eval(expr, **kwargs)
File "C:\Users\QW664QA\Miniconda3\lib\site-packages\pandas\core\frame.py", line 4191, in eval
return _eval(expr, inplace=inplace, **kwargs)
File "C:\Users\QW664QA\Miniconda3\lib\site-packages\pandas\core\computation\eval.py", line 353, in eval
ret = eng_inst.evaluate()
File "C:\Users\QW664QA\Miniconda3\lib\site-packages\pandas\core\computation\engines.py", line 80, in evaluate
res = self._evaluate()
File "C:\Users\QW664QA\Miniconda3\lib\site-packages\pandas\core\computation\engines.py", line 121, in _evaluate
return ne.evaluate(s, local_dict=scope)
File "C:\Users\QW664QA\Miniconda3\lib\site-packages\numexpr\necompiler.py", line 943, in evaluate
raise e
File "C:\Users\QW664QA\Miniconda3\lib\site-packages\numexpr\necompiler.py", line 851, in validate
_names_cache[expr_key] = getExprNames(ex, context)
File "C:\Users\QW664QA\Miniconda3\lib\site-packages\numexpr\necompiler.py", line 714, in getExprNames
ex = stringToExpression(text, {}, context)
File "C:\Users\QW664QA\Miniconda3\lib\site-packages\numexpr\necompiler.py", line 274, in stringToExpression
raise ValueError(f'Expression {s} has forbidden control characters.')
ValueError: Expression (A) == (__pd_eval_local_a) has forbidden control characters.
So this is actually an issue with numexpr
release 2.8.5 which went live on Sunday 6th August 2023:
- As an addendum to the use of NumExpr for parsing user inputs, is that NumExpr
callseval
on the inputs. A regular expression is now applied to help sanitize
the input expression string, forbidding '__', ':', and ';'. Attribute access
Not sure if this qualifies as a bug over there, but it breaks pandas if you have numexpr==2.8.5
installed
Expected Behavior
df.query("A == 8", engine="numexpr")
correctly queries the df and produces a valid response. So this is an issue with using @ variables in the query which produces those dunder variables, although I guess it may manifest elsewhere.
Installed Versions
INSTALLED VERSIONS
------------------
commit : 66e3805
python : 3.9.12.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19044
machine : AMD64
processor : Intel64 Family 6 Model 140 Stepping 1, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United Kingdom.1252
pandas : 1.3.5
numpy : 1.24.3
pytz : 2021.3
dateutil : 2.8.2
pip : 21.2.4
setuptools : 61.2.0
Cython : 3.0.0
pytest : None
hypothesis : None
sphinx : 6.1.3
blosc : None
feather : None
xlsxwriter : 3.0.3
lxml.etree : 4.9.1
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.1.2
IPython : 8.7.0
pandas_datareader: 0.10.0
bs4 : 4.11.1
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.6.2
numexpr : 2.8.5
odfpy : None
openpyxl : 3.0.10
pandas_gbq : None
pyarrow : 11.0.0
pyxlsb : None
s3fs : None
scipy : 1.9.3
sqlalchemy : None
tables : None
tabulate : 0.9.0
xarray : None
xlrd : 2.0.1
xlwt : None
numba : None