Conversation
Reviewer's GuideReplaces unsafe eval-based expression evaluation with a restricted AST interpreter, removes eval from API docs generation by using an explicit whitelist of known globals, and tightens template attribute/call access via an allowlist while also updating ignore rules. Class diagram for restricted AST-based evaluator in template moduleclassDiagram
class Sub {
+mapping
+eval(mapping)
+__str__()
}
class RepEval {
+expression
+mapping
+eval(mapping) tp.Any
+__str__() str
}
class TEMPLATE_ALLOWED_ATTRS {
<<module_constant>>
+np : prod
%% original type/value notation: {prod}
}
class ast {
<<external_module>>
}
class merge_dicts {
<<function>>
+merge_dicts(base_mapping, override_mapping) Mapping
}
RepEval ..> ast : uses
RepEval ..> TEMPLATE_ALLOWED_ATTRS : consults
RepEval ..> merge_dicts : calls
RepEval : +_eval_node(node) tp.Any
RepEval : supports ast.Constant
RepEval : supports ast.Name
RepEval : supports ast.BinOp (Add, Sub, Mult, Div, FloorDiv, Mod, Pow)
RepEval : supports ast.UnaryOp (USub, UAdd, Not)
RepEval : supports ast.BoolOp (And, Or)
RepEval : supports ast.Compare (Eq, NotEq, Lt, LtE, Gt, GtE)
RepEval : supports ast.Attribute (only allowed attrs on mapped names)
RepEval : supports ast.Call (only allowed mapped attributes)
RepEval : supports ast.Subscript (indices, slices, tuples)
RepEval : supports ast.List, ast.Tuple, ast.Dict
RepEval : raises ValueError on unsupported nodes
Flow diagram for restricted AST-based expression evaluationflowchart TD
A[RepEval.eval called] --> B[Merge default mapping and provided mapping using merge_dicts]
B --> C[Parse expression with ast.parse in eval mode]
C --> D[Call _eval_node on parsed.body]
D --> E{Node type}
E --> F[ast.Constant
return node.value]
E --> G[ast.Name
lookup in mapping or raise NameError]
E --> H[ast.BinOp
recursively eval left and right
apply supported arithmetic op or raise ValueError]
E --> I[ast.UnaryOp
recursively eval operand
apply USub, UAdd, Not or raise ValueError]
E --> J[ast.BoolOp
recursively eval values
apply And/Or or raise ValueError]
E --> K[ast.Compare
recursively eval chain of comparisons
support Eq, NotEq, Lt, LtE, Gt, GtE]
E --> L[ast.Attribute
only allow attribute on top level mapped name
check TEMPLATE_ALLOWED_ATTRS
getattr or raise error]
E --> M[ast.Call
only allow call of allowed attribute on mapped name
resolve function
recursively eval args and kwargs
call or raise error]
E --> N[ast.Subscript
recursively eval value and slice
handle Slice, Tuple, or single index]
E --> O[ast.List
recursively eval elements and return list]
E --> P[ast.Tuple
recursively eval elements and return tuple]
E --> Q[ast.Dict
recursively eval keys and values and return dict]
E --> R[Other node type
raise ValueError unsupported expression]
F --> S[Return to caller]
G --> S
H --> S
I --> S
J --> S
K --> S
L --> S
M --> S
N --> S
O --> S
P --> S
Q --> S
R --> S
Flow diagram for safe_default_value global resolution without evalflowchart TD
A[safe_default_value called with parameter p] --> B[Get default value from p]
B --> C{value is inspect.Parameter.empty?}
C -- Yes --> D[Return p unchanged]
C -- No --> E[Build _safe_globals map
os.environ, sys.stdin, sys.stdout, sys.stderr]
E --> F[Iterate over _safe_globals items
find name where obj is value]
F --> G{Found matching object?}
G -- Yes --> H[Set replacement to matching name string]
G -- No --> I[If value is CPUDispatcher use value.py_func.__name__
else leave replacement as None]
H --> J[Return parameter info with safe replacement]
I --> J[Return parameter info with possibly transformed default]
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The
ast.Subscripthandling currently setsidx = Noneforast.Slice, which will attemptval[None]; either implement proper slice evaluation (start, stop, step) or explicitly raise for slice subscripts to avoid surprising runtime errors. - The new evaluator allows arbitrary attribute access (
ast.Attribute) and method calls on any mapped object, which can still expose unsafe surface area depending on what is placed inmapping; consider constraining allowed attributes/callables via an explicit allowlist or wrapper objects instead of usinggetattrdirectly.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `ast.Subscript` handling currently sets `idx = None` for `ast.Slice`, which will attempt `val[None]`; either implement proper slice evaluation (start, stop, step) or explicitly raise for slice subscripts to avoid surprising runtime errors.
- The new evaluator allows arbitrary attribute access (`ast.Attribute`) and method calls on any mapped object, which can still expose unsafe surface area depending on what is placed in `mapping`; consider constraining allowed attributes/callables via an explicit allowlist or wrapper objects instead of using `getattr` directly.
## Individual Comments
### Comment 1
<location> `vectorbt/utils/template.py:195-197` </location>
<code_context>
+ kwargs = {kw.arg: _eval_node(kw.value) for kw in node.keywords}
+ return func(*args, **kwargs)
+ raise ValueError("only calls to mapped attributes are allowed")
+ if isinstance(node, ast.Subscript):
+ val = _eval_node(node.value)
+ idx = _eval_node(node.slice) if not isinstance(node.slice, ast.Slice) else None
+ return val[idx]
+ if isinstance(node, ast.List):
</code_context>
<issue_to_address>
**issue (bug_risk):** Subscript handling drops slices, leading to incorrect indexing behavior.
For slice syntax (`a[1:3]`, `a[::2]`), `node.slice` is an `ast.Slice`, but here `idx` becomes `None` and you end up doing `val[None]`. Instead, for `ast.Slice`, recursively evaluate `lower`, `upper`, and `step` and use `val[slice(lower, upper, step)]`. If relevant, also handle multi-dimensional cases (`ast.ExtSlice` / tuple indices).
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Hey - I've found 2 issues, and left some high level feedback:
- Consider wrapping
ast.parse(self.expression, mode="eval")in a try/except to convertSyntaxErrorinto a clearer, template-specific error message rather than bubbling up a raw parse error. - The inline
_eval_nodeimplementation insideRepEval.evalis quite large and complex; extracting it (and the operator/allowed-attr logic) into a dedicated helper/module would improve readability and reuse while keepingevalitself small.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider wrapping `ast.parse(self.expression, mode="eval")` in a try/except to convert `SyntaxError` into a clearer, template-specific error message rather than bubbling up a raw parse error.
- The inline `_eval_node` implementation inside `RepEval.eval` is quite large and complex; extracting it (and the operator/allowed-attr logic) into a dedicated helper/module would improve readability and reuse while keeping `eval` itself small.
## Individual Comments
### Comment 1
<location> `vectorbt/utils/template.py:151-155` </location>
<code_context>
+ if isinstance(node.op, ast.Not):
+ return not operand
+ raise ValueError(f"unsupported unary operator: {node.op}")
+ if isinstance(node, ast.BoolOp):
+ values = [_eval_node(v) for v in node.values]
+ if isinstance(node.op, ast.And):
+ return all(values)
+ if isinstance(node.op, ast.Or):
+ return any(values)
+ raise ValueError(f"unsupported boolean operator: {node.op}")
</code_context>
<issue_to_address>
**issue (bug_risk):** Boolean operations do not short-circuit, which can change semantics and cost for expressions with function calls.
Here all operands are eagerly evaluated (`values = [...]`) before applying `all`/`any`, which changes the behavior of `and`/`or`—particularly for operands with side effects, expensive calls, or potential errors that would normally be skipped by short-circuiting. Please evaluate operands one by one in a loop and return as soon as the result is determined, to preserve Python’s short-circuit semantics.
</issue_to_address>
### Comment 2
<location> `vectorbt/utils/template.py:117` </location>
<code_context>
mapping = merge_dicts(self.mapping, mapping)
- return eval(self.expression, {}, mapping)
+ # Use a restricted AST evaluator to avoid arbitrary code execution
+ def _eval_node(node):
+ if isinstance(node, ast.Constant):
+ return node.value
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring the new AST evaluator by centralizing operator mappings and extracting attribute, call, and subscript handling into helpers to keep the recursive dispatcher smaller and clearer.
You can keep the new security behavior but reduce complexity and duplication by factoring out operator tables and some helpers, while still using a single recursive dispatcher.
For example:
1. **Centralize operator handling**
Instead of repeating `if isinstance(node.op, ast.X)` across `BinOp`, `UnaryOp`, `BoolOp`, `Compare`, define small lookup tables once and use them:
```python
import operator as _op
_BIN_OPS = {
ast.Add: _op.add,
ast.Sub: _op.sub,
ast.Mult: _op.mul,
ast.Div: _op.truediv,
ast.FloorDiv: _op.floordiv,
ast.Mod: _op.mod,
ast.Pow: _op.pow,
}
_UNARY_OPS = {
ast.UAdd: _op.pos,
ast.USub: _op.neg,
ast.Not: _op.not_,
}
_BOOL_OPS = {
ast.And: all,
ast.Or: any,
}
```
Then inside `_eval_node`:
```python
if isinstance(node, ast.BinOp):
left = _eval_node(node.left)
right = _eval_node(node.right)
op_func = _BIN_OPS.get(type(node.op))
if op_func is None:
raise ValueError(f"unsupported binary operator: {node.op}")
return op_func(left, right)
if isinstance(node, ast.UnaryOp):
operand = _eval_node(node.operand)
op_func = _UNARY_OPS.get(type(node.op))
if op_func is None:
raise ValueError(f"unsupported unary operator: {node.op}")
return op_func(operand)
if isinstance(node, ast.BoolOp):
values = (_eval_node(v) for v in node.values)
agg = _BOOL_OPS.get(type(node.op))
if agg is None:
raise ValueError(f"unsupported boolean operator: {node.op}")
return agg(values)
```
For `Compare`, you can do something similar for per‑op functions to avoid the long `elif` chain while keeping the “chain comparison” semantics:
```python
_CMP_OPS = {
ast.Eq: _op.eq,
ast.NotEq: _op.ne,
ast.Lt: _op.lt,
ast.LtE: _op.le,
ast.Gt: _op.gt,
ast.GtE: _op.ge,
}
if isinstance(node, ast.Compare):
left = _eval_node(node.left)
for op, comparator in zip(node.ops, node.comparators):
right = _eval_node(comparator)
cmp_func = _CMP_OPS.get(type(op))
if cmp_func is None:
raise ValueError(f"unsupported comparison operator: {op}")
if not cmp_func(left, right):
return False
left = right
return True
```
2. **Extract attribute/call policy into helpers**
Move the security policy checks out of the main dispatcher to keep `_eval_node` focused on traversal:
```python
def _resolve_allowed_attr(mapping, base_name: str, attr: str):
if base_name not in mapping:
raise NameError(f"name '{base_name}' is not defined")
allowed = TEMPLATE_ALLOWED_ATTRS.get(base_name, set())
if attr not in allowed:
raise ValueError(f"access to attribute '{attr}' of '{base_name}' is not allowed")
return getattr(mapping[base_name], attr)
def _eval_call_attr(mapping, func_node, args, keywords, eval_node):
if not (isinstance(func_node, ast.Attribute) and isinstance(func_node.value, ast.Name)):
raise ValueError("only calls to mapped attributes are allowed")
func = _resolve_allowed_attr(mapping, func_node.value.id, func_node.attr)
if not callable(func):
raise ValueError(f"object '{func_node.attr}' of '{func_node.value.id}' is not callable")
eval_args = [eval_node(a) for a in args]
eval_kwargs = {kw.arg: eval_node(kw.value) for kw in keywords}
return func(*eval_args, **eval_kwargs)
```
Then:
```python
if isinstance(node, ast.Attribute):
if not isinstance(node.value, ast.Name):
raise ValueError("attribute access is only allowed on top-level mapped names")
return _resolve_allowed_attr(mapping, node.value.id, node.attr)
if isinstance(node, ast.Call):
return _eval_call_attr(mapping, node.func, node.args, node.keywords, _eval_node)
```
3. **Factor out subscript handling**
Subscript/slice handling can be a short helper to shorten `_eval_node`:
```python
def _eval_subscript(eval_node, value_node, slice_node):
val = eval_node(value_node)
s = slice_node
if isinstance(s, ast.Slice):
lower = eval_node(s.lower) if s.lower is not None else None
upper = eval_node(s.upper) if s.upper is not None else None
step = eval_node(s.step) if s.step is not None else None
return val[slice(lower, upper, step)]
if isinstance(s, ast.Tuple):
idx = tuple(eval_node(elt) for elt in s.elts)
return val[idx]
idx = eval_node(s)
return val[idx]
```
Then simply:
```python
if isinstance(node, ast.Subscript):
return _eval_subscript(_eval_node, node.value, node.slice)
```
These changes keep the same behavior and security guarantees but flatten the long `if isinstance` sections, centralize operator semantics in one place, and separate policy concerns into small helpers that are easier to audit and extend.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Vulnerability Fixes:
Remote code execution (RCE) from untrusted input evaluated by eval
Risk: RCE lets attackers execute arbitrary code, access sensitive data, pivot the environment, or fully compromise the process when untrusted input reaches eval.
Cause: eval executes strings as code. Passing data derived from external sources without strict validation or a strict allowlist enables arbitrary code injection.
Fix
Remove eval usage. If parsing literals, use ast.literal_eval(). For calculations or logic, implement explicit handlers or a whitelisted function map. Validate inputs strictly. If isolation is unavoidable, use a sandbox like RestrictedPython with minimal, immutable globals.
Note
Dynamic expression execution will be removed or restricted; inputs that previously executed arbitrary expressions may be rejected or behave differently.
Remote code execution (RCE) from external input evaluated by exec
Risk: Attackers could execute arbitrary code on the server, exfiltrate data, modify state, or fully compromise the host process.
Cause: User-controlled strings are passed to exec without strict validation or isolation, enabling injected code to run with application privileges.
Fix
Remove exec usage. Replace dynamic code execution with explicit functions or a dispatch map. For data-only evaluation, use ast.literal_eval. If expression evaluation is required, implement a strict whitelist parser and never pass user input to exec.
Note
If the application relied on executing arbitrary expressions, those scripts will no longer run; only explicitly allowed operations will execute.
Summary by Sourcery
Harden expression evaluation and remove unsafe dynamic code execution from the codebase.
Bug Fixes:
Enhancements: