Unsafe use of eval() in calculate_reward enables code execution if tasks are untrusted

### Summary
`tunix/rl/agentic/rewards/reward.py` implements `calculate_reward()` using Python `eval()` on a string derived from `task["question"]`. If an untrusted task/question is processed with this reward enabled, arbitrary Python code can execute in the context of the running process.

### Location
- File: `tunix/rl/agentic/rewards/reward.py`
- Function: `calculate_reward`
- Line: `correct_value = eval(expression)`

### Why this matters
Many RL/agentic workflows consume tasks from external datasets/benchmarks. If those inputs are not fully trusted, `eval()` introduces a code-execution risk.

### Reproduction (safe)
Set the task question to a harmless payload like:
- `__import__('os').system('echo PWNED')`

Then execute it through `TaskEnvironment(..., reward_fn=calculate_reward)`.

### Suggested remediation
- Replace `eval()` with a safe math expression evaluator (AST allowlist), or
- gate this behind an explicit “unsafe” flag / move to tests-only code paths.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unsafe use of eval() in calculate_reward enables code execution if tasks are untrusted #1196

Summary

Location

Why this matters

Reproduction (safe)

Suggested remediation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unsafe use of eval() in calculate_reward enables code execution if tasks are untrusted #1196

Description

Summary

Location

Why this matters

Reproduction (safe)

Suggested remediation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions