Skip to content

Conversation

@biagiom
Copy link
Contributor

@biagiom biagiom commented Jan 2, 2026

Hello Datadog Security Labs Team,
this PR aims to improve the api-obfuscation rule for Python introduced in #607 in order to be more robust and generalize in case of string obfuscation.

Motivations:
Let's consider the following code that leverages API obfuscation to execute a malicious Python script:

getattr(os, "system")("python3 malicious_payload.py")

The current rule successfully detect the above code because the name of the function is a literal string matching a function name in Python.
However, if the attacker leverages some string obfuscation techniques the the malicious code is not detected anymore:

  • base64 encoding: getattr(os, base64.b64decode("c3lzdGVt").decode()) ...
  • hex encoding: getattr(os, "\x73\x79\x73\x74\x65\x6d") ...
  • string splitting: getattr(os, "sys" + "tem") ...

Proposed Changes:
To fix this limitation, I have updated the rule to use a generic metavariable without enforcing specific regex matching on string literals. This allows the rule to flag usage where the argument is derived from function calls or operations, not just static strings.

Validation:
First, I included new test cases covering the evasion techniques mentioned above. The tests successfully pass with the new rule.
Second, I repeated the tests I did in the previous PR (#607) on both MalwareBench dataset and a private dataset of malicious packages. As results, I didn't notice any regressions: the detection rate remains the same, and no new false positives were generated.

Kind regards,
Biagio

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant