Skip to content

Conversation

mrkickling
Copy link
Contributor

@mrkickling mrkickling commented Sep 19, 2025

  • Implement attacker ttc penalty
  • Write tests

if setting ttc_values_as_attacker_penalty is enabled,
attackers get rewards as before but additionally gets negative rewards for attack step TTCs.

Attack steps are successfully executed instantly even though they have a TTC value set, but the TTC value is given as a penalty to the attacker agent.

Requires TTCMode PRE_SAMPLE or EXPECTED_VALUE so there are ttc values pre calculated.

@mrkickling mrkickling linked an issue Sep 19, 2025 that may be closed by this pull request
@mrkickling mrkickling marked this pull request as draft September 19, 2025 11:04
@mrkickling mrkickling marked this pull request as ready for review September 19, 2025 11:06
@sandorstormen sandorstormen changed the title Add initial solution with ttc values as attacker rewards Add initial solution with ttc values as attacker penalties Sep 19, 2025
@sandorstormen
Copy link
Contributor

Can I set ttc_values_as_attacker_penalty to True and still select how TTC values are sampled?

@mrkickling
Copy link
Contributor Author

mrkickling commented Sep 19, 2025

Can I set ttc_values_as_attacker_penalty to True and still select how TTC values are sampled?

Yes, but the current implementation on this branch requires you to pick either PRE_SAMPLE or EXPECTED_VALUE, since there otherwise isn't any pre-calculated ttc value. I explained a bit more detail in the top.

Note: this is not necessarily the final solution, we could chose the other approach that just penalizes attackers with 1 point for each step they attempt.

@sandorstormen
Copy link
Contributor

this is not necessarily the final solution, we could chose the other approach that just penalizes attackers with 1 point for each step they attempt.

But isn't this solution already implemented? You won't be able to train an attacker RL agent with ttc_values_as_attacker_penalty=False otherwise.

@mrkickling
Copy link
Contributor Author

No, an attacker agent is not penalized with 1 point per step it takes, and never was. It is not penalized at all as it is now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

TTCs as Reward
2 participants