TTCs as Reward

One way you can represent cost for RL algorithms is to adjust the reward function. We should implement an option to represent TTCs in terms of reward. The reward function (or more aptly named utility function) should give the negative TTC when performing an attack step that always succeeds.

There are two formulations for this:

$r(a) = -\texttt{ttc value}$, where $\texttt{ttc value}\sim\text{TTCDist}(a)$

$r(a) = -\texttt{ttc value}$, where $\texttt{ttc value} = \mathbb{E}\[\text{TTCDist}(a)\]$

such that $r(\cdot)$ is the reward/utility function and $a$ is an attack step.

Let me note here that this has already been done by [Sandor](https://www.diva-portal.org/smash/record.jsf?pid=diva2%3A1708261&dswid=-1238) and [Manuel](https://kth.diva-portal.org/smash/record.jsf?pid=diva2%3A1643829&dswid=946) with a simplified `coreLang`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TTCs as Reward #165

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

TTCs as Reward #165

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions