You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A LambdaDiffReward function that automates the following, where the type of metric is computed from the return type of SCIPsomething.
Describe alternatives you have considered
Interestingly, when we have SomeReward, we can get back to SCIPsomething by using reward.cumsum().
In Numpy, there is also a np.diff that does the opposite of np.cumsum, i.e. compute the finite difference.
Perhaps, we could have a LambdaReward that does not compute differences and simply return the output
As well as a reward.diff() that build the difference.
Additional context
We should be consistent on whether reward functions compute finite differences or not.
Using .cumsum().diff() we can easily switch from one to the other.
The text was updated successfully, but these errors were encountered:
Describe the problem or improvement suggested
Many simple reward functions have the following pattern
Which get redundant and could be automated.
Describe the solution you would like
A
LambdaDiffReward
function that automates the following, where thetype
ofmetric
is computed from the return type ofSCIPsomething
.Describe alternatives you have considered
Interestingly, when we have
SomeReward
, we can get back toSCIPsomething
by usingreward.cumsum()
.In Numpy, there is also a
np.diff
that does the opposite ofnp.cumsum
, i.e. compute the finite difference.Perhaps, we could have a
LambdaReward
that does not compute differences and simply return the outputAs well as a
reward.diff()
that build the difference.Additional context
We should be consistent on whether reward functions compute finite differences or not.
Using
.cumsum()
.diff()
we can easily switch from one to the other.The text was updated successfully, but these errors were encountered: