Gamma or discount factor is a way to calculate the importance of actions based on the temporal proximity
Gamma is a way to calculate the importance of action. The following formula can be used to calculate the weighted reward.
Let's now calculate some rewards where 1 step is 1 point towards our reward, this should be linear right?
We're using a gamma of.995in this example
.995
Number of actions
Raw reward
Reward after Gamma
10
9.511
1000
6.654
10000
1.701e-18
Last updated 5 years ago