Class EpsilonGreedyExploration
Epsilon greedy exploration policy.
Inheritance
System.Object
EpsilonGreedyExploration
Implements
Namespace: Mars.Components.Services.Explorations
Assembly: Mars.Components.dll
Syntax
public class EpsilonGreedyExploration : object, IExplorationPolicy
Remarks
The class implements epsilon greedy exploration policy. According to the policy, the best action is chosen with probability 1-epsilon. Otherwise, with probability epsilon, any other action, except the best one, is chosen randomly.
According to the policy, the epsilon value is known also as exploration rate.
Constructors
EpsilonGreedyExploration(Double)
Initializes a new instance of the EpsilonGreedyExploration class.
Declaration
public EpsilonGreedyExploration(double epsilon)
Parameters
Type | Name | Description |
---|---|---|
System.Double | epsilon | Epsilon value (exploration rate). |
Properties
Epsilon
Epsilon value (exploration rate), [0, 1].
Declaration
public double Epsilon { get; set; }
Property Value
Type | Description |
---|---|
System.Double |
Remarks
The value determines the amount of exploration driven by the policy. If the value is high, then the policy drives more to exploration - choosing random action, which excludes the best one. If the value is low, then the policy is more greedy - choosing the beat so far action.
Methods
ChooseAction(Double[])
Choose an action.
Declaration
public int ChooseAction(double[] actionEstimates)
Parameters
Type | Name | Description |
---|---|---|
System.Double[] | actionEstimates | Action estimates. |
Returns
Type | Description |
---|---|
System.Int32 | Returns selected action. |
Remarks
The method chooses an action depending on the provided estimates. The
estimates can be any sort of estimate, which values usefulness of the action
(expected summary reward, discounted reward, etc).