• Documentation
  • Core API
  • SmartOpenHamburg API
  • Model Components API
  • Common API

    Show / Hide Table of Contents
    • Mars.Components
      • DependencyRegistrationModelling
    • Mars.Components.Agents
      • AbstractAgent
      • AbstractEntity
      • Agent
    • Mars.Components.Environments
      • AbstractEnvironment
      • AbstractGraphElement
      • AlwaysPassingNodeGuard
      • EntityLane<TEntity>
      • GeoEnvironment
      • GeoHashEnvironment<T>
      • IEnvironmentMediator
      • IModalChoiceResolver
      • PathHeuristics
      • SpatialEdge
      • SpatialGraphEntityExtensions
      • SpatialGraphEnvironment
      • SpatialGraphHelper
      • SpatialHashEnvironment<TEntity>
      • SpatialNode
      • WorldIndexExtensions
    • Mars.Components.Environments.Cartesian
      • CartesianEnvironment<TEntity>
      • CollisionEnvironment<TCharacter, TObstacle>
      • CollisionKind
      • ConeExplorationView
      • ExplorationView
      • GeometryFactory
      • ICartesianEnvironment<TEntity>
      • ICharacter
      • ICollidable
      • ICollisionEnvironment<TCharacter, TObstacle>
      • IObstacle
      • IVisible
      • RayExplorationView
      • TrapezoidExploration
      • VisibilityKind
    • Mars.Components.Environments.Cartesian.Routing
      • CartesianRoutingService<TObstacle>
      • GeometryCoordinatesRoutingPointsGenerator
      • HomogenousRoutingPointsGenerator
      • IRoutingPointsGenerator
      • RandomRoutingPointsGenerator
      • SectionalRandomRoutingPointsGenerator
    • Mars.Components.Layers
      • AbstractActiveLayer
      • AbstractEnvelopedLayer
      • AbstractLayer
      • AbstractVectorFeature
      • AgentSchedulerLayer<TAgent, TLayer>
      • QuerySourceLoader
      • RasterLayer
      • SchedulerEntry
      • SchedulerLayer
      • VectorFeature
      • VectorLayer
      • VectorLayer<T>
      • VectorLayerHelper
    • Mars.Components.Layers.Temporal
      • IQueryFieldProvider
      • ITemporalObjectCatalog
      • ITemporalObjectIterator
      • TemporalCatalogFeature<TKey>
      • TemporalFeatureKind
      • TemporalMetadataLoader
    • Mars.Components.Services
      • AgentManager
    • Mars.Components.Services.Explorations
      • BoltzmannExploration
      • EpsilonGreedyExploration
      • IExplorationPolicy
      • RouletteWheelExploration
      • TabuSearchExploration
    • Mars.Components.Services.Learning
      • InfiniteQLearning
      • QLearning
      • Sarsa
    • Mars.Components.Services.Planning
      • ActionPlanResult
      • GoapAction
      • GoapAgentStates
      • GoapGoal
      • GoapPlanner
      • GoapStateKey<T>
      • GoapStateProperty
      • GoapStateProperty<TElement>
      • IGoapAction
      • IGoapAgentStates
      • IGoapGoal
      • IGoapPlanner
      • IGoapStateKey
      • StatesCommons
    • Mars.Components.Services.Planning.ActionCommons
      • AllGoalsSatisfiedAction
      • NoGoalReachableAction
    • Mars.Components.Starter
      • Command
      • CommandArgumentContainer
      • CommandParser
      • CommandSet
      • HelpCommand
      • Option
      • OptionContext
      • OptionException
      • OptionSet
      • OptionValueCollection
      • OptionValueType
      • SimulationStarter
    • Mars.Components.Starter.Exceptions
      • SimulationStartupException
    • Mars.Interfaces
      • IModelObject
      • ISimulationContainer
      • ISimulationContext
      • ISimulationModule
      • ISimulationState
      • IStepExecutionContext
      • SimulationContext
      • SimulationContext.WorkflowStateType
    • Mars.Interfaces.Agents
      • IAgent
      • IAgent<TLayer>
      • IEntity
      • IMarsDslAgent
      • IModalCapabilitiesAgent
      • ISpatialGraphEntity
      • ITickClient
    • Mars.Interfaces.Annotations
      • ActiveConstructor
      • PropertyDescription
    • Mars.Interfaces.Data
      • AbstractDomainData
      • GeometryData
      • IDomainData
      • IGeometryData
      • IIdentifiableData
      • IMarsFeature
      • IRasterBandData
      • ISource
      • IStringData
      • IStringStructuredData
      • IStructuredData
      • IStructuredDataGeometry
      • IValueData
      • LayerInitData
      • RasterBandData
      • StringDataDto
      • StructuredData
      • VectorStructuredData
    • Mars.Interfaces.Environments
      • BoundingBox
      • DirectionType
      • EdgeExploreResult
      • EdgeStop
      • ExploreDirection
      • GeoPositionExtension
      • HighwayType
      • IEdge<TEdge, TNode>
      • IEntityLane<TEntity>
      • IEnvironment
      • IEnvironment<TEntity>
      • IGeoGridEnvironment<T>
      • IGraphEntity
      • IGraphEnvironment<TEdge, TNode>
      • INode<TEdge, TNode>
      • INodeGuard
      • IPositionable
      • IRoutePlanner
      • ISpatialEdge
      • ISpatialGraphEnvironment
      • ISpatialLane
      • ISpatialNode
      • LaneExploreResult
      • ModalChoice
      • MultimodalRoute
      • NodeIntegrationKind
      • Position
      • Route
      • RouteStop
      • Size
      • SpatialGraphExploreResult
      • SpatialModalityType
      • TrafficLightPhase
    • Mars.Interfaces.Layers
      • FinallySimulationStep
      • IChangingLayer
      • IDataContainer
      • IDataLayer
      • IDataSet
      • IEnvelopDataLayer
      • ILayer
      • IMarsDslLayer
      • IModalChoiceConsumer
      • InitSimulationStep
      • INotifyChangedSource<T>
      • IRaster<TValue>
      • IRasterLayer
      • ISteppedActiveLayer
      • IVectorFeature
      • IVectorLayer
      • IVectorLayer<T>
      • IVectorNode<TEntity>
      • RegisterAgent
      • UnregisterAgent
    • Mars.Interfaces.Model
      • ActiveTypeMapping
      • AgentMapping
      • AgentType
      • ConstructorType
      • ContainsOperator
      • DataContainer
      • DataType
      • EntityMapping
      • EntityType
      • Execution
      • FeatureType
      • Globals
      • IdentifiableElement
      • IndividualMapping
      • Input
      • InputConfiguration
      • LayerMapping
      • LayerType
      • ModelDescription
      • ModelElement
      • ObjectType
      • Output
      • OutputConfiguration
      • OutputFilter
      • OutputKind
      • OutputTargetType
      • ParameterMapping
      • Processing
      • ProcessingConfiguration
      • ProcessingTargetType
      • PropertyType
      • SimulationConfig
      • TimeSpanUnit
      • TimeSpanUnitExtensions
      • TypeElement
      • TypeMapping
    • Mars.Interfaces.Model.Converters
      • CultureConverter
      • SerializationTypeConverter
      • TimeSpanUnitConverter
    • Mars.Interfaces.Model.Exceptions
      • InvalidTypeException
      • MissingTypeException
    • Mars.Interfaces.Model.Options
      • AbstractEndpointOptions
      • AbstractFileOptions
      • AbstractOptions
      • AscOptions
      • CsvOptions
      • GeoJsonOptions
      • MongoOptions
      • MqttOptions
      • PostgresSqlOptions
      • RedisOptions
      • RelationalEndpointOptions
      • SerializationType
      • SocketOutputOptions
      • SpatialGraphOptions
      • SqLiteOptions
      • TripsOptions

    Class InfiniteQLearning

    QLearning learning algorithm with infinite number of states.
    Inheritance
    System.Object
    InfiniteQLearning
    Namespace: Mars.Components.Services.Learning
    Assembly: Mars.Components.dll
    Syntax
    public class InfiniteQLearning : object
    Remarks
    The class provides implementation of Q-Learning algorithm, known as off-policy Temporal Difference control.

    Constructors

    InfiniteQLearning(Int32, Int32, IExplorationPolicy)

    Initializes a new instance of the InfiniteQLearning class.
    Declaration
    public InfiniteQLearning(int states, int actions, IExplorationPolicy explorationPolicy)
    Parameters
    Type Name Description
    System.Int32 states Amount of possible states.
    System.Int32 actions Amount of possible actions.
    IExplorationPolicy explorationPolicy Exploration policy.
    Remarks
    The randomize parameter specifies if initial action estimates should be randomized with small values or not. Randomization of action values may be useful, when greedy exploration policies are used. In this case randomization ensures that actions of the same type are not chosen always.

    Properties

    ActionsCount

    Amount of possible actions.
    Declaration
    public int ActionsCount { get; }
    Property Value
    Type Description
    System.Int32

    DiscountFactor

    Discount factor, [0, 1].
    Declaration
    public double DiscountFactor { get; set; }
    Property Value
    Type Description
    System.Double
    Remarks
    Discount factor for the expected summary reward. The value serves as multiplier for the expected reward. So if the value is set to 1, then the expected summary reward is not discounted. If the value is getting smaller, then smaller amount of the expected reward is used for actions' estimates update.

    ExplorationPolicy

    Exploration policy.
    Declaration
    public IExplorationPolicy ExplorationPolicy { get; set; }
    Property Value
    Type Description
    IExplorationPolicy
    Remarks
    Policy, which is used to select actions.

    LearningRate

    Learning rate, [0, 1].
    Declaration
    public double LearningRate { get; set; }
    Property Value
    Type Description
    System.Double
    Remarks
    The value determines the amount of updates Q-function receives during learning. The greater the value, the more updates the function receives. The lower the value, the less updates it receives.

    StatesCount

    Amount of possible states.
    Declaration
    public BigInteger StatesCount { get; }
    Property Value
    Type Description
    BigInteger

    TriedStatesCount

    Gets the number of states that have already been explored by the algorithm.
    Declaration
    public int TriedStatesCount { get; }
    Property Value
    Type Description
    System.Int32

    Methods

    GetAction(Int32)

    Get next action from the specified state.
    Declaration
    public int GetAction(int state)
    Parameters
    Type Name Description
    System.Int32 state Current state to get an action for.
    Returns
    Type Description
    System.Int32 Returns the action for the state.
    Remarks
    The method returns an action according to current ExplorationPolicy.

    UpdateState(Int32, Int32, Double, Int32)

    Update Q-function's value for the previous state-action pair.
    Declaration
    public void UpdateState(int previousState, int action, double reward, int nextState)
    Parameters
    Type Name Description
    System.Int32 previousState Previous state.
    System.Int32 action Action, which leads from previous to the next state.
    System.Double reward Reward value, received by taking specified action from previous state.
    System.Int32 nextState Next state.

    See Also

    QLearning
    Sarsa
    Back to top Copyright © MARS GROUP.
    HAW Hamburg