Documentation
¶
Overview ¶
Package qlearning is an experimental set of interfaces and helpers to implement the Q-learning algorithm in Go.
This is highly experimental and should be considered a toy.
See https://github.com/ecooper/qlearning/tree/master/examples for implementation examples.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Action ¶
Action is an interface wrapping an action that can be applied to the model's current state.
BUG (ecooper): A state should apply an action, not the other way around.
type Agent ¶
type Agent interface {
// Learn updates the model for a given state and action, using the
// provided Rewarder implementation.
Learn(*StateAction, Rewarder)
// Value returns the current Q-value for a State and Action.
Value(State, Action) float32
// Return a string representation of the Agent.
String() string
}
Agent is an interface for a model's agent and is able to learn from actions and return the current Q-value of an action at a given state.
type Rewarder ¶
type Rewarder interface {
// Reward calculates the reward value for a given action in a given
// state.
Reward(action *StateAction) float32
}
Rewarder is an interface wrapping the ability to provide a reward for the execution of an action in a given state.
type SimpleAgent ¶
type SimpleAgent struct {
// contains filtered or unexported fields
}
SimpleAgent is an Agent implementation that stores Q-values in a map of maps.
func NewSimpleAgent ¶
func NewSimpleAgent(lr, d float32) *SimpleAgent
NewSimpleAgent creates a SimpleAgent with the provided learning rate and discount factor.
func (*SimpleAgent) Learn ¶
func (agent *SimpleAgent) Learn(action *StateAction, reward Rewarder)
Learn updates the existing Q-value for the given State and Action using the Rewarder.
func (*SimpleAgent) String ¶
func (agent *SimpleAgent) String() string
String returns the current Q-value map as a printed string.
BUG (ecooper): This is useless.
type State ¶
type State interface {
// String returns a string representation of the given state.
// Implementers should take care to insure that this is a consistent
// hash for a given state.
String() string
// Next provides a slice of possible Actions that could be applied to
// a state.
Next() []Action
}
State is an interface wrapping the current state of the model.
type StateAction ¶
StateAction is a struct grouping an action to a given State. Additionally, a Value can be associated to StateAction, which is typically the Q-value.
func NewStateAction ¶
func NewStateAction(state State, action Action, val float32) *StateAction
NewStateAction creates a new StateAction for a State and Action.
func Next ¶
func Next(agent Agent, state State) *StateAction
Next uses an Agent and State to find the highest scored Action.
In the case of Q-value ties for a set of actions, a random value is selected.