Policies and Value Functions

Define policy and value function representations, such as deep neural networks and Q tables

A reinforcement learning policy is a mapping that selects an action to take based on observations from the environment. During training, the agent tunes the parameters of its policy representation to maximize the long-term reward.

Reinforcement Learning Toolbox™ software provides objects for actor and critic representations. The actor represents the policy that selects the best action to take. The critic represents the value function that estimates the value of the current policy. Depending on your application and selected agent, you can define policy and value functions using deep neural networks, linear basis functions, or look-up tables. For more information, seeCreate Policy and Value Function Representations.

Functions

expand all

Create Representations

`rlValueRepresentation`	Value function critic representation for reinforcement learning agents
`rlQValueRepresentation`	Q-Value function critic representation for reinforcement learning agents
`rlDeterministicActorRepresentation`	Deterministic actor representation for reinforcement learning agents
`rlStochasticActorRepresentation`	年代tochastic actor representation for reinforcement learning agents
`rlRepresentationOptions`	Options set for reinforcement learning agent representations (critics and actors)
`rlTable`	Value table or Q table

Deep Neural Network Layers

`quadraticLayer`	Quadratic layer for actor or critic network
`scalingLayer`	年代caling layer for actor or critic network
`softplusLayer`	年代oftplus layer for actor or critic network

Get and Set Agent Representations

`getActor`	Get actor representation from reinforcement learning agent
`setActor`	年代et actor representation of reinforcement learning agent
`getCritic`	Get critic representation from reinforcement learning agent
`setCritic`	年代et critic representation of reinforcement learning agent
`getLearnableParameters`	Obtain learnable parameter values from policy or value function representation
`setLearnableParameters`	年代et learnable parameter values of policy or value function representation
`getModel`	Get computational model from policy or value function representation
`setModel`	年代et computational model for policy or value function representation

Get Actions and Value Functions

`getAction`	Obtain action from agent or actor representation given environment observations
`getValue`	Obtain estimated value function representation
`getMaxQValue`	Obtain maximum state-value function estimate for Q-value function representation with discrete action space