价值函数评论家代表强化学习者
这个对象实现approximat值函数or to be used as a critic within a reinforcement learning agent. A value function is a function that maps an observation to a scalar value. The output represents the expected total long-term reward when the agent starts from the given observation and takes the best possible action. Value function critics therefore only need observations (but not actions) as inputs. After you create anrlValueRepresentation
评论家, use it to create an agent relying on a value function critic, such as anrlacagent
,rlPGAgent
, orrlPPOAgent
。For an example of this workflow, seeCreate Actor and Critic Representations。有关创建表示形式的更多信息,请参阅创建策略和价值功能表示。
creates the value function based评论家
= rlValueRepresentation(net
,observationInfo
,'Observation',obsName
)评论家
从深神网络net
。This syntax sets theObservationInfoproperty of评论家
到inputobservationInfo
。obsName
must contain the names of the input layers ofnet
。
creates the value function based评论家
= rlValueRepresentation(tab
,observationInfo
)评论家
与discrete observation space, from the value tabletab
, which is anrlTable
object containing a column array with as many elements as the possible observations. This syntax sets theObservationInfoproperty of评论家
到inputobservationInfo
。
creates the value function based评论家
= rlValueRepresentation({basisFcn
,W0
},,observationInfo
)评论家
using a custom basis function as underlying approximator. The first input argument is a two-elements cell in which the first element contains the handlebasisFcn
to a custom basis function, and the second element contains the initial weight vectorW0
。This syntax sets theObservationInfoproperty of评论家
到inputobservationInfo
。
creates the value function based评论家
= rlValueRepresentation(___,选项
)评论家
using the additional option set选项
, which is anrlRepresentationOptions
目的。This syntax sets the选项property of评论家
到选项
input argument. You can use this syntax with any of the previous input-argument combinations.
rlacagent |
Actor-critic reinforcement learning agent |
rlPGAgent |
Policy gradient reinforcement learning agent |
rlPPOAgent |
Proximal policy optimization reinforcement learning agent |
getValue |
Obtain estimated value function representation |