setCritic

Set critic of reinforcement learning agent

Syntax

agent = setCritic(agent,critic)

Description

agent= setCritic(agent,critic)updates the reinforcement learning agent,agent, to use the specified critic object,critic.

考试ples

collapse all

Modify Critic Parameter Values

Open Live Script

Assume that you have an existing trained reinforcement learning agent. For this example, load the trained agent fromTrain DDPG Agent to Control Double Integrator System.

load('DoubleIntegDDPG.mat','agent')

Obtain the critic function approximator from the agent.

critic = getCritic(agent);

Obtain the learnable parameters from the critic.

params = getLearnableParameters(critic)

params=2×1 cell array{[-5.0077 -1.5619 -0.3475 -0.0961 -0.0455 -0.0026]} {[ 0]}

Modify the parameter values. For this example, simply multiply all of the parameters by2.

modifiedParams = cellfun(@(x) x*2,params,'UniformOutput',false);

Set the parameter values of the critic to the new modified values.

critic = setLearnableParameters(critic,modifiedParams);

Set the critic in the agent to the new modified critic.

setCritic(agent,critic);

Display the new parameter values.

getLearnableParameters(getCritic(agent))

ans =2×1 cell array{[-10.0154 -3.1238 -0.6950 -0.1922 -0.0911 -0.0052]} {[ 0]}

Modify Deep Neural Networks in Reinforcement Learning Agent

Open Live Script

Create an environment with a continuous action space and obtain its observation and action specifications. For this example, load the environment used in the exampleTrain DDPG Agent to Control Double Integrator System.

Load the predefined environment.

env = rlPredefinedEnv("DoubleIntegrator-Continuous");

Obtain observation and action specifications.

obsInfo = getObservationInfo(env); actInfo = getActionInfo(env);

Create a PPO agent from the environment observation and action specifications. This agent uses default deep neural networks for its actor and critic.

agent = rlPPOAgent(obsInfo,actInfo);

To modify the deep neural networks within a reinforcement learning agent, you must first extract the actor and critic function approximators.

actor = getActor(agent); critic = getCritic(agent);

Extract the deep neural networks from both the actor and critic function approximators.

actorNet = getModel(actor); criticNet = getModel(critic);

The networks aredlnetwork对象. To view them using theplotfunction, you must convert them tolayerGraph对象.

For example, view the actor network.

plot(layerGraph(actorNet))

Figure contains an axes object. The axes object contains an object of type graphplot.

To validate a network, useanalyzeNetwork. For example, validate the critic network.

analyzeNetwork(criticNet)

You can modify the actor and critic networks and save them back to the agent. To modify the networks, you can use theDeep Network Designerapp. To open the app for each network, use the following commands.

deepNetworkDesigner(layerGraph(criticNet)) deepNetworkDesigner(layerGraph(actorNet))

InDeep Network Designer, modify the networks. For example, you can add additional layers to your network. When you modify the networks, do not change the input and output layers of the networks returned bygetModel. For more information on building networks, seeBuild Networks with Deep Network Designer.

To validate the modified network inDeep Network Designer, you must click onAnalyze for dlnetwork, under theAnalysissection. To export the modified network structures to the MATLAB® workspace, generate code for creating the new networks and run this code from the command line. Do not use the exporting option inDeep Network Designer. For an example that shows how to generate and run code, seeCreate Agent Using Deep Network Designer and Train Using Image Observations.

For this example, the code for creating the modified actor and critic networks is in thecreateModifiedNetworkshelper script.

createModifiedNetworks

Each of the modified networks includes an additionalfullyConnectedLayerandreluLayerin their main common path. View the modified actor network.

plot(layerGraph(modifiedActorNet))

Figure contains an axes object. The axes object contains an object of type graphplot.

After exporting the networks, insert the networks into the actor and critic function approximators.

actor = setModel(actor,modifiedActorNet); critic = setModel(critic,modifiedCriticNet);

Finally, insert the modified actor and critic function approximators into the actor and critic objects.

agent = setActor(agent,actor); agent = setCritic(agent,critic);

Input Arguments

collapse all

`agent`—Reinforcement learning agent
`rlQAgent`|`rlSARSAAgent`|`rlDQNAgent`|`rlPGAgent`|`rlDDPGAgent`|`rlTD3Agent`|`rlACAgent`|`rlSACAgent`|`rlPPOAgent`|`rlTRPOAgent`

Reinforcement learning agent that contains a critic, specified as one of the following:

rlQAgent
rlSARSAAgent
rlDQNAgent
rlPGAgent(when using a critic to estimate a baseline value function)
rlDDPGAgent
rlTD3Agent
rlACAgent
rlSACAgent
rlPPOAgent
rlTRPOAgent

Note

agentis an handle object. Therefore is updated bysetCriticwhetheragentis returned as an output argument or not. For more information about handle objects, seeHandle Object Behavior.

`critic`—Critic
`rlValueFunction`object|`rlQValueFunction`object|`rlVectorQValueFunction`object|two-element row vector of`rlQValueFunction`对象

Critic object, specified as one of the following:

rlValueFunctionobject — Returned whenagentis anrlACAgent,rlPGAgent, orrlPPOAgentobject.
rlQValueFunctionobject — Returned whenagentis anrlQAgent,rlSARSAAgent,rlDQNAgent,rlDDPGAgent, orrlTD3Agentobject with a single critic.
rlVectorQValueFunctionobject — Returned whenagentis anrlQAgent,rlSARSAAgent,rlDQNAgent, object with a discrete action space, vector Q-value function critic.
Two-element row vector ofrlQValueFunction对象— Returned whenagentis anrlTD3AgentorrlSACAgentobject with two critics.

Output Arguments

collapse all

`agent`— Updated reinforcement learning agent
`rlQAgent`|`rlSARSAAgent`|`rlDQNAgent`|`rlPGAgent`|`rlDDPGAgent`|`rlTD3Agent`|`rlACAgent`|`rlSACAgent`|`rlPPOAgent`|`rlTRPOAgent`

Updated agent, returned as an agent object. Note thatagentis an handle object. Therefore its actor is updated bysetCriticwhetheragentis returned as an output argument or not. For more information about handle objects, seeHandle Object Behavior.

Version History

Introduced in R2019a

setCritic

Syntax

Description

考试ples

Modify Critic Parameter Values

Modify Deep Neural Networks in Reinforcement Learning Agent

Input Arguments

`agent`—Reinforcement learning agent
`rlQAgent`|`rlSARSAAgent`|`rlDQNAgent`|`rlPGAgent`|`rlDDPGAgent`|`rlTD3Agent`|`rlACAgent`|`rlSACAgent`|`rlPPOAgent`|`rlTRPOAgent`

`critic`—Critic
`rlValueFunction`object|`rlQValueFunction`object|`rlVectorQValueFunction`object|two-element row vector of`rlQValueFunction`对象

Output Arguments

`agent`— Updated reinforcement learning agent
`rlQAgent`|`rlSARSAAgent`|`rlDQNAgent`|`rlPGAgent`|`rlDDPGAgent`|`rlTD3Agent`|`rlACAgent`|`rlSACAgent`|`rlPPOAgent`|`rlTRPOAgent`

Version History

See Also

Topics

setCritic

Syntax

Description

考试ples

Modify Critic Parameter Values

Modify Deep Neural Networks in Reinforcement Learning Agent

Input Arguments

agent—Reinforcement learning agentrlQAgent|rlSARSAAgent|rlDQNAgent|rlPGAgent|rlDDPGAgent|rlTD3Agent|rlACAgent|rlSACAgent|rlPPOAgent|rlTRPOAgent

critic—CriticrlValueFunctionobject|rlQValueFunctionobject|rlVectorQValueFunctionobject|two-element row vector ofrlQValueFunction对象

Output Arguments

agent— Updated reinforcement learning agentrlQAgent|rlSARSAAgent|rlDQNAgent|rlPGAgent|rlDDPGAgent|rlTD3Agent|rlACAgent|rlSACAgent|rlPPOAgent|rlTRPOAgent

Version History

See Also

Topics

`agent`—Reinforcement learning agent
`rlQAgent`|`rlSARSAAgent`|`rlDQNAgent`|`rlPGAgent`|`rlDDPGAgent`|`rlTD3Agent`|`rlACAgent`|`rlSACAgent`|`rlPPOAgent`|`rlTRPOAgent`

`critic`—Critic
`rlValueFunction`object|`rlQValueFunction`object|`rlVectorQValueFunction`object|two-element row vector of`rlQValueFunction`对象

`agent`— Updated reinforcement learning agent
`rlQAgent`|`rlSARSAAgent`|`rlDQNAgent`|`rlPGAgent`|`rlDDPGAgent`|`rlTD3Agent`|`rlACAgent`|`rlSACAgent`|`rlPPOAgent`|`rlTRPOAgent`