rlDeterministicActorRepresentation

(Not recommended) Deterministic actor representation for reinforcement learning agents

expand all in page

rlDeterministicActorRepresentationis not recommended. UserlContinuousDeterministicActorinstead. For more information, seerlDeterministicActorRepresentation is not recommended.

Description

This object implements a function approximator to be used as a deterministic actor within a reinforcement learning agent with acontinuousaction space. A deterministic actor takes observations as inputs and returns as outputs the action that maximizes the expected cumulative long-term reward, thereby implementing a deterministic policy. After you create anrlDeterministicActorRepresentationobject, use it to create a suitable agent, such as anrlDDPGAgentagent. For more information on creating representations, seeCreate Policies and Value Functions.

Creation

Syntax

actor = rlDeterministicActorRepresentation(net,observationInfo,actionInfo,'Observation',obsName,'Action',actName)

actor = rlDeterministicActorRepresentation({basisFcn,W0},observationInfo,actionInfo)

actor = rlDeterministicActorRepresentation(___,options)

Description

example

actor= rlDeterministicActorRepresentation(net,observationInfo,actionInfo,'Observation',obsName,'Action',actName)creates a deterministic actor using the deep neural networknetas approximator. This syntax sets theObservationInfoandActionInfoproperties ofactorto the inputsobservationInfoandactionInfo, containing the specifications for observations and actions, respectively.actionInfomust specify a continuous action space, discrete action spaces are not supported.obsNamemust contain the names of the input layers ofnetthat are associated with the observation specifications. The action namesactNamemust be the names of the output layers ofnetthat are associated with the action specifications.

example

actor= rlDeterministicActorRepresentation({basisFcn,W0},observationInfo,actionInfo)creates a deterministic actor using a custom basis function as underlying approximator. The first input argument is a two-elements cell in which the first element contains the handlebasisFcnto a custom basis function, and the second element contains the initial weight matrixW0. This syntax sets theObservationInfoandActionInfoproperties ofactorrespectively to the inputsobservationInfoandactionInfo.

actor= rlDeterministicActorRepresentation(___,options)creates a deterministic actor using the additional options setoptions, which is anrlRepresentationOptionsobject. This syntax sets theOptionsproperty ofactorto theoptionsinput argument. You can use this syntax with any of the previous input-argument combinations.

Input Arguments

expand all

`net`—Deep neural network
array of`Layer`对象|`layerGraph`object|`DAGNetwork`object|`SeriesNetwork`object|`dlNetwork`object

Deep neural network used as the underlying approximator within the actor, specified as one of the following:

Array ofLayer对象
layerGraphobject
DAGNetworkobject
SeriesNetworkobject
dlnetworkobject

The network input layers must be in the same order and with the same data type and dimensions as the signals defined inObservationInfo. Also, the names of these input layers must match the observation names listed inobsName.

The network output layer must have the same data type and dimension as the signal defined inActionInfo. Its name must be the action name specified inactName.

rlDeterministicActorRepresentation对象support recurrent deep neural networks.

For a list of deep neural network layers, seeList of Deep Learning Layers. For more information on creating deep neural networks for reinforcement learning, seeCreate Policies and Value Functions.

`obsName`—Observation names
string|character vector|cell array of character vectors

观察名称指定为圣的单元阵列rings or character vectors. The observation names must be the names of the input layers innet.

Example:{'my_obs'}

`actName`—Action name
string|character vector|single-element cell array containing a character vector

Action name, specified as a single-element cell array that contains a character vector. It must be the name of the output layer ofnet.

Example:{'my_act'}

`basisFcn`—Custom basis function
function handle

Custom basis function, specified as a function handle to a user-defined MATLAB function. The user defined function can either be an anonymous function or a function on the MATLAB path. The action to be taken based on the current observation, which is the output of the actor, is the vectora = W'*B, whereWis a weight matrix containing the learnable parameters andBis the column vector returned by the custom basis function.

When creating a deterministic actor representation, your basis function must have the following signature.

B = myBasisFunction(obs1,obs2,...,obsN)

Hereobs1toobsNare observations in the same order and with the same data type and dimensions as the signals defined inobservationInfo

Example:@(obs1,obs2,obs3) [obs3(2)*obs1(1)^2; abs(obs2(5)+obs3(1))]

`W0`—Initial value of the basis function weights
column vector

Initial value of the basis function weights,W, specified as a matrix having as many rows as the length of the vector returned by the basis function and as many columns as the dimension of the action space.

Properties

expand all

`Options`—Representation options
`rlRepresentationOptions`object

Representation options, specified as anrlRepresentationOptionsobject. Available options include the optimizer used for training and the learning rate.

`ObservationInfo`—Observation specifications
`rlFiniteSetSpec`object|`rlNumericSpec`object|array

Observation specifications, specified as anrlFiniteSetSpecorrlNumericSpecobject or an array of such objects. These objects define properties such as the dimensions, data types, and names of the observation signals.

rlDeterministicActorRepresentationsets theObservationInfoproperty ofactorto the inputobservationInfo.

You can extractObservationInfofrom an existing environment or agent usinggetObservationInfo. You can also construct the specifications manually.

`ActionInfo`—Action specifications
`rlNumericSpec`object

Action specifications for a continuous action space, specified as anrlNumericSpecobject defining properties such as dimensions, data type and name of the action signals. The deterministic actor representation does not support discrete actions.

rlDeterministicActorRepresentationsets theActionInfoproperty ofactorto the inputobservationInfo.

You can extractActionInfofrom an existing environment or agent usinggetActionInfo. You can also construct the specification manually.

For custom basis function representations, the action signal must be a scalar, a column vector, or a discrete action.

Object Functions

`rlDDPGAgent`	Deep deterministic policy gradient reinforcement learning agent
`rlTD3Agent`	Twin-delayed deep deterministic policy gradient reinforcement learning agent
`getAction`	Obtain action from agent or actor given environment observations

Examples

collapse all

Create Deterministic Actor from Deep Neural Network

Create an observation specification object (or alternatively usegetObservationInfoto extract the specification object from an environment). For this example, define the observation space as a continuous four-dimensional space, so that a single observation is a column vector containing four doubles.

obsInfo = rlNumericSpec([4 1]);

Create an action specification object (or alternatively usegetActionInfoto extract the specification object from an environment). For this example, define the action space as a continuous two-dimensional space, so that a single action is a column vector containing two doubles.

2 actInfo = rlNumericSpec ([1]);

Create a deep neural network approximator for the actor. The input of the network (here calledmyobs) must accept a four-element vector (the observation vector just defined byobsInfo), and its output must be the action (here calledmyact) and be a two-element vector, as defined byactInfo.

net = [featureInputLayer(4,'Normalization','none','Name','myobs')富尔语lyConnectedLayer(2,'Name','myact')];

Create the critic withrlQValueRepresentation, using the network, the observations and action specification objects, as well as the names of the network input and output layers.

actor = rlDeterministicActorRepresentation(net,obsInfo,actInfo,...'Observation',{'myobs'},'Action',{'myact'})

actor = rlDeterministicActorRepresentation with properties: ActionInfo: [1x1 rl.util.rlNumericSpec] ObservationInfo: [1x1 rl.util.rlNumericSpec] Options: [1x1 rl.option.rlRepresentationOptions]

To check your actor, usegetActionto return the action from a random observation, using the current network weights.

act = getAction(actor,{rand(4,1)}); act{1}

ans =2x1 single column vector-0.5054 1.5390

You can now use the actor to create a suitable agent (such as anrlACAgent,rlPGAgent, orrlDDPGAgentagent).

Create Deterministic Actor from Custom Basis Function

obsInfo = rlNumericSpec([3 1]);

The deterministic actor does not support discrete action spaces. Therefore, create acontinuous action spacespecification object (or alternatively usegetActionInfoto extract the specification object from an environment). For this example, define the action space as a continuous two-dimensional space, so that a single action is a column vector containing 2 doubles.

2 actInfo = rlNumericSpec ([1]);

Create a custom basis function. Each element is a function of the observations defined byobsInfo.

myBasisFcn = @(myobs) [myobs(2)^2; myobs(1); 2*myobs(2)+myobs(1); -myobs(3)]

myBasisFcn =function_handle with value:@(myobs)[myobs(2)^2;myobs(1);2*myobs(2)+myobs(1);-myobs(3)]

The output of the actor is the vectorW'*myBasisFcn(myobs), which is the action taken as a result of the given observation. The weight matrixWcontains the learnable parameters and must have as many rows as the length of the basis function output and as many columns as the dimension of the action space.

Define an initial parameter matrix.

W0 = rand(4,2);

Create the actor. The first argument is a two-element cell containing both the handle to the custom function and the initial weight matrix. The second and third arguments are, respectively, the observation and action specification objects.

actor = rlDeterministicActorRepresentation({myBasisFcn,W0},obsInfo,actInfo)

actor = rlDeterministicActorRepresentation with properties: ActionInfo: [1x1 rl.util.rlNumericSpec] ObservationInfo: [1x1 rl.util.rlNumericSpec] Options: [1x1 rl.option.rlRepresentationOptions]

To check your actor, use thegetActionfunction to return the action from a given observation, using the current parameter matrix.

a = getAction(actor,{[1 2 3]'}); a{1}

ans = 2x1 dlarray 2.0595 2.3788

You can now use the actor (along with an critic) to create a suitable continuous action space agent.

Create Deterministic Actor from Recurrent Neural Network

Create observation and action information. You can also obtain these specifications from an environment.

obsinfo = rlNumericSpec([4 1]); actinfo = rlNumericSpec([2 1]); numObs = obsinfo.Dimension(1); numAct = actinfo.Dimension(1);

Create a recurrent deep neural network for the actor. To create a recurrent neural network, use asequenceInputLayeras the input layer and include at least onelstmLayer.

net = [sequenceInputLayer(numObs,'Normalization','none','Name','state')富尔语lyConnectedLayer(10,'Name','fc1') reluLayer('Name','relu1') lstmLayer(8,'OutputMode','sequence','Name','ActorLSTM')富尔语lyConnectedLayer(20,'Name','CriticStateFC2')富尔语lyConnectedLayer(numAct,'Name','action') tanhLayer('Name','tanh1')];

Create a deterministic actor representation for the network.

actorOptions = rlRepresentationOptions(“LearnRate”,1e-3,'GradientThreshold',1); actor = rlDeterministicActorRepresentation(net,obsinfo,actinfo,...'Observation',{'state'},'Action',{'tanh1'});

版本历史

Introduced in R2020a

expand all

R2022a:`rlDeterministicActorRepresentation`is not recommended

Not recommended starting in R2022a

rlDeterministicActorRepresentationis not recommended. UserlContinuousDeterministicActorinstead.

The following table shows some typical uses ofrlDeterministicActorRepresentation, and how to update your code withrlContinuousDeterministicActorinstead. The first table entry uses a neural network, the second one uses a basis function.

Network-Based Representations: Not Recommended	Network-Based Representations: Recommended
`myActor = rlDeterministicActorRepresentation(net,obsInfo,actInfo,'Observation',obsNames,'Action',actNames)`with`actInfo`defining a continuous action space and`net`having observations as inputs and a single output layer with as many elements as the number of dimensions of the continuous action space.	`myActor = rlContinuousDeterministicActor(net,obsInfo,actInfo,'ObservationInputNames',obsNames)`. Use this syntax to create a deterministic actor object with a continuous action space.
`rep = rlDeterministicActorRepresentation({basisFcn,W0},obsInfo,actInfo)`, where the basis function has observations as inputs and actions as outputs,`W0`is a matrix with as many columns as the number of possible actions, and`actInfo`defines a continuous action space.	`rep = rlContinuousDeterministicActor({basisFcn,W0},obsInfo,actInfo)`. Use this syntax to create a deterministic actor object with a continuous action space.

rlDeterministicActorRepresentation

Description

Creation

Syntax

Description

Input Arguments

`net`—Deep neural network
array of`Layer`对象|`layerGraph`object|`DAGNetwork`object|`SeriesNetwork`object|`dlNetwork`object

`obsName`—Observation names
string|character vector|cell array of character vectors

`actName`—Action name
string|character vector|single-element cell array containing a character vector

`basisFcn`—Custom basis function
function handle

`W0`—Initial value of the basis function weights
column vector

Properties

`Options`—Representation options
`rlRepresentationOptions`object

`ObservationInfo`—Observation specifications
`rlFiniteSetSpec`object|`rlNumericSpec`object|array

`ActionInfo`—Action specifications
`rlNumericSpec`object

Object Functions

Examples

Create Deterministic Actor from Deep Neural Network

Create Deterministic Actor from Custom Basis Function

Create Deterministic Actor from Recurrent Neural Network

版本历史

R2022a:`rlDeterministicActorRepresentation`is not recommended

See Also

Functions

Topics

rlDeterministicActorRepresentation

Description

Creation

Syntax

Description

Input Arguments

net—Deep neural networkarray ofLayer对象|layerGraphobject|DAGNetworkobject|SeriesNetworkobject|dlNetworkobject

obsName—Observation namesstring|character vector|cell array of character vectors

actName—Action namestring|character vector|single-element cell array containing a character vector

basisFcn—Custom basis functionfunction handle

W0—Initial value of the basis function weightscolumn vector

Properties

Options—Representation optionsrlRepresentationOptionsobject

ObservationInfo—Observation specificationsrlFiniteSetSpecobject|rlNumericSpecobject|array

ActionInfo—Action specificationsrlNumericSpecobject

Object Functions

Examples

Create Deterministic Actor from Deep Neural Network

Create Deterministic Actor from Custom Basis Function

Create Deterministic Actor from Recurrent Neural Network

版本历史

R2022a:rlDeterministicActorRepresentationis not recommended

See Also

Functions

Topics

`net`—Deep neural network
array of`Layer`对象|`layerGraph`object|`DAGNetwork`object|`SeriesNetwork`object|`dlNetwork`object

`obsName`—Observation names
string|character vector|cell array of character vectors

`actName`—Action name
string|character vector|single-element cell array containing a character vector

`basisFcn`—Custom basis function
function handle

`W0`—Initial value of the basis function weights
column vector

`Options`—Representation options
`rlRepresentationOptions`object

`ObservationInfo`—Observation specifications
`rlFiniteSetSpec`object|`rlNumericSpec`object|array

`ActionInfo`—Action specifications
`rlNumericSpec`object

R2022a:`rlDeterministicActorRepresentation`is not recommended