

Specify custom reinforcement learning environment dynamics using functions


env= rlFunctionEnv(obsInfo,actInfo,stepfcn,重置fcn)creates a reinforcement learning environment using the provided observation and action specifications,obsInfoactInfo, 分别。您还设置了StepFcn重置properties using MATLAB functions.

观察规范, specified as anrlFiniteSetSpec或者rlNumericSpecobject or an array containing a mix of such objects. These objects define properties such as the dimensions, data types, and names of the observation signals.

Action specification, specified as anrlFiniteSetSpec或者rlNumericSpec目的。这些对象定义了诸如操作信号的尺寸,数据类型和名称之类的属性。


Step behavior for the environment, specified as a function name, function handle, or anonymous function.


[观察,奖励,iSdone,登录标志] = mystepfunction(动作,记录标志)

To use additional input arguments beyond the required set, specifyStepFcn使用匿名函数句柄。


  • Action— Current action, which must match the dimensions and data type specified inactInfo.

  • Observation— Returned observation, which must match the dimensions and data types specified inobsInfo.

  • Reward- 当前步骤的奖励,作为标量值返回。

  • IsDone— Logical value indicating whether to end the simulation episode. The step function that you define can include logic to decide whether to end the simulation based on the observation, reward, or any other values.

  • 记录标志— Any data that you want to pass from one step to the next, specified as a structure.

For an example showing multiple ways to define a step function, see使用自定义功能创建MATLAB环境.



[InitialObservation,LoggedSignals] = myResetFunction




TheInitialObservationoutput must match the dimensions and data type ofobsInfo.

To pass information from the reset condition into the first step, specify that information in the reset function as the output structure记录标志.

For an example showing multiple ways to define a reset function, see使用自定义功能创建MATLAB环境.

Information to pass to the next step, specified as a structure. When you create the environment, whatever you define as the记录标志output of重置初始化此属性。当发生步骤时,该软件将其属性带有数据以传递到下一步的情况,如所定义StepFcn.

Object Functions

getActioninfo Obtain action data specifications from reinforcement learning environment or agent
GetObservationinfo 从增强学习环境或代理中获取观察数据规格
train 火车在speci强化学习代理fied environment
sim 在指定环境中模拟训练有素的加固学习剂
validateEnvironment Validate custom reinforcement learning environment



Create a reinforcement learning environment by supplying custom dynamic functions in MATLAB®. Usingrlfunctionenv,您可以从观察规范,行动规范和step重置您定义的功能。

For this example, create an environment that represents a system for balancing a cart on a pole. The observations from the environment are the cart position, cart velocity, pendulum angle, and pendulum angle derivative. (For additional details about this environment, see使用自定义功能创建MATLAB环境。)为这些信号创建观察规范。

oinfo = rlNumericSpec([4 1]); oinfo.Name ='CartPole States';oinfo.Description ='x, dx, theta, dtheta';

The environment has a discrete action space where the agent can apply one of two possible force values to the cart, –10 N or 10 N. Create the action specification for those actions.

ActionInfo = rlFiniteSetSpec([-10 10]); ActionInfo.Name =“ Cartpole Action”;

Next, specify the customstep重置functions. For this example, use the supplied functionsmyResetFunction.mmyStepFunction.m. For details about these functions and how they are constructed, see使用自定义功能创建MATLAB环境.


env = rlFunctionEnv(oinfo,ActionInfo,“神秘函数”,'myResetFunction');

You can create agents forenv和train them within the environment as you would for any other reinforcement learning environment.


Version History

Introduced in R2019a