Emmanouil Tzorakoleftherakis, MathWorks
在reinforcement learning Designer应用程序中使用可视化交互工作流设计、训练和模拟强化学习代理。使用该应用程序在reinforcement learning Toolbox™中设置一个强化学习问题,而无需编写MATLAB®代码。通过整个强化学习工作流程:
从MATLAB的R2021a版本开始,Reinforcement Learning Toolbox可以让你用新的Reinforcement Learning Designer应用程序交互设计、训练和模拟RL代理。从命令行或MATLAB工具条打开应用程序。首先,您需要创建您的代理将根据其进行培训的环境对象。Reinforcement Learning Designer允许您从MATLAB工作空间导入环境对象,从几个预定义的环境中进行选择,或创建您自己的自定义环境。对于这个例子,让我们创建一个预定义的带有离散动作空间的车杆MATLAB环境,我们还将从MATLAB工作空间中导入一个带有连续动作空间的四足机器人的定制Simulink环境。金宝app您可以根据需要从“环境”窗格中删除或重命名环境对象,还可以在“预览”窗格中查看观察空间和操作空间的维度。要创建代理,请单击Reinforcement Learning选项卡上的agent部分中的New。根据所选择的环境,以及观察和行动空间的性质,该应用程序将显示兼容的内置训练算法列表。对于这个演示,我们将选择DQN算法。该应用程序将生成一个具有默认评论家架构的DQN代理。在创建代理之前,您可以根据需要调整批评家的一些默认值。 The new agent will appear in the Agents pane and the Agent Editor will show a summary view of the agent and available hyperparameters that can be tuned. For example let’s change the agent’s sample time and the critic’s learn rate. Here, we can also adjust the exploration strategy of the agent and see how exploration will progress with respect to number of training steps. To view the critic default network, click View Critic Model on the DQN Agent tab. The Deep Learning Network Analyzer opens and displays the critic structure. You can change the critic neural network by importing a different critic network from the workspace. You can also import a different set of agent options or a different critic representation object altogether. Click Train to specify training options such as stopping criteria for the agent. Here, let’s set the max number of episodes to 1000 and leave the rest to their default values. To parallelize training click on the Use Parallel button. Parallelization options include additional settings such as the type of data workers will send back, whether data will be sent synchronously or not and more. After setting the training options, you can generate a MATLAB script with the specified settings that you can use outside the app if needed. To start training, click Train. During the training process, the app opens the Training Session tab and displays the training progress. If visualization of the environment is available, you can also view how the environment responds during training. You can stop training anytime and choose to accept or discard training results. Accepted results will show up under the Results Pane and a new trained agent will also appear under Agents. To simulate an agent, go to the Simulate tab and select the appropriate agent and environment object from the drop-down list. For this task, let’s import a pretrained agent for the 4-legged robot environment we imported at the beginning. Double click on the agent object to open the Agent editor. You can see that this is a DDPG agent that takes in 44 continuous observations and outputs 8 continuous torques. In the Simulate tab, select the desired number of simulations and simulation length. If you need to run a large number of simulations, you can run them in parallel. After clicking Simulate, the app opens the Simulation Session tab. If available, you can view the visualization of the environment at this stage as well. When the simulations are completed, you will be able to see the reward for each simulation as well as the reward mean and standard deviation. Remember that the reward signal is provided as part of the environment. To analyze the simulation results, click on Inspect Simulation Data. In the Simulation Data Inspector you can view the saved signals for each simulation episode. If you want to keep the simulation results click accept. When you finish your work, you can choose to export any of the agents shown under the Agents pane. For convenience, you can also directly export the underlying actor or critic representations, actor or critic neural networks, and agent options. To save the app session for future use, click Save Session on the Reinforcement Learning tab. For more information please refer to the documentation of Reinforcement Learning Toolbox.
你也可以从以下列表中选择一个网站:
选择中国网站(中文或英文)以获得最佳网站性能。其他MathWorks国家站点没有针对您所在位置的访问进行优化。