Python代写强化学习 | CAP 6629 Course project 3

本次python机器学习的内容主要是使用神经网络实现Q值表的强化学习方向

CAP 6629: Reinforcement Learning Spring 2021
Course project 3
Due: 04/09/2021 (Friday), 11:59PM

Submission: A single PDF with your code (use any programming language), results and analysis.

Please follow the project report guidelines and submit the report/code in a SINGLE PDF file

In project 2, you may realize that when you have a large grid world maze setup, it takes a long time for the agent to learn a value table. One way to eliminate this challenge is to use neural networks to approximate the value function as discussed in lecture 10. There are two options provided below and you may choose either one to implement.

a. Based on your results in project 2, you can choose to build a neural network to approximate your obtained Q table. In this way, you are using a neural network to generate your Q value so that you can guide the agent to move to achieve the goal.

b. You may choose to implement an actor-critic architecture (ADP) as discussed in lecture 10. In this way, you will need to build an action network and a critic network to learn the Q table from scratch. This may require more time, but I am happy to help in your project.

Report suggestions:
1. Choose either option you are going to implement and provide the pseudo code 2. Design your own grid world example

3. Show the Q value table trained from the neural network and compare with that obtained from project 2


程序代写代做C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB


blank

本网站支持淘宝 支付宝 微信支付  paypal等等交易。如果不放心可以用淘宝交易!

E-mail: itcsdx@outlook.com  微信:itcsdx


如果您使用手机请先保存二维码,微信识别。如果用电脑,直接掏出手机果断扫描。

blank

发表评论