2023年11月25日

APP代写 | 2701ICT Assignment App Design and Prototype

这是一篇美国的Python强化学习python代写

Proximal Policy Optimization

In this project, you will be using an open source RL library stable-baselines3, and with it learn a policy for the same arm goal reaching task from the previous project.

Project Setup

For this project, we will again use Anaconda as our python virtual environment manager.

Check out Project 5 from the SVN server:

Install the virtual environment:

cd project5
conda create –name project5 –file spec-file. txt

You can then activate and deactivate the virtual environment anywhere in the terminal with:

conda activate project5
conda deactivate

Important:

DO NOT install any other libraries/or dependencies or a different version of the already provided package. The autograder will give you a 0 if you import libraries that are not specifed in the spec-fle.txt.
If you are concerned you may have accidentally imported something that isn’t in the spec-fle.txt, delete your conda environment and re-create it, and then re-run your code to see if your code still runs without error. You can also test your code on the MechTech lab machines, which don’t have any additional libraries installed.

Starter Code Explanation

Make sure to watch the lecture from 4/26 for details on how to run this project.

In addition to code you are already familiar with from the previous project (i.e. arm dynamics, etc.) we are providing partially implemented environment in the ArmEnv class. The environment “wraps around” the arm dynamics to provide the key functions that an RL algorithm expects. Your implementation must follow the gyme API.

Instructions

You must complete implementing ArmEnv in arm_ env.py and train() in train_ .ppo.py. Details are below.

ArmEnv

Unlike the previous project, you will be implementing majority of the key functions. You are also expected to deliberate over various choices for setting up the environment. You get to decide the components of the observation space. You can also choose reward function you deem appropriate for the task.

train()

Here, you must fll in the train.(.. function that actually trains a policy using PPO. You can refer to stable- baselines3
documentation .

Documentation:

https://stable-baselines3.readthedocs. io/en/master/modules/ppp .html?highlight=ppo

gym api:

https://gym.openai.com/docs/

Grading

The script enjoy_ ppo.py can be used to test your code. This is how we will run your code for grading:

python3 enjoy_ ppo.py –model. path final.zip

While developing, you can also test a pre-saved model, like so:

python3 enjoy. ppo.py –model. path models/2022 -04 10. 12-04 17/models.zip

You can pass the –gui flag to enjoy. ppo.py and then you will also see what the policy is doing.

You MUST take the final model you want to be scored with, copy it into the project root folder, rename it to fnal.zip, and commit it to SVN. Failure to do so will result in getting 0 points on the project. Remember to test your grade by doing a clean SVN checkout: checkout your submission into a new directory, create and activate the virtual environment, and run the grader, without a single modifcation or addition to these steps.

The grader will run five episodes, each with a different goal. For each goal, we expect the end-effector to reach the desired location and then stay there. Is the distance the end-effector stabilizes at is below what we consider an easy to reach threshold, the script will award 1.5 points. If the distance is below a tighter threshold, it will award an additional 1.5 points. The max is thus 3 points for each goal, for a total of 15 for the project.

程序代写代做C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB

CS代写,留学生编程代写,CS作业代写,Java代写,程序代写，代码代写 | ITCS代写

本网站支持淘宝支付宝微信支付 paypal等等交易。如果不放心可以用淘宝交易！

E-mail:itcsdx@outlook.com 微信:itcsdx

如果您使用手机请先保存二维码，微信识别。如果用电脑，直接掏出手机果断扫描。

澳洲IT代写

Python代写｜Manufacturing Data Science 製造數據科學 Assignment 4

CONTACT

Assignment Example

Service Scope

Recent Case

2024年10月8日

ITCS代写

APP代写 | 2701ICT Assignment App Design and Prototype

Proximal Policy Optimization

Project Setup

Instructions

ArmEnv

Grading

CONTACT

Assignment Example

Service Scope

Recent Case

MySQL数据库学习指南：留学生如何在不同国家的课程和就业形势下脱颖而出

北美计算机留学高校整理与热门专业前景分析

留学生计算机代写常见服务有哪些？

留学生程序代写靠谱吗

留学生如何选择机器学习方向的专业

Tags