2023年11月25日

机器学习代写 | MiniProject 2: IMDB Sentiment Analysis

# because we are generating random data, set a random seed
set.seed(1)
# generate values in x spread evenly from 0 to 20
x <- seq(from=0, to=20, by=0.05)
# generate y according to the following known function of x
y <- 500 + 0.4 * (x-10)ˆ3
# add random noise to y
noise <- rnorm(length(x), mean=10, sd=80)
noisy.y <- y + noise
# plot data
# red line for true underlying function generating y
{
plot(x,noisy.y)
lines(x, y, col=!red!)
}

a. With predictor x and outcome noisy_y, split the data into a training and test set.

b. Perform 10-fold CV for polynomials from degree 1 to 5 (use MSE as your error measure). This should
be done from scratch using a for loop. (Hint: It may be helpful to randomly permute and then split
the training set from the previous section into 10 evenly sized parts. You may need an if statement to
handle a potential problem in the last iteration of your loop.)

c. Plot the best model’s fitted line in blue and compare to the true function (the red line from the previous
plot).

d. Comment on the results of (c). Why was performance better or worse at different order polynomials?
e. Report the CV error and test error at each order of polynomial. Which achieves the lowest CV error?
How does the CV error compare to the test error? Comment on the results.

3. Classifying a toy dataset

a. Pick a new dataset from the mlbench package (one we haven’t used in class that is 2-dimensional with
two classes; Hint: run ls(package:mlbench)). Experiment with classifying the data using KNN at
different values of k. Use cross-validation to choose your best model.

b. Plot misclassification error rate at different values of k.

c. Plot the decision boundary for your classifier using the function at the top code block,
plot_decision_boundary(). Make sure you load this function into memory before trying to
use it.

4. Performance measures for classification

Recall the Caravan data from the week 2 lab (part of the ISLR package). Train a KNN model with k=2 using
all the predictors in the dataset and the outcome Purchase. Create a confusion matrix with the test set
predictions and the actual values of Purchase. Using the values of the confusion matrix, calculate precision,
recall, and F1. (Note that Yes is the positive class and the confusion matrix may be differently oriented
than the one presented in class.)

5. ISLR Chapter 5 Exercise 3

程序代写代做C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB

CS代写,留学生编程代写,CS作业代写,Java代写,程序代写，代码代写 | ITCS代写

本网站支持淘宝支付宝微信支付 paypal等等交易。如果不放心可以用淘宝交易！

E-mail:itcsdx@outlook.com 微信:itcsdx

如果您使用手机请先保存二维码，微信识别。如果用电脑，直接掏出手机果断扫描。

SQL代写 | CMPT 291 Section OP01 Database Programming C++代写 | SIT102 Introduction to Programming Pass Task

CONTACT

Assignment Example

Service Scope

Recent Case

2024年7月10日

ITCS代写

机器学习代写 | MiniProject 2: IMDB Sentiment Analysis

CONTACT

Assignment Example

Service Scope

Recent Case

Project代写｜QBUS6180 Statistical Learning and Data Mining Classification Project: Marketing Analytics

CS代写｜CS524 – Problem Set #10

图像算法代写｜ECE 232E Project 4 Graph Algorithms

C++代写｜COMP 3023 – Design Patterns with C++ Assignment 2: Patient Vitals Management System

编程代写｜COMP3221 MPI Programming Assignment and Analysis

Tags