本次代写是R语言数据分析的一个assignment

**Instructions:**

The assignment contains 3 problems worth a total of 100 points which will count towards

15% of the final mark for the course. If you L ATEXand knitr your assignment in a nice

way, you will potentially get up to a maximum of 0:75% towards the final mark for the

course as extra credits.

Use tables, graphs and concise text explanations to support your answers. Unclear answers

may not be marked at your own cost. All tables and graphs must be clearly commented

and identified.

No late submission is allowed.

**Data:** In the assignment you will analyze some rainfall data. The dataset is available in .txt

format on the LMS website. To load the data into R you can use the function read.table()

or any command of your choice. You may need to manipulate the data format (data frames

or matrices) depending on the task. The data are separated in a training set and a test set.

The training set contain p = 365 explanatory variables X1; : : : ;Xp and one class membership

(G = 0 or 1) for ntrain = 150 individuals. The test set contains p = 365 explanatory variabless

X1; : : : ;Xp and one class membership (G = 0 or 1) for ntest = 41 individuals.

In these data, for each individual, X1; : : : ;Xp correspond to the amount of rainfall at each

of the p = 365 days in a year. Each individual in this case is a place in Australia coming either

from the North (G = 0) or from the South (G = 1) of the country. Thus, the two classes (North

and South) are coded by 0 and 1.

You will use the training data to fit your models or train classifiers. Once you have fitted

your model or trained your classifiers with the training data, you will need to check how well

the tted models/trained classifiers work on the test data.

The test and training data are all placed in different text files: XGtrainRain.txt, which

contains the training X data (values of the p explanatory X-variables) for ntrain = 150 indi-

viduals as well as their class (0 or 1) label, and XGtestRain.txt, which contains the test X

data (values of the p explanatory X-variables) for ntest = 41 as well as their class (0 or 1)

label. The test class membership is provided to you ONLY TO COMPUTE THE ERROR OF

CLASSIFICATION of your classifier.

Please include all the necessary R code to answer the questions, but not super-

uous R code that are not relevant. Marks may be taken off for R code that is

poorly presented.

You may take classification error/test error to be the proportion/percentage out of the

41 test samples that are misclassified.

**Problem 1 [60 marks]:**

In this problem you will train quadratic discriminant (QDA) and the logistic regression

classifiers to predict the class labels (0 or 1) in the test set.

(a) Use standard functions in R to train the QDA classifier and the logistic classifier, with all

the p predictors in the training set. What happened? And why did it happen? Do you

recommend using these two classifiers on the test set? (Hint: For the logistic classifier,

use the summary function to take a look at the trained model object) [10]

(b) Use prcomp and the plsr (package pls) functions to obtain, respectively, the PCA and

PLS (partial least square) components of the explanatory variables, in the training set.

Here, when considering the covariance maximization problem of PLS, we maximise the

covariance between X = (X1; : : : ;Xp)T and Y = 1fG = 1g, the indicator variable that an

individual belongs to group 1. For each case, you will need to use the \projection matrix”

(i.e., for PCA and for PLS discussed in class) reported by the function to re-compute

the components \manually” to check that you understand how the components are ob-

tained. [10]

(c) Train a QDA classifier with the PLS components, and another one with the PCA compo-

nents. In each case, pick the number of components to use based on leave-one-out cross

validation (LOOCV); consider up to using 50 components. Plot the leave-one-out CV

error against the number of components considered. Report the final chosen number of

components. (Refer to the lab in Week 7 to get some ideas)

Do the same for the logistic classifier.

(If you want to pick your number of components based on methods other than LOOCV,

please explain your choice in a clear and concise manner)

[20]

(d) For each of the QDA and logistic classifiers, which version (PCA or PLS) do you prefer?

Why? (Answer this question without any knowledge of the test-set results in the next

problem) [5]

(e) Apply your trained classifiers in (c) to the test set, and report the resulting classification

error (test error). Be careful about how you should center the data in your test set to

produce your prediction. The lab in Week 7 may give you some ideas again. [15]

**程序代写代做C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB**

本网站支持淘宝 支付宝 微信支付 paypal等等交易。如果不放心可以用淘宝交易！

**E-mail:** itcsdx@outlook.com **微信:**itcsdx

如果您使用手机请先保存二维码，微信识别。如果用电脑，直接掏出手机果断扫描。