2023年11月25日

R数据代写｜INFO411/911: Data Mining and Knowledge Discovery Assignment 2

这是一篇澳洲的R数据挖掘编程代写

Questions

(3 marks) In this assignment we make use of the data creditworthiness.csv which was used in Task 2 of Assignment 1. As before, we wish to predict the credit rating that would be assigned to each individual. Recall that data on 2500 customers have been collected, and credit rating for 1962 of them has been assessed as either A,B, or C, coded as 1, 2, or 3, respectively, with the remaining 538 needing to be classifified. Write the code to split the dataset into 50% training set and 50% test set and only include the data with known ratings.

Using default settings, fifit a decision tree to the training set predict the credit ratings of customers using all of the other variables in the dataset.

(a) (2 marks) Report the resulting tree.

(b) (2 marks) Based on this output, predict the credit rating of a hypothetical “median” customer, i.e., one with the attributes listed in Table 1, showing the steps involved.

(c) (2 marks) Produce the confusion matrix for predicting the credit rating from this tree on the test set, and also report the overall accuracy rate.

(d) (5 marks) What is the numerical value of the gain in entropy corresponding to the fifirst split at the top of the tree? (Use logarithms to base 2, and show the details of the calculation rather than just providing a fifinal answer.)

(e) (2 marks) Fit a random forest model to the training set to try to improve prediction. Report the R output.

(f) (2 marks) Produce the confusion matrix for predicting the credit rating from this forest on the test set, and also report the overall accuracy rate.

Using default settings for svm() from the e1071 package, fifit a support vector machine to predict the credit ratings of customers using all of the other variables in the dataset.

(a) (2 marks) Predict the credit rating of a hypothetical “median” customer, i.e.,one with the attributes listed in Table 1. Report decision values as well.

(b) (2 marks) Produce the confusion matrix for predicting the credit rating from this SVM on the test set, and also report the overall accuracy rate.

(c) (2 marks) Automatically or manually tune the SVM to improve prediction over that found in 3b. Report the resulting SVM settings and the resulting confusion matrix for predicting the test set. (Any amount of improvement is acceptable.)

Fit the Naive Bayes model to predict the credit ratings of customers using all of the other variables in the dataset.

(a) (2 marks) Predict the credit rating of a hypothetical “median” customer, i.e.,one with the attributes listed in Table 1. Report predicted probabilities as well.

(b) (2 marks) Reproduce the fifirst 20 or so lines of the R output for the Naive Bayes fifit, and use them to explain the steps involved in making this prediction.

Naive Bayes on the test set, and also report the overall accuracy rate.

Based on the confusion matrices reported in the preceding parts,

(a) (2 marks) Which of the classififiers look to be the best? (Be specifific, and specify the fifigures you used to answer this question.)

(b) (2 marks) Which look to be the worst? (Be specifific, and specify the fifigures you used to answer this question.)

Consider a simpler problem of predicting whether a customer gets a credit rating of A or not.

(a) (2 marks) Fit a logistic regression model to predict whether a customer gets a credit rating of A using all of the other variables in the dataset, with no interactions.

(b) (2 marks) Report the summary table of the logistic regression model fifit.

(d) (2 marks) Fit an SVM model of your choice to the training set.

(e) (3 marks) Produce an ROC chart comparing the logistic regression and the SVM results of predicting the test set. Comment on any difffferences in their performance.

程序代写代做C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB

CS代写,留学生编程代写,CS作业代写,Java代写,程序代写，代码代写 | ITCS代写

本网站支持淘宝支付宝微信支付 paypal等等交易。如果不放心可以用淘宝交易！

E-mail:itcsdx@outlook.com 微信:itcsdx

如果您使用手机请先保存二维码，微信识别。如果用电脑，直接掏出手机果断扫描。

澳洲CS代写

数据挖掘代写 | CS 5710: Data Mining Program #6 遗传算法代写 | CMT304 Scripting, Filters, Regular Expressions, Genetic Programming, Machine Programming

CONTACT

Assignment Example

Service Scope

Recent Case

2024年10月8日

ITCS代写

R数据代写｜INFO411/911: Data Mining and Knowledge Discovery Assignment 2

Questions

CONTACT

Assignment Example

Service Scope

Recent Case

MySQL数据库学习指南：留学生如何在不同国家的课程和就业形势下脱颖而出

北美计算机留学高校整理与热门专业前景分析

留学生计算机代写常见服务有哪些？

留学生程序代写靠谱吗

留学生如何选择机器学习方向的专业

Tags