Python代写 | SIT 720 – Machine Learning Assessment Task 3 (40 marks)

本次Python代写是完成机器学习中的线性回归和逻辑回归

SIT 720 – Machine Learning
Assessment Task 3 (40 marks)

Part 1: Linear Regression: (25 marks)
1. Load the dataset and split the data for training and testing – consider the data of last 2
years (2015 and 2016) for testing. Now exclude recording_date_time column from both
training and test sets. Display the shape of training and test sets. (3 marks)
In [ ]:
# INSERT your code (or comment) here
1. Consider the ‘temperature’ as the target. List the insignificant features for predicting
temperature, if any. Explain your findings. (5 marks)
[Hint for students: See the “7.3 Relevance and Covariance among features or
variables” for more information.]
In [ ]:
# INSERT your code (or comment) here
1. Now create a linear model considering the ‘temperature’ as the target variable and other
columns as features (you can optionally remove non-contributing features). Show the test
performance (as Mean Absolute Error, MAE) of the model. (5 marks)
In [ ]:
# INSERT your code (or comment) here
1. Find the feature which shows maximum correlation with “pressure”. Create a linear
regression model to predict temperature using these two features (‘pressure’ and the one
which shows maximum correlation). Compare the performance of this simplified model with
the model developed in the previous question (Q-3). Explain the performance variation, if
any. (6 marks)
In [ ]:
# INSERT your code (or comment) here
1. Apportion the complete dataset into training and test sets, with an 40-60 split. (6 marks)
(a) Train a linear regression model without considering overfitting scenario and report the
test performance.
(b) Create an optimal regularised linear regression model and report the test performance.
(c) Explain the reason behind the performance variation, if any.
In [ ]:
# INSERT your answer in maximum five sentences.
Part 2: Logistic Regression: (9 marks)
1. Can the same target (temperature, mentioned in Part-1) be used for logistic regression?
Why? (2 marks)
In [ ]:
# INSERT your code (or comment) here
1. Split the dataset as 70-30% for training and testing. Create a logistic regression model to
predict the ‘precip_type’. Report the prediction accuracy of your model whether the
“precip_type” is “rain” or not (use decision threshold of 0.45). (5 marks)
In [ ]:
# INSERT your code (or comment) here
1. Discuss the test performance using precision, recall and confusion matrix. (2 marks)
In [ ]:
# INSERT your code (or comment) here
Part 3: Objective function optimisation: (6 marks)
Let’s consider the line graphs shown below and answer the following questions [Hint: See
weekly content 7.4-7.10],
(a) (b)
a. Which of the above figures represents the convex objective function and why? (1 marks)
b. Which hyper-parameter can help to reach the convergence point and the impact of value
selection? (2 marks)