代码代写|ITO5201 Assessment 2

这是一篇来自澳洲的关于解决潜在变量模型和神经网络2个板块的代码代写,以下是具体作业内容:

 

Assessment overview

This assessment consists of two parts, which cover latent variables models and neural networks (Modules 4 and 5).

The total marks of this assessment are 70.

Assessment details

Part A. Document Clustering

In this part, you solve a document clustering problem using unsupervised learning algorithms (i.e., soft and hard Expectation Maximization) for document clustering.

Question 1 [EM for Document Clustering, 40 Marks]

I.Derive Expectation and Maximization steps of the hard-EM algorithm for Document Clustering, show your work in your submitted PDF report. In particular, include all model parameters that should be learnt and the exact expression (using the same math convention that we saw in the Module 4) that should be used to update these parameters during the learning process (ie., E step, M step and assessments).

II.Implement the hard-EM (you derived above) and soft-EM (derived in Latent valuable models for document analysis from Module 4). Please provide enough comments in your submitted code.

Hint: If it helps, feel free to base your code on the provided code for EM algorithm for GMM in Activity 1, Module 4 or the codebase provided in the Moodle).

III. Load Task2A.txt file and necessary libraries (if needed, perform text preprocessing similar to what we did in Activity 2, Module 4), set the number of clusters K=4, and run both the soft-EM and hard-EM algorithms on the provided data.

IV.Perform a PCA on the clusterings that you get based on the hard-EM and soft-EM in the same way we did in Activity 2, Module 4. Then, visualize the obtained clusters with different colors where x and y axes are the first two principal components (similar to Activity 2 in Module 4). Attach the plots to your PDF report and report how and why the hard and soft-EM are different, based on your plots in the report.

Part B. Neural Network vs. Perceptron

In this part, you apply a 3-layer Neural Network on a synthetically generated data to compare its performance with Perceptron. Here, we are looking for your explanation about the differences between perceptron and NN that leads to different results.

Question 2 [Neural Network’s Decision Boundary, 30 Marks]

I.Load Task2B_train.csv and Task2B_test.csv sets, plot the training data with classes are marked with different colors, and attach the plot to your PDF report.

II.Train two perceptron models on the loaded training data by setting the learning rates η to .01 and .09 respectively, using a code from Activity 1 in Module 3.

Calculate the test errors of two models and find the best η and its corresponding model, then plot the test data while the points are colored with their estimated class labels using the best model that you have selected; attach the plot to your PDF report.

Hint: Note that you must remove NA records from the datasets (using “complete.cases()’ function). You may also choose to change the labels from [0, 1] to [-1, +1] for your convenience. If you decided to use the code from Activity 1 in Module 3, you may need to change some initial settings (e.g.,epsilon and tau.max). Finally, remember that perceptron is sensitive to initial weights. Therefore, we recommend to run your code a few times with different initial weights.

III. For each combination of K (i.e, number of units in the hidden layer) in {5, 10, 15,…, 100} and µ (learning rate) in {0.01, 0.09}, run the 3-layer Neural Network given to you in Activity 1, Module 5 and record testing error for each of them (40 models will be developed, based on all possible combinations). Plot the error for µ 0.01 and 0.09 vs K (one line for µ 0.01 and another line for µ 0.09 in a plot) and attach it to your PDF report. Based on this plot, find the best combination of K and µ and the corresponding model, then plot the test data while the points are colored with their estimated class labels using the best model that you have selected; attach the plot to your PDF report.

Hint: In case you choose to use the provided examples in Activity 1, Module 5,you may need to transpose the dataset (using “t()” function) and use different values for parameter settings (e.g., lambda).

IV.In your PDF report, explain the reason(s) responsible for such difference between perceptron and a 3-layer NN by comparing the plots you generated in Steps II and III.

Hint: Look at the plots and think about the model assumptions.

Supporting resources

You will need the following resources in order to complete this assessment item:

  • Assessment Datasets
  • codeBase2A, codeBase2B files. You can use these codebase files to answer Part A and PartB questions.
  • You may need to review the FIT citation style tutorial to make you familiar with appropriate citing and referencing for this assessment. Also, review the demystifying citing and referencing for help.

Submission details

The files that you need to submit are:

  1. Jupyter Notebook files containing the code for questions {1,2} with the extension “.ipynb”. The file names should be in the following format:

STUDNETID_assessment_2_qX.ipynb where ‘X=1,2,3’ is the question number.

For example, the Notebook for Question 2 Should be named

STUDNETID_assessment_2_q2.ipynb

  1. You must add enough comments to your code to make it readable and understandable by the tutor.
  1. A PDF file that contains your report, the file name should be in the following format: STUDNETID_assessment_2_report.pdf. You should replace STUDENTID with your own student ID. All files must be submitted via Moodle.
  1. Zip all of your files and submit it via Moodle. The name of your file must be in the following format:

STUDNETID_FirstName_LastName_assessment_2_report.zip

where in addition to your student ID, you need to use your first name and last name as well.

Assessment marking criteria

The following outlines the criteria which you will be assessed against:

  • Ability to understand the fundamentals of neural networks and latent variable models.
  • Working code: The code executes without errors and produces correct results.
  • Quality of report: Your report should show your understanding of the latent variable models and neural networks by answering the questions in this assessment and attaching the required figures.

Penalties

  • Late submission (students who submit an assessment task after the due date will receive a late-penalty of 10% of the available marks in that task per calendar day. Assessment submitted more than 7 calendar days after the due date will receive a mark of zero (0) for that assessment task. )
  • Jupyter Notebook file is not properly named (-5%)
  • The report PDF file is not properly named (-5%)