数据科学代写|CSC 380: Principles of Data Science Final Examination

本次美国代写是一个数据科学原理的限时测试

This is the final examination for CSC 380 \Principles of Data Science”. For full credit you
must answer Questions 1 – 3. The final Question 4 is optional extra credit. Submit Jupyter
notebooks for the first three questions by the stated deadline. In addition, if you choose to
answer the extra credit, submit that as a PDF file by the deadlien and make sure to show
all work along with answers.

Problem 1: Data Analysis and Visualization (6 points)

This question asks you to visualize and analyze a dataset of Netflix titles. To complete the
question, refer to the Jupyter notebook under \q1/q1.ipynb”.

Problem 2: Linear and Polynomial Regression (7 points)

This question asks you to analyze a synthetic dataset an fit, both, linear and cubic poly
nomial regression models. To complete the question, refer to the Jupyter notebook under
\q2/q2.ipynb”.

Problem 3: Dimensionality Reduction and K-Means Clustering (8 points)

This question asks you to perform cluster analysis on a dataset of human gene expression
levels. The clustering will involve, first, projecting the high-dimensional data to down to
two dimensions, using PCA. To complete the question, refer to the Jupyter notebook under
\q3/q3.ipynb”.

Problem 4: Extra Credit (3 points)

We have been tasked with building a system capable of diagnosing problems in a vehicle’s
fuel system. To do this we will construct a probability model of three key components of the
fuel system: the battery level, fuel level, and the electronic fuel gauge. To simplify things, we
will treat each as a binary random variable with B = 1 denoting the event that the battery
is full, F = 1 that the fuel tank is full, and G = 1 that the fuel gauge reports \full”. The
fuel gauge measures the amount of fuel but can be noisy. We model the fuel gauge with the
conditional probabilities:

p(G = 1 j B = 1; F = 1) = 0:8
p(G = 1 j B = 1; F = 0) = 0:2
p(G = 1 j B = 0; F = 1) = 0:2
p(G = 1 j B = 0; F = 0) = 0:1

In other words, if the fuel tank is empty, but the battery is full, there is still a 20% chance
that the fuel gauge will report \full”. Probabilities of the remaining events can be computed
from those above. Answer the following questions:

a) Suppose our prior belief is that there is a 90% chance the battery is full, and similarly a
90% chance that the fuel tank is full. Compute the probability that the following condition
occurs: the battery is empty AND the fuel tank is full AND the fuel gauge reports \full”.
(0.25 points)

b) Suppose that the fuel gauge reads \full”. What is the probability that the tank is actually
full? (0.75 points)

c) Show that the state of the battery does not depend on the state of the fuel tank. Hint: This
is asking you to demonstrate that the random variable B is independent of the random
variable F in the model. (1 point)

d) Show that the state of the battery and state of the fuel tank become dependent when we
observe the fuel gauge. Hint: This question is asking you to show that under this model
B and F are conditionally dependent, given G. (1 point)