本次Python代写是实现两种聚类算法：k均值和聚集聚类。

CS 178: Machine Learning: Spring 2020

1. Load the standard Iris dataset, select the first two features, and ignore the class (or target) variables. Plot

the data and see for yourself how “clustered” you think it looks. Include the plot, say how many clusters you

think exist, and briefly explain why. (There are multiple reasonable answers to this question.) (5 points)

2. Run k-means on the first two features of the Iris data, for k = 2, k = 5, and k = 20. Try multiple (at least 5)

different initializations for each k, and check to see whether they find the same solution; if not, pick the

one with the best score. For the best clustering for each candidate k, create a plot with the data colored

by assignment, and the cluster centers. You can plot the points colored by cluster assignments z using

ml.plotClassify2D(None,X,z) . (You will need to also plot the cluster centers yourself.) (15 points)

3. Run agglomerative clustering on the first two features of the Iris data, first using single linkage and then

again using complete linkage, using the algorithms implemented in ml.cluster.agglomerative from

cluster.py ). For each linkage criterion, plot the data colored by their assignments to 2, 5, and 20 clusters.

(Agglomerative clustering does not require an initialization, so there is no need to run methods multiple

times.) (20 points)

4. Briefly discuss similarities and differences in the outputs of the agglomerative clustering and k-means

algorithms. (5 points)

Problem 2: EigenFaces (50 points)

In class, we discussed how PCA has been applied to faces, and showed some example results. Here, you’ll explore

this representation yourself. First, load the data and display a few faces to better understand the data format:

1 X = np.genfromtxt(“data/faces.txt”, delimiter=None) # load face dataset

2 plt.figure()

3 # pick a data point i for display

4 img = np.reshape(X[i,:],(24,24)) # convert vectorized data to 24×24 image patches

5 plt.imshow( img.T , cmap=”gray”) # display image patch; you may have to squint

1. Subtract the mean of the face images (X0 = X − µ) to make your data zero-mean. (The mean should be of

the same dimension as a face, 576 pixels.) Plot the mean face as an image. (5 points)

2. Use scipy.linalg.svd to take the SVD of the data, so that

X0 = U · diag(S)· Vh

Since the number of faces is larger than the dimension of each face, there are at most 576 non-zero singular

values; use the full_matrices=False argument to avoid using a lot of memory. As in the slides, then

compute W = U.dot( np.diag(S) ) so that X0 ≈ W · Vh

. Print the shapes of W and Vh

. (10 points)

Homework 5 UC Irvine 1/ 2

CS 178: Machine Learning Spring 2020

3. For K = 1, . . . ,30, compute the approximation to X0 given by the first K eigenvectors (or eigenfaces):

Xˆ

0 = W[:,: K] · V h[: K,:]. For each K, compute the mean squared error in the SVD’s approximation,

np.mean( (X0 − Xˆ

0)**2 ) . Plot these MSE values as a function of K. (10 points)

4. Display the first three principal directions of the data, by computing µ+α V[j,:] and µ-α V[j,:], where α

is a scale factor (we suggest setting α to 2*np.median(np.abs(W[:,j])) , to match the scale of the data).

These should be vectors of length 242 = 576, so you can reshape them and view them as “face images” just

like the original data. They should be similar to the images in lecture. (10 points)

5. Choose any two faces and reconstruct them using the first K principal directions, for K = 5, 10, 50, 100. Plot

the reconstructed faces as images. (5 points)

6. Methods like PCA are often called “latent space” methods, as the coefficients can be interpreted as a new

geometric space in which the data are represented. To visualize this, choose 25 of the faces, and display

them as images with the coordinates given by their coefficients on the first two principal components:

1 idx = … # pick some data (randomly or otherwise); an array of integer indices

2

3 import mltools.transforms

4 coord,params = ml.transforms.rescale( W[:,0:2] ) # normalize scale of “W” locations

5 plt.figure();

6 for i in idx:

7 # compute where to place image (scaled W values) & size

8 loc = (coord[i,0],coord[i,0]+0.5, coord[i,1],coord[i,1]+0.5)

9 img = np.reshape( X[i,:], (24,24) ) # reshape to square

10 plt.imshow( img.T , cmap=”gray”, extent=loc ) # draw each image

11 plt.axis( (-3,3,-3,3) ) # set axis to a reasonable scale

This plot is a good way to gain intuition for what the PCA latent representation captures. (10 points)

**程序代写代做C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB**

本网站支持淘宝 支付宝 微信支付 paypal等等交易。如果不放心可以用淘宝交易！

**E-mail:** [email protected] **微信:**itcsdx

如果您使用手机请先保存二维码，微信识别。如果用电脑，直接掏出手机果断扫描。