# 机器学习代写 | COMP9417: A collection of sample exam exercises

Question 1

Note: this question was incorporated into the Ensemble learning lab and the code there is much im-
proved. Refer to that lab for a better approach to this problem. This question requires you to implement
the Adaptive Boosting Algorithm from lectures. Use the following code to generate a toy binary classi-
ﬁcation dataset:

1 import numpy as np
2 import matplotlib.pyplot as plt
3 from matplotlib.colors import ListedColormap
4 import warnings
5 warnings.simplefilter(action=’ignore’, category=FutureWarning)
6
7 from sklearn.tree import DecisionTreeClassifier
8 from sklearn.datasets import make_blobs
9
10 np.random.seed(2)
11 n_points = 15
12 X, y = make_blobs(n_points, 2, centers=[(0,0), (-1,1)])
13 y[y==0] = -1 # use -1 for negative class instead of 0
14
15 plt.scatter(*X[y==1].T, marker=”+”, s=100, color=”red”)
16 plt.scatter(*X[y==-1].T, marker=”o”, s=100, color=”blue”)
17 plt.show()
18

(a) By now, you will be be familiar with the scikitlearn DecisionTreeClassiﬁer class. Fit Decision trees
of increasing maximum depth for depths ranging from 1 to 9. Plot the decision boundaries of each
of your models in a 3  3 grid. You may ﬁnd the following helper function useful:

1 def plotter(classifier, X, y, title, ax=None):
2 # plot decision boundary for given classifier
3 plot_step = 0.02
4 x_min, x_max = X[:, 0].min() – 1, X[:,0].max() + 1
5 y_min, y_max = X[:, 1].min() – 1, X[:,1].max() + 1
6 xx, yy = np.meshgrid(np.arange(x_min, x_max, plot_step),
7 np.arange(y_min, y_max, plot_step))
8 Z = classifier.predict(np.c_[xx.ravel(),yy.ravel()])
9 Z = Z.reshape(xx.shape)
10 if ax:
11 ax.contourf(xx, yy, Z, cmap = plt.cm.Paired)
12 ax.scatter(X[:, 0], X[:, 1], c = y)
13 ax.set_title(title)
14 else:
15 plt.contourf(xx, yy, Z, cmap = plt.cm.Paired)
16 plt.scatter(X[:, 0], X[:, 1], c = y)
17 plt.title(title)
18

(b) Comment on your results in (a). What do you notice as you increase the depth of the trees? What
do we mean when we say that trees have low bias and high variance?

(c) We now restrict attention to trees of depth 1. These are the most basic decision trees and are
commonly referred to as decision stumps. Consider the adaptive boosting algorithm presented
in the ensemble methods lecture notes on slide 50/70. In adaptive boosting, we build a model
composed of T weak learners from a set of weak learners. At step t, we pick a model from the set
of weak learners that minimises weighted error:

(d) In this question, we will extend our implementation in (c) to be able to use the plotter function in

(e) To do this, we need to implement a boosting model class that has a ‘predict’ method. Once
you do this, repeat (c) for T = [2; : : : ; 17]. Plot the decision boundary of your 16 models in a 4  4
grid. The following template may be useful:
1 class boosted_model:
2 def __init__(self, T):
3 self.alphas = # your code here
4 # your code here
5
6 def predict(self, x):
7 # your code here
8

(e) Discuss the differences between bagging and boosting.

E-mail: itcsdx@outlook.com  微信:itcsdx