算法代写 | CM3109 Coursework

本次算法代写要求实现模拟退火算法,一共包含4个task。

CM3109 Coursework

Autumn semester 2018

Notes before starting

  • The following programming exercises may be done using a program-ming language of your own choice, though all parts should be done using the same language. If you wish to use anything other than Java or Python then please let me know well in advance so I can make sure I have everything necessary to run your program installed on my com-puter.
  • If you use any external sources for solving your coursework (e.g. pseu-docode from a website), you should add a comment with a reference to these sources and a brief explanation of how they have been used. You are allowed to discuss high-level aspects of your solution with other students, but you should disclose the names of these students in a com-ment.

The aim of this coursework is to use simulated annealing to solve an optimi-sation problem coming from the field of computational social choice – a hot interdisciplinary topic currently receiving a lot of attention from researchers in several di erent fields, including AI. Section 1 provides necessary back-ground to the problem, with the specific coursework tasks coming in Section 2.

 

1    Background

We start by introducing the problem we’re looking at, then we discuss how to represent the input and outputs to this problem, and then we introduce a particular proposal for approaching this problem which we will aim to implement via simulated annealing.

The problem

Suppose we have a competition with n participants, which is decided by playing a tournament in which each participant plays one match against each of the others. (These participants could be football teams, chess players, or anything). After the tournament is finished we know both (i) the winner of each match (or if it was a draw), and (ii) the margin of victory (represented by an integer greater than 0 if it wasn’t a draw). The question is, based on this information, which participant should be judged to be the winner of the tournament, who is the runner-up, who places third, and so on up to the worst? In other words, which ranking of the participants (from best to worst) appropriately reflects the relative quality of the participants? (We do not allow ties in the ranking).

Weighted tournament graphs and rankings

We can picture such a tournament T as a directed graph whose nodes are the participants, and such that there exists a directed edge from i to j if i is the winner of the match “i versus j”. Each such edge is labelled with a weight w(i, j), which is the number representing the margin of victory. One example of such a weighted tournament graph is on the left of Fig. 1. (Note the lack of edge between B and D, indicating their match was drawn.) There is an edge from A to C labelled with 14, indicating that A was the very clear winner (by margin 14) of the match “A versus C”. A also defeated B by a margin of 5, but was defeated by D by a margin of 3. These three results would suggest that D has claims to being declared the best, but D was itself defeated quite convincingly by C (by margin of 9), so it’s not clear-cut.

Another way to represent a weighted tournament graph (more amenable for representation in a computer) is as an n n matrix [aij ] whose rows and columns represent the participants {i | i = 1, . . . , n} and whose (i, j) entry aij equals 0 if i did not defeat j, and gives the weight w(i, j) if i defeated j. For example, the matrix on the right of Fig. 1 is the matrix representation of the tournament graph to its left.

Concerning the output of our problem, i.e., rankings, the easiest way to represent a ranking is as a list [i1, i2, . . . , in] with the first entry i1 representing the best participant, the second i2 representing the second best, and so on. For instance, continuing the above example, the ranking which places D first followed by A, B and C, in that order, would be represented as [D, A, B, C].

Kemeny rankings

In the above problem, we have to choose the ranking that “best fits” the results from a given weighted tournament T . Several di erent proposals have been made for how to precisely formalise what “best fit” means. We are going to focus on just one of them, which is based on the intuition of minimising the disagreements with T . Let’s say a particular ranking R disagrees with T on an edge (x, y) if x defeats y in T but y is ranked above x in R. We can then measure how well R fits T by adding up all the weights of all the edges for which R disagrees with T . This results in the Kemeny score1 of R (with respect to T ), denoted by c(R, T ) and given by the following formula:

Xc(R, T ) =      {w(x, y) | R disagrees with T on (x, y)}.

For example if T is the weighted tournament in Fig. 1 and R = [D, A, B, C] then R disagrees with T on (C, D) and (C, B), which have weights 9 and 2 respectively. Hence c(R, T ) = 9 + 2 = 11. The ranking that best fits T is then taken to be any ranking R that has the lowest possible Kemeny score. Thus we arrive at the following optimisation problem:

Optimisation problem. Given a weighted tournament T , find a ranking that minimises the Kemeny score c(R, T ).

A ranking that minimises the Kemeny score (with respect to a given tournament T ) is called a Kemeny ranking (with respect to T ). Note that some tournaments can have more than one Kemeny ranking. The tournament in Fig. 1 has two Kemeny rankings, namely [A, C, B, D] and [A, C, D, B].

 

2    Coursework questions

The tasks below require you to write a program (in a programming language of your choice) that is able to read a specific weighted tournament from a separate file (to be described below), and return a Kemeny ranking for this tournament – or at least an approximation of one – using simulated anneal-ing (see Lecture 9, slide 13 for the pseudocode of the simulated annealing algorithm).

This coursework will use a particular weighted tournament containing real data (1994 Formula One.wmg) from the 1994 World Formula One Motor Rac-ing Championship, for which there were a total of 46 participants. This file, along with an explanation of its data format (see How to read tournament data.pdf) can be downloaded from Learning Central.

 

First part: Setting SA parameters

As discussed during lecture, simulated annealing has a number of parameters that need to be fixed by the programmer in advance (see Lecture 9, slide 17). Some will be fixed below, but for others you will be free to experiment.

  • initial solution. For the initial solution you must always use the ranking [1, 2, 3, 4, . . . , 45, 46], using the numbering of candidates speci-fied in 1994 Formula One.wmg.
  • cooling schedule. For initial temperature T I and temperature length T L you are free to experiment with di erent values. Your only re-striction regarding cooling ratio f is that f(T ) should be of the form f(T ) = a T , where a is some fixed parameter between 0 and 1 (which you are free to choose).
  • cost function. The cost of a solution (i.e., possible ranking) R is the Kemeny score c(R, T ), defined in Section 1 above, where T is the tournament loaded from 1994 Formula One.wmg.
  • stopping criterion. The algorithm should STOP if a pre-specified number num non improve of solutions have been looked at without im-proving on the current best solution. You are free to experiment with di erent values of num non improve.

The only parameter not yet specified is the neighbourhood function, which brings us to…

TASK 1. Write down a definition of a neighbourhood N that is specific to this problem, i.e., specify the conditions for when any given ranking R2 is a neighbour of another R1 (2 marks), and illustrate your definition with an example (2 marks). Justify your choice of definition by showing how the cost of a neighbouring solution R2 can easily be computed from the cost of current solution R1 (2 marks).

Second part: Implementation

After having set up the parameters, you must now implement the algorithm.

TASK 2. Write a program implementing the simulated annealing algorithm to find a ranking that minimises the Kemeny score with respect to the given weighted tournament T specified in 1994 Formula One.wmg. You must use the parameters in accordance with the first part above, and use the neigh-bourhood you defined in Task 1. Your program should have the following behaviour. Successful completion results in the marks indicated:

  1. It should be executable via the command line, taking the file 1994 Formula One.wmg as an argument. (1 mark)
  2. After running, your program should print to the screen the following information:
  • The best ranking it found (positions 1 to 46), in the form of a table, using the real names of the candidates, which are included in 1994 Formula One.wmg. (1 mark)
  • The Kemeny score of this best ranking. (1 mark)
  • The algorithm’s runtime in milliseconds. (1 mark)
  • The number of uphill moves made. (1 mark)

Your code will be judged on whether the SA algorithm is correctly imple-mented (2 marks) and, given it is correctly implemented, whether the so-lution returned is optimal (or at least close to optimal) (4 marks) and is returned within reasonable runtime (2 marks). You must highlight in your code (e.g., with comments) the part that deals with the actual steps of the SA algorithm. Failure to do so may lead to loss of marks.

 

Third part: Selecting best parameters and discussion

As stated in the first part, you are free to try out di erent values for the parameters T I, T L and the “a” in f(T ) = a T , as well as the number num non improve in the stopping criterion. The final task is about exper-imentally tuning to the best choice of parameters and reflecting on your results.

TASK 3. Give the values of the parameters T L, T I and a (in f(T ) = a T ), as well as num non improve, which seem to give the best solutions for your program. For this “best” choice of parameters provide screenshots of the output of your program (1 mark). Write a short summary (5 marks) (max 300 words) of your results, indicating the range of di erent values you tried for the parameters, which parameters’ variation had the biggest e ect on the output solution, etc. Also o er some speculation on the presence of local optima in this problem. Are there many?

(Total 6 marks)

 

Fourth part: Formulating the problem as a Binary In-teger Program

The problem of finding a Kemeny ranking of a given tournament T can also be posed as a binary integer programming problem by making an appropriate choice of decision variables, objective function and linear constraints. For this part you may assume we are given some general tournament T with n participants and with tournament matrix [aij ] (see Sect. 1).

TASK 4. Formulate the problem as a binary integer program by writing down the decision variables (and what they stand for) (1 mark), objective function (and whether it’s to be maximised or minimised) (1 mark) and constraints (3 marks). [Hint: For the constraints, note that any ranking must satisfy the following properties: (1) for all i, j, k, if i is ranked above j and j is ranked above k then i is ranked above k, (2) for all i, j such that i 6= j exactly one of “i is ranked above j” or “j is ranked above i” must hold.]

(Total 5 marks)