Java代写 | ICSI213 Data Structures



Assignment 2: Association Rule Mining

Task 1 (on paper): Given the dataset below whereby each line is representing a transaction,what is the frequent itemsets and rules produced, if minsup is set to 0.25 and minconf is set to 0.80. Show your working.

Trans ID


1 A, B, C, D, F, G, H

2 A, B, C, D

3 A, B, C, D

4 A, B, D

5 B, C, E

Task 2 (coding): Your task is to write a Apriori program (in python using Jupyter notebook),that takes as parameters:

  • minsup – minimum support,
  • minconf – minimum confidence,
  • minlift – minimum lift, and
  • the name of file of transactions (whose format is comma separated value as that of the supermarket.csv downloaded from Canvas). Each line within the data file represents a transaction, where items are separated by commas. Imagine you are going to a grocery store, your transaction at the check-out counter would be one of the lines.

It will produce all association rules which can be mined from the transaction file which satisfy the minimum support, lift, and confidence requirements. The rules should be output sorted first by the number of items that they contain (in ascending order), then by the lift value, confidence,and support (all three in descending order).


A -> B 4, 0.9, 0.8

A, C -> B 3.8, 0.8, 0.7

A, D -> B 3.6, 0.8, 0.6

You can use libraries e.g. Pandas, NumPy but you may NOT use any prebuilt Apriori packages.

Task 3: Now run your implementation using the data from the Task 1. Show that you can produce the same output as Task 1. This can be the output from your Jupyter notebook.

Task 4: Your task is to investigate a dataset and perform an association rule mining task.

  • Run your Apriori code on the data downloaded from Canvas (supermarket.csv). Try different parameters minsup e.g. 0.10, 0.15, 0.20. TIP: Please note that this may take a while if your code is inefficient.
  • Generate rules (you can try different measures (minsup, minlift, minconf to see which gives you more useful and interesting results)). From the generated rules, select 2 rules and explain why the rules are interesting to you. TIP: Don’t over think this. Just describe what the measures for the rule and why you think it is interesting.