Practicum 4: Text Classification

Practicum 4: Text Classification

Lab 6: Association Rules

Evgueni N. Smirnov

June 29, 2013

1.Introduction

In this lab you will consider two possible applications of association rules. The first one is an application of association-rule mining for learning decision rules. The second application is an application of association-rule mining for analyzing a market basket dataset. For both applications you will use an implementation of the Apriori algorithm provided in Weka. We note that this implementation uses attribute-value representation of items and that is why you can encounter problems during the market-basket analysis.

2.Decision-Rule Learning Problem

In one of the previous labs you derived a set of decision rules for the weather problem using the JRip decision-rule algorithm. In this part of this lab you will use the Weka implementation of the Apriori algorithm on the same problem. Run the Apriori algorithm on the data file of the weather problem and analyze the resulting association rules. Compare these rules with the rules produced by the JRip algorithm. On the basis of the comparison derive a simple modification of the Apriori algorithm that can be applied for decision-rule learning.

The data file for the weather problem is provided in the WEKA installationdirectory (subfolder data).

3.Market Basket Problem

Given:

  • a set I of 11 items: {fruitveg, freshmeat, dairy, cannedveg, cannedmeat, frozenmeal, beer, wine, softdrink, fish, confectionery}.
  • a database of 1000 transactions T s.t. TI

.

Find:

  • interesting association rules that explain customer behaviour.

The data file marketBasket.arfffor the market-basket problem is provided on the course website.

4.Algorithm

As stated above to mine association rules you will use an implementation of the Apriori algorithm provided in Weka.

5.Lab Tasks

A.Run the Apriori algorithm on the data file of the weather problem and analyze the resulting association rules. Compare these rules with the rules produced by the JRip algorithm. On the basis of the comparison derive a simple modification of the Apriori algorithm that can be applied for decision-rule learning.

B.Study the data file marketBasket.arff.

C.Runthe Apriori algorithm on the data file marketBasket.arff and try to find interesting association rules. To do this experiment you will try to find appropriate values of the algorithm options support, confidence, lift, and conviction.

1