TPR3411 Pattern Recognition
Tutorial 6
Part 1 - Theory
1. What is the difference between parametric and non-parametric approaches in pattern recognition?
In the parametric approach, the densities are uni-modal. However, in practical problems, it usually involves multi-modal densities. On the other hand, the non-parametric approach takes arbitrary distributions without assuming forms of the underlying densities.
2. Briefly explain Parzen windows in classification.
Parzen windows classification is a technique for nonparametric density estimation, which can also be used for classification. Using a given kernel function, the technique approximates a given training set distribution via a linear combination of kernels centered on the observed points.
3. What is the goal used by the k-nearest neighbor method in classification?
The goal of k-nearest neighbor is to classify a new sample by assigning it the label most frequently represented among the k nearest samples and use a voting scheme.
4. Let say you are given a task to classify whether a given cell sample is malignant or benign. You have a set of training sample for this task.
Feature 1 / Feature 2 / Classification2 / 4 / Malignant
3 / 8 / Benign
5 / 9 / Benign
3 / 7 / Malignant
7 / 10 / Benign
5 / 4 / Malignant
6 / 8 / Benign
Given a new sample, . Use the k-nearest neighbor method with k = 3 to classify this sample.
Feature 1 / Feature 2 / Distance / Rank Minimum Distance / Is it included in 3 –NN / Category of NN2 / 4 / 13 / 6
3 / 8 / 2 / 2 / Yes / Benign
5 / 9 / 5 / 3 / Yes / Benign
3 / 7 / 1 / 1 / Yes / Malignant
7 / 10 / 18 / 7
5 / 4 / 10 / 5
6 / 8 / 5 / 4
So we classify the new sample as Benign.
Part 2 – Practical
Objective: You are going to convert the malignant/benign classification problem above into Matlab program.
1. Create matrices for the two classes:
Class1 = [2 4; 3 7; 5 4]
Class2 = [3 8; 5 9; 7 10; 6 8]
2. Plot the data in each class. You graph should look as follow.
x1_1 = Class1(:,1);
x1_2 = Class1(:,2);
plot(x1_1,x1_2,'*','Color','blue')
hold on
x2_1 = Class2(:,1);
x2_2 = Class2(:,2);
plot(x2_1,x2_2,'o','Color','red')
3. Create a variable for the new sample, . Plot the new sample in the graph you created just now. You graph should look as follow:
x = [4 7]
plot(4,7,'x','Color','black','MarkerSize',15)
4. Call the k-NN function with k = 3. What is the returned result?
kNN(Class1,Class2,x,3)
Note: How to choose K?
· “Rule of thumb”: choose , where n is the number of samples.
· For efficiency, choose k = 1