Projects

1 Simulate pictures from a photographin a style of your favorite painter.

Suggested method: Create a database of many paintings, containing patches that are typical to the painter. For each patch in the photograph replace it with a most similar patch in the database.As the first step do it in gray-scale only. The patches are better to be matched after low-pass filtering. As a second step try incorporating colors. You should suggest a method for stitching patches. If the patch sizes are too small the photograph will still looks very much the same. Think of patch size (possibly adaptive selection).

2 Classify images of two categories using bag of features using discriminatively learnt visual vocabulary.

The task in this project is to develop a classifier that can discriminate between images of classes of objects, for example cats and dogs. Unlike other projects in this one you are not free in choosing the method. The goal of this project is to implement and test a novel recognition method described below:

The training set should be divided into two subsets, one is for the clustering and SVM learning and the other for the vocabulary construction.

You can use any choice of interest point detector for patch selection, use SIFT descriptors as features and K-means on for the initial quantization. You can try to compute the clustering on both categories at the same time or on each category separately. (check the degree of the mix between the classes in each cluster). Resulting clusters should be purified by learning SVM classifier on each cluster.

Vocabulary construction:

Each cluster will be associated with two bits. Thus the size of the word will be 2*N where N is the number of clusters.

Each training feature will be converted into a binary string according to the following scheme:

1)Find the closest cluster by measuring the distances between the feature and the clusters centers.

2)Apply SVM to classify the feature into one of the two classes.

3)Update the pair of bits in the word that is associated with the closest cluster: Set the first bit of the pair to 1 if the SVM label in (2) was 1, set the second bit of the pair to 1 if the label was -1 (the other bit of the pair should remain zero)

The resulting binary codes will form the vocabulary (If the size of the vocabulary is more too large, try using fewer clusters).

Use nearest neighbor with the hamming distance for bag-of-features construction. Train an SVM on the bag-of-features constructed form the training data that classifies between the two categories.

The resulting method should be tested empirically and compared to a standard bag of features approach.

Some relevant references:

  • Csurka, G., Dance, C. R., Fan, L., Willamowski, J., Bray, C., 2004. Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision.
  • Zhang J.G., Marszalek, M., Lazebnik S. and Schmid C.", "Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study", IJCV, vol. 73(2), 2007, pages: 213-238

3 Automatic cast listing for movies or TV shows

Develop a system that takes a movie and automatically determines the key actors and actresses and labels them each time they appear in the show. In addition to the visual cues, consider also exploiting information from the subtitle text, as is done in Everingham et al.

Some relevant references:

  • W. Fitzgibbon and A. Zisserman. On affine invariant clustering and automatic cast listing in movies. ECCV 2002.
  • J. Sivic and A.Zisserman. Video Data Mining Using Configurations of Viewpoint Invariant Regions. CVPR 2004.
  • T. Berg, A. Berg, J. Edwards, D. Forsyth. Who's in the Picture. NIPS 2004
  • M. Everingham, J. Sivic, A. Zisserman. Hello! My name is... Buffy -- Automatic Naming of Characters in TV Video. British Machine Vision Conference, 2006.

4 Celebrity look-alikes

Design a system that can take a photo of a face and determine the celebrity who looks most similar. (Check out the system from myheritage.com.) You can download a face detection system from the web just to locate a face in the image, but you the face recognition algorithms should be implemented by the group.

5 Face alignment – localizing eyes, nose, and mouth

Design a system that finds the main features in a human face: eyes, nose, and mouth. You can use any method you like. The system should be nearly real time and should be tested on a large volume of images (provided).

6. Find the frontal pose of a face in a sequence.

Face recognition systems achieve very impressive recognition results on the frontal faces with no or very minor occlusions. The purpose of this project is to build a system that tracks a face of a person in a video and chooses a set of frames where the face appears in a frontal pose and is not occluded. The system should work with a real camera. (a webcam with high resolution will be provided for the project).

7. Using friends’ faces to make your portrait

In this project you should design a system that inputs a face of a person and constructs a synthetic image from faces of other people (or other images). The resulting image should be similar to the input face, but should not include the artifacts of the viewing conditions, such as effects of illumination, minor occlusions (glasses, facial hair, etc.)

This project investigates a new patched-based representation of a human face. First, it can be used in secure applications where the identity of the person, specifically its photo is classified. Second, it can be used in face identification with occlusions, such as glasses, facial hair, etc.

A proposed approach:

Create a data base of facial fragments from many people. For each patch in the input face look for the closest patch in the data base, but be careful to find the best identity match and not illumination or alignment match, since these two significantly influence some of the image similarity measures.

For a nicer stitching of patches in the portrait you can use a stitching method proposed in