Advanced Topics in Computer Vision Spring 2010

Topics and Papers


1. Image and video descriptors:
Currently there are many papers under this topic some of them will possibly be removed. The presenters will not be expected to go into all of the details, but rather be familiar with the different types of descriptors.


Image Descriptors:

·  [SIFT] Distinctive image features from scale-invariant keypoints. D. G. Lowe. IJCV, 60(2):91–110, 2004.

·  [GIST] Modeling the shape of the scene: a holistic representation of the spatial envelope Aude Oliva, Antonio Torralba International Journal of Computer Vision, Vol. 42(3): 145-175, 2001

·  [Shape-Context] Shape Matching and Object Recognition Using Shape Contexts, PAMI April 2002.

·  [Geometric Blur] Geometric Blur for Template Matching Computer Vision and Pattern Recognition (CVPR) 2001, Hawaii, pp I.607-614

·  [Local Self-Similarity] Matching Local Self-Similarities across Images and Videos. E. Shechtman and M. IraniIEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2007.

·  [SURF] "SURF: Speeded Up Robust Features", Herbert Bay, Andreas Ess, Tinne Tuytelaars, Luc Van Gool Computer Vision and Image Understanding (CVIU), Vol. 110, No. 3, pp. 346--359, 2008

·  [LBP] Description of interest regions with local binary patterns. M. Heikkila, M. Pietikainen and C. Schmid. Pattern Recognition Volume 42, Issue 3, March 2009, Pages 425-436

Video Descriptors:

·  [Space-time Local Self-Similarity]

·  [Space-time corners] "On Space-Time Interest Points" I. Laptev; International Journal of Computer Vision, vol 64, number 2/3 (2005), OR . Laptev and T. Lindeberg; in Proc. ICCV'03, Nice, France, pp.I:432-439.

·  [Space-Time SIFT]

Survey / comparison papers for different applications (recognition / matching):

·  Local features and kernels for classification of texture and object categories: a comprehensive study. J. Zhang, M. Marszalek, S. Lazebnik and C. Schmid.
International Journal of Computer Vision, 73(2):213-238, 2007.

·  A performance evaluation of local descriptors. K. Mikolajczyk and C. Schmid. PAMI, 27(10):1615–1630, 2005.

·  Comparing local feature descriptors in pLSA-based image models. E. Horster, T. Greif, R. Lienhart, and M. Slaney. DAGM, 2008.

2. Exploiting wealth of huge image libraries for solving Computer Vision problems:

·  James Hays, Alexei A. Efros. Scene Completion Using Millions of Photographs. ACM Transactions on Graphics (SIGGRAPH 2007). August 2007, vol. 26, No. 3.
http://graphics.cs.cmu.edu/projects/scene-completion/scene-completion.pdf

·  Simon, I. and Seitz, S. M.. Scene Segmentation Using the Wisdom of Crowds. ECCV 2008.
http://grail.cs.washington.edu/pub/papers/simon08ss.pdf

·  D. Bitouk, N. Kumar, S. Dhillon, P. N. Belhumeur, and S. K. Nayar, "Face Swapping: Automatically Replacing Faces in Photographs," ACM Trans. on Graphics (also Proc. of ACM SIGGRAPH), Aug, 2008.
http://www1.cs.columbia.edu/CAVE/publications/./pdfs/Bitouk_SIGGRAPH08.pdf

·  J. Hays, A. Efros. IM2GPS: estimating geographic information from a single image. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2008.
http://graphics.cs.cmu.edu/projects/im2gps/


3. Efficient search in large imagery databases:

·  A. Torralba, R. Fergus, W. T. Freeman, "80 million tiny images: a large dataset for non-parametric object and scene recognition", PAMI 2008.
http://people.csail.mit.edu/torralba/tmp/tiny.pdf

·  A. Torralba, R. Fergus, Y. Weiss, "Small codes and large databases for recognition", CVPR 2008.
http://people.csail.mit.edu/torralba/publications/cvpr2008.pdf

·  Yair Weiss,Antonio Torralba, Rob Fergus, "Spectral Hashing" -- NIPS 2008
http://books.nips.cc/papers/files/nips21/NIPS2008_0806.pdf

·  D. Nister and H. Stewenius. Scalable recognition with a vocabulary tree. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages 2161–2168, 2006.
http://www.vis.uky.edu/~stewe/publications/nister_stewenius_cvpr2006.pdf

·  N. Kumar, P. N. Belhumeur, and S. K. Nayar, "FaceTracer: A Search Engine for Large Collections of Images with Faces," European Conference on Computer Vision (ECCV), pp.340-353, Oct, 2008.
http://www1.cs.columbia.edu/CAVE/publications/./pdfs/Kumar_ECCV08.pdf

4. Statistics of Natural Images:

·  S. C. Zhu and D. Mumford. Prior learning and gibbs reaction-diffusion. IEEE PAMI, 19(11):1236–1250, 1997.
http://www.stat.ucla.edu/~sczhu/papers/Generic_prior.pdf

·  S. Roth and M. J. Black. Fields of experts: A framework for learning image priors. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2005.
http://www.gris.informatik.tu-darmstadt.de/~sroth/pubs/cvpr05roth.pdf
(There is also a longer and more detailed journal version -- IJCV'09: http://www.gris.informatik.tu-darmstadt.de/~sroth/pubs/foe-ijcv.pdf)

·  Y. Weiss and W. T. Freeman. What makes a good model of natural images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2007.
http://www.cs.huji.ac.il/~yweiss/foe-final.pdf

5. Blind Deconvolution

·  H Attias. Independent factor analysis. Neural Computation (Article) 11, 803-851, 1999.
http://www.goldenmetallic.com/research/ifa.pdf

·  R. Fergus, B. Singh, A. Hertzmann, S.T. Roweis, and W.T. Freeman. Removing camera shake from a single photograph. SIGGRAPH, 2006

·  N. Joshi, R. Szeliski, and D. Kriegman. Psf estimation using sharp edge prediction. In CVPR, 2008

·  LEVIN, A., WEISS, Y., DURAND, F., AND FREEMAN, W. 2009.
Understanding and evaluating blind deconvolution algorithms. In CVPR.


6. Lightfield

·  Marc Levoy and Patrick M. Hanrahan. Light field rendering. In SIGGRAPH, 1996.

·  Steven J. Gortler, Radek Grzeszczuk, Richard Szeliski, and Michael F. Cohen. The lumigraph. SIGGRAPH ’96: Proceedings of the 23rd annual conference on Computer graphics and interactive techniques, pages 43–54, New York, NY, USA, 1996. ACM

·  J. Chai, X. Tong, S. Chan, and H. Shum. Plenoptic sampling. SIGGRAPH, 2000.

·  Aaron Isaksen, LeonardMcMillan, and Steven J. Gortler. Dynamically reparameterized light fields. SIGGRAPH, 2000.

·  Ren Ng. Fourier slice photography. SIGGRAPH, 2005.

7. Coded Aperture

·  Zhou, C., Nayar, S.K.: What are Good Apertures for Defocus Deblurring? In: IEEE International Conference on Computational Photography. (2009)

·  Zhou, C., Lin, S., Nayar, S.K.: Coded Aperture Pairs for Depth from Defocus. In: ICCV. (2009)

·  Levin, A., Fergus, R., Durand, F., Freeman,W.: Image and depth from a conventional camera with a coded aperture. SIGGRAPH (2007)

·  Veeraraghavan, A., Raskar, R., Agrawal, A., Mohan, A., Tumblin, J.: Dappled photography: Mask-enhanced cameras for heterodyned light fields and coded aperture refocusing. SIGGRAPH (2007)

8. Compressed sensing

Compressive sensing, matrix completion

·  Emmanuel Candes and Michael Wakin,An introduction to compressive sampling, IEEE Signal Processing Magazine, 25(2), pp. 21 - 30, March 2008.

·  http://www-stat.stanford.edu/~candes/papers/MatrixCompletion.pdf

·  Also may be of interest
http://www-stat.stanford.edu/~candes/papers/RobustPCA.pdf

·  [SFM with sparsity:]
http://www.maths.lth.se/vision/publdb/reports/pdf/olsson-oskarsson-scia-09.pdf

·  [Face recognition with sparsity:]
http://watt.csl.illinois.edu/~yima/psfile/PAMI-Face.pdf

9. Sparse Representations

Dictionaries for sparserepresentationmodeling:

·  M. Aharon, M. Elad, and A. M. Bruckstein, “The K-SVD: an algorithm for designing of overcomplete dictionaries for sparse representation”, IEEE Trans. Signal Process., vol. 54, no. 11, pp. 4311–4322, 2006.

·  http://www.cs.technion.ac.il/~elad/publications/journals/2006/Review_Paper_SIAM_Review.pdf

·  http://www.cs.technion.ac.il/~elad/publications/journals/2005/34_KSVD_LAA.pdf

·  http://www.cs.technion.ac.il/~elad/publications/journals/2009/IEEE_Proc_Dictionary.pdf

10. Graph-Cut – using higher-order potentials

·  Yuri Boykov, Olga Veksler, Ramin Zabih, Fast Approximate Energy Minimization via Graph-Cuts IEEE transactions on PAMI, vol. 20, no. 12, p. 1222-1239, November 2001
http://www.csd.uwo.ca/faculty/olga/Papers/pami01_final.pdf

·  Pushmeet Kohli, M Pawan Kumar, Philip Torr P3 & Beyond: Solving Energies with Higher Order Cliques. CVPR 2007.
http://research.microsoft.com/en-us/um/people/pkohli/papers/cvpr07.pdf

·  Carsten Rother, Pushmeet Kohli, Wei Feng, Jiaya Jia Minimizing Sparse Higher Order Energy Functions of Discrete Variables. CVPR 2009.
http://research.microsoft.com/en-us/um/people/pkohli/papers/rkfj_cvpr09.pdf

·  Pushmeet Kohli, Lubor Ladicky, Philip Torr Robust Higher Order Potentials for Enforcing Label Consistency. IJCV 2009.
http://research.microsoft.com/en-us/um/people/pkohli/papers/klt_IJCV09.pdf

11. Graph-Cut – using using non-submodular functions

·  Vladimir Kolmogorov and Ramin Zabih “What Energy Functions can be Minimized via Graph Cuts?”.In IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 26(2):147-159, February 2004.
http://www.cs.ucl.ac.uk/staff/V.Kolmogorov/papers/KZ-PAMI-graph_cuts.pdf

·  (2) Vladimir Kolmogorov and Cartsen Rother. “Minimizing non-submodular functions with graph cuts - a review” In IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 29(7):1274-1279, July 2007.
http://www.cs.ucl.ac.uk/staff/V.Kolmogorov/papers/KR-PAMI07.pdf

·  "Exact Optimization for Markov Random Fields with Convex Priors" Hiroshi Ishikawa, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 10, pp. 1333-1336, Oct. 2003.

12. Action Recognition

Dynamic:

·  I. Laptev,M. Marszałek,C. Schmid Learning realistic human actions from movies.CVPR 2008.
http://www.irisa.fr/vista/Papers/2008_cvpr_laptev.pdf

·  Lena Gorelick, Moshe Blank,Eli Shechtman,Michal Irani, andRonen Basri. Actions as Space-Time Shapes.PAMI 2007
http://www.wisdom.weizmann.ac.il/~vision/VideoAnalysis/Demos/SpaceTimeActions/SpaceTimeActions_pami07.pdf

·  I. Laptev and P. Pérez. Retrieving actions in movies, ICCV 2007
http://www.irisa.fr/vista/Papers/2007_iccv_laptev.pdf

Static:

·  Abhinav Gupta,Aniruddha Kembhavi,Larry S. Davis, Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition, PAMI, October 2009 (vol. 31 no. 10)pp. 1775-1789
http://www.computer.org/portal/web/csdl/doi?doc=doi/10.1109/TPAMI.2009.83

·  Li-Jia Li,Li Fei-Fei, What, where and who? Classifying events by scene and object recognition. ICCV 2007
http://vision.stanford.edu/documents/LiFei-Fei_ICCV07.pdf