Annosearch: Image Auto-Annotation by Search

ANNOSEARCH: IMAGE AUTO-ANNOTATION BY SEARCH

ABSTRACT

Although it has been studied for several years by computer vision and machine learning communities,image annotation is still far from practical. In thispaper, we present AnnoSearch, a novel way to annotate images using search and data miningtechnologies. Leveraging the Web-scale images, wesolve this problem in two-steps:

1) Searching forsemantically and visually similar images on the Web,

2) And mining annotations from them.

Firstly, at least oneaccurate keyword is required to enable text-basedsearch for a set of semantically similar images. Thencontent-based search is performed on this set to retrieve visually similar images. At last, annotationsare mined from the descriptions (titles, URLs and surrounding texts) of these images. It worthhighlighting that to ensure the efficiency, high dimensional visual features are mapped to hash codeswhich significantly speed up the content-based search process. Our proposed approach enables annotatingwith unlimited vocabulary, which is impossible for allexisting approaches. Experimental results on real webimages show the effectiveness and efficiency of theproposed algorithm.

ARCHITECTURE

EXISTING SYSTEM

In this Existing System,previous works require a supervisedtraining stage to learn prediction models which limitstheir vocabulary. Moreover, most of them use manuallylabeled training image descriptions how to model thesemantic concepts effectively and efficiently.

The otherreason is the lack of training data, and hence thesemantic gap cannot be effectively bridged. With the prosperity of the Web, it has become a huge deposit of almost all kinds of data and providessolutions to many problems that were believed to be“unsolvable”.

PROPOSED SYSTEM

In this Proposed System, mainly workon two directions. One finds more representative features to model the objects, represent images by a group of blobs, and then usestatistical machine translation model to translate theblobs into a set of keywords. The other uses machinelearning techniques to learn the joint probabilities or correlations betweenimages and keywords.

A notable advantage of our approach is that nosupervised training process is adopted, and as a directresult, our method can handle unlimited vocabulary,which is apparently superior to the previous works.

MODULES

1.LOGIN MODULE

2.VISUAL SIMILARITY

3.JOINT PROBABILITIES (OR) CORRELATION

4.CONTENT-BASED SEARCH

4a. Hash code-based Image retrieval

MODULES DESCRIPTION

In this module, Login (also called logging in or on and signing in or on) is the process by which individual access to a computer system is controlled by identification of the user using credentials provided by the user.

A user can log in to a system and can then log out or log off (perform a logout / logoff) when the access is no longer needed.Logging out may be done explicitly by the user performing some action, such as entering the appropriate command, or clicking a website link labeled as such. It can also be done implicitly, such as by powering the machine off, closing a web browser window, leaving a website, or not refreshing a webpage within a defined period.

VISUAL SIMILARITY

In this module we are getting the images by Visual Similarity. We can see the images by manually and we can identified what type of image is this so that the way we can doing the social tagging with the image.

JOINT PROBABILITIES (OR) CORRELATION

What we are interested in is the fact that the tags are annotated by different users and there are variations in individual user’s perspective and vocabulary. Incorporation of user may bring similar benefits to the image understanding. On top of visual appearance, the fact that images from the same user or tagged by similar users can capture more semantic correlations.

CONTENT-BASED SEARCH

Visual features are generally of high dimensional, similarity-oriented search based on visualfeatures is always a bottleneck for large-scale imagedatabase retrieval on search efficiency.

To overcomethis problem, we adopt a hash encoding algorithmto speed up this procedure.

4a.Hash code-based Image retrieval

Using this hash code based retrieval the higher bitsof the hash codes contain the majority of energy of an image. Hence if the higher bits of two hash codes match, possibly they are more similar than only lower bits match. This measure is proposed based on these analyses. Images whose higher n bits of hash codes match exactly those of the query image are kept, and then ranked based on features.

SYSTEM REQUIREMENT SPECIFICATION

HARDWARE REQUIREMENTS

System: Pentium IV 2.4 GHz.
Hard Disk: 80 GB.
Monitor: 15 VGA Color.
Mouse: Logitech.
Ram: 512 MB.

SOFTWARE REQUIREMENTS

Operating system : Windows 7 Ultimate
Front End:Visual Studio 2010
Coding Language: C#.NET
Database:SQL Server 2008