5

FOUNDIT: Searching for Decoration Designs in Digital Catalogues

E. J. Pauwels and M.J. Huiskes

Centre for Mathematics and Computer Science (CWI),

Kruislaan 413, 1098SJ Amsterdam, The Netherlands

E-mail: {Eric.Pauwels,Mark.Huiskes}@cwi.nl

K. NoONAN, p. bERNARD and p. vANDENBORRE

Sophis Systems NV,

Vlamingstraat 19, B-8560 Wevelgem, Belgium
E-mail: {karl,Paul.Bernard,piet}@sophis.be

P. Pianezza and M. de Maddalena

Pianezza Paolo Srl., Italy.

Località Oro 6, 21030 Azzio (VA), Italy
E-mail: {paolo,marco}@pianezza.it

The FOUNDIT project aims to develop a system for content-based image retrieval (CBIR) that allows users to give relevance feedback in a natural and intuitively transparent manner. Although the aim is to develop a generic system, the project focuses its efforts on large and challenging databases provided by the decoration industry.

1.  Introduction

Content-based image retrieval (CBIR) remains a challenging problem especially when the user’s subjective appreciation is involved (e.g. when browsing databases of decorative designs such as clothes, textiles, wallpaper, etc). In these applications, the only way to elucidate the user’s preference is by continuously soliciting his feedback. This feedback is then harnessed to estimate for each image in the database the likelihood of its relevance with respect to the user’s goals, whereupon the most promising candidates are displayed for further inspection and feedback. The most straightforward way to model the fuzzy state of knowledge about the user's preferences, is for the search engine to assign to every image I in the database a relevance probability p(I) that reflects the current estimate of relevance. As gradually more information about the user’s preferences becomes available, the probability measure will change to reflect the reduced state of uncertainty and images that are assigned a high relevance will be more likely to be sampled for display. The goal of the FOUNDIT project [1] is to build a CBIR search engine based on the above principles that can handle the requirements typically encountered in decoration-related image and design databases. The FOUNDIT system comprises the following three modules. The graphically oriented interface (see Fig.1) allows the user to provide the system with relevance feedback by selecting examples and counter-examples which are collected in separate bins. This qualitative feedback is then transformed by the relevance inference engine into a probabilistic relevance-measure for each image in the database by coupling it to mathematical features. The inference engine therefore relies on the availability of pre-computed features that characterize the visual appearance of the images. This feature-database is generated off-line by the feature extraction engine.

Figure 1. Screenshot of the Foundit prototype interface. Browsing and searching by both manually annotated categories (index on left) and relevance feedback are supported. Feedback is supplied by selecting positive and negative examples which are collected on the display bar at the bottom (positive examples on the left, negative on the right).

2.  Feature Extraction Engine

The Feature Extraction Engine consists of a large collection of algorithms for quantitative image characterization. The routines are not restricted to computation of low-level features such as global color and texture measures, but try to establish a link to the more semantically meaningful categories that are typically used by humans when making esthetical judgments on designs.

In recognition of their vital role in capturing the essence of a design, much effort has been directed towards the detection of so-called salient design elements. Two main strategies are followed to this end: (i) figure-ground segregation, and (ii) grouping of primitives.

The figure-ground segregation is based on color-texture region extraction and subsequent region classification based on region property variables such as relative size, connectedness and compactness. Primitive grouping is directed at finding objects by analyzing configurations of primitive image elements (e.g. edges). In this manner we may for instance detect the occurrence and arrangement of homogeneous strips.

Based on the decomposition of a design into a ground and one or more salient regions or objects, the feature computation process can be further specialized. For foreground regions/salient objects we compute, among others, features for the following: size, orientation, color, shape (region and contour-based); and in case of several objects: spatial organization, occurrence of periodic patterns, motive variation (color, shape, orientation). For background regions, or images consisting entirely of (color) texture, we characterize: color, e.g. dominant color, color structure, color layout; texture, e.g. by regularity, coarseness, direction, edge histograms and granulometries. The Feature Extraction Engine supports the full range of MPEG-7 visual descriptors [2], and further contains routines based on multi-resolution analysis and morphological operators.

3.  Relevance Inference Engine

At every stage of the search-history, the user inspects a (small) fraction of the database (displayed on the interface, see Fig. 1) and provides the system with feedback by making a number of positive and negative selections and transferring them into the collection box as examples and counter-examples (see above). This can be formalized by saying that for the images in the collection box we have additional information (i.e. on top of the pre-computed feature values) that is captured in a binary variable based on the interpretation supplied by the user: we assign the value 1 if the image was considered to be an example, and 0 if it was a counter-example.

Within the Foundit framework we have chosen for logistic regression as a flexible framework to translate the qualitative user feedback (in terms of examples and counter-examples) into quantitative information useful for retrieval. It allows us to express the correlation between the feature values and the binary response variable into a precise parametric model. Furthermore, these regression models yield a principled tool to estimate the efficacy of individual features in gauging the overall relevance of images by taking into account the goodness-of-fit coefficients. As a consequence, it becomes possible to automatically and adaptively extract from the vast collection of pre-recorded image-features, the small subset that correlates best with the particular search at hand.

4.  Conclusion

To improve the image mining capabilities in large databases of decoration designs, FOUNDIT is developing a CBIR search engine that allows the user to give relevance feedback in a natural and intuitively transparent fashion, thus making this technology more efficient for professional users, and opening it up to the much wider audience of non-expert mainstream users. Among future work we envisage the definition of an XML/MPEG-7 description scheme [3] for the representation of design image interpretations.

Acknowledgments

FOUNDIT is partially supported by the European Commission under the IST Programme of the Fifth Framework (Project nr. IST-2000-28427).

References

1. FOUNDIT Webpage: http://www.cwi.nl/~foundit.

2.  B. Manjunath, P. Salembier, and T. Sikora (Eds.) (2002),
Introduction to MPEG-7 - multimedia content description interface, John Wiley and Sons, Ltd.

3.  P. Bernard, H.Derumeaux, M.Huiskes, E. Pauwels, P. Vandenborre, S. Sette, L. Vanlangenhove: An MPEG7-compatible XML-Schema for Semantic Meta-data for Decorative Designs in Textile Industry. Submitted to AUTEX 2003.