Relevance Feedback - MULTIMEDIA

Relevance feedback is a powerful tool that has been brought to bear in recent CBIR systems. Briefly, the idea is to involve the user in a loop, whereby images retrieved are used in further rounds of convergence onto correct returns. The usual situation is that the user identifies images as good, bad, or don't care, and weighting systems are updated according to this user guidance.


In the MARS system, weights assigned to feature points are updated by user input. First, the MARS authors suppose that there are many features, i = 1.1 of them, such as color, texture, and so on. For each such feature, they further suppose that we can use multiple representations of each. For example, for color we may use color histograms, color layout, moments of color histograms, dominant colors, and so on. Suppose that, for each i, there are j = 1.. ji such representations. Finally, for each representation j of feature i, suppose there is associated a set of k = 1.. Kij components of a feature vector. So in the end, we have feature vector components rijk.

Each kind of feature i has importance, or weight, Wi, and weights Wij are associated with each of the representations for the kind of feature i. Weights Wijk are also associated with each component of each representation. Weights are meant to be dynamic, in that they change as further rounds of user feedback are incorporated.

Let F = {fi} be the whole set of features fi. Let R = {rij} be the set of representations for a given feature fi. Then, again, just for the current feature i, suppose that M = {mij} is a set of similarity measures used to determine how similar or dissimilar two representations are in set R. That is, different metrics should be used for different representations: a vector - based representation might use Mahalanobis distance for comparing feature vectors, while histogram intersection may be used for comparing color histograms. With set D being the raw image data, an entire expression of a relevance feedback algorithm is expressed as a model (D,F,R,M).

Then the retrieval process suggested in is as follows:

  1. Initialize weights as uniform values:

    Initialize weights as uniform values
    Recall that I is the number of features in set F; Ji, is the number of representations for feature fi; and Kij is the length of the representation vector rij.

  2. A database image's similarity to the query is first defined in terms of components:

    S(rij) = mij(rij , Wijk). Then each representation's similarity values are grouped as

    terms of components

  3. Finally, the overall similarity 5 is defined as
    overall similarity 5
  4. The top N images similar to query image Q are then returned.

  5. Each of the retrieved images is marked by the user as highly relevant, relevant, no opinion, nonrelevant, or highly nonrelevant, according to his or her subjective opinion.

  6. Weights are updated, and the process is repeated.

Similarities have to be normalized to get a meaningful set of images returned:

  1. Since representations may have different scales, features are normalized, both offline {intranormalization), and online {internormalization).

  2. Intranormalization: the idea here is the normalization of the rijk so as to place equal emphasis on each component within a representation vector rij. For each component k, find an average over all M images in the database, μk. Then replace that component by its normalized score in the usual fashion from statistics:

    normalized score in the usual fashion from statistics

  3. Internormalization: here we look for equal emphasis for each similarity value S(rjj) within the overall measure S. We find the mean μjj and standard deviation σij over all database image similarity measures S.

  4. Then, online, for any new query Q we replace the raw similarity between Q and a database image m by

    raw similarity between Q and a database image

Finally, the weight update process is as follows:

  1. Scores of {3, 1, 0, —1, —3} are assigned to user opinions "highly relevant" to "highly nonrelevant”.

  2. Weights are updated as

    Wij ---> Wij + Score

    for images viewed by the user. Then weights are normalized by

    weights are normalized

  3. The inverse of the standard deviation of feature rijk is assigned to the component weight Wijk:
    component weight Wijk
    That is, the smaller the variance, the larger the weight.

  4. Finally, these weights are also normalized:
    component weight Wijk

The basic advantage of putting the user into the loop by using relevance feedback is that this way, the user need not provide a completely accurate initial query. Relevance feed­back establishes a more accurate link between low-level features and high - level concepts, somewhat closing the semantic gap. Of course, retrieval performance of CBIR systems is bettered this way.


An experimental system that explicitly uses relevance feedback in image retrieval is the Microsoft Research system iFind. This approach attempts to get away from just low - level image features by addressing the semantic content in images. Images are associated with keywords, and a semantic net is built for image access based on these, integrated with low - level features. Keywords have links to images in the database, with weights assigned to each link. The degree of relevance, the weight, is updated on each relevance feedback round.

Clearly, an image can be associated with multiple keywords, each with a different degree of relevance. Where do the keywords come from? They can be generated manually or retrieved from the ALT HTML tag associated with an image, using a web crawler.

All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd Protection Status