Document Relevance Evaluation via Term Distribution Analysis Using Fourier Series Expansion

Galeas, Patricio
Kretschmer, Ralph
Freisleben, Bernd
In addition to the frequency of terms in a document collection, the distribution of terms plays an important role in determining the relevance of documents for a given search query. In this paper, term distribution analysis using Fourier series expansion as a novel approach for calculating an abstract representation of term positions in a document corpus is introduced. Based on this approach, two methods for improving the evaluation of document relevance are proposed: (a) a function-based ranking optimization representing a user defined document region, and (b) a query expansion technique based on overlapping the term distributions in the top-ranked documents. Experimental results demonstrate the effectiveness of the proposed approach in providing new possibilities for optimizing the retrieval process.
Comment: 9 pages, submitted to proceedings of JCDL-2009
Computer Science - Information Retrieval, H.3.3