IR Models: Vector Space Model

 

simplistic-term-vector-mode

In the Vector Space Model, search terms are defined as a dimension and queries or documents are expressed as a vectors. (Chu, 2010, p.115) A vector consists of values that rank an item’s importance. Term weighting in the vector space model can be ranked by term frequency or by the user’s perception of the term’s importance.

Strengths of the vector space model include: Boolean logic is not required to perform a search, terms are weighted by their relative importance, a user can limit the size of the retrieval output, and relevance feedback can improve retrieval performance (Chu, 2010, p. 116).

Weaknesses of the Vector Space Model include: terms are assumed to be independent from each other,  a difficulty expressing phrasal relationships due to the absence of Boolean logic, the weighting mechanism is subjective and complex, and several terms are needed to represent a query for good information retrieval.

While the vector Space model provides several options that Boolean can’t provide such as ranking the relevance of documents retrieved and providing relevance feedback, “such systems, however, have not been able to markedly outperform systems based on the Boolean logic model” (Chu, 2010, p. 124).

 

References

Chu, H. (2010). Information representation and retrieval in the digital age (2nd ed.). Medford, N.J: Published for the American Society for Information Science and Technology by Information Today.

Simplistic Term Vector Space Model [Graph]. (n.d.). Retrieved from http://http://moz.com/blog/lda-and-googles-rankings-well-correlated

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s