Motivate the need for Inverse Document Frequency.
Motivate the need for Inverse Document Frequency.
The feature vectors contain large weights for terms that occur frequently in a document, even if those terms occur frequently in most documents in the corpus. These terms do not help to represent the meaning of a particular document relative to the rest of the corpus.These words can be thought of as corpus-specific stop words and may not be useful to calculate the similarity of documents.