TF-IDF

TF-IDF Term weighting

I na large text corpus, some words will have high frequency providing little information about the actual documents.

If we feed the model directly with direct count data to the classifier, the high frequency data would shadow rarer words.

TF-IDF transforms enable to reweight the count features.

TF = term Frequency TF-IDF = term frequency time inverse document frequency.