Fake News Detection
Content analysis and classification on the Buzzfeed news dataset
Problem
Misinformation is difficult to detect at scale. Understanding what distinguishes fake news from real news at a content level is a critical challenge for media literacy and platform integrity.
Approach
Performed text mining, exploratory data analysis, and statistical testing on the Buzzfeed news dataset. Identified discriminatory patterns between fake and real news (title length, word usage, and source characteristics). Built predictive classification models to detect fake news articles.
Impact
31k+ views on Kaggle. Demonstrated practical NLP techniques for misinformation detection and content analysis.