Skip to content

Harshit Kumar

  • Home
  • Projects
  • Blog
  • Poems
← Blog

R

All Categories →

Statistical analysis and data exploration with R covering data wrangling, visualisation, and applying statistical methods to real-world datasets.

4 posts

  • Simpson's paradox
    Data Science

    Simpson's paradox

    Simpson's paradox explained through UC Berkeley's 1973 admissions data, a trend that reverses when data is aggregated across groups.

    Sep 01, 2017 · 6 min read
  • Email spam filtering: Text analysis in R
    Data Science

    Email spam filtering: Text analysis in R

    Building and evaluating an email spam filter using text analytics and machine learning in R.

    Aug 25, 2017 · 63 min read
  • Moneyball: Why no prediction can't be made for baseball champion
    Data Science

    Moneyball: Why no prediction can't be made for baseball champion

    Using logistic regression in R to explore why ML cannot reliably predict the baseball World Series champion.

    Aug 04, 2017 · 27 min read
  • Moneyball: How linear regression changed baseball
    Data Science

    Moneyball: How linear regression changed baseball

    How Oakland A's used linear regression in R to identify undervalued players and compete despite limited budget.

    Jul 28, 2017 · 16 min read
Harshit Kumar 2026 About this site Creative Commons License