Madhavan Mukund



Data Mining and Machine Learning,
Jan-Apr 2025

Data Mining and Machine Learning

Jan-Apr, 2025


Administrative details

  • Teaching assistants: Alok Dhar Dubey, Ankan Kar, Rohit Roy

  • Evaluation:

    • Assignments 30-40%, quizzes and midsemester exam 20-30%, final exam 40%

    • Copying is fatal

  • Course outline (tentative)

    • Supervised learning: Association rules, regression, decision trees, naive Bayes, SVM, classifier evaluation, expectation maximization, ensemble classifiers.

    • Unsupervised learning: Clustering, outlier detection, dimensionality reduction.

    • Text mining: Basic ideas from information retrieval, TF/IDF model, Page Rank, HITS

    • Other topics (if time permits): Probabilistic graphical models, Bayesian networks, Markov models, neural networks, ranking and social choice, …

  • Text and reference books:

    • Web Data Mining by Bing Liu, 2nd edition, Springer (2011).

    • Foundations of Data Science by Avrim Blum, John Hopcroft and Ravi Kannan

    • Machine Learning by Tom Mitchell.

    • C4.5: Programs for Machine Learning by Ross Quinlan.

    • Artificial Intelligence: A Modern Approach by Stuart J Russell and Peter Norvig, 3rd edition, Pearson (2016).

    • Hands-On Machine Learning with Scikit-Learn, Keras and Tensorflow by Aurélien Géron, 3rd edition, O'Reilly (2022)

    • Reinforcement Learning: An Introduction, by Richard S. Sutton and Andrew G. Barto, MIT Press, 2nd ed (2018)


Assignments

  • TBA


Lecture summary

  • Lecture 1: 7 Jan 2025
    (Class Notes (pdf))

    Introduction to supervised and unsupervised learning

  • Lecture 2: 16 Jan 2025
    (Class Notes (pdf))

    Market-basket analysis, frequent itemsets, Apriori algorithm

    • Liu, Chapter 2.1, 2.2, 2.2.1