Data Mining
Beschrijving
The goal of the course is to teach students how to think like a data miner. Intuitively, this means you have the mindset and skills to find practical solutions to common problems you encounter when extracting knowledge, patterns, and models from large data sets. To make such solutions effective, you must understand both the underlying intuition and its mathematical foundation. The field of data mining contains far too many of such practical solutions to teach in one course. We therefore focus on core techniques that demonstrate some of the magic behind modern data mining solutions:
- matrix decomposition
- sketching and hashing
- embeddings and distances
We cover the basic mathematical skills required to use these techniques effectively and you have to demonstrate mastery by developing solutions in 3 large lab assignments from scratch:
* Anomaly detection in system logs
* Recommender systems for profile matching
* Clustering in social networks
Data from these different domains often needs special (pre)processing methods to be able to apply machine learning/data mining methods. We discuss the main (pre)processing methods and their considerations in the course. Importantly, different distance measures and their effect on the mining outcome is a recurring theme. Also, special consideration is given to being able to deal with huge datasets through smart approximations. Ethical considerations of data mining are discussed. In all of these topics, the course will cover key algorithms for similar-item retrieval, dimension reduction, large scale clustering, collaborative filtering, locality-sensitive hashing, outlier detection, profiling, and graph mining.
Reviews0 reviews
Heb jij dit vak gevolgd?
Deel je ervaring met toekomstige studenten. Inloggen met je TU Delft mailadres duurt één minuut.
Schrijf een review