Data Mining
Materials: There is no text. However, if you have the second edition of Database Systems: The Complete Book (Garcia-Molina, Ullman, Widom), you will find Section 20.2 and Chapters 22 and 23 relevant. Slides from the lectures will be made available in PPT and PDF formats.
Students will use the Gradiance automated homework system for which a fee will be charged. Note: if you already have Gradiance (GOAL) privileges from CS145 or CS245 within the past year, you should also have access to the CS345A homework without paying an additional fee. Notes and/or slides will be posted on-line.
You can see earlier versions of the notes and slides covering Data Mining. Not all these topics will be covered this year.
Course Information
Instructors: Anand Rajaraman (anand @ kosmix dt com), Jeffrey D. Ullman (ullman @ gmail dt com).
Students will use the Gradiance automated homework system for which a fee will be charged. Note: if you already have Gradiance (GOAL) privileges from CS145 or CS245 within the past year, you should also have access to the CS345A homework without paying an additional fee. Notes and/or slides will be posted on-line.
You can see earlier versions of the notes and slides covering Data Mining. Not all these topics will be covered this year.
Date | Topic | PowerPoint Slides | PDF Document |
---|---|---|---|
1/7 | Introductory Remarks (JDU) | PPT | |
1/7 | Introductory Remarks (AR) | PPT | |
1/12 | Map-Reduce | PPT | |
1/14 | Frequent Itemsets 1 | PPT | |
1/14-1/21 | Frequent Itemsets 2 | PPT | |
1/16 | Peter Pawlowski's Talk on Aster Data | PPTX | |
1/16 | Nanda Kishore's Talk on ShareThis | PPT | |
1/26 | Recommendation Systems | PPT | |
1/28 | Shingling, Minhashing, Locality-Sensitive Hashing | PPT | |
2/2 | Applications and Variants of LSH | PPT | |
2/2-2/4 | Distance Measures, Generalizations of Minhashing and LSH | PPT | |
2/4 | High-Similarity Algorithms | PPT | |
2/9 | PageRank | PPT | |
2/11 | Link Spam, Hubs & Authorities | PPT | |
2/18 | Generalization of Map-Reduce | PPT | |
2/18-2/23 | Clustering | PPT | |
2/23 | Streaming Data | PPT | |
2/25 | Relation Extraction | PPT | |
3/2 | On-Line Algorithms, Advertising Optimization | PPT | |
3/4 | Algorithms on Streams | PPT |
No comments:
Post a Comment