NOC:Scalable Data Science


Lecture 1 - Background: Introduction


Lecture 2 - Probability: Concentration inequalities


Lecture 3 - Linear algebra: PCA, SVD


Lecture 4 - Optimization: Basics, Convex, GD


Lecture 5 - Machine Learning: Supervised, generalization, feature learning, clustering.


Lecture 6 - Memory-efficient data structures: Hash functions, universal / perfect hash families


Lecture 7 - Bloom filters


Lecture 8 - Sketches for distinct count


Lecture 9 - Sketches for distinct count (Continued...)


Lecture 10 - Misra-Gries sketch


Lecture 11 - Frequent Element: Space Saving and Count Min


Lecture 12 - Frequent Element: Count Sketch


Lecture 13 - Near Neighbors


Lecture 14 - Locality Sensitive Hashing


Lecture 15 - Building LSH Tables


Lecture 16 - Approximate near neighbors search: Extensions e.g. multi-probe, b-bit hashing, Data dependent variants


Lecture 17 - Approximate near neighbors search: Extensions e.g. multi-probe, b-bit hashing, Data dependent variants (Continued...)


Lecture 18 - Approximate near neighbors search: Extensions e.g. multi-probe, b-bit hashing, Data dependent variants (Continued...)


Lecture 19 - Randomized Numerical Linear Algebra: Random projection


Lecture 20 - Randomized Numerical Linear Algebra: Random projection (Continued...)


Lecture 21 - Randomized Numerical Linear Algebra: a) Matrix multiplication + QB decomposition


Lecture 22 - Randomized Numerical Linear Algebra: b) CUR+CX


Lecture 23 - Randomized Numerical Linear Algebra: a) L2 regression using RP


Lecture 24 - Randomized Numerical Linear Algebra: b) Leverage scores


Lecture 25 - Randomized Numerical Linear Algebra: c) Hash Kernels + Kitchen Sink


Lecture 26 - Map-reduce and Hadoop


Lecture 27 - Hadoop System


Lecture 28 - Hadoop System (Continued...)


Lecture 29 - Hadoop System (Continued...)


Lecture 30 - Spark


Lecture 31 - Spark (Continued...)


Lecture 32 - Spark (Continued...)


Lecture 33 - Distributed Machine Learning and Optimization: Introduction


Lecture 34 - SGD+Proof


Lecture 35 - SGD+Proof (Continued...)


Lecture 36 - Distributed Machine Learning and Optimization:ADMM + applications


Lecture 37 - Distributed Machine Learning and Optimization:ADMM + applications (Continued...)


Lecture 38 - Clustering


Lecture 39 - Clustering (Continued...)


Lecture 40 - Conclusion