#Data Mining
##Knowledge Discovery in Databases
- Types:
- Association Rules**
- Causality (Interestingness, Conviction)
- Clustering
- Classification
- Sequential Patterns
- Association Rules
- Support
- Confidence
- Apriori Algorithm
- Hierarchical Itemsets
- Quantitative Fields
##OLAP
- Decision Support System
- Data Warehouse, ETL, Metadata Repository
- Dimensions, facts, measures
- OLAP, OLTP
- Roll-up / Drill-down, Pivoting, Cross Tabulation, Slicing, Dicing
- MOLAP
- ROLAP
- Schema
- Star
- Fact Constellation
- Snowflake
- Performance Considerations
- Bit-Map Index – Bit vectors for sparse columns
- Join Index
##Time-Series
####Similarity
- Euclidean Distance
- Cross-correlation measure
- Dynamic Time Warping distance
- Properties: Continuity, Boundary Constraint, Monotonicity
- γ(i,j)=d(q_i,c_i )+min{ γ(i-1,j-1),γ(i-1,j),γ(i,j-1) }
- Global constrains – Sakoe Chiba Band, Itakura Parallelogram
- Symbolic Aggregate Approximation (SAX)
- Lower Bound for Euclidean and DTW?? What
- Piecewise Aggregate approximation (PAA)
- PAA -> Symbols
- Feature Based Similarity (μ,σ, Kurtosis, and Skew…)
####Pre-Processing (Removing Distortions)
- Offset Translation (Mean)
- Amplitude Scaling (Standard Deviation)
- Linear Trend
- Noise
####Clustering
- Hierarchical Clustering
- K-Means (Partitioning)
####Classification
- Nearest Neighbor Classification
##Spatial Data Management
- Data: Point Data, Raster Data, Region Data, Vector Data
- Queries: Range Queries, Nearest Neighbor Queries, Spatial Join Queries
- B+ Trees Index vs Spatial Index
####Space Filling Curves
- Point Data: Z-Ordering and B+ Trees
- Region Data: Region Quad Trees, Z-Ordering and B+ Trees - 2^k^ regions
- Querying – range, nearest neighbor, join
####Grid Files
- Grid Directory, Linear Scale
- Querying – point, range, nearest neighbor
- Creation / Insertion of points (Page Capacity, Splitting Policy)
- Deletion – Convexity Requirement
####R-Trees
- Bounding Box
- Querying – point, range, nearest neighbor
- Insertion and Deletion, Optimal Splits
- R* Trees