IT414
Course Name:
Data Warehousing And Data Mining (IT414)
Programme:
Category:
Credits (L-T-P):
Content:
Introduction to data mining: Motivation and significance of data mining, data mining functionalities, interestingness measures, classification of data mining system, major issues in data mining; Data pre-processing: Need, data summarization, data cleaning, data integration and transformation, data reduction techniques, data discretization and concept hierarchy generalization; Data warehouse and OLAP technology: multidimensional data model(s), data warehouse architecture, OLAP server types, data warehouse implementation, on-line analytical processing and mining; Data cube computation and data generalization: Efficient methods for data cube computation, discovery driven exploration of data cubes, complex aggregation, attribute oriented induction for data generalization; Mining frequent patterns, associations and correlations: Basic concepts, efficient and scalable frequent itemset mining algorithms, mining various kinds of association rules – multilevel and multidimensional, association rule mining versus correlation analysis, constraint based association mining; Classification and prediction: Definition, decision tree induction, Bayesian classification, rule based classification and support vector machines, associative classification, lazy learners, prediction, accuracy and error measures; Cluster analysis: Definition, clustering algorithms – partitioning, hierarchical, density based, grid based and model based; Clustering high dimensional data, constraint based cluster analysis; Data mining on complex data and applications: Algorithms for mining of spatial data, multimedia data, text data; Data mining applications, social impacts of data mining, trends in data mining.