Module Code
SOR3008
Introduction to Data Mining; Exploratory Data Analysis; Cluster analysis; Classification including Probabilistic Modelling, Bayesian Networks, Decision tree analysis; Prediction including Regression trees, Random Forests, Neural nets.
On completion of the module, it is intended that students will be able to: demonstrate understanding of the field of data mining, how it has developed and the need for data mining techniques in today’s society; demonstrate knowledge familiarity with data warehouses, webhouses and data marts, the various forms of storing, managing and maintaining large amounts of data; employ exploratory data analysis techniques for univariate analyses, when one outcome variable is considered compared to bivariate, or multivariate analyses for more than one variable in terms of multivariate exploratory analysis of both quantitative and qualitative data and to apply and interpret the results of principal component analysis for multiple variables; demonstrate knowledge of classification and of classification methods including simple linear, nearest neighbour, decision tree models, Bayes classifying, neural networks and random forests; demonstrate knowledge of the purpose of clustering and to use hierarchical clustering and the non-hierarchical clustering methods of k means and nearest neighbour when applied to real data sets; understand and use association rules and their application on real data sets.
Problem solving and computational skills.
None.
Coursework
40%
Examination
60%
Practical
0%
20
SOR3008
Spring Semester
12 Weeks