At the end of this course, students will be able to

1. Work with a data mining software such as R Software

2. Define data mining and knowledge discovery steps

3. Create descriptive statistics of a data set

4. Assess the necessity of normalization techniques for a given data set and problem

5. Choose and apply an appropriate pre-processing technique for a given data set and problem

6. Differentiate between supervised and unsupervised learning

7. Describe and apply association rule mining methods including Apriori and FP-Growth algorithm

8. Describe and apply classification methods including kNN, naïve Bayes and decision trees

9. Differentiate between bias and variance

10. Compare the results of model performance with the literature systematically and interpret them.

11. Explain neural network architectures and learning algorithms.

12. Describe and apply clustering methods including kmeans, and density-based algorithms

13. Explain overfitting and underfitting concepts

14. Describe the characteristics of a time series such as in terms of stationarity, normality and skewness.

15. Evaluate the adequacy of a data mining technique for a given problem and data set and perform the selected technique correctly.

16. Perform missing data analysis techniques

17. Visualise the data sets including Box-plots, bean plots and scatter plots.