University of Padova

Machine Learning course, May 2011

Practical Assignment/Exercise

Aim: To explore the applicability of ML methods for classification and prediction problems. This is only a formative assignment for Masters students, i.e. they will not be formally assessed. However, the assignment is summative for the PhD students and they will be assessed on the completion of the assignment.

Tasks: You will need to complete tasks A and B as described below. You can use any of the available to you software, including NeuCom (www.theneucom.com):

Part A. ML model for classification

1. Chose a classification problem and appropriate data. You may want to use your own data or select data from some of the ML repository web sites, e.g. www.ics.uci.edu or www.kedri.info.

2. Review previous usage of the same data and problem by other researchers.

3. Analyse the data and the problem and select relevant features for modelling.

4. Create at least one global classification model (e.g. Linear regression, Logistic regression, SVM, MLP or RBF) and at least one local or personalised model (e.g. Local regressions, ECF, kNN, wkNN). Validate all created models through K-fold cross validation (e.g. K=3) to evaluate the models’ accuracy.

5. Compare the models in terms of:

(a) Accuracy of the classification results;

(b) Knowledge obtained/discovered (e.g. formulas, rules, important features);

Part B. ML system for time series prediction

1. Chose a prediction problem and appropriate data. You may want to use your own data or select data from some of the ML repository web sites, e.g. www.ics.uci.edu or www.kedri.info.

2. Review previous usage of the same data and problem by other researchers.

3. Analyse the data and the problem and select relevant features for modelling.

4. Create at least one global prediction model (e.g. Linear regression, Logistic regression, SVM, MLP or RBF) and at least one local or personalised model (e.g. local regressions, kNN, wkNN, DENFIS, EFuNN,) and validate the models through K-fold cross validation (e.g. K=3) to evaluate the models’ accuracy.

5. Compare the models in terms of:

(a) Accuracy of the prediction results;

(b) Knowledge obtained/discovered (e.g. formulas, rules, important features) related to the problem;

For PhD students only: You need to write a report (up to 1500 words) on the findings of the assignment following the points above. Email your report to Prof. Nikola Kasabov (email: nkasabov@aut.ac.nz) by the end of June. Pass mark for the assignment is 50% completion of the required work. You may chose to work on your own data and problem specified as part of your PhD study. Later you may want to also consider writing a paper based on this report, subject to your supervisor’s advice.