The purpose of this course is to present researchers and scientists with R machine learning techniques for the analysis of large data sets. The course, held by data analytics experts, will consist of introductory lectures on Machine Learning techniques. It will provide basic concepts such as training and tests sets, overfitting, bagging, boosting and error rates. The course will also introduce a range of model based and algorithmic machine learning methods including clustering, association rules, decision trees, Naive Bayes, and random forests. It also covers practical issues in machine learning which includes programming in R, reading data into R, accessing R packages Examples of parallel R programming will be shown. Participants will use R software for lab exercises using Cineca HPC facilities.
Machine Learning techniques
R parallel packages
Students and researchers with different backgrounds, looking for technologies and methods to analyze large amount of data.
Participants must have basic statistics knowledge and some programming experience (in any language) is recommended. Participants should be also familiar with basic Linux commands since some of them will be used in the course.