Introduction to R for data analytics

You are here

Description:

The purpose of this course is to present researchers and scientists with R implementation of Machine Learning methods. The first part of the course will consist of introductory lectures on popular Machine Learning algorithms including unsupervised methods (Clustering, Association Rules) and supervised ones (Decision Trees, Naive Bayes, Random Forests and Deep Neural Network). Basic Machine Learning concepts such as training set, test set, validation set, overfitting, bagging, boosting will be introduced as well as performance evaluation for supervised and unsupervised methods.

The second part will consist of practical exercises such as reading data, using packages and building machine learning applications. Different options for parallel programming will be shown using specific R packages (parallel, h2o,…). For Deep Learning applications the Keras package will be presented. The examples will cover the analysis of large datasets and images datasets. Participants will use R on Cineca HPC facilities for practical assignments.

Skills:

At the end of the course, the student will be expected to have acquired:
    • the ability to perform basic operations on matrices and dataframes 
    • the ability to manage packages
    • the ability to navigate in the RStudio interface
    • a general knowledge of Machine and Deep Learning methods
    • a general knowledge of the most popular packages for Machine and Deep Learning
    • a basic knowledge of different parallel programming techniques
    • the ability to build machine learning applications with large datasets and images datasets

Target audience:

Students and researchers with different backgrounds, looking for technologies and methods to analyze a large amount of data.

Pre-requisites:

Participants must have a basic statistics knowledge and some programming experience (in any language) is recommended. Participants should also be familiar with basic Linux.

 

Area: 
Languages
Techniques
Data
Target: 
Companies
Research Institutions
Universities
Length: 
3 dd
Minimum number of attendants required: 
6

Next courses

Any question?

For HPC and computer graphics courses, write to corsi.hpc@cineca.it

About CINECA

Cineca is a non profit Consortium, made up of 70 Italian universities, 5 Italian Research Institutions and the Italian Ministry of Education.

Today it is the largest Italian computing centre, one of the most important worldwide. With more seven hundred employees, it operates in the technological transfer sector through high performance scientific computing, the management and development of networks and web based services, and the development of complex information systems for treating large amounts of data.

It develops advanced Information Technology applications and services, acting like a trait-d'union between the academic world, the sphere of pure research and the world of industry and Public Administration. .

Visit the Cineca website