Data science with R

Monday, 25 November 2019 00:00 to Wednesday, 27 November 2019 00:00

Registration:

Registration will open about three months before the course/school starts and will normally close 5 days before.
Please note that even if in the box above it says registration closed it might mean simply that registration hasnt yet been opened.
In this case you will be able to apply at a later date.

This course will be held in ENGLISH.

Registration will open about three months before the course/school starts.

Coordinating teacher: G. Pedrazzi

Organizer:

Giorgio Pedrazzi

Description:

The purpose of this course is to present researchers and scientists with R implementation of Machine Learning methods. The first part of the course will consist of introductory lectures on popular Machine Learning algorithms including unsupervised methods (Clustering, Association Rules) and supervised ones (Decision Trees, Naive Bayes, Random Forests and Deep Neural Network). Basic Machine Learning concepts such as training set, test set, validation set, overfitting, bagging, boosting will be introduced as well as performance evaluation for supervised and unsupervised methods.

The second part will consist of practical exercises such as reading data, using packages and building machine learning applications. Different options for parallel programming will be shown using specific R packages (parallel, h2o,…). For Deep Learning applications the Keras package will be presented. The examples will cover the analysis of large datasets and images datasets. Participants will use R on Cineca HPC facilities for practical assignments.

Skills:

At the end of the course, the student will be expected to have acquired:
• the ability to perform basic operations on matrices and dataframes
• the ability to manage packages
• the ability to navigate in the RStudio interface
• a general knowledge of Machine and Deep Learning methods
• a general knowledge of the most popular packages for Machine and Deep Learning
• a basic knowledge of different parallel programming techniques
• the ability to build machine learning applications with large datasets and images datasets

Target audience:

Students and researchers with different backgrounds, looking for technologies and methods to analyze a large amount of data.

Pre-requisites:

Participants must have a basic statistics knowledge. Participants must also be familiar with basic Linux and R language.

Intended for:

Companies

Research Institutions

Universities

Area:

Languages

Techniques

Data

Length:

3 dd

Course material and recordings:

https://learn.cineca.it/course/view.php?id=160

Files e allegati:

agenda_datasciencer_rm.pdf

Conclusa:

1

Registration:

Description:

Skills:

Target audience:

Pre-requisites:

Next courses

Any question?

About CINECA

ACADEMY UTILITIES

ABOUT CINECA

CONTACTS