High Performance Bioinformatics

Monday, 9 December 2019 00:00 to Wednesday, 11 December 2019 00:00

Wednesday, 04 December 2019 at 00:00TizianoFlati , SilviaGioiosa , FrancescoFalciano , NicolaSpallanzani

Registration:

 

This event is sponsored by PRACE, so, to register, go to the PTC event site: 
https://events.prace-ri.eu/event/939

This course will be held in English.

Registration will open about three months before the course/school starts.

Coordinating teacher: T. Castrignanò
Teachers: T. Flati, S. Gioiosa, T. Castrignanò

Organizer: 
Teachers: 
TizianoFlati , SilviaGioiosa , FrancescoFalciano , NicolaSpallanzani

Description:

This course focuses on the development and execution of bioinformatics pipelines and on their optimization with regards to computing time and disk space. In an era where the data produced per-analysis is in the order of terabytes, simple serial bioinformatic pipelines are no longer feasible. Hence the need for scalable, high-performance parallelization and analysis tools which can easily cope with large-scale datasets. To this end, we will study the common performance bottlenecks emerging from everyday bioinformatic pipelines and see how to strike down the execution times for effective data analysis on current and future supercomputers.
As a case study, a transcriptome data analysis will be presented and re-implemented on the supercomputers of CINECA thanks to ad-hoc hands-on sessions aimed at applying the concepts explained in the course.

Skills:

By the end of the course each student should be able to:

- Manage the transfer/download of huge data and/or large number of files from the local computer or public repositories to the Cineca platforms and vice versa
- Prepare the software environment to analyse big amount of biological data on a supercomputer;
- Run bioinformatic sotware on a supercomputer;
- Combine several bioinformatics applications into automated pipelines on a supercomputer;
- Have an overview of python data analysis framework.

Target audience:

Biologists, bioinformaticians and computer scientists interested in approaching large-scale NGS-data analysis for the first time.

Pre-requisites:

Basic knowledge of python and shell command line. A very basic knowledge of biology is recommended but not required.

 

Intended for: 
Companies
Research Institutions
Universities
Area: 
Languages
Science
Length: 
3 dd
Conclusa: 
0

Next courses

Non sono previste edizioni di questo corso.

Any question?

For HPC and computer graphics courses, write to corsi.hpc@cineca.it

About CINECA

Cineca is a non profit Consortium, made up of 102 Italian national institutions: Universities, Italian Research Institutions and the Italian Ministries of Universities and Education.

Today it is the largest Italian computing centre, one of the most important worldwide. With more seven hundred employees, it operates in the technological transfer sector through high performance scientific computing, the management and development of networks and web based services, and the development of complex information systems for treating large amounts of data.

It develops advanced Information Technology applications and services, acting like a trait-d'union between the academic world, the sphere of pure research and the world of industry and Public Administration. .

Visit the Cineca website