Training bioinformaticians in high performance computing

Esteban Pérez-WohlfeilOscar TorrenoLouisa J.BellisPedro L.FernandesBrane LeskosekOswaldo Trelles

Heliyon, 2018

In the last decade, bioinformatics has become an indispensable branch of modern science research, experiencing an explosion in financial support, developed applications and data collection. The growth of the datasets that are emerging from research laboratories, industry, the health sector, etc., are increasingly raising the levels of demand in computing power and storage. Processing biological data, in the large scales of these datasets, often requires the use of High Performance Computing (HPC) resources, especially when dealing with certain types of omics data, such as genomic and metagenomic data. Such computational resources not only require substantial investments, but they also involve high maintenance costs. More importantly, in order to keep good returns from the investments, specific training needs to be put in place to ensure that wasting is minimized. Furthermore, given that bioinformatics is a highly interdisciplinary field where several other domains intersect (such as biology, chemistry, physics and computer science), researchers from these areas also require bioinformatics-specific training in HPC, in order to fully take advantage of supercomputing centers. In this document, we describe our experience in training researchers from several different disciplines in HPC, as applied to bioinformatics under the framework of the leading European bioinformatics platform ELIXIR, and analyze both the content and outcomes of the course.