Organizer: de.NBI

Start: Sunday, 12 July 2020 @ 09:00

End: Sunday, 12 July 2020 @ 13:00

Description:

Educators:
René Rahn, Marcel Ehrhardt, Svenja Mehringer (CIBI)

Date:
12.07.2020
9:00 am - 1:00 pm (Eastern Daylight Time)

Location:
ISMB 2020: Virtual Conference
https://www.iscb.org/ismb2020/4399

Contents:
In this half-day tutorial we are going to teach how to use modern C++ and utilise modern C++ libraries to rapidly develop tools and scripts for operating on and manipulating large-scale sequencing data.

The high variability and heterogeneity often observed within various genomic data is challenging for many standard tools, for example for read alignment and variant calling. Often, these tools are wrapped in complicated pre- and postprocessing data curation steps in order to obtain results with higher quality. However, these additional steps incur a high maintenance and performance burden to the established work process and often do not scale with larger data sets. Seldomly, C++ is considered as the language of choice for these small processes, although it is the main language used in high-performance computing. We are going to show that implementing modern C++ can be as easy as using other modern high-level languages.

Learning goals:
Students will develop

  • skills in developing an application using the C++ programming language
  • skills in using modern C++ libraries to query large sequence databases (e.g. SeqAn, SDSL, etc.)
  • knowledge and understanding of modern C++ features, such as ranges and concepts
  • knowledge and understanding about modern and efficient data structures as well as algorithms crucial for large-scale genomic sequence analysis
  • knowledge and understanding about how to develop and sustain high-quality software

Prerequisites:
This tutorial is mostly suited for computational biologist and bioinformaticians with research focus on sequence analysis (e.g., genomics, metagenomics, proteomics, read alignment, variant detection, etc.). A fundamental knowledge about sequencing experiments and the involved data is required. We expect that attendees have an intermediate knowledge in programming with any high-level programming language, e.g. Python, Java or C++. Some basic C++-knowledge is helpful but not mandatory to successfully complete the course.

This tutorial is targeting beginners and intermediate C++ developers that want to learn more about modern C++ features like ranges and concepts.

Keywords:
BioC++, modern C++, bioinformatics, SeqAn, FileIO

Tools:
- A simple text editor
- g++ >= 7
- cmake >= 3.12
- git

Event type:
  • Meetings and conferences
BioC++ - solving daily bioinformatic tasks with C++ efficiently - ISMB 2020 https://tess.elixir-europe.org/events/bioc-solving-daily-bioinformatic-tasks-with-c-efficiently-ismb-2020 Educators: René Rahn, Marcel Ehrhardt, Svenja Mehringer (CIBI) Date: 12.07.2020 9:00 am - 1:00 pm (Eastern Daylight Time) Location: ISMB 2020: Virtual Conference https://www.iscb.org/ismb2020/4399 Contents: In this half-day tutorial we are going to teach how to use modern C++ and utilise modern C++ libraries to rapidly develop tools and scripts for operating on and manipulating large-scale sequencing data. The high variability and heterogeneity often observed within various genomic data is challenging for many standard tools, for example for read alignment and variant calling. Often, these tools are wrapped in complicated pre- and postprocessing data curation steps in order to obtain results with higher quality. However, these additional steps incur a high maintenance and performance burden to the established work process and often do not scale with larger data sets. Seldomly, C++ is considered as the language of choice for these small processes, although it is the main language used in high-performance computing. We are going to show that implementing modern C++ can be as easy as using other modern high-level languages. Learning goals: Students will develop - skills in developing an application using the C++ programming language - skills in using modern C++ libraries to query large sequence databases (e.g. SeqAn, SDSL, etc.) - knowledge and understanding of modern C++ features, such as ranges and concepts - knowledge and understanding about modern and efficient data structures as well as algorithms crucial for large-scale genomic sequence analysis - knowledge and understanding about how to develop and sustain high-quality software Prerequisites: This tutorial is mostly suited for computational biologist and bioinformaticians with research focus on sequence analysis (e.g., genomics, metagenomics, proteomics, read alignment, variant detection, etc.). A fundamental knowledge about sequencing experiments and the involved data is required. We expect that attendees have an intermediate knowledge in programming with any high-level programming language, e.g. Python, Java or C++. Some basic C++-knowledge is helpful but not mandatory to successfully complete the course. This tutorial is targeting beginners and intermediate C++ developers that want to learn more about modern C++ features like ranges and concepts. Keywords: BioC++, modern C++, bioinformatics, SeqAn, FileIO Tools: - A simple text editor - g++ >= 7 - cmake >= 3.12 - git 2020-07-12 09:00:00 UTC 2020-07-12 13:00:00 UTC de.NBI [] [] [] meetings_and_conferences [] []