Automated Workflow Composition in the Life Sciences
Organizer: Jon Ison, Anna-Lena Lamprecht, Magnus Palmblad and Veit Schwämmle
Host institution: Leiden University
Start: Monday, 09 March 2020 @ 09:00
End: Friday, 13 March 2020 @ 17:00
Sponsors: ELIXIR, Lorentz Center, LUMC
Venue: Lorentz Center Oort
Scientific topic: Omics, Workflows
Operations: AnalysisTarget audience:
- software developers, bioinformaticians
In the age of computational science, researchers in the life sciences – just as in other domains – regularly face the need of composing several individual software tools into pipelines or workflows that perform the specific data analysis processes that they need in their research. For over 20 years now, dedicated scientific workflow management systems have been supporting scientists in this task, and they continue to gain popularity. In fact, recent years have seen significant progress in the functional annotation of bioinformatics software tools, as well as their virtualization, containerization and assembly into workflows for automatically executing the processes.
At least since the rise of the Semantic Web in the early 2000s, also the idea of semantics-based automated composition of workflows has been around to simplify the work with scientific workflows further and free life science researchers from having to deal with the technicalities of software composition. This would not only save valuable research time, but also reduce errors, allow benchmarking of data analysis pipelines and enable new scientific findings by discovering workflows that researchers would not have thought of themselves. However, despite its obvious potential and appeal, the need for optimizing data analysis workflows, and despite different research groups working on the topic, automated workflow composition has not yet arrived in the daily practice of life science researchers.
The reasons for this are manifold. Some are more practical (for example the lack of automatic composition tools in the commonly used software frameworks), others are of more fundamental nature (such as questions on specification languages, composition algorithms, formal semantics and workflows representations). On one important aspect, namely the semantic annotation of tools on a large scale, the life science community has made significant progress in the last years: The EDAM ontology provides a controlled vocabulary of bioinformatics operations, data types and formats, and the bio.tools registry has become a large collection of bioinformatics tools that are semantically annotated with terms from the EDAM ontology. As demonstrated in a recent Bioinformatics publication (https://academic.oup.com/bioinformatics/article/35/4/656/5060940), this forms a solid basis for performing automated workflow composition in the life sciences domain. Nevertheless, it is still a long way to its use in daily scientific practice.
This workshop will bring together researchers and practitioners who have been working on different aspects related to automated workflow composition in the life sciences. These include life science researchers, tool providers, infrastructure developers, ontologists, algorithmics researchers and many more. They do not normally come together as a group at the regular scientific events, so a Lorentz workshop devoted to this topic provides a unique opportunity to join forces and together significantly advance the field.
- Workshops and courses