IFB French Institute of Bioinformatics
The French Institute of Bioinformatics (CNRS IFB) is a national service infrastructure in bioinformatics. IFB’s principal mission is to provide basic services and resources in bioinformatics for scientists and engineers working in the life sciences. IFB is the French node of the European research infrastructure, ELIXIR.
Galaxy is an open-source project. Everyone can contribute to its development with core Galaxy development, integration of softwares in Galaxy environment, ... Here, you will find some materials to learn how to contribute to Galaxy project.
Keywords: Galaxy
Introduction message of the EGDW 2017
Assessing the FAIRness of Training Materials
Keywords: biohackaton 2018
JSON schema validation with ontologies
Keywords: biohackaton 2018
Not available
Keywords: Transcriptomics
DNA-sequence analysis: from raw reads to variants calling within the galaxy environement.
Keywords: DNA Analysis, Galaxy, Variant calling
"How to do" guidelines about:
- What and why schema.org
- What and why bioschemas.org
- How-to select the right profile for your resource
- How-to mark up your own resource with Bioschemas
Keywords: Bioschemas, Standards
Galaxy docker integration
Enable Galaxy to use BioContainers (Docker)
Galaxy with Docker swarm
Keywords: Docker, Galaxy
Building a semantic search engine for biology publications using event stream processing
Keywords: biohackaton 2018
The French Institute of Bioinformatics (IFB) has organised in partnership with the Institute of Integrative Biology (I2BC) a training course for bioinformaticians and biostatisticians wishing to implement the "FAIR" principles (Findable, Accessible, Interoperable, Reusable) in their analysis and...
Keywords: FAIR, Reproducible Science, Open science, Data analysis, Data processing
Resource type: Training materials
Goal
The aim is to :
Get familiar with motif analysis of ChIP-seq data.
Learn de novo motif discovery methods.
In practice :
Motif discovery with peak-motifs
Differential analysis
Random controls
Keywords: Chip-seq, NGS, Pattern recognition
Introduction on sequencing: available technologies, library types, applications ...
Keywords: Genomics
This training material has been used in a two-day training workshop on data management organized by Elixir and the Wheat Initiative.
It provides an overview of current practices and methods for plant phenotyping data standardization, and how to deal with the variability and heterogeneity...
Scientific topics: Data management, Plant biology, Phenomics
Keywords: Plant Phenotyping, Data-format, data sharing
Resource type: Training materials
Course organised by the French Institute of Bioinformatics (IFB) with the help of the Swiss Institute of Bioinformatics (SIB) in french
- 1st edition: 9-10 July 2019, Institut Pasteur (trainer: Frédéric Schütz)
- 2nd edition: 12-13 September 2019, Institut Pasteur (trainers: Christophe Malabat...
Keywords: R-programming, Reproducible Science
Introduction to statistics with R
Keywords: R
Small RNAseq data analysis for miRNA identification
Keywords: RNA-seq
The WheatIS project aims at building an International Wheat Information System to support the wheat research community. The main objective is to provide a single-access web base system to access to the available data resources and bioinformatics tools. The project is...
Scientific topics: Data management
Keywords: wheat, information system, data discovery
Resource type: Webinar, Slides, Video
How to use the administation panel of Galaxy
Keywords: Galaxy
How to handle Users, Groups, and Quotas in Galaxy
Keywords: Galaxy
Some use cases :
Extract a subset of variants (localization, type)
Combine variants from several analyses
Compare obtained variants from several data types (RNA
...
Keywords: Genomics
OmicsPath: Finding Relevant omics datasets using pathway information
Keywords: biohackaton 2018
Problems
Running jobs on the Galaxy server negatively impacts Galaxy UI performance
Even adding one other host helps
Can restart Galaxy without interrupting jobs
Solution:
Connecter Galaxy to a computing cluster
Keywords: Cluster, Galaxy
Variant calling practical session
Keywords: Genomics
BioBlend module, a python library to use Galaxy API
Keywords: API, Galaxy, Python
Biodiversity is now commonly described by DNA based approches. Several actors are currently using DNA to describe biodiversity, and most of the time they use different genetic markers that is hampering an easy sharing of the accumulated knowledges. Taxonomists rely a lot on the DNA Barcoding...
Keywords: metagenomics
CWL support in Galaxy
Keywords: biohackaton 2018
This module is separated in different courses:
MicroScope: General overview, Keyword search and gene cart functionalities
Functional annotation of microbial genomes
Functional annotation of microbial genomes: Prediction of enzymatic functions
Relational...
Keywords: Annotation, Genomics, Metabolomics, Microbial evolution, Transcriptomics
Prototyping the new PSICQUIC 2.0
Keywords: biohackaton 2018
How to use the Microscope Platform to annotate and analyze microbial genomes.
Keywords: Annotation, Genomics, Metabolomics, Microbial evolution, Sequence analysis, Transcriptomics
Prediction of Region of Genomic Plasticity (RGPs) and CoDing Sequences (CDSs) and visualization
Keywords: CDS, Data visualization, Genomics, RGP
Global Objective
Given a set of ChIP-seq peaks annotate them in order to find associated genes, genomic categories and functional terms.
Keywords: Chip-seq, Functional Annotation, NGS
MG-RAST has been offering metagenomic analyses since 2007. Over 20,000 researchers have submitted data. I will describe the current MG-RAST implementation and demonstrate some of its capabilities. In the course of the presentation I will highlight several metagenomic pitfalls. MG-RAST:...
Keywords: metagenomics
bio.tools, EDAM drop-in hackathon and discussions
Keywords: biohackaton 2018
How to install a local instance of Galaxy
Keywords: Galaxy
Pathway effect prediction for protein targets
Keywords: biohackaton 2018
Transcriptome analysis provides information about the identity and quantity of all RNA molecules
Keywords: Genomics, RNA-seq
Development of a GA4GH-compliant, language-agnostic workflow execution service
Keywords: biohackaton 2018
Variant calling practical session
Keywords: Genomics
EBI metagenomics (EMG, https://www.ebi.ac.uk/metagenomics/) is a freely available hub for the analysis and exploration of metagenomic, metatranscriptomic, amplicon and assembly data. The resource provides rich functional and taxonomic analyses of user-submitted sequences, as well as analysis of...
Keywords: metagenomics
Import workflows into TeSS Concept Maps
Keywords: biohackaton 2018
Large scale metagenomic projects aim to extract biodiversity knowledge between different environmental conditions. Current methods for comparing microbial communities face important limitations. Those based on taxonomical or functional assignation rely on a small subset of the sequences that can...
Keywords: metagenomics
Metagenomic studies have gained increasing popularity in the years since the introduction of next generation sequencing. NGS allows for the production of millions of reads for each sample without the intermediate step of cloning. However, just as in the past, the quality of the data generate by...
Keywords: metagenomics
Enrichment and propagation of metagenomic experimental metadata
Keywords: biohackaton 2018
Meta3C is an experimental and computational approach that exploits the physical contacts experienced by DNA molecules sharing the same cellular compartments. These collisions provide a quantitativeinformation that allows interpreting and phasing the genomes present within complex mixes of species...
Keywords: metagenomics
Understand the method used in identifying an unknown sequence.
Understand the limitations of this method
Get to grips with various software (CLUSTALw, SeaView, Phylo_win and Njplot)
Keywords: Phylogenetics
Find Rapidly OTU with Galaxy Solution
Keywords: Galaxy, Metagenomics
Exploring data annotation on the genomics and transcriptomics levels with the MicroScope Platform and its tools
Keywords: Genomics, NGS, RNA-seq, SNP, Transcriptomics
Cardio-metabolic and Nutrition-related diseases (CMDs) represent an enormous burden for health care. They are characterized by very heterogeneous phenotypes progressing with time. It is virtually impossible to predict who will or will not develop cardiovascular comorbidities. There is a clear...
Keywords: metagenomics
How query databases according to complex taxonomic critera
Cross-Taxa allows to retrieve gene families that are shared by a given set of taxa, or which are specific to a set of taxa. It is also possible to select genes families which are associated to a certain set of taxa but which are not...
Keywords: genomics
Development of BioJS components
Keywords: biohackaton 2018
RNA-seq: Differential gene expression analysis practical session
Keywords: Genomics, RNA-seq
Understand the method behind constructing a phylogenetic tree from the search for sequences to the analysis of the tree.
Get to grips with various bio-informatic software (BLAST, CLUSTALw, SeaView and Phylo_win).
Keywords: Phylogenetics
Exploring Pharmacogenomic LOD for Molecular Explanations of Gene-Drug Relationships
Keywords: biohackaton 2018
Transfer of Research Assets between FAIRDOM SEEKs
Keywords: biohackaton 2018
How to add biological meaning to peaks
Keywords: Annotation, Chip-seq, Data Visualization, NGS
transcriptome from new condition
tissue-speci c transcriptome
different development stages
transcriptome from non model organism
cancer cell
RNA maturation mutant
How to manage RNA-Seq data with genes subjected to di erential
splicing?
Is it possible to discover new isoforms?
Is it...
Keywords: Isoforms, RNA-seq, Transcriptomics
It is a generally accepted characteristic of the biogeochemical nitrogen cycle that nitrification is catalyzed by two distinct clades of microorganisms. First, ammonia-oxidizing bacteria and archaea convert ammonia to nitrite, which subsequently is oxidized to nitrate by nitrite-oxidizing...
Keywords: metagenomics
Learn about and become familiar with phyloseq R package for the analysis of microbial census data
Keywords: Microbiomes, R
Improve Shiny and RStudio integration within Galaxy using Galaxy Interactive Environment
Keywords: biohackaton 2018
Putting structured data into individual entry pages in biological database
Keywords: biohackaton 2018
Presentation of the workshop (Chairman: Victoria Dominguez Del Angel)
Keywords: biohackaton
Development of a catalog of federated SPARQL queries in the field of Rare Diseases
Keywords: biohackaton 2018
Complex microscopic communities are composed of species belonging to all life realms, from single-cell prokaryotes to multicellular eukaryotes of small size. Each component of a community needs to be studied for a full understanding of the functions performed by the whole assemblage, however...
Keywords: metagenomics
Support tools for rapid adoption of compact identifiers in the publishing process
Keywords: biohackaton 2018
ProtVista (protein annotation viewer) extension using Bioschemas data
Keywords: biohackaton 2018
Galaxy training material improvement and extension
Keywords: biohackaton 2018
What is Docker?
Building an image
BioShadock Orchestration
Keywords: Docker
How to configure your local instance of Galaxy
Keywords: Galaxy
Introduction to RADSeq through STACKS on Galaxy
Keywords: NGS
Soils are highly complex ecosystems and are considered as one of the Earth’s main reservoirs of biological diversity. Bacteria account for a major part of this biodiversity, and it is now clear that such microorganisms have a key role in soil functioning processes. However, environmental factors...
Keywords: metagenomics
Quick Search is dedicated to a quick search for sequences or sequence families in the databases available on the PBIL server. It is an alternative to WWW Query which allows more complex queries. Quick Search allows you to retrieve sequences or sequence families associated to a single word without...
Keywords: genomics, pattern recognition
Research data and their centrality in the research process.
This material is mostly in French.
Keywords: metadata, Data Life Cycle, Reproducibility, Data management plan
Resource type: Slides
La vie des données pendant le projet : Principe et outils pour organiser, nommer, versionner, stocker, archiver, mes données
Scientific topics: Bioinformatics, Biology, Data management
Keywords: Data preserving, Data storage
Resource type: Slides
Partager et diffuser les données. Le cadre juridique, les entrepôts et les licences sur les données
Scientific topics: Data management, Biology, Bioinformatics
Keywords: data sharing, Data publishing, legal framework, data warehouse, licensing, data reuse
Resource type: Slides
Visualizations may be very helpful in understanding data better. There is a whole range of visualizations, from rather simple scatter and barplots up to projections of high dimensional data or even entire genomes. Many of these visualizations often require a lot of tweaking and changes in...
Keywords: Youri Hoogstrate
Visualization of Next Generation Sequencing Data using the Integrative Genomics Viewer (IGV)
Keywords: Genomics
Add meta-information on variant to facilitate interpretation
Keywords: Variant analysis
TEannot is able to annote a genome using DNA sequences library. This library can be a predicted TE library built by TEdenovo
Keywords: Annotation, Genomics
Les Métadonnées : les standards du domaine des données omiques en biologie et séances pratiques d’annotations de jeux de données
Scientific topics: Data management, Biology, Bioinformatics
Keywords: metadata, data annotation, life science standards, data sharing
Resource type: Slides
Questions:
What is a tool for Galaxy?
How to build a tool/wrapper with the good practices?
How to deal with the tool environment?
Objectives:
Discover what is a wrapper and its structure
Use the Planemo utilities to develop a good wrapper
Deal with the dependencies
Write functional...
Keywords: Galaxy
Not available
Keywords: RNA-seq
Questions
What is a Galaxy Interactive Tour?
How to create a Galaxy Interactive Tour?
Objectives
Discover what is a Galaxy Interactive Tour
Be able to create a Galaxy Interactive Tour
Be able to add a Galaxy Interactive Tour in a Galaxy instance
Keywords: Galaxy
No description available
Keywords: Copy number, Structural genomics
The TEdenovo pipeline follows a philosophy in three first steps:
Detection of repeated sequences (potential TE)
Clustering of these sequences
Generation of consensus sequences for each cluster, representing the ancestral TE
Keywords: Annotation, Genomics
From Biotea to Bioschemas: definition of profiles required to represent scholarly publications
Keywords: biohackaton 2018
Cheese ripening is a complex biochemical process driven by microbial communities composed of both eukaryotes and prokaryotes. Surface-ripened cheeses are widely consumed all over the world and are appreciated for their characteristic flavor. Microbial community composition has been studied for a...
Keywords: metagenomics
Quality, normalisation and peak calling
Keywords: Chip-seq, Genomics
Opening an x2go session to the IFBcloud
Keywords: Cloud
Using the Gene-regulation appliance
1.1 Requirements 1.2 Virtual disk creation 1.3 Creation of an instance 1.4 Connection to the device 1.5 Download source data 1.6 Execute workflow
Visualizing results
2.1 Install and run the X2Go client on your host computer
2.2...
Keywords: Cloud Computing, Gene Regulation
Using blockchain in biomedical provenance, the identifiers use case
Keywords: biohackaton 2018
Improve Orphanet disease description knowledge by phenotypic automated recognition using scrapping toolkits
Keywords: biohackaton 2018
Data visualization, quality control, normalization peak calling
Peak annotation
From peaks to motifs
Keywords: Chip-Seq
Objectives:
Mapping the DNA-seq data to the reference genome
Process the alignments for the variant calling
Keywords: Alignment, DNA-seq, Genomics, Variant calling
HOVERGEN is a database containing homologous vertebrate protein and nucleotide sequences. It allows to easily select similar gene sequences from a wide range of vertebrates. Hence it becomes particularly useful in comparative genomics, phylogeny and evolutionary studies on a molecular level....
Keywords: genomics, proteomics
Questions
Why Docker? What is it?
How to use Docker?
How to integrate Galaxy in Docker to facilitate its deployment?
Objectives
Docker basics
Galaxy Docker image (usage)
Galaxy Docker (internals)
Galaxy flavours
Keywords: Docker, Galaxy
Galaxy II: common tools, quality control; alignment; data managment
Keywords: Genomics
Find Rapidly OTU with Galaxy Solution
Keywords: Galaxy, Metagenomics
Questions
How can visualization plugins benefit science?
Objectives
Implement a first Galaxy visualization
Understand the client side vs. server side principle
Keywords: Galaxy
PPF (Prokaryotic Phylogeny on the Fly) is an automated pipeline allowing to compute molecular phylogenies for prokarotic organisms. It is based on a set of specialized databases devoted to SSU rRNA, the most commonly used marker for bacterial txonomic identification. Those databases are splitted...
Keywords: metagenomics
Get started with Docker!
Create a Docker account
Install Docker on your local host
Create shared repositories and download source data
Fetch the Docker image and run it with shared folders
Execute the pipeline
JVH / Mac
Keywords: Docker, Gene regulation
The interpretation of metagenomic data (environmental, microbiome, etc, ...) usually involves the recognition of sequence similarity with previously identified (micro-organisms). This is for instance the main approach to taxonomical assignments and a starting point to most diversity analyses....
Keywords: metagenomics
Application of RDF-based models and tools for enhancing interoperable use of biomedical resources
Keywords: biohackaton 2018
Practical session on transciptome de novo assembly
Keywords: RNA-seq
Introduction
Goal
The aim is to :
Get familiar with motif analysis of ChIP-seq data.
Learn de novo motif discovery methods.
In practice :
Motif discovery with peak-motifs
Differential analysis
Random controls
Keywords: Chip-seq, Motif analysis, NGS, Pattern recognition
The PASTEClassifier (Pseudo Agent System for Transposable Elements Classification) is a transposable element (TE) classifier searching for structural features and similarity to classify TEs ( Hoede C. et al. 2014 )
Keywords: Genomics, Transposons
The soil microorganisms are responsible for a range of critical functions including those that directly affect our quality of life (e.g., antibiotic production and resistance – human and animal health, nitrogen fixation -agriculture, pollutant degradation – environmental bioremediation)....
Keywords: metagenomics
Read mapping: from raw reads to aligned reads.
Peak calling: from aligned reads to regions/peaks of high read density.
ChIP-seq annotation
Identification of genes related to the peaks.
Profiles of ChIP-seq reads around reference points (TSS, histone marks,).
Functional enrichment of the...
Keywords: Chip Seq, Motif Analysis, NGS
The aim is to :
Understand how to process reads to obtain peaks (peak-calling).
Become familiar with differential analysis of peaks
In practice :
Obtain dataset from GEO
Analyze mapped reads
Obtain set(s) of peaks, handle replicates
Differential analysis of peak
Keywords: Chip-seq, NGS, Peak calling
Design, describe, explore and model
Keywords: Genomics, RNA-seq
R and RStudio overview.
Keywords: Graphical analysis, R, Statistics
Data clearinghouse, validation and curation of BioSamples/ENA/Breeding API endpoints/MAR databases
Keywords: biohackaton 2018
Alternative episodes for the 4 Open Source Software (4OSS) lesson focused on different Open Source technologies: Github, Docker, Jupyter Notebook and so on
Keywords: biohackaton 2018
Installation and configuration of NGiNX for Galaxy
Keywords: Galaxy, NGiNX
Detection of Copy Number Variations
Keywords: DNA-seq
Intro to built in datasets
Built in data hierarchy
Some problems
Data Managers
Keywords: Galaxy
Visualisation of next-gen sequencing data with Integrative Genomics Viewer
Keywords: Data visualization, Genomics, NGS
Adding bioschemas markup to data repository
Keywords: biohackaton
Use cases:
Extact a subset of variants
Combine variants from several analysis
Compare obtained variants from several data types
Identify new variants compare to a reference list
Apply specific filters for Chip Design
Keywords: NGS, Variant calling
Understanding the interactions between microbial communities and their environment well enough to be able to predict diversity on the basis of physicochemical parameters is a fundamental pursuit of microbial ecology that still eludes us. However, modeling microbial communities is a complicated...
Keywords: metagenomics
Bioconda packaging of the Regulatory Sequence Analysis Tools (RSAT)
Keywords: biohackaton 2018
How to choose a database for Galaxy and configure it
Keywords: Galaxy
Practical work to introduce basic and advanced usage of the IFB cloud
Howto launch virtual machines
Managing your data in the cloud ;
Howto to connect to your VMS (SSH, web, remote desktop)
Personalizing your VMs (approver, galaxy, docker)
Keywords: Cloud computing, Virtual machine
Galaxy is a web application that uses handlers to perform actions.
There are two main types of actions that are carried out by handlers:
Respond to user requests; These actions are carried out by web handlers
Manage the execution of tools; These actions are performed by job handlers.
By...
Keywords: Galaxy
The last decade witnessed the discovery of four families of giant viruses infecting Acanthamoeba. They have genome encoding from 500 to 2000 genes, a large fraction of which encoding proteins of unknown origin. These unique proteins meant to recognize and manipulate the same building blocks as...
Keywords: metagenomics
We will analyze the copy number variations of a human tumor (parotid gland carcinoma), limited to the chr17, from a WES (whole-exome sequencing) experiment. All genomic coordinates correspond to the 2009 build of the reference human genome (hg19 / GRC37).
Keywords: Copy number, Structural genomics
The aim of this lecture is to present the impact of metagenomics and single-cell genomics on public databases. These new powerful approches allow us to have access to the diversity of life on our planet. However, care has to be taken when using these data for posterior analyses, such as...
Keywords: metagenomics
In 2010, the MetaHIT consortium published a 3.3M microbiota gene catalog generated by whole genome shotgun metagenomic sequencing, representing a mixture of bacteria, archaea, parasites and viruses coming from 124 human stool metagenomic samples [Qin et al, Nature 2010].
However most of the genes...
Keywords: metagenomics
Docker is free software that automates the deployment of applications in software containers executant in isolation. A Docker container, away from traditional virtual machines, requires no separate operating system and not providing any but relies instead on the core functionality and uses the...
Keywords: Docker
Questions
What is a Tool Shed?
How to install tools and workflows from a Tool Shed into a Galaxy instance?
What are the Tool Shed repository types?
How to publish with Planemo?
Objectives
Discover what is a Tool Shed
Be able to install tools and workflows from a Tool Shed into a...
Keywords: Galaxy
Global Objective
Given a set of ChIP-seq peaks annotate them in order to find associated genes, genomic categories and functional terms.
Keywords: Annotation, Chip-seq, Data Visualization, NGS
The Minimal Information About Plant Phenotyping Experiment, MIAPPE (www.miappe.org), has been designed by ELIXIR, EMPHASIS and Bioversity international, to guide plant scientist in the management of experimental data. Furthermore, since genetic studies relies on the integration and the linking...
Scientific topics: Data submission, annotation, and curation, Data quality management, Phenomics, Plant biology
Resource type: Video, Slides
Be careful about experimental design : avoid putting all the
replicates in the same lane, using the same barcode for the
replicates, putting different number of samples in lanes etc...
Non- uniformity of the per base read distribution (Illumina Random
Hexamer Priming bias visible on the 13...
Keywords: Differential Expression, RNA-seq, transcriptomics
Workflow 1: Rules and targets
Workflow 2: Introducing wildcards
Workflow 3: Keywords
Workflow 4: Combining rules
Workflow 5: Configuration file
Workflow 6: Separated files
Keywords: Gene regulation, Snakemake
The application of next-generation sequencing technologies to RNA orDNA directly extracted from a community of organisms yields a mixtureof nucleotide fragments. The task to distinguish amongst these and tofurther categorize the families of ribosomal RNAs (or any other givenmarker) is an...
Keywords: metagenomics
Shotgun metagenomics provides insights into a larger context of naturally occurring microbial genomes when short reads are assembled into contiguous DNA segments (contigs). Contigs are often orders of magnitude longer than individual sequences, offering improved annotations, and key information...
Keywords: metagenomics
