Register training material
43 materials found

Content provider: CSC - IT Center for Science  or Data Carpentry 


Detecting differentially expressed genes with RNA-seq 11.9.2019

This workshop introduces the participants to RNA-seq data analysis methods, tools and file formats. It covers the whole workflow from quality control and alignment to quantification and differential gene expression analysis. The workshop consists of lectures and practical exercises. The free and...

Scientific topics: RNA-Seq

Detecting differentially expressed genes with RNA-seq 11.9.2019 https://tess.elixir-europe.org/materials/detecting-differentially-expressed-genes-with-rna-seq This workshop introduces the participants to RNA-seq data analysis methods, tools and file formats. It covers the whole workflow from quality control and alignment to quantification and differential gene expression analysis. The workshop consists of lectures and practical exercises. The free and user-friendly Chipster software is used in the exercises, so no previous knowledge of Unix or R is required, and the workshop is thus suitable for everybody. RNA-Seq
Single cell RNA-seq data analysis using Chipster

This course introduces single cell RNA-seq data analysis. It covers the processing of transcript counts from quality control and filtering to dimensional reduction, clustering, and differential expression analysis. You will also learn how to do integrated analysis of two samples. We use Seurat v3...

Keywords: scRNA-seq

Resource type: Slides, Training materials

Single cell RNA-seq data analysis using Chipster https://tess.elixir-europe.org/materials/single-cell-rna-seq-data-analysis-using-chipster This course introduces single cell RNA-seq data analysis. It covers the processing of transcript counts from quality control and filtering to dimensional reduction, clustering, and differential expression analysis. You will also learn how to do integrated analysis of two samples. We use Seurat v3 tools embedded in the user-friendly Chipster software. scRNA-seq Biologists bioinformaticians
Introduction to using cloud and containers for training - OpenStack and Docker oriented view

This material is based on CSC's [Pouta cloud course](https://www.csc.fi/en/web/training/-/pouta-cloud-course-2018) which consists of lectures and hands-on [exercises](https://chipster.csc.fi/material/cloud/exercises.pdf) on creating and managing virtual resources in OpenStack (VM, volumes,...

Keywords: Cloud computing, Containers

Resource type: Slides, course materials

Introduction to using cloud and containers for training - OpenStack and Docker oriented view https://tess.elixir-europe.org/materials/introduction-to-using-cloud-and-containers-for-training-openstack-and-docker-oriented-view This material is based on CSC's [Pouta cloud course](https://www.csc.fi/en/web/training/-/pouta-cloud-course-2018) which consists of lectures and hands-on [exercises](https://chipster.csc.fi/material/cloud/exercises.pdf) on creating and managing virtual resources in OpenStack (VM, volumes, networks, security Groups, VM snapshots, etc). It also covers topics like orchestration with Heat and accessing Pouta Object storage. The original material was produced by Shubham Kapoor and Johan Guldmyr, and it was tailored for trainers by Jarno Laitinen. Eija Korpelainen Cloud computing, Containers Trainers
Single cell RNA-seq data analysis with R

This hands-on course introduces the participants to single cell RNA-seq data analysis concepts and popular tools and R packages. It covers the preprocessing steps from raw sequence reads to expression matrix as well as clustering, cell type identification, differential expression analysis and...

Scientific topics: RNA-Seq

Keywords: RNA-Seq, Single Cell technologies, scRNA-seq

Resource type: course materials

Single cell RNA-seq data analysis with R https://tess.elixir-europe.org/materials/single-cell-rna-seq-data-analysis-with-r This hands-on course introduces the participants to single cell RNA-seq data analysis concepts and popular tools and R packages. It covers the preprocessing steps from raw sequence reads to expression matrix as well as clustering, cell type identification, differential expression analysis and pseudotime analysis. Eija Korpelainen RNA-Seq RNA-Seq, Single Cell technologies, scRNA-seq bioinformaticians Biologists
Single cell RNA-seq data analysis with Chipster

This course introduces single cell RNA-seq data analysis methods, tools and file formats. It covers the preprocessing steps of DropSeq data from raw reads to a digital gene expression matrix (DGE), and how to find sub-populations of cells using clustering with the Seurat tools. You will also...

Scientific topics: RNA-Seq

Keywords: RNA-Seq, Single Cell technologies, scRNA-seq

Resource type: course materials, Video

Single cell RNA-seq data analysis with Chipster https://tess.elixir-europe.org/materials/single-cell-rna-seq-data-analysis-with-chipster-6cc8f0fb-1c92-444b-ab19-b04fe6454430 This course introduces single cell RNA-seq data analysis methods, tools and file formats. It covers the preprocessing steps of DropSeq data from raw reads to a digital gene expression matrix (DGE), and how to find sub-populations of cells using clustering with the Seurat tools. You will also learn how to compare two samples and detect conserved cluster markers and differentially expressed genes in them. The user-friendly Chipster software is used in the exercises, so no Unix or R experience is required and the course is thus suitable for everybody. Eija Korpelainen RNA-Seq RNA-Seq, Single Cell technologies, scRNA-seq Biologists bioinformaticians
Python for Social Science Data: Instructor Notes

PIP is referred to in the text but it shouldn’t need to be used. It is assumed that Jupyter notebooks will be used for all of the coding. (The shell is used in explaining REPL) How to start Jupyter is included in the setup instructions. All of the datasets used have been placed in the data...

Python for Social Science Data: Instructor Notes https://tess.elixir-europe.org/materials/python-for-social-science-data-instructor-notes PIP is referred to in the text but it shouldn’t need to be used. It is assumed that Jupyter notebooks will be used for all of the coding. (The shell is used in explaining REPL) How to start Jupyter is included in the setup instructions. All of the datasets used have been placed in the data folder. They should be downloaded to the local machine before use.
R for Social Scientists: Instructor Notes

This lesson uses SAFI_clean.csv. The direct download link for this file is: https://ndownloader.figshare.com/files/11492171. When time comes in the lesson to use this file, we recommend that the instructors, place the download.file() command in the Etherpad, and that the learners copy and paste...

R for Social Scientists: Instructor Notes https://tess.elixir-europe.org/materials/r-for-social-scientists-instructor-notes This lesson uses SAFI_clean.csv. The direct download link for this file is: https://ndownloader.figshare.com/files/11492171. When time comes in the lesson to use this file, we recommend that the instructors, place the download.file() command in the Etherpad, and that the learners copy and paste it in their scripts to download the file directly from figshare in their working directory. . If the learners haven’t created the data/ directory and/or are not in the correct working directory, the download.file command will produce an error. Therefore, it is important to use the stickies at this point. Some learners may have previous R installations. On Mac, if a new install is performed, the learner’s system will create a symbolic link, pointing to the new install as ‘Current.’ Sometimes this process does not occur, and, even though a new R is installed and can be accessed via the R console, RStudio does not find it. The net result of this is that the learner’s RStudio will be running an older R install. This will cause package installations to fail. This can be fixed at the terminal. First, check for the appropriate R installation in the library; We are currently using R 3.x.y If it isn’t there, they will need to install it. If it is present, you will need to set the symbolic link to Current to point to the 3.x.y directory: Then restart RStudio.
Data Organization in Spreadsheets for Social Scientists: Instructor Notes

The challenge with this lesson is that the instructor’s version of the spreadsheet software is going to look different than about half the room’s. It makes it challenging to show where you can find menu options and navigate through. Instead discuss the concepts of quality control, and how things...

Data Organization in Spreadsheets for Social Scientists: Instructor Notes https://tess.elixir-europe.org/materials/data-organization-in-spreadsheets-for-social-scientists-instructor-notes The challenge with this lesson is that the instructor’s version of the spreadsheet software is going to look different than about half the room’s. It makes it challenging to show where you can find menu options and navigate through. Instead discuss the concepts of quality control, and how things like sorting can help you find outliers in your data. Provide information on setting up your environment for learners to view your live coding (increasing text size, changing text color, etc), as well as general recommendations for working with coding tools to best suit the learning environment. The main challenge with this lesson is that Excel looks very different and how you do things is even different between Mac and PC, and between different versions of Excel. So, the presenter’s environment will only be the same as some of the learners. We need better notes and screenshots of how things work on both Mac and PC. But we likely won’t be able to cover all the different versions of Excel.
OpenRefine for Social Science Data: Instructor Notes

There is a separate file for the setup instructions for installing OpenRefine (setup). Introduction Working with OpenRefine Filtering and Sorting Examining Numbers in OpenRefine

OpenRefine for Social Science Data: Instructor Notes https://tess.elixir-europe.org/materials/openrefine-for-social-science-data-instructor-notes There is a separate file for the setup instructions for installing OpenRefine (setup). Introduction Working with OpenRefine Filtering and Sorting Examining Numbers in OpenRefine
Community analysis of amplicon sequencing data (16S rRNA)

This course introduces community analysis of amplicon sequencing data (16S rRNA). It covers preprocessing, taxonomic classification, and statistical analysis for marker gene studies. The user-friendly Chipster software is used in the exercises, so no Unix or R experience is required and the...

Resource type: course materials, Video

Community analysis of amplicon sequencing data (16S rRNA) https://tess.elixir-europe.org/materials/community-analysis-of-amplicon-sequencing-data-16s-rrna This course introduces community analysis of amplicon sequencing data (16S rRNA). It covers preprocessing, taxonomic classification, and statistical analysis for marker gene studies. The user-friendly Chipster software is used in the exercises, so no Unix or R experience is required and the course is thus suitable for everybody.
Virus detection using small RNA-seq

This course introduces the VirusDetect pipeline covering all the analysis steps and file formats. VirusDetect allows you to detect known viruses and identify news ones by sequencing small RNAs (siRNA) in host samples. siRNA sequences are assembled to contigs and compared to known virus sequences....

Scientific topics: RNA-Seq

Resource type: course materials, Video

Virus detection using small RNA-seq https://tess.elixir-europe.org/materials/virus-detection-using-small-rna-seq This course introduces the VirusDetect pipeline covering all the analysis steps and file formats. VirusDetect allows you to detect known viruses and identify news ones by sequencing small RNAs (siRNA) in host samples. siRNA sequences are assembled to contigs and compared to known virus sequences. The user-friendly Chipster software is used in the exercises, so no Unix or R experience is required and the course is thus suitable for everybody. Eija Korpelainen RNA-Seq
RNA-seq data analysis

This course introduces RNA-seq data analysis methods, tools and file formats. It covers all the steps from quality control and alignment to quantification and differential expression analysis, and also experimental design is discussed. The user-friendly Chipster software is used in the exercises,...

Scientific topics: RNA-Seq

Resource type: course materials, Video

RNA-seq data analysis https://tess.elixir-europe.org/materials/rna-seq-data-analysis-with-chipster This course introduces RNA-seq data analysis methods, tools and file formats. It covers all the steps from quality control and alignment to quantification and differential expression analysis, and also experimental design is discussed. The user-friendly Chipster software is used in the exercises, so no Unix or R experience is required and the course is thus suitable for everybody. Eija Korpelainen RNA-Seq
Python for Ecologists: Glossary

SciPy ecosystem for Python provides the tools necessary for scientific computing Jupyter Notebook and the Spyder IDE are great tools to code in and interact with Python with its large community it is easy to find help in the internet FIXME FIXME FIXME

Python for Ecologists: Glossary https://tess.elixir-europe.org/materials/python-for-ecologists-glossary SciPy ecosystem for Python provides the tools necessary for scientific computing Jupyter Notebook and the Spyder IDE are great tools to code in and interact with Python with its large community it is easy to find help in the internet FIXME FIXME FIXME
Intro to Geospatial Data with R

A single raster file can contain multiple bands or layers. Spatial objects in sf are similar to standard data frames except for a geometry list-column. It is important to know the projection (if any) of your point data prior to converting to a spatial object. CRAN Spatial Task View Geocomputation...

Intro to Geospatial Data with R https://tess.elixir-europe.org/materials/intro-to-geospatial-data-with-r A single raster file can contain multiple bands or layers. Spatial objects in sf are similar to standard data frames except for a geometry list-column. It is important to know the projection (if any) of your point data prior to converting to a spatial object. CRAN Spatial Task View Geocomputation with R
R for Social Scientists: Glossary

Use install.packages() to install packages (libraries). Access individual values by location using []. Access slices of data using [low:high]. Access arbitrary sets of data using [c(...)]. Use logical operations and logical vectors to access subsets of data.

R for Social Scientists: Glossary https://tess.elixir-europe.org/materials/r-for-social-scientists-glossary Use install.packages() to install packages (libraries). Access individual values by location using []. Access slices of data using [low:high]. Access arbitrary sets of data using [c(...)]. Use logical operations and logical vectors to access subsets of data.
Python for Social Science Data: Glossary

The REPL (Read-Eval-Print loop) allows rapid development and testing of code segments Jupyter notebooks builds on the REPL concepts and allow code results and documentation to be maintained together and shared Jupyter notebooks is a complete IDE (Integrated Development Environment) The Jupyter...

Python for Social Science Data: Glossary https://tess.elixir-europe.org/materials/python-for-social-science-data-glossary The REPL (Read-Eval-Print loop) allows rapid development and testing of code segments Jupyter notebooks builds on the REPL concepts and allow code results and documentation to be maintained together and shared Jupyter notebooks is a complete IDE (Integrated Development Environment) The Jupyter environment can be used to write code segments and display results Datatypes in Python are implicit based on variable values
SQL for Social Science Data: Glossary

SQL (Structured Query Language) is used to extract data from the tables A schema for a table has to be created before data can be added The schema can be used to provide some data validation on input The DB Browser for SQLite application allows you to connect to an existing database or create a...

SQL for Social Science Data: Glossary https://tess.elixir-europe.org/materials/sql-for-social-science-data-glossary SQL (Structured Query Language) is used to extract data from the tables A schema for a table has to be created before data can be added The schema can be used to provide some data validation on input The DB Browser for SQLite application allows you to connect to an existing database or create a new database When connected to a database you can create new tables
OpenRefine for Social Science Data: Glossary

OpenRefine will automatically track any steps allowing you to backtrack as needed and providing a record of all work done OpenRefine can import a variety of file types. OpenRefine can be used to explore data using filters. Clustering in OpenRefine can help to identify different values that might...

OpenRefine for Social Science Data: Glossary https://tess.elixir-europe.org/materials/openrefine-for-social-science-data-glossary OpenRefine will automatically track any steps allowing you to backtrack as needed and providing a record of all work done OpenRefine can import a variety of file types. OpenRefine can be used to explore data using filters. Clustering in OpenRefine can help to identify different values that might mean the same thing. OpenRefine can transform the values of a column.
Data Organization in Spreadsheets for Social Scientists

Never modify your raw data. Always make a copy before making any changes. Keep track of all of the steps you take to clean your data. Organize your data according to tidy data principles. Record metadata in a separate plain text file. Avoid using multiple tables within one spreadsheet.

Data Organization in Spreadsheets for Social Scientists https://tess.elixir-europe.org/materials/lesson-title-glossary Never modify your raw data. Always make a copy before making any changes. Keep track of all of the steps you take to clean your data. Organize your data according to tidy data principles. Record metadata in a separate plain text file. Avoid using multiple tables within one spreadsheet.
Metagenomics data analysis

This course covers metagenomics analysis from quality control, filtering and assembly to taxonomic classification, functional assignment and comparative metagenomics. In addition to covering the analysis of whole genome shotgun sequencing data, the course has also module on community analysis of...

Scientific topics: Metagenomics

Resource type: Video, course materials

Metagenomics data analysis https://tess.elixir-europe.org/materials/metagenomics-data-analysis This course covers metagenomics analysis from quality control, filtering and assembly to taxonomic classification, functional assignment and comparative metagenomics. In addition to covering the analysis of whole genome shotgun sequencing data, the course has also module on community analysis of amplicon sequencing data (16S rRNA). Finally, international databases and standards for storing the data are introduced. The course material includes slides, exercises and lecture videos. The workshop is organized in collaboration with the ELIXIR EXCELERATE project and PRACE, and it is part of the PRACE Advanced Training Centre activity. Eija Korpelainen Metagenomics
Cloud Genomics: Instructor NotesCloud Genomics Pre-WorkshopDuring the workshop

VM Image Directories A high-level listing of the directory tree from the dcuser account is shown below. Please note that is may be subject to change over time, but we’ll try to remember to update this doc. We had a couple instances die as we were going through our workshop.

Cloud Genomics: Instructor NotesCloud Genomics Pre-WorkshopDuring the workshop https://tess.elixir-europe.org/materials/cloud-genomics-instructor-notescloud-genomics-pre-workshopduring-the-workshop VM Image Directories A high-level listing of the directory tree from the dcuser account is shown below. Please note that is may be subject to change over time, but we’ll try to remember to update this doc. We had a couple instances die as we were going through our workshop.
Wrangling Genomics: Glossary

for loops let you perform the same set of operations on multiple files with a single command. The options you set for the command-line tools you use are important! Data cleaning is an essential step in a genomics workflow. Bioinformatics command line tools are collections of commands that can be...

Wrangling Genomics: Glossary https://tess.elixir-europe.org/materials/wrangling-genomics-glossary for loops let you perform the same set of operations on multiple files with a single command. The options you set for the command-line tools you use are important! Data cleaning is an essential step in a genomics workflow. Bioinformatics command line tools are collections of commands that can be used to carry out bioinformatics analyses. To use most powerful bioinformatics tools, you’ll need to use the command line.
Cloud Genomics: Glossary

You can use one set of log-in credentials for many instances Logging off an instance is not the same as turning off an instance Always check a new instance to verify it started correctly Using a program like tmux can keep your work going even if your internet connection is bad No matter which way...

Cloud Genomics: Glossary https://tess.elixir-europe.org/materials/cloud-genomics-glossary You can use one set of log-in credentials for many instances Logging off an instance is not the same as turning off an instance Always check a new instance to verify it started correctly Using a program like tmux can keep your work going even if your internet connection is bad No matter which way you want to move data, it’s easier to start the transfer from your local machine
Shell Genomics: Instructor Notes

This lesson will introduce learners to fundamental skills needed for working with their computers through a command-line interface (using the bash shell). They will learn how to navigate their file system, computationally manipulate their files (e.g. copying, moving, renaming), search files,...

Shell Genomics: Instructor Notes https://tess.elixir-europe.org/materials/shell-genomics-instructor-notes This lesson will introduce learners to fundamental skills needed for working with their computers through a command-line interface (using the bash shell). They will learn how to navigate their file system, computationally manipulate their files (e.g. copying, moving, renaming), search files, redirect output and write shell scripts. By the end of the lesson, learners will be prepared to move on to using more advanced bioinformatic command line tools (see the lesson on Data Wrangling and Processing). This lesson is meant to be taught in its entirety. For novice learners, schedule around 4 hours for this lesson. If your learners are already somewhat familiar with the bash shell, the earlier episodes can be condensed. This lesson uses data hosted on an Amazon Machine Instance (AMI). Instructors will be sent information on how to log-in to the AMI by the workshop coordinator a few days before the workshop. If you are running a self-organized workshop, register the workshop with our self-organized workshop form and send us an email at mailto:team@datacarpentry.org with information on how many people you expect to have at the workshop, and we’ll start instances for you to use in the workshop. The day before the workshop, we’ll send you the login information for your learners. Learners will work through an Amazon Web Service (AWS) instance for this lesson. The workshop coordinator will set up AWS instances for your workshop a few days ahead of time. Put the links for all instances on your workshop Etherpad and have learners put their name next to the instance they will use. This prevents learners from accidentally messing up another learner’s filesystem. The workshop coordinator usually sets up more AWS instances than needed for the registered learners. If a learner accidentally deletes or overwrites data files, you can have them change to a different AWS instance.
Genomics Organization: Instructor Notes

Discussions can happen between neighbors in a workshop. Then after paired discussion there can be a short general discussion of the types of things that came up in the discussion. You could also have people enter responses to the discussion in the workshop etherpad. Or capture the general...

Genomics Organization: Instructor Notes https://tess.elixir-europe.org/materials/genomics-organization-instructor-notes Discussions can happen between neighbors in a workshop. Then after paired discussion there can be a short general discussion of the types of things that came up in the discussion. You could also have people enter responses to the discussion in the workshop etherpad. Or capture the general responses in that Etherpad. That etherpad is then a resource for learners after the workshop.
Shell Genomics: Glossary

Useful commands for navigating your file system include: ls, pwd, and cd. Most commands take options (flags) which begin with a -. Tab completion can reduce errors from mistyping and make work more efficient in the shell. The /, ~, and .. characters represent important navigational shortcuts....

Shell Genomics: Glossary https://tess.elixir-europe.org/materials/shell-genomics-glossary Useful commands for navigating your file system include: ls, pwd, and cd. Most commands take options (flags) which begin with a -. Tab completion can reduce errors from mistyping and make work more efficient in the shell. The /, ~, and .. characters represent important navigational shortcuts. Hidden files and directories start with . and can be viewed using ls -a.
Genomics Organization: Glossary

Tabular data needs to be structured to be able to work with it effectively Data being sent to a sequencing center also needs to be structured so you can use it. Raw sequencing data should be kept raw somewhere, so you can always go back to the original files. Public data repositories are a great...

Genomics Organization: Glossary https://tess.elixir-europe.org/materials/genomics-organization-glossary Tabular data needs to be structured to be able to work with it effectively Data being sent to a sequencing center also needs to be structured so you can use it. Raw sequencing data should be kept raw somewhere, so you can always go back to the original files. Public data repositories are a great source of genomic data. FIXME
SQL for Ecology: Glossary

A relational database is made up of tables which are related to each other by shared keys. Different database management systems (DBMS) use slightly different vocabulary, but they are all based on the same ideas. It is useful to apply conventions when writing SQL queries to aid readability. Use...

SQL for Ecology: Glossary https://tess.elixir-europe.org/materials/sql-for-ecology-glossary A relational database is made up of tables which are related to each other by shared keys. Different database management systems (DBMS) use slightly different vocabulary, but they are all based on the same ideas. It is useful to apply conventions when writing SQL queries to aid readability. Use logical connectors such as AND or OR to create more complex queries. Calculations using mathematical symbols can also be performed on SQL queries.
Open Refine for Ecology: Glossary

OpenRefine will automatically track any steps you take in working with your data. Faceting and clustering approaches can identify errors or outliers in data. OpenRefine provides a way to sort and filter data without affecting the raw data. OpenRefine also provides ways to get overviews of...

Open Refine for Ecology: Glossary https://tess.elixir-europe.org/materials/open-refine-for-ecology-glossary OpenRefine will automatically track any steps you take in working with your data. Faceting and clustering approaches can identify errors or outliers in data. OpenRefine provides a way to sort and filter data without affecting the raw data. OpenRefine also provides ways to get overviews of numerical data. All changes are being tracked in OpenRefine, and this information can be used for scripts for future analyses or reproducing an analysis.
Data Organization in Spreadsheets: Glossary

Never modify your raw data. Always make a copy before making any changes. Keep track of all of the steps you take to clean your data. Organize your data according to tidy data principles. Avoid using multiple tables within one spreadsheet. Avoid spreading data across multiple tabs (but do use a...

Data Organization in Spreadsheets: Glossary https://tess.elixir-europe.org/materials/data-organization-in-spreadsheets-glossary Never modify your raw data. Always make a copy before making any changes. Keep track of all of the steps you take to clean your data. Organize your data according to tidy data principles. Avoid using multiple tables within one spreadsheet. Avoid spreading data across multiple tabs (but do use a new tab to record data cleaning or manipulations).