hands-on tutorial

Sequence data submission to ENA

DNA sequencing has become one of the key technologies in molecular biology, with applications in diagnostics, evolutionary biology, drug discovery, forensics and much more. Drop in sequencing costs and breakthroughs in sequencing technologies has seen increasing utilization of sequencing as a research tool, featuring in thousands of life-science publications every year.

Prior to publication many journals and funders require authors to submit their raw sequence data to one of the three INSDC member databases – ENA, NCBI or DDBJ – between which data is synchronised on a daily basis. INSDC is the core infrastructure for sharing nucleotide sequence data and metadata in the public domain. Data in INSDC member databases is available permanently, for free and with unrestricted access. For each submitted sequence a unique accession number is issued which can be reported in the publication.

The three databases have different methods for making submissions. If your database of choice is ENA and you need to submit data stored on a remote server, you are in the right place. This tutorial will cover how to find your way around the ENA Webin portal for uploading raw sequencing read data as well as accompanying metadata, and use cURL to copy read files over to ENA’s FTP server.

If you would like to use Galaxy tools for submission to ENA you may find Submitting sequence data to ENA tutorial helpful.

DOI: https://gxy.io/GTN:T00369

Licence: Creative Commons Attribution 4.0 International

Keywords: FAIR, ENA, sequence datasets

Resource type: hands-on tutorial

Version: 1

Status: Active

Learning objectives:

To populate ENA metadata objects through the Webin portal

To submit raw reads to ENA using FTP

Date created: 2023-11-01

Date modified: 2023-11-01

Date published: 2023-11-01

Authors: Sonal Henson

Contributors: Katarzyna Kamieniecka, Munazah Andrabi, Krzysztof Poterlowicz


Activity log