A Critical Guide to the neXtProt knowledgebase: querying using SPARQL

This Critical Guide in the Introduction to Bioinformatics series briefly outlines how to explore the neXtProt human protein database using SPARQL. While text indexation has made database contents more accessible, being able to combine search criteria for specific content permits more powerful querying, and provides a means to mine the information stored in databases. This Guide illustrates the use of the SPARQL semantic query language to interrogate neXtProt and other databases that provide SPARQL endpoints.
Specifically, the Guide introduces the concept of database ‘semantic triples’, and examines features of the neXtProt data model. On reading this Guide, and completing the exercises, users will be able to: i) identify key entities within the neXtProt data model; ii) explain what these entities represent, what information they contain and what the information is used for; iii) identify key SPARQL syntax elements; iv) understand SPARQL tutorial examples; and v) write a SPARQL query to retrieve entries matching specific criteria.

Scientific topics: Database management

Keywords: Human protein database, Introduction bioinformatics, Introduction nextprot, Nextprot data model, Rdf triples, Semantic triples, Sparql queries, Sparql syntax, Training material

Target audience: Beginners

Authors: Terri Attwood

Remote created date: 2019-06-06

A Critical Guide to the neXtProt knowledgebase: querying using SPARQL https://tess.elixir-europe.org/materials/a-critical-guide-to-the-nextprot-knowledgebase-querying-using-sparql This Critical Guide in the Introduction to Bioinformatics series briefly outlines how to explore the neXtProt human protein database using SPARQL. While text indexation has made database contents more accessible, being able to combine search criteria for specific content permits more powerful querying, and provides a means to mine the information stored in databases. This Guide illustrates the use of the SPARQL semantic query language to interrogate neXtProt and other databases that provide SPARQL endpoints. Specifically, the Guide introduces the concept of database ‘semantic triples’, and examines features of the neXtProt data model. On reading this Guide, and completing the exercises, users will be able to: i) identify key entities within the neXtProt data model; ii) explain what these entities represent, what information they contain and what the information is used for; iii) identify key SPARQL syntax elements; iv) understand SPARQL tutorial examples; and v) write a SPARQL query to retrieve entries matching specific criteria. Database management Human protein database, Introduction bioinformatics, Introduction nextprot, Nextprot data model, Rdf triples, Semantic triples, Sparql queries, Sparql syntax, Training material Beginners 2019-06-06