View event

Date: 8 January 2026 @ 09:00 - 12:00

Timezone: Brussels

Duration: 3 hours

Language of instruction: English

Loading map...

Researchers often spend a significant amount of time on data wrangling tasks, such as reformatting, cleaning, and integrating data from different sources. Despite the availability of software tools, they often end up with difficult to reuse workflows that require manual steps.

Omnipy is a new Python library that offers a systematic and scalable approach to building data pipelines. Through specification of data models/parsers, Omnipy allows researchers to import data in various formats and wrangle their data through stepwise transformations.

For automation with large data, Omnipy seamlessly scales up for deployment on remote infrastructures. This workshop will provide down-to-earth tutorials and examples to help data scientists from any field make use of Omnipy to wrangle real-world datasets into shape.

The workshop is divided into three parts:

The first part will introduce the concepts of models, datasets and tasks in Omnipy through small examples. We will also touch upon Python type hints and pydantic models as needed, as these are important building blocks for Omnipy.
In the second part, the participants will be provided with a rough example dataset that requires cleaning. As a hands-on exercise, the participant will carry out step-wise parsing and shaping of the data to make it comply with a specified metadata schema.
In the last part, the participants will be introduced to the metadata mapping functionalities in Omnipy and will be led through another hands-on exercise to set up a transformation that maps the data from one metadata schema to another.

Contact: digitalscholarship@ub.uio.no

Venue: Moltke Moes vei 39, 39 Moltke Moes vei

City: Oslo

Region: Oslo kommune

Country: Norway

Postcode: 0851

Prerequisites:

The participant should have some experience with Python programming/scripting. We will not spend time explaining basic syntax and concepts, other than what is related to type hints. Experience with type hints in Python is useful, but not required.
Laptop.
No software installation is required other than a modern browser.
We will make use of JupyterLab for the hands-on exercise. An online JupyterLab service will be made available, but participants can also install JupyterLab locally on their laptops if they prefer.

Learning objectives:

Introduction to Python type hints and pydantic models
How to use type hints to define models, datasets and tasks in Omnipy
How to wrangle a rough dataset into the shape required by a metadata schema
How to set up an executable mapping of data from one metadata schema to another

Organizer: The workshop is provided by the Oslo node of ELIXIR Norway as part of an extended event organised by Digital Scholarship Center (DSC), Carpentry@UiO, IFI (UiO), dScience, NAIC, Humit, PSI, ISS, USIT, NORRN, CodeRefinery, ELIXIR Oslo, University of Oslo Library and BærUt!

Host institutions: University of Oslo

Target audience: PhD, Postdoctoral Fellows, Researchers, Engineers

Capacity: 20

Event types:

Workshops and courses

Cost basis: Free to all

Sponsors: Digital Scholarship Centre (University of Oslo)

Scientific topics: Data curation and archival, Data identity and mapping, Data quality management, Data governance, Workflows

Operations: Data handling

External resources:

omnipy

Activity log

Content provider

Node

Building Scalable and Maintainable Data Pipelines with Omnipy (Part 1)

omnipy