e-learning
Foundational Aspects of Machine Learning using Python
Abstract
Machine Learning is a subset of artificial intelligence that involves training algorithms to learn patterns from data and make predictions or decisions without being explicitly programmed. It has revolutionized various fields, from healthcare and finance to autonomous vehicles and natural language processing.
About This Material
This is a Hands-on Tutorial from the GTN which is usable either for individual self-study, or as a teaching material in a classroom.
Questions this will address
- How can we use Machine-Learning to make more generalizable models?
- What are the key components of a supervised learning problem, and how do they influence model performance?
- How do classification and regression tasks differ in supervised learning, and what types of models are suitable for each?
- What strategies can we employ to ensure our Machine Learning models generalize well to unseen data?
- How can we use Machine Learning to make more generalizable models that perform well on diverse datasets?
- What are some practical steps for applying Machine Learning to real-world datasets, such as the transcriptomics dataset for predicting potato coloration?
Learning Objectives
- Understand and apply the general syntax and functions of the scikit-learn library to implement basic Machine Learning models in Python.
- Identify and explain the concepts of overfitting and underfitting in Machine Learning models, and discuss their implications on model performance.
- Analyze the need for regularization techniques and justify their importance in preventing overfitting and improving model generalization.
- Evaluate the effectiveness of cross-validation and test sets in assessing model performance and implement these techniques using scikit-learn.
- Compare different evaluation metrics and select appropriate metrics for imbalanced datasets, ensuring accurate and meaningful model assessment.
Licence: Creative Commons Attribution 4.0 International
Keywords: Statistics and machine learning, ai-ml, elixir, jupyter-notebook
Target audience: Students
Resource type: e-learning
Version: 3
Status: Active
Prerequisites:
- Introduction to Python
- Python - Warm-up for statistics and machine learning
Learning objectives:
- Understand and apply the general syntax and functions of the scikit-learn library to implement basic Machine Learning models in Python.
- Identify and explain the concepts of overfitting and underfitting in Machine Learning models, and discuss their implications on model performance.
- Analyze the need for regularization techniques and justify their importance in preventing overfitting and improving model generalization.
- Evaluate the effectiveness of cross-validation and test sets in assessing model performance and implement these techniques using scikit-learn.
- Compare different evaluation metrics and select appropriate metrics for imbalanced datasets, ensuring accurate and meaningful model assessment.
Date modified: 2025-05-19
Date published: 2025-03-11
Contributors: Anup Kumar, Bérénice Batut, Saskia Hiltemann, Wandrille Duchemin
Scientific topics: Statistics and probability
Activity log