Recorded webinar

LLM generated summaries for protein classification at InterPro

This webinar will explore how Large Language Models can accelerate protein classification by automatically generating descriptive annotations for previously unannotated protein families. Traditionally, the process of curating protein family descriptions relies on manual literature review and expert knowledge, a time-consuming approach that often delays integration into biological databases. In this session, we will discuss our innovative workflow that leverages LLMs to synthesise functional summaries from existing curated data, thereby streamlining the annotation process. We will also highlight a comparative evaluation of using both a state-of-the-art GTP model and a fine-tuned local model, demonstrating that smaller, cost-effective LLMs can produce high-quality descriptions that support rapid protein classification.

Resource type: Recorded webinar

Scientific topics: Machine learning


Activity log