Recorded webinar
LLM generated summaries for protein classification at InterPro
This webinar will explore how Large Language Models can accelerate protein classification by automatically generating descriptive annotations for previously unannotated protein families. Traditionally, the process of curating protein family descriptions relies on manual literature review and expert knowledge, a time-consuming approach that often delays integration into biological databases. In this session, we will discuss our innovative workflow that leverages LLMs to synthesise functional summaries from existing curated data, thereby streamlining the annotation process. We will also highlight a comparative evaluation of using both a state-of-the-art GTP model and a fine-tuned local model, demonstrating that smaller, cost-effective LLMs can produce high-quality descriptions that support rapid protein classification.
Resource type: Recorded webinar
Scientific topics: Machine learning
Activity log