草莓社区

Skip to main content

Lab Workshop: Multiple Models as Judges to Construct Labeled Datasets

How can researchers and students assess the ground truth about performance of language models on specific tasks?

This workshop demonstrates how LLM-as-a-Judge can turn large, unstructured collections of business documents into structured datasets. Participants will explore new Quinlan-built language models that extract and label workplace skills and tasks from natural language, review sample code and real-world use cases, and learn how to interpret outputs and compile summary reports using multiple LLM-as-a-Judge assessments of ground truth.

Event info:

Date: Wednesday, October 29, 2025
Time: Noon-1:00 p.m. (CT)
Format: Online

Speakers:

Research Paper: 

</