ZurichNLP #12
Tue, 10 Sept
|Zürich
Yannic Kilcher on OpenAssistant and Afra Amini on aligning LLM algorithms.
Time & Location
10 Sept 2024, 18:00 – 20:00
Zürich, OAT ETH Zurich (14th floor), Andreasstrasse 5, 8050 Zürich, Switzerland
About the Event
Yannic Kilcher (from DeepJudge) will be discussing the OpenAssistant project, which collected over 10000 answers & preference labels from volunteer annotators for finetuning LLMs.
Afra Amini (from ETH Zurich) will be talking about Variational Best-of-N alignment:
"Best-of-N is an effective and straightforward LLM algorithm for aligning language models to human preferences: at inference time, N samples are drawn from the language model, and the sample with the highest reward, as judged by a reward model, is returned as the output. Despite its effectiveness, BoN is computationally expensive; it reduces sampling throughput by a factor of N. In this talk, we will explore how to approximate the BoN algorithm by fine-tuning a language model. The goal is to make sampling a single output from the fine-tuned model equivalent to performing BoN on the reference model before alignment."