ZurichNLP #18

Name: ZurichNLP #18
Start: 2025-11-24T18:00:00+01:00
End: 2025-11-24T20:00:00+01:00
Location: Zürich

Mon 24 Nov

Zürich

Philippe Bich (Huawei) on calibration-free low-precision LLMs, and Alejandro Hernández Cano (EPFL) on Apertus, the Swiss open LLM.

Time & Location

24 Nov 2025, 18:00 – 20:00

Zürich, OAT ETH Zurich (14th floor), Andreasstrasse 5, 8050 Zürich, Switzerland

About the Event

Philippe Bich (Huawei): SINQ: Sinkhorn-Normalized Quantization for calibration-free low-precision LLM weights

Quantization is a key technique for making Large Language Models (LLMs) and Vision-Language Models (VLMs) faster and less memory-demanding, but maintaining accuracy at low precision remains challenging. Outlier weights often dominate shared scales, leading to large errors and degraded performance. We introduce SINQ, our novel open-source approach that makes post-training quantization both simpler and more accurate.

SINQ adds a second normalization axis and applies a fast Sinkhorn-Knopp-style algorithm to balance per-row and per-column variances in weight matrices. Tested on Qwen3 and DeepSeek-V2.5, SINQ improves perplexity on WikiText2 and C4 benchmarks without calibration data or model-specific tuning. It can quantize billion-parameter models in just a few seconds while achieving accuracy comparable to, or better than, existing methods.

Alejandro Hernández Cano (EPFL):

TBD

ZurichNLP #18

Time & Location

About the Event

Share This Event