ZurichNLP #18
Mon 24 Nov
|Zürich
Philippe Bich (Huawei) on calibration-free low-precision LLMs, and Alejandro Hernández Cano (EPFL) on Apertus, the Swiss open LLM.


Time & Location
24 Nov 2025, 18:00 – 20:00
Zürich, OAT ETH Zurich (14th floor), Andreasstrasse 5, 8050 Zürich, Switzerland
About the Event
Philippe Bich (Huawei): SINQ: Sinkhorn-Normalized Quantization for calibration-free low-precision LLM weights
Quantization is a key technique for making Large Language Models (LLMs) and Vision-Language Models (VLMs) faster and less memory-demanding, but maintaining accuracy at low precision remains challenging. Outlier weights often dominate shared scales, leading to large errors and degraded performance. We introduce SINQ, our novel open-source approach that makes post-training quantization both simpler and more accurate.
SINQ adds a second normalization axis and applies a fast Sinkhorn-Knopp-style algorithm to balance per-row and per-column variances in weight matrices. Tested on Qwen3 and DeepSeek-V2.5, SINQ improves perplexity on WikiText2 and C4 benchmarks without calibration data or model-specific tuning. It can quantize billion-parameter models in just a few seconds while achieving accuracy comparable to, or better than, existing methods.
Alejandro Hernández Cano (EPFL):
TBD