quantization
Recover-LoRA: Restoring Accuracy in 2-Bit LLMs
A new technique called Recover-LoRA uses low-rank adaptation and knowledge distillation on synthetic data to reclaim accuracy lost during aggressive 2-bit quantization of language models, enabling far more efficient deployment.