Together AI
Together AI Open-Sources OSCAR for 2-Bit KV Cache
Together AI has open-sourced OSCAR, an attention-aware 2-bit KV cache quantization system that slashes memory costs for long-context LLM serving while preserving accuracy across reasoning and retrieval benchmarks.