LLM Infrastructure
FPGA-Based CXL Memory Architecture Tackles LLM KV-Cache Bottlenec
New research proposes CXL-SpecKV, a disaggregated FPGA architecture using CXL memory pooling and speculative prefetching to overcome memory bottlenecks in large language model inference at datacenter scale.