N
Hacker Next
new
past
show
ask
show
jobs
submit
login
▲
New KV cache compaction technique cuts LLM memory 50x without accuracy loss
(
venturebeat.com
)
15 points by
mellosouls
3 days ago
|
2 comments
add comment
Rendered at 11:03:43 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
westurner 2 days ago
[-]
ScholarlyArticle: "Fast KV Compaction via Attention Matching" (2025)
https://arxiv.org/abs/2602.16284
androiddrew 2 days ago
[-]
I hope this is real.