Posts tagged attention
-
MiniMax Sparse Attention: Teaching Long-Context Models to Use an Index
MiniMax Sparse Attention turns long context into searchable memory: a learned index selects relevant key-value blocks, then exact softmax attention reads only those blocks.