Concept

Transformers

The neural-network architecture, built on attention, behind nearly all modern LLMs.

A full explainer for this concept is being written. In the meantime, here's what's in the news.

In the news

Accelerating Text-to-Video Generation with Calibrated Sparse Attention

Apple ML Research · Jul 21, 2026

RayRoPE: Projective Ray Positional Encoding for Multi-View Attention

Apple ML Research · Jul 20, 2026

Profiling in PyTorch (Part 3): Attention is all you profile

Hugging Face · Jul 10, 2026

transformers-explained — Transformer architecture explained step by step - the full architecture, every attention variant, positional embeddings,

GitHub · Jul 8, 2026

Native-speed vLLM transformers modeling backend

Hugging Face · Jul 8, 2026

Scaling works. These researchers are betting billions it isn't enough

Transformer · Jul 7, 2026

Subtext — To know what models don't say out loud.

GitHub · Jul 6, 2026

MemoryLLM: Plug-n-Play Interpretable Feed-Forward Memory for Transformers

Apple ML Research · Jul 2, 2026