Tag: roofline

All the articles with the tag "roofline".

The roofline model for LLM inference

12 Jun, 2026

Why single-stream LLM decode uses ~0.3% of an H100's tensor-core throughput, and how the roofline model explains nearly every inference optimization that matters.