Which explanation best captures why decoder-only autoregress…

Questions

Which explаnаtiоn best cаptures why decоder-оnly autoregressive Transformers became the dominant paradigm for modern LLMs?