1、Seed Diffusion:A Large-Scale Diffusion Language Modelwith High-Speed Inference1ByteDance Seed2Institute for AI Industry Research(AIR),Tsinghua University3SIA-Lab of Tsinghua AIR and ByteDance SeedAbstractWe presentSeed Diffusion Preview,a large-scale language model based on discrete-state diffusion,
2、offering remarkably fast inference speed.Thanks to non-sequential,parallel generation,discretediffusion models provide a notable speedup to mitigate the inherent latency of token-by-tokendecoding,as demonstrated recently(e.g.,Mercury Coder 1,Gemini Diffusion 2).Seed DiffusionPreview achieves an infe
3、rence speed of2,146 token/sover H20 GPUs while maintaining competitiveperformance across a sweep of standard code evaluation benchmarks,significantly faster thancontemporary Mercury and Gemini,establishing new state of the art on the speed-quality Paretofrontier for code models.Demo is available ath
4、ttps:/studio.seed.ai/exp/seed_diffusion/.Date:August 5,2025Correspondence:,Project Page:https:/ token/s400 token/s800 token/s1200 token/s1600 token/s2000 token/sLiveCodeBench*Bigcode BenchMbppHuman EvalSpeed33.730.825.017.034.853.245.4 45.542.053.379.476.0 76.677.185.282.889.6 90.088.084.82146148973
5、71109Seed Diffusion PreviewGemini DiffusionMercury Coder(small)Mercury Coder(mini)Seed Coder InstructFigure 1Seed Diffusions inference speed is measured over H20 GPUs across eight open code benchmarks.Directcomparison with baselines is challenging due to differing test conditions:Mercury Coder was e
6、valuated on a proprietarydataset with H100s,while Gemini Diffusions speed was averaged over a mixed-task benchmark using unknownhardware.Furthermore,reported speeds on these benchmarks can benefit from format-constraining system prompts.LiveCodeBench results are specifically on the 1055 problems fro