1、Seedream 3.0 Technical ReportByteDance SeedAbstractWe present Seedream 3.0,a high-performance Chinese-English bilingual image generation founda-tion model.We develop several technical improvements to address existing challenges in Seedream2.0,including alignment with complicated prompts,fine-grained
2、 typography generation,suboptimalvisual aesthetics and fidelity,and limited image resolutions.Specifically,the advancements ofSeedream 3.0 stem from improvements across the entire pipeline,from data construction to modeldeployment.At the data stratum,we double the dataset using a defect-aware traini
3、ng paradigmand a dual-axis collaborative data-sampling framework.Furthermore,we adopt several effectivetechniques such as mixed-resolution training,cross-modality RoPE,representation alignmentloss,and resolution-aware timestep sampling in the pre-training phase.During the post-trainingstage,we utili
4、ze diversified aesthetic captions in SFT,and a VLM-based reward model withscaling,thereby achieving outputs that well align with human preferences.Furthermore,See-dream 3.0 pioneers a novel acceleration paradigm.By employing consistent noise expectationand importance-aware timestep sampling,we achie
5、ve a 4 to 8 times speedup while maintainingimage quality.Seedream 3.0 demonstrates significant improvements over Seedream 2.0:it enhancesoverall capabilities,in particular for text-rendering in complicated Chinese characters which isimportant to professional typography generation.In addition,it prov
6、ides native high-resolutionoutput(up to 2K),allowing it to generate images with high visual quality.Official Page:https:/ 2.0Imagen 3Ideogram 3.0Midjourney v6.1FLUX1.1 ProSeedream 3.0Figure 1Seedream 3.0 demonstrates outstanding performance across all evaluation aspects.Due to missing data,thePortra