Stable Diffusion - Model Versions

Quiz

The Stability Diffusion model has undergone significant improvement since its release, with each version building up the lessons from the previous version. This chapter compares the functionality between the versions of stable diffusion.

Stable Diffusion 1.x

The first generation of Stable Diffusion models, known as the 1.x series, includes 1.1, 1.2, 1.3, 1.4, and 1.5 versions. They are capable enough to generate a wide range of styles and require a limited amount of computational power and resources.

Stable Diffusion 2.x

The 2.x series includes 2.0 and 2.1. This series has been developed to create high-resolution images, along with the ability to interpret expressive and complex prompts.

Stable Diffusion XL 1.0

Stable Diffusion XL 1.0 is the most used open-source version that creates high-resolution images with improved color grading and composition. Also, this version can understand complex prompts and concepts.

Stable Diffusion XL Turbo (SDXL Turbo) is the extension of SDXL 1.0 that is developed for rapid generation of images in a single step.

Stable Diffusion 3

Stable Diffusion 3 is the latest version announced by Stability AI in March 2024, with improved performance in features like interpreting prompts, image quality and resolution, and spelling abilities. The model is still in its preview stage and still not available to the public.

Comparing Stable Diffusion Models

The following table summarizes the features and improvements across the versions of Stable Diffusion −

Features	SD 1.5	SD 2.0	SD 2.1	SD XL 1.0
Release Date	October 2022	November 2022	December 2022	July 2023
Resolution	512x512	768x768	768x768	1024x1024
Prompt Technology	OpenAI's CLIP Vit-L/14	LAION's OpenCLIP-ViT/H	LAION's OpenCLIP-ViT/H	OpenCLIP-ViT/G and CLIP-ViT/L
Strength	Beginner friendly, better performance on landscape and architectural subjects	Improved handling and interpretation of complex prompts, better image resolution	Improved conceptual understanding, better color grading, and image quality	Better portraits, high resolution and image quality, shorted prompts
Limitations	Poor prompt interpretation	More restrictive in generations, NSFW filtering	More "censored," especially with generating celebrities and art styles.	Requires computational resources to run locally

Previous Quiz Next