This article is for complete beginners, explaining the operation of Stable Diffusion in the simplest way possible. It may not be entirely accurate, but it’s the easiest way to understand through analogies.
Comfy UI and Web UI are like two different storefronts of the same factory, selling the same products. Comfy UI is optimized better and overall faster, but the products you get are the same.
Basically, the simple Comfy UI process is just breaking down the Web UI components and placing them separately. However, Comfy UI allows for a lot of customization (community-developed plugins), at the cost of a less straightforward interface. Beginners can start with Web UI to get familiar.
Once you have Web UI installed, you’ll see the following screen (default is in English, I have installed a translation).
Here are the translations and explanations of the terms:
Checkpoint = Large Model = Model = Artist: Choose the artist you want. Different artists excel in different styles and interpret the same word differently. For example, if you type “a girl,” one artist might draw a realistic style, another an anime style, one might draw a 10-year-old, another a 30-year-old. If you type a less common term, some artists might not understand and draw something random. For example, if I type “golden ship,” some might draw a horse, some a beast-eared girl, and some a golden ship. As for SD1.5, SDXL, you can think of them as different generations of artists. SD1.5 is like an artist trained by the company Stable Diffusion, and then others train various versions of SD1.5 according to their preferences. Note that VAE and LORA compatible with different generations of artists are not interchangeable.
LORA = Small Model = Module: Think of it as a game mod or a guide for the artist. For example, if you type “a girl,” your checkpoint might randomly give you various girls, but if you use a Hatsune Miku LORA, the girl drawn will look like Hatsune Miku. Note that you can set the strength of LORA. The higher the strength, the more it will follow your guide, but if it’s too high, it will limit the artist’s creativity, and the result will be poor.
Clip Skip = Clip Stop Layer: You don’t need to know what this is. Just know that some checkpoints require Clip Skip-2, so set it to -2, or the result will be poor or even broken.
Prompt = Client’s Request: What you want your artist to draw. Just type your request in the box. In Comfy UI, this box is called Clip.
Latent = Canvas: Basically, you choose how wide and high you want your drawing to be and how many canvases you want the artist to draw.
Sampler = Sampling Method = Artist’s Drawing Technique: Basically, you tell your artist (checkpoint) which technique to use. You need to check reviews or test which one suits you. Some checkpoints require specific samplers. If you don’t know which one to choose, Euler a is usually a safe bet. There are also special accelerated methods like LCM, turbo, lightning, etc., which you need to look up online on how to use. These require specific checkpoints and other components.
Scheduler: Integrated into the sampler in Web UI. It’s a sub-item of the sampler and not very important. Choosing exponential will make the result a bit blurry. Other options are similar. If you don’t know which one to choose, go with karras.
CFG = Client’s Request Strictness: The higher the CFG number, the more the generated image will follow your request, but with less variation. Conversely, the lower the CFG, the more freedom the artist has, resulting in more variation but less adherence to your instructions.
Step = Number of Steps = Artist’s Effort: In theory, the higher the step, the better the detail, but limited by the checkpoint’s capability. Usually, beyond 40 steps, it’s hard to see any difference, so no need to go higher.
Seed = Random Seed: Added random parameters to make each image different. If you want to test the difference between checkpoints, prompts, samplers under the same conditions, fix the seed. Otherwise, random is fine.
VAE = Colorist: Imagine an artist finishes a drawing, and you’re not satisfied with the colors, so you ask a colorist to recolor it. Sounds a bit redundant, right? So newer checkpoints often don’t need VAE.
Once you’ve selected all these, you can start. The basic Stable Diffusion process is:
Choose your artist (checkpoint) >> Choose the LORA you want to use (optional) >> Set Clip Skip (optional) >> Tell the artist what you want and don’t want (prompt) >> Tell them how much effort and which technique to use (sampler) >> Give them the canvas (latent) >> Add seed >> Start drawing >> Ask the colorist to adjust the colors (VAE) >> Save the result.
Other things I didn’t mention can be left alone for now. If you’re interested, you can find tutorials online. This article is just for beginners.