Even a Monkey Can Understand Stable Diffusion and Comfy UI

This article is for complete beginners, explaining the operation of Stable Diffusion in the simplest way possible. It may not be entirely accurate, but it’s the easiest way to understand through analogies.

Comfy UI and Web UI are like two different storefronts of the same factory, selling the same products. Comfy UI is optimized better and overall faster, but the products you get are the same.

Basically, the simple Comfy UI process is just breaking down the Web UI components and placing them separately. However, Comfy UI allows for a lot of customization (community-developed plugins), at the cost of a less straightforward interface. Beginners can start with Web UI to get familiar.

Once you have Web UI installed, you’ll see the following screen (default is in English, I have installed a translation).

Here are the translations and explanations of the terms:

  1. Checkpoint = Large Model = Model = Artist: Choose the artist you want. Different artists excel in different styles and interpret the same word differently. For example, if you type “a girl,” one artist might draw a realistic style, another an anime style, one might draw a 10-year-old, another a 30-year-old. If you type a less common term, some artists might not understand and draw something random. For example, if I type “golden ship,” some might draw a horse, some a beast-eared girl, and some a golden ship. As for SD1.5, SDXL, you can think of them as different generations of artists. SD1.5 is like an artist trained by the company Stable Diffusion, and then others train various versions of SD1.5 according to their preferences. Note that VAE and LORA compatible with different generations of artists are not interchangeable.

  2. LORA = Small Model = Module: Think of it as a game mod or a guide for the artist. For example, if you type “a girl,” your checkpoint might randomly give you various girls, but if you use a Hatsune Miku LORA, the girl drawn will look like Hatsune Miku. Note that you can set the strength of LORA. The higher the strength, the more it will follow your guide, but if it’s too high, it will limit the artist’s creativity, and the result will be poor.

  3. Clip Skip = Clip Stop Layer: You don’t need to know what this is. Just know that some checkpoints require Clip Skip-2, so set it to -2, or the result will be poor or even broken.

  4. Prompt = Client’s Request: What you want your artist to draw. Just type your request in the box. In Comfy UI, this box is called Clip.

  5. Latent = Canvas: Basically, you choose how wide and high you want your drawing to be and how many canvases you want the artist to draw.

  6. Sampler = Sampling Method = Artist’s Drawing Technique: Basically, you tell your artist (checkpoint) which technique to use. You need to check reviews or test which one suits you. Some checkpoints require specific samplers. If you don’t know which one to choose, Euler a is usually a safe bet. There are also special accelerated methods like LCM, turbo, lightning, etc., which you need to look up online on how to use. These require specific checkpoints and other components.

  7. Scheduler: Integrated into the sampler in Web UI. It’s a sub-item of the sampler and not very important. Choosing exponential will make the result a bit blurry. Other options are similar. If you don’t know which one to choose, go with karras.

  8. CFG = Client’s Request Strictness: The higher the CFG number, the more the generated image will follow your request, but with less variation. Conversely, the lower the CFG, the more freedom the artist has, resulting in more variation but less adherence to your instructions.

  9. Step = Number of Steps = Artist’s Effort: In theory, the higher the step, the better the detail, but limited by the checkpoint’s capability. Usually, beyond 40 steps, it’s hard to see any difference, so no need to go higher.

  10. Seed = Random Seed: Added random parameters to make each image different. If you want to test the difference between checkpoints, prompts, samplers under the same conditions, fix the seed. Otherwise, random is fine.

  11. VAE = Colorist: Imagine an artist finishes a drawing, and you’re not satisfied with the colors, so you ask a colorist to recolor it. Sounds a bit redundant, right? So newer checkpoints often don’t need VAE.

Once you’ve selected all these, you can start. The basic Stable Diffusion process is:

Choose your artist (checkpoint) >> Choose the LORA you want to use (optional) >> Set Clip Skip (optional) >> Tell the artist what you want and don’t want (prompt) >> Tell them how much effort and which technique to use (sampler) >> Give them the canvas (latent) >> Add seed >> Start drawing >> Ask the colorist to adjust the colors (VAE) >> Save the result.

Other things I didn’t mention can be left alone for now. If you’re interested, you can find tutorials online. This article is just for beginners. 

Ethical Considerations and the Future Landscape of AIGC

 Artificial Intelligence-Generated Content (AIGC) has bridged the gap between burgeoning technology and human creativity, charting new territories in the digital creative sphere. However, this revolutionary progress brings forward not just opportunities but also ethical conundrums that must be addressed. This article delves into the ethical concerns surrounding AIGC and foreshadows the future landscape of this rapidly evolving domain.



With AIGC, content creation reaches an unprecedented scale and speed. Nonetheless, such advancements also prompt important ethical questions:

  • Authorship and Ownership: Who holds the rights to AI-generated work? As AI systems can create content reflective of a particular style or mimic an existing artist, it complicates the issues of originality and copyright.
  • Transparency: Consumers have the right to know whether the content they're consuming is generated by AI. It is vital to maintain transparency about the origins of content to ensure informed engagement.
  • Bias and Discrimination: AI algorithms, inherently influenced by their training data, can perpetuate and amplify societal biases. This can manifest across content types, from written articles to visual arts, and has significant implications.
  • Job Displacement: There is an ongoing concern that AIGC might supplant human jobs in creative fields. It’s imperative to assess how these technologies can serve as a tool rather than a replacement for human creativity.
  • Accountability and Content Moderation: When AI creates objectionable or harmful content, determining accountability becomes complex. Effective content moderation mechanisms need to be in place to safeguard against misuse.

Ethical Frameworks for AIGC


As the AIGC landscape evolves, developing ethical frameworks to guide its use is critical. Some initiatives propose:

  • Establishing AIGC Ethics Committees: Composed of artists, ethicists, technologists, and legal experts to oversee the responsible development of AIGC tools.
  • Creating Standards for Fair Use: Guidelines that define fair practices for using AI to generate content, ensuring it doesn’t infringe upon individual creativity and rights.
  • Promoting AI Literacy: Educating creators and consumers about AI capabilities and limitations can cultivate a culture of ethical AIGC usage.

The Future Landscape of AIGC


Looking ahead, the AIGC landscape is expected to continue expanding, calling for efficiency in creative tasks while nurturing the human-AI collaborative essence. We’ll likely see:

  • Regulatory Evolution: Laws will evolve to cover the novel legal challenges posed by AIGC, including copyright disputes and ethical content generation.
  • Advancements in AI Personalization: AI may offer personalized and adaptive content creation tools that align closely with individual users' styles and preferences.
  • New Creative Employment Models: AIGC might lead to new job roles and markets that leverage the strength of AI while valuing the irreplaceable human element.

In Conclusion


The ethical considerations and the dynamic future of AIGC are intrinsically tied. Addressing the ethical issues will build trust in the technology and foster a future where AIGC enhances human creativity rather than undermines it. The conscious implementation of AIGC, paired with ongoing discourse on its ethical boundaries, will ensure that the future of human creativity and AI-generated content advances hand in hand, paving the way for a responsible and inspiring digital renaissance.

Composing with AI: How Technology is Reshaping Music

 The harmonious blend of music and technology has reached a crescendo with the advent of Artificial Intelligence (AI) in the realm of composition. The once unimaginable idea that machines could not only perform but also create music is today's reality. In this chapter, we explore the burgeoning field of AI musicianship and how technology is redefining the art of composing music.


The Roots of AI in Music Composition


The integration of AI in music isn't entirely new. For decades, algorithms have played a role in generating simple melodies and aiding in sound synthesis. However, recent advances in AI have led to more sophisticated applications that can analyze musical structures, learn styles, and compose complex pieces that resonate with human emotions. These advances are not just changing how music is made but also expanding the definition of the composer.

The AI Composition Process


The process of composing music with AI involves Machine Learning (ML), particularly deep learning networks that have been fed large datasets of musical scores, recordings, and styles. AI music platforms like AIVA (Artificial Intelligence Virtual Artist) and IBM's Watson Beat can now create music across various genres, from classical to contemporary pop.

  1. Training: AIs are trained using vast collections of music to understand patterns, harmonies, rhythms, and styles.
  2. Generation: Based on this training, AIs can generate original compositions by interpolating existing musical knowledge and producing novel combinations of musical elements.
  3. Refinement: These generated pieces can then be refined by human composers, who adjust nuances to better align with creative visions or specific emotional tones.

The Symphony of Opportunities


AI music platforms present a multitude of opportunities:

  • Accessibility: AI opens the door for non-musicians to compose music, democratizing the ability to create and express through this medium.
  • Collaboration: AI as a collaborative tool can offer suggestions, variations, and enhancements to a musician's existing work, leading to unexpected and innovative artistic results.
  • Customization: From personalized soundtracks to adaptive game music, AI can tailor compositions to individual preferences and scenarios.

Harmony and Discord: Ethical Considerations


While the benefits of AI in music are manifold, ethical issues must be addressed:

  • Authorship and Copyright: Defining ownership of AI-generated music and addressing copyright implications are critical considerations in this digital age.
  • Preservation of Artistic Identity: Ensuring that AI does not dilute the artist's unique voice is important for maintaining the integrity of the human element in music.
  • Bias and Diversity: Like all AI applications, biases can exist in AI music systems based on the datasets they are trained with, necessitating careful curation and oversight.

The Resounding Future of Music with AI


The future of AI in music promises even more nuanced compositions as technology continues to evolve. Upcoming AI systems will likely synergize with Virtual Reality (VR) and Augmented Reality (AR) for immersive musical experiences, offering new platforms for performance and storytelling.

In the crescendo of AI's role in music, it is not about machines usurping the artist's seat but rather, orchestrating a new wave of collaborative creativity. As we advance, AI-enabled music composition is bound to play a significant score in the concert of future artistic expression, where human and digital creativity harmonize in an exciting new symphony.