Best Practice Workflow for Automatic 1111 โ€“ Stable Diffusion

AIKnowledge2Go
26 Jun 202307:59

TLDRThis video outlines the best workflow for using the Stable Diffusion model in Automatic 1111, focusing on the semi-realistic 'ref animated' model. The tutorial covers setting up the CLIP skip, adjusting UI settings, and using Euler a for prompt engineering. It advises on image dimensions to avoid deformations and suggests a batch size for image selection. The video also details the image-to-image process, changing the sampler to DPM++ 2m, resizing images, and adjusting denoising strength. It concludes with tips on fixing image errors and upscaling using the RSRugen 4X Anime 6B upscaler for enhanced detail.

Takeaways

  • ๐ŸŽจ **Best Workflow for Stable Diffusion**: The video outlines a recommended workflow for using Stable Diffusion in automatic 1111.
  • ๐Ÿ–ผ๏ธ **Model Selection**: The 'ref animated' model is highlighted for its semi-realistic renderings and is suggested for trying out.
  • โš™๏ธ **Settings Adjustment**: 'Clip Skip' is set to 2, and the UI needs to be restarted for the setting to take effect.
  • ๐Ÿš€ **Prompt Preparation**: A detailed prompt involving a female astronaut and an exploding space station is provided.
  • ๐Ÿ“ **Resolution Settings**: The video emphasizes using a width and height of 768 to avoid deformations, with a 16:9 aspect ratio for screen compatibility.
  • ๐Ÿ”Ž **Sampling Method**: Euler a is recommended for prompt engineering and experimenting, while DPM++ 2m is chosen for detail enhancement.
  • ๐Ÿ”ง **Denoising Strength**: A denoising strength between 0.4 and 0.7 is suggested, depending on the desired level of image alteration.
  • ๐Ÿ–ฅ๏ธ **Batch Processing**: A batch size of 8 is set to allow selection from multiple renders.
  • ๐Ÿ‘ฉโ€๐ŸŽจ **Image Selection**: The process of choosing the best image from a batch based on certain criteria like the explosion detail is discussed.
  • ๐Ÿ› ๏ธ **Image Refinement**: The use of 'send to image to image' is highlighted to maintain composition while refining details.
  • ๐Ÿ“ˆ **Upscaling**: The video recommends using the 'r s rugen 4X anime 6B' upscaler for semi-realistic images rendered with 'ref animated'.

Q & A

  • What is the recommended workflow for using Stable Diffusion in Automatic 1111?

    -The recommended workflow involves using the ref animated model, setting clip skip to 2, adjusting settings such as Euler a for prompt engineering, setting width and height to 768 and 432 respectively for a 16:9 resolution, and cranking up the batch size to 8 for selecting multiple images.

  • How do you enable the clip skip feature in Automatic 1111?

    -To enable clip skip, go to settings, then user interface, and click into Quick Settings list. Type in 'clip stop at last layers' and restart your UI.

  • What is the significance of setting the width and height to 768 and 432?

    -Setting the width and height to 768 and 432 ensures that the image resolution is within the maximum most models can handle without causing deformations.

  • Why is Euler a recommended for experimenting with the model?

    -Euler a is recommended for experimenting because it is fast and efficient for prompt engineering.

  • What is the purpose of increasing the batch size to 8 during rendering?

    -Increasing the batch size to 8 allows for the selection of multiple images, providing a variety of options to choose from.

  • Why is the sampler changed to DPM++ 2m in the workflow?

    -Changing the sampler to DPM++ 2m helps maintain the composition while introducing some changes to the details of the image.

  • What is the recommended denoising strength value when using DPM++ 2m?

    -The recommended denoising strength value is between 0.4 and 0.7, with 0.7 for more changes and 0.4 for minimal changes.

  • How does the process of downscaling and upscaling affect the image quality?

    -Downscaling reduces the image resolution temporarily, which helps in not losing too much detail. Upscaling later ensures a crisp and detailed image.

  • What is the role of the mask in the image editing process?

    -The mask is used to focus on specific areas of the image, such as the face, without affecting the rest of the image during the editing process.

  • Why is it important to check if the mask has been deleted before proceeding with edits?

    -Checking if the mask has been deleted is crucial because if it hasn't, it will affect the edits, potentially causing errors or unwanted changes to the image.

  • What upscaler is recommended for images rendered with ref animated?

    -The RSRugan 4X anime 6B upscaler is recommended for images rendered with ref animated due to its semi-realistic look.

Outlines

00:00

๐ŸŽจ 'Best Workflow for Stable Diffusion'

The speaker introduces their preferred workflow for using Stable Diffusion with the ref animated model, which is known for its semi-realistic renderings. They share tips and tricks for using the model effectively. The tutorial covers how to adjust settings such as clip skip, resolution, and batch size. A specific prompt involving a female astronaut and an exploding space station is prepared and will be shared on the speaker's Patreon page. The speaker emphasizes the importance of using Euler for prompt engineering and maintaining a 16:9 resolution for screen compatibility. They also discuss the selection process for the best image from a batch render and the decision to use the image-to-image sampler to maintain compositional integrity while introducing detail changes.

05:02

๐Ÿš€ 'Refining and Enhancing the Image'

In the second paragraph, the focus is on refining the generated image by addressing issues such as a missing leg on the astronaut. The speaker guides viewers on how to use the 'Over Paint' feature to fix such anomalies. They discuss the importance of scaling the image down for denoising to preserve detail and then scaling it back up for a crisp image. The speaker also covers the use of denoising strength and batch count to manage the level of changes in the image. They conclude by selecting the most satisfactory image from the batch and discussing the next steps, which include using an upscaling tool called RSRugan 4X Anime 6B to enhance the image's details. The speaker also mentions upcoming video content, including a tutorial on painting hands and common mistakes to avoid with Stable Diffusion.

Mindmap

Keywords

๐Ÿ’กStable Diffusion

Stable Diffusion is a type of deep learning model used for generating images from textual descriptions. In the context of the video, it is the primary tool used for creating AI art. The presenter discusses a workflow that optimizes the use of Stable Diffusion for generating images, emphasizing its capabilities and how to get the best results from it.

๐Ÿ’กAutomatic 1111

Automatic 1111 seems to refer to a specific version or configuration of the Stable Diffusion model being used. The video presents a workflow that is tailored to this version, suggesting that different versions of the model may require different approaches or settings for optimal results.

๐Ÿ’กRef Animated

Ref Animated is described as a semi-realistic model within the Stable Diffusion framework that can produce beautiful renderings. The video uses this model to demonstrate the workflow, showcasing its ability to create detailed and aesthetically pleasing images, such as the scene with a female astronaut and a space station.

๐Ÿ’กClip Skip

Clip Skip is a setting in the user interface that allows users to skip certain layers during the image generation process. The video explains how to enable this feature and suggests setting it to 2, which is part of the presenter's recommended workflow for optimizing image results.

๐Ÿ’กEuler a

Euler a is mentioned as a method for prompt engineering, which is the process of refining textual prompts to guide the AI in generating specific types of images. Euler a is recommended for its speed when experimenting with different prompts, as it allows for quicker feedback and iteration.

๐Ÿ’กResolution

Resolution refers to the dimensions of the generated images, with the video specifying 768 as the maximum width and 432 as the height to maintain image quality without causing deformations. The presenter also explains the choice of a 16:9 aspect ratio for the screen resolution.

๐Ÿ’กBatch Size

Batch Size is the number of images generated in one rendering process. The video suggests increasing the batch size to 8 to have more options to choose from when selecting the best images from a render.

๐Ÿ’กDenoising Strength

Denoising Strength is a parameter that controls the amount of change introduced during the image refinement process. The video recommends setting it between 0.4 and 0.7, with 0.7 resulting in more changes and 0.4 resulting in minimal changes, to find a balance between detail preservation and enhancement.

๐Ÿ’กImage to Image

Image to Image is a mode within the Stable Diffusion workflow where the generated image is based on a provided image rather than being completely new. The video uses this mode to maintain the composition of the original image while introducing some changes, as controlled by the denoising strength.

๐Ÿ’กDPMP

DPMP stands for Denoising Diffusion Probabilistic Models in Practice and is a type of sampler used in the image generation process. The video suggests using DPM plus plus 2m as the sampler to enhance image details like facial features.

๐Ÿ’กUpscaling

Upscaling is the process of increasing the resolution of an image. The video mentions using an upscaler called 'RS Rugen 4X Anime 6B' to enhance the details of the final image, turning it into a high-resolution artwork suitable for display or further use.

Highlights

Introduction to the best workflow for Stable Diffusion in automatic 1111.

Use of the ref animated model for semi-realistic renderings.

Setting clip skip to 2 for better UI settings.

How to enable the clip skip slider in the settings.

Preparing a prompt featuring a female astronaut and an exploding space station.

Using Euler a for prompt engineering and experimentation.

Setting width and height to 768 to avoid deformations.

Choosing a 16:9 resolution for screen compatibility.

Increasing batch size to 8 for image selection.

Selecting the best image based on the explosion detail.

Changing the sampler to DPM plus plus 2m arrows for image refinement.

Resizing the image by two for better detail enhancement.

Setting denoising strength between 0.4 and 0.7 for controlled changes.

Choosing one image from the batch for further refinement.

Fixing image issues like the leg situation using Over Paint.

Downscaling and upscaling the image for crisp details.

Using denoising strength of 4.5 for minor changes.

Finalizing the image with the correct leg and boot details.

Using the RSRugen 4X anime 6B upscaler for final image enhancement.

Upcoming tutorial videos on in painting hands and common mistakes with automatic 1111.