Learning to Redraw Non-Standard Hands in Stable Diffusion Generated Images

In this article, we explore the challenge of generating realistic hands in images using a technique called Stable Diffusion. The issue arises when the generated hands do not resemble human hands closely enough, creating an uncanny valley effect. To address this problem, we propose a method for detecting and restoring non-standard hands in Stable Diffusion images.
Our approach involves using control images to guide the generation of realistic hands. These control images are created by redrawing samples from a dataset to ensure accurate representation of the hand’s position. We also use negative prompts to avoid unwanted outcomes, such as deformed or low-quality images.
To visualize the process, we provide an example of how our method works in Figure 11. In this figure, we show samples from the control image generation and the resulting union mask, which highlights the difference between realistic and non-standard hands.
Our proposed method improves the accuracy and realism of hand representation in Stable Diffusion images, making them more suitable for applications such as augmented reality, virtual reality, and gaming. By detecting and restoring non-standard hands, we can create a more seamless and realistic experience for users, avoiding the uncanny valley effect.
In summary, this article presents a method to improve the accuracy of hand representation in Stable Diffusion images by detecting and restoring non-standard hands. Our approach uses control images and negative prompts to guide the generation process and create more realistic images. This innovation has significant implications for applications that require realistic hand representation, such as AR, VR, and gaming.

ARXIV/2312.04236 authored by Yiqun Zhang, Zhenyue Qin, Yang Liu, Dylan Campbell.

Learning to Redraw Non-Standard Hands in Stable Diffusion Generated Images

LLama 2 7B Chat

Categories

Tags

Archives

Learning to Redraw Non-Standard Hands in Stable Diffusion Generated Images

LLama 2 7B Chat

Accurate Analysis of Image Captions with CoT-Based Methods

Unsupervised Audio-Caption Alignment via Correspondence Learning

Efficient Method for ML Model Accuracy Improvement in Non-IID Data Settings

Categories

Tags

Archives