Bridging the gap between complex scientific research and the curious minds eager to explore it.

Computer Science, Sound

Evaluating Generative Models’ Quality Without Training Data: Perceptual Metrics as an Alternative

Evaluating Generative Models' Quality Without Training Data: Perceptual Metrics as an Alternative

Generative models are becoming more popular in production systems, but evaluating their quality is a challenge. Traditional methods rely on human ratings, which are time-consuming and expensive. This survey aims to identify objective metrics that can accurately assess the quality of generative models for vision and audio.

Objective Metrics

The authors argue that traditional training objectives such as mean squared error are not well correlated with human quality ratings. Instead, they propose using perceptual metrics that take into account structural information common across natural signals or the structure of neurological pathways used in perception. These metrics can be designed by analyzing the normalized laplacian pyramid, which is a way to represent images in a more compact and efficient manner.

Perceptual Metrics

The authors propose using perceptual metrics that better match human quality ratings. One such metric is structural similarity, which takes into account the structure of natural signals or neurological pathways used in perception. Another metric is multiscale structural similarity, which combines information from different scales to provide a more accurate assessment of image quality.

Acknowledgments

The authors acknowledge the support of various funding agencies and thank their colleagues for their contributions to the field.

Conclusion

In conclusion, evaluating generative models for vision and audio is crucial in today’s production systems. While traditional methods rely on human ratings, objective metrics can provide a faster and more efficient way to assess the quality of these models. By using perceptual metrics that take into account structural information common across natural signals or neurological pathways used in perception, we can accurately evaluate the quality of generative models for vision and audio.