Improved Accuracy in Emotion Recognition through Automatic Segmentation and Transcription: A Comparative Study

In this article, the authors aim to analyze short videos on social media platforms by leveraging multimodal emotion analysis. They designed criteria for selecting short videos featuring one or two main characters, with clear speech in the same language, and less than three minutes duration. The audio segments are then fed into a Whisper model for transcription, and the resulting text is analyzed for emotions using a multimodal emotion analysis method.
The authors explain that short videos have simple but strong emotions due to their fragmented transmission and purpose of gaining high likes and comments. They argue that conducting multimodal emotion analysis on short videos can provide more accurate results than traditional videos, making them a valuable resource for understanding public attitudes and anticipating future opinions.
The authors introduce the concept of multimodal data, which combines multiple modalities such as audio, video, and text to analyze emotions. They emphasize the importance of considering the context in which short videos are created and disseminated on social media platforms.
In summary, the article presents a new approach to analyzing short videos on social media by leveraging multimodal emotion analysis. By combining audio, video, and text data, the method can provide more accurate results than traditional video analysis, making it a valuable tool for understanding public attitudes and anticipating future opinions. The authors emphasize the importance of considering the context in which short videos are created and disseminated on social media platforms.

ARXIV/2312.04279 authored by Qinglan Wei, Yaqi Zhou, Yuan Zhang.

Categories

Tags

Archives

Improved Accuracy in Emotion Recognition through Automatic Segmentation and Transcription: A Comparative Study

LLama 2 7B Chat

Accurate Analysis of Image Captions with CoT-Based Methods

Unsupervised Audio-Caption Alignment via Correspondence Learning

Efficient Method for ML Model Accuracy Improvement in Non-IID Data Settings

Categories

Tags

Archives