Experimental Evaluation of Dynamic Attention in Text Generation

The article discusses the use of pre-trained T5 models for text generation tasks, specifically for translation and summarization. The authors propose a new approach that adapts adversarial attacks for text generation tasks, modifying the objectives of TextBugger and TextFooler originally designed for classification tasks. The method aims to minimize the BLEU score between the machine-translated text and the reference translation generated from translating or summarizing the clean texts using the static model. Adversarial texts are generated from the original static model using texts from the TED Talk and Gigaword datasets, and then translated and summarized using the original static model and three dynamic models, respectively. The authors evaluate their method on two popular tasks: classification and text generation, using datasets such as Amazon (sentiment analysis), Twitter (toxic comment detection), Enron (spam detection), Yelp (business reviews), TED Talk (original Ted talks), and Gigaword (headlines). The results show that their method outperforms the baselines in both tasks, demonstrating its effectiveness in adapting adversarial attacks for text generation tasks.
In summary, the article explores the use of pre-trained T5 models for text generation tasks, specifically for translation and summarization, and proposes a new approach that adapts adversarial attacks to improve performance. The method is evaluated on various datasets and shows promising results in both classification and text generation tasks.

ARXIV/2311.17400 authored by Lujia Shen, Yuwen Pu, Shouling Ji, Changjiang Li, Xuhong Zhang, Chunpeng Ge, Ting Wang.

Experimental Evaluation of Dynamic Attention in Text Generation

LLama 2 7B Chat

Categories

Tags

Archives

Experimental Evaluation of Dynamic Attention in Text Generation

LLama 2 7B Chat

Accurate Analysis of Image Captions with CoT-Based Methods

Unsupervised Audio-Caption Alignment via Correspondence Learning

Efficient Method for ML Model Accuracy Improvement in Non-IID Data Settings

Categories

Tags

Archives