Bridging the gap between complex scientific research and the curious minds eager to explore it.

Computation and Language, Computer Science

DP-Prompt: Private Document Paraphrasing via Differentially Private Text Generation

DP-Prompt: Private Document Paraphrasing via Differentially Private Text Generation

In this article, we explore the issue of de-anonymization attacks in the context of differential privacy (DP) and propose a novel approach called DP-Prompt to address these limitations. By leveraging large language models fine-tuned for paraphrasing tasks, we generate sanitized versions of documents that conceal author identity while preserving their meaning. Our proposed method combines the advantages of word-level and sentence-level techniques and overcomes the limitations of existing approaches by considering contextual information.
We begin by discussing the challenges of de-anonymization attacks in DP frameworks, where various methods have been proposed to address these issues. We then introduce our proposed approach, DP-Prompt, which fine-tunes a language model specifically for paraphrasing tasks and generates sanitized versions of documents. Our method is designed to overcome the limitations of existing approaches by incorporating contextual information and generating more accurate paraphrases.
We evaluate DP-Prompt against several state-of-the-art methods on various factors, including privacy level, fine-tuning, and generation capabilities. Our results demonstrate that DP-Prompt outperforms existing techniques in terms of both accuracy and privacy guarantees. We also conduct a series of experiments to analyze the effectiveness of our approach in different scenarios and show that it can generate sanitized documents that are not only accurate but also protect user privacy.
In conclusion, DP-Prompt offers a promising solution for de-anonymization attacks in DP frameworks. By leveraging large language models fine-tuned for paraphrasing tasks, we provide a novel approach that combines the advantages of word-level and sentence-level techniques while considering contextual information. Our proposed method demonstrates better performance than existing approaches and provides a more comprehensive solution to address de-anonymization attacks in DP frameworks.