Bridging the gap between complex scientific research and the curious minds eager to explore it.

Computer Science, Distributed, Parallel, and Cluster Computing

Secure Data Deduplication in Cloud Storage Using Secret Sharing Schemes

Secure Data Deduplication in Cloud Storage Using Secret Sharing Schemes

In this article, we propose a novel cloud storage system called FASTEN, designed to address the limitations of traditional cloud storage solutions. By leveraging network coding and deduplication techniques, FASTEN provides a fault-tolerant and storage efficient solution for data replication. Our system balances between replication and deduplication to minimize storage overhead while ensuring high availability. We present a detailed analysis of the system architecture and design principles, highlighting the key components and their functionalities. FASTEN offers several advantages over existing cloud storage systems, including improved reliability, scalability, and efficiency. With its innovative approach, FASTEN is poised to revolutionize the way we store and manage data in the cloud.

Introduction

Cloud computing has become an essential part of modern-day computing, offering a range of benefits, including on-demand accessibility, scalability, and cost-effectiveness. However, one of the significant challenges associated with cloud storage is ensuring data availability and integrity, particularly in the face of failures or attacks. To address these concerns, we propose FASTEN, a novel cloud storage system that combines network coding and deduplication techniques to provide a fault-tolerant and storage efficient solution.

System Architecture

FASTEN is designed as a hierarchical structure consisting of multiple layers, each with its unique function and responsibilities (Fig. 1). The system architecture can be divided into three main components: i) Index Server, ii) Data Node, and iii) Network Coding.

Index Server

The Index Server is the core component of FASTEN, responsible for managing the system’s metadata and ensuring data availability. It stores a hash table of all files in the system, along with their corresponding locations on the Data Nodes. The Index Server also maintains a subset of the most recent files to facilitate faster data retrieval (Fig. 2).

Data Node

The Data Node is responsible for storing and managing the actual data files. It fragments the data into smaller blocks and stores them in a distributed manner across multiple nodes to ensure fault tolerance. The Data Node also maintains a memorization table to store the differences between the current and previous hashes of each block (Fig. 3).

Network Coding

Network coding is used to generate redundant data blocks, which are stored on different Data Nodes. This technique ensures that if one Data Node fails, the other nodes can still recover the lost data by using the redundant information (Fig. 4).

Data Processing

FASTEN utilizes several techniques for data processing, including i) convergent encryption, ii) AES-256 symmetric cryptography, iii) file fragmentation based on a specified block size, and iv) authentication tag generation using the SHA-256 algorithm. These techniques ensure data privacy, integrity, and authenticity (Fig. 5).

Redundancy

To minimize storage overhead while ensuring high availability, FASTEN employs a redundancy scheme that utilizes hashes of data blocks to store them on multiple Data Nodes. The system also maintains a memorization table to store the differences between the current and previous hashes of each block (Fig. 6).

System Design Principles

FASTEN is designed based on several principles, including i) distribution, ii) fault tolerance, iii) data privacy, iv) scalability, v) efficiency, and vi) simplicity. By incorporating these design principles, FASTEN provides a robust and reliable cloud storage solution (Fig. 7).

Conclusion

In this article, we proposed FASTEN, a novel cloud storage system that combines network coding and deduplication techniques to provide a fault-tolerant and storage efficient solution for data replication. By leveraging these techniques, FASTEN minimizes storage overhead while ensuring high availability and data integrity. Our system offers several advantages over existing cloud storage systems, including improved reliability, scalability, and efficiency. With its innovative approach, FASTEN is poised to revolutionize the way we store and manage data in the cloud.