Network Attached Storage (NAS) and Storage Area Network (SAN)

Get Complete Project Material File(s) Now! »

The different types of distributed storage system architectures

Distributed storage architectures are numerous and span widely different scales, from a few servers to tens of thousands. Some storage systems are composed of loosely connected anonymous servers while others are centralized organizations with a single master node that has full knowledge and authority on the cluster. This section proposes a classification of storage clusters in 5 main families. This categorization is in no way the only one and some systems might even fit multiple families, but it has the merit of underlining the strength and limitations of each family. Because the storage systems adopting these architectures are varied (filesystems, SQL databases, key-value stores, object stores …), we call data object the basic piece of data on which those systems operate (file, database entry, key-value pair, object …) for the remainder of this chapter.

Network Attached Storage (NAS) and Storage Area Network (SAN)

A NAS is the most basic form of distribution with regard to storage. It is simply a storage device accessible through a Local Area Network (LAN), usually though an operating system on a host server. It is a commonly used way to share data among users or computers and as a storage backend for devices with small storage devices. The Network File System (NFS) [35], Server Message Block (SMB) [36] and Apple Filing Protocol (AFP) [37] are the most popular protocols used to interact with a NAS. Even though the first version of NFS was developped in 1984, these protocols are still widely in use today. NAS are often used by clients as remote filesystems with a local mount point. More intricate versions of multi-NAS systems include the Andrew File System (AFS) [38] and other filesystems inspired by it such as OpenAFS [39], the Coda File System [40], Intermezzo [41], etc. While most NAS are regularly backed up, thus offering some kind of reliability in the face of hardware failure, they are still usually composed of a single storage device or server, which is a throughput and capacity bottleneck. Furthermore, data transfers take place on the LAN and depending on the use-case, it can take a large part of the available throughput.
Therefore, the 1990’s saw the emergence of Storage Area Network (SAN) technologies. A SAN is a network purely dedicated to storage. It is composed of several storage devices that are interconnected through dedicated links. These links are most often fibre channel or ethernet, with the servers running the Fibre Channel Protocol (FCP) or the internet Small Computer System Interface (iSCSI) protocol. While SANs are performant, they are also not flexible and require a dedicated infrastructure that is hard to scale. Unlike a NAS, SANs operate at the block level (rather than filesystem).

Master-Slaves architectures

Where DHTs provide a distributed and consistent way to assign data objects to servers, master-slaves architectures put the burden of data placement and indexing on a master server. Requests to write or read data from the system must go through this master, either to decide where the data will be written or to know where to find it. Because data placement is not tied to consistent hash, it can be flexible and allow for more fine-grained policies than DHTs. Furthermore, having a single master reduces the complexity of concurrent operations, since the master can keep track of who is interacting with which data object. For this reason, distributed filesystems often rely on master-slaves architectures since operations on file can range from simple file creation to random writes in different places of the file. The Google File System (GFS) [24] and its open source counterpart, the Hadoop File System (HDFS) [29] adopt this architecture, as shown in figure 1.6. However, these single master architectures have an obvious scalability limitation. While efforts can be made to increase the number of concurrent client or amount of data that a single master server can handle, like in GFS where numerous optimisations were made (such as very large filesystem blocks – severalMBinstead of the traditionnal 4KB – and aggressive prefetching), a point comes where the master can not handle too many clients. Moreover, the single master is a Single Point Of Failure (SPOF). For this reason the master is often replicated on inactive masters ready to serve as fallbacks, but the transition period when a master fails can lead to cluster inactivity. These limitations have been observed and discussed both for GFS and HDFS [30, 31], pushing Hadoop to develop a distributed master layer for HDFS [53] and Google to develop a new distributed filesystem named Colossus [54] – on which there is no public information. There are many other single-master filesystems such as QFS [55], GPFS [56], the Global File System [57]…

READ Banach algebras of infinitary formulas

Table of contents :

Declaration of Authorship
Abstract
Acknowledgements
1 What you should know about distributed storage systems
1.1 The different types of distributed storage system architectures
1.1.1 Network Attached Storage (NAS) and Storage Area Network (SAN)
1.1.2 Peer-to-Peer (P2P) networks
1.1.3 Distributed Hash Tables (DHTs)
1.1.4 Master-Slaves architectures
1.1.5 Summarize
1.2 Reliability in distributed storage systems
1.2.1 Mirroring
1.2.2 Replication
1.2.3 Erasure Codes
1.2.4 Erasure codes and replication: what is the trade-off
1.3 Consistency and consensus
1.3.1 Theoretical frameworks
Consistency and Availability: the CAP theorem
Database characteristics: ACID and BASE
Client-centric and data-centric consistency models
1.3.2 Consensus and consistency: how to reach it
Consensus algorithms: Paxos and Raft
Latency and Consistency, the (N,W,R) quorum model .
1.4 Examples of distributed storage systems
2 6Stor
2.1 Why we built 6Stor from scratch
2.1.1 Software layering
2.1.2 Architectural reasons
2.1.3 Ceph
2.1.4 GFS
2.1.5 Scaling the metadata layer and embracing the heterogeneity
2.2 6Stor architecture
2.2.1 Architecture Description
2.2.2 Attributing IPv6 prefixes to MNs
2.2.3 6Stor: An IPv6-centric architecture
2.2.4 Description of basic operations
2.2.5 Consistency
2.3 Expanding or shrinking the cluster without impacting the cluster’s performance
2.3.1 Storage Nodes
2.3.2 Metadata Nodes
2.3.3 Availability and data transfer
2.4 Coping with failures: reliability and repair model
2.4.1 Reliability
2.4.2 Reacting to failures
Short failure
Definitive failure
Voluntary shutdown and maintenance
Maintaining reliability
2.5 Considerations on the Architecture
2.5.1 Client and Cluster Configuration
2.5.2 Layer of Indirection
2.5.3 Scalability
2.5.4 Metrology and Analytics
2.5.5 Limitations
2.6 Experimental Evaluation
2.6.1 Rationale
2.6.2 Setup and Protocol
2.6.3 Results
2.6.4 Get Tests
2.6.5 Post Tests
2.6.6 CPU consumption analysis
2.6.7 Performance impact of HTTP
Protocol
Results
2.7 Conclusion
3 6Stor extensions
3.1 Building a block device on 6Stor
3.1.1 Different implementations
3.1.2 A note on caching and consistency
3.1.3 Performance benchmark
3.2 Adapting 6LB to 6Stor
3.2.1 Load balancing in distributed storage systems
3.2.2 Segment-routing load-balancing
3.2.3 Adapting 6LB to 6Stor
3.2.4 Consequences on consistency
3.3 Conclusion
4 Request Scheduler for Storage Systems (RS3)
4.1 Related work
4.1.1 Packet scheduling
4.1.2 I/O scheduling
4.1.3 System-wide scheduling
4.2 Designing RS3
4.2.1 Typical storage server implementation
4.2.2 RS3’s rationales
4.2.3 RS3’s batch budget allocation algorithm
4.3 First evaluation and analysis
4.3.1 Experimental protocol
4.3.2 Throughput fairness results
4.3.3 Response time results
4.3.4 Throughput results
4.4 Using Linux filesystem mechanisms to improve RS3
4.4.1 Sending hints to the kernel
4.4.2 Response time and throughput results
4.5 Going further with RS3
4.5.1 Evaluating batch budget’s impact on RS3’s performance.
4.5.2 Tweaking RS3 to enforce policies: Weighted-RS3
4.5.3 Considerations on RS3 and its current implementation
4.6 Conclusion
4.6.1 Going further
5 Caching erasure-coded objects
5.1 RelatedWork
5.2 Caching and Popularity In Distributed Storage Systems
5.2.1 System Architecture
5.2.2 Object Caching
5.3 Theoretical Evaluation
5.3.1 Popularity Model
5.3.2 System Model
5.3.3 Performance Evaluation
5.3.4 Results and evaluation
5.4 Experimental Evaluation
5.4.1 Experimental setup
5.4.2 Results and Evaluation
5.5 Conclusion
A Predictive Container Image Prefetching
A.1 Motivations
A.2 Storage and containers
A.3 Some statistics about popular container images
A.4 Optimized Predictive Container Image Storage System (OPCISS)
B Vectorizing TCP data handling for file servers
B.1 Motivations
B.2 State of the art
B.3 Segment-oriented TCP in VPP
B.4 Zero-copy file server
Bibliography