Spatial Coupling for Distributed Storage and Diversity Applications

Get Complete Project Material File(s) Now! »

Erasure channel essential

Erasure channel is the simplest non-trivial channel model. This model has for each transmitted input two possible outcomes: either it is correctly received or it is known to be lost, no corruption is possible. The decoding problem is to nd the values of the information erased given the locations of erasures and the non-erased part of the codeword. The transmitter and the receiver know the time t, i.e. they are both synchronized. The channel input at time t is denoted by Xt and the corresponding output Yt. The Binary Erasure Channel (BEC) was introduced by Peter Elias in 1954 as a toy example. In the case of the binary erasure channel the channel input is binary and Xt 2 f0; 1g. The corresponding output Yt is ternary and takes on values in the alphabet fXt; ?g, where ? indicates an erasure. One transmitted bit is erased with probability , i.e. P(Yt =?) = and P(Yt = 0) = P(Yt = 1) = 1 􀀀 . Erasure occurs for each t independently, therefore the channel is said to be memoryless. The Binary Erasure Channel, BEC() is illustrated in Figure 1.1.
In this thesis, we consider also non-binary erasure channel. For this channel model, input Xt is q-ary and belongs to the nite eld Fq. The integer q is the number of elements of the nite eld Fq. Respectively, Yt belongs to Fq [ f?g. We can notice that binary erasure channel is a particular case where q = 2.
Our interest is focused on three erasure channels:
The symbol erasure channel: SEC(q; ). The N symbols of a codeword are independently erased. A symbol is erased with a probability and is correctly received with a probability 1 􀀀 . The erasure events are independent from one symbol to another.
This channel is said to be ergodic. Ergodicity of a channel is dened as follows: Denition 1.1. A channel is said to be ergodic if the time average of the signal is equal to the collective average. In other words, the randomness of the channel gain can be averaged out over time.
In a similar way we can dene a non-ergodic channel:
Corollary 1.1. A channel is non-ergodic if its channel gain is a random variable and is assumed to be invariant for some time period. The channel gain process is stationary but not ergodic, i.e., the time average is not equal to the ensemble average. In other words, the randomness of the channel gain cannot be averaged out over time. So long-term constant bit rates cannot be supported.
The block erasure channel: CEC(q; ). The N symbols of a codeword are partitioned into M blocks, each block contain symbols. M is the number of degrees of freedom of the channel. The channel erases a block with a probability . The block is correctly received with a probability 1 􀀀 . The erasure events are independent from one block to another. From Corollary 1.1, the block erasure channel is a non-ergodic channel. In this context, non-ergodicity means that the transmitted codeword spans only a nite number M of independent realizations of the channel independently from its length. Block erasure channel is a particular case of non-ergodic (quasi-static) fading
channels, where the fading coecient belongs to f0;+1g.

Reed-Solomon codes

Reed{Solomon (RS) codes are block-based error-correcting codes that were introduced by Irving S. Reed and Gustave Solomon in 1960 [84]. They are applied in all areas requiring reliable data. The most prominent applications are in digital communications and storage. These codes were dened for noisy channel where errors, i.e. the data transmitted is modied during the transmission, occur during transmission. Reed-Solomon codes can also be used to correct burst of errors. A burst of errors is a series of bits in the codeword which are received in error.
RS codes are linear block codes and belong to the family of BCH (Bose, Ray- Chaudhuri et Hocquenghem) codes, which are a class of cyclic error-correcting codes constructed using nite elds. For more details see [46], [12], [10], [65].
An RS code is a cyclic linear block code over the nite eld Fq of length n, with n a divisor of q 􀀀 1. When n = q 􀀀 1 the RS code is said to be primitive. Reed-Solomon codes are maximum distance separable codes as they attain the Singleton bound in Definition 1.8. In the sequel of this thesis, we only consider non-trivial non-binary RS codes where q > n > 2.

Low-density parity-check codes

Low-Density Parity-Check (LDPC) codes are linear error-correcting codes based on parity-check matrices. LDPC codes were developed by Robert G. Gallager in his doctoral dissertation in 1963 [36], hence so-called Gallager codes. In [101], Tanner proposed a bipartite graph representation of LDPC codes. Moreover, LDPC codes belong to the family of sparse graph codes, which are not MDS codes. However, some existing practical constructions of sparse graph codes allow to approach very closely the Shannon limit over a symmetric memoryless channel, in particular over the BEC. Thus, they are named capacity-approaching codes. Using iterative belief propagation techniques, LDPC codes can be decoded in time linear to their block-length. All these advantages explain that LDPC codes are widely used in applications requiring reliable and highly ecient information transfer over noisy channels.
Graph representation of codes Any linear code can be represented as a graph. We start by giving the general denition of a graph [11].
Denition 1.11. A graph G is an ordered pair of disjoint sets (V;E) such that E is a subset of the set of unordered pairs of V . The set V is the set of vertices and the set E is the set of edges.
LDPC codes are represented by a bipartite graph also called Tanner graph [101]. It is dened as follows.
Denition 1.12. A graph G is a bipartite graph with vertex classes V1 and V2 if V (G) = V1 [ V2, V1 \ V2 = ; and every edge joins a vertex of V1 to a vertex of V2, i.e. G has a bipartition (V1; V2).
An LDPC code shall be represented by a bipartite graph G = (V1; V2;E). V1 is the set representing the symbols from the nite eld Fq, V2 is the set representing the constraints to be applied on these symbols, and E is the set of edges. We adopt the following representation:
Symbols also called variable nodes are represented by circles on the left of the bipartite graph.
Constraints on symbols also called check nodes are represented by squares on the right of the bipartite graph.
The incident matrix of G is a matrix that shows the relationship between the two sets of nodes. The matrix has one row for each element of V2 and one column for each element of V1. The entry in row i and column j is 1 if i and j are linked by one edge in the graph and 0 if they are not. The incident matrix of G is a parity-check matrix H of the LDPC code. A check node in the Tanner graph corresponds to a row in H, i.e. it represents a parity-check equation (SPC) code such as c1 + c2 + c3 = 0 shown inFigure 1.3 .

READ Experiences of the boundary between work and home life

Table of contents :

Acknowledgments
Abstract
Table of contents
List of gures
List of tables
List of abbreviations
List of notations
Resume Detaille de la These
Introduction
1. Channel Coding Background and Recent Advances
1.1. Erasure channel essential
1.2. Linear block codes
1.3. Reed-Solomon codes
1.4. Product codes
1.5. Low-density parity-check codes
1.6. Fountain codes
1.7. Spatially coupled codes
1.8. Polar codes
1.9. Reed-Muller codes
1.10. Conclusions
2. Coding for Distributed Storage
2.1. Distributed storage parameters and constraints
2.2. Today’s methods: Replication and erasure codes
2.3. Regenerating codes
2.4. Locally repairable codes
2.5. Repairable Fountain codes
2.6. Conclusions
3. Spatial Coupling for Codes and Diversity
3.1. State of the art: uniform spatial coupling
3.2. Forward-layered LDPC coupling
3.2.1. Construction of the forward-layered LDPC coupling
3.2.2. Non-uniform chain of spatial coupling
3.2.3. Spatially-variant spatial coupling
3.3. Spatial coupling of Root-LDPC ensembles: Parity bits doping
3.3.1. Main properties of Root-LDPC ensembles
3.3.2. Spatial coupling of Root-LDPC ensembles
3.4. Spatial Coupling for Distributed Storage and Diversity Applications
3.4.1. Channel coding model
3.4.2. Uncoupled rate-3=4 Root-LDPC ensemble
3.4.3. Double diversity Root-LDPC for distributed storage applications .
3.5. Conclusions
4. Edge Coloring in Product Code Graphs
4.1. Mathematical notation and Terminology
4.2. Graph representations for diversity
4.2.1. Non-compact graph
4.2.2. Compact graph
4.2.3. Diversity and codes on graphs
4.2.4. Rootcheck nodes and root symbols
4.2.5. The rootcheck order in product codes
4.3. Stopping sets for MDS components
4.3.1. Decoding erasures
4.3.2. Stopping set denition
4.3.3. Stopping sets and subgraphs of product codes
4.3.4. Enumeration of stopping sets
4.4. Edge coloring algorithm under constraints
4.4.1. Hand-made edge coloring and its limitations
4.4.2. The algorithm
4.4.3. Applications
4.4.4. Random edge coloring
4.5. Code performance in presence of erasures
4.5.1. Block erasures
4.5.2. Independent erasures
4.5.3. Unequal probability erasures
4.6. Conclusions
Conclusion and perspectives
Appendices
4.A. Proof of Theorem
4.A.1. Minimum distances satisfying d2 < 2d1
4.A.2. Minimum distances satisfying d2 = 2d1
4.A.3. Minimum distances satisfying d2 > 2d1
Bibliography