Write disabling the Mc→v memory when a Min-Sum algorithm is used

Get Complete Project Material File(s) Now! »

The turbo message passing schedule

The turbo decoding messages passing introduced by Mansour [43, 42] is also known as shuffled iterative decoder [70], staggered decoding or gauss-seidel iteration. Turbo decoding applies a different message schedule than the two phase flooding schedule. In the case of the horizontal shuffle schedule, the basic idea is to process the CNs one by one and to pass the newly calculated messages immediately to the corresponding VNs. The VNs update their outgoing messages in the same sub iteration. The next CN will thus receive newly updated messages which improve the convergence speed. Figure 1.6 shows the Probability Density Function (PDF) of the number of iterations before a codeword is decoded for a flooding schedule and for an horizontal shuffle schedule. This figure shows a simulation for N=16200 bits, a code rate of 1/2 and a constant Eb/No = 1dB. Note that the average number of iterations (itavr) to find a codeword is about two times smaller for the shuffle schedule than for the flooding schedule. The same observation can be done for the maximum number of iterations (itmax).
However, the main drawback of this schedule is that the CN are updated serially one by one leading to low throughput. The next subsection explains how this serial schedule can be partially parallelized.

Soft Output (SO) centric decoder

In this subsection we explain how the soft output (SO) based check node processor (CNP) architecture is deduced. From (1.6) and (1.7), we can find the new equation: SOv = Mv→c + Mc→v (1.14) The update of the VNs connected to a given CN steps. First, the message from a VN to a CN (Mv→c is done serially in three ) is calculated as: Mv→c = SOv − Mcold→v (1.15).
The second step is the serial Mc→v update, where Mc→v is a message from CN to VN, and is also called extrinsic. Let vc be the set of all the VNs connected to CN c and vc/v be vc without v. For implementation convenience, the sign and the absolute value of the messages |Mcnew→v| are updated separately: sign(Mcnew→v) = sign(Mv′→c) (1.16) v′∈vc/v |Mcnew→v| = f f (|Mv′→c|) (1.17) v′∈vc/v where f (x) = − ln tanh value: x2 . The third step is the calculation of the SOnew SOvnew = Mv→c + Mcnew→v (1.18).
The updated SOvnew value can be used in the same iteration by another sub-iteration leading to convergence which is twice as fast as the flooding schedule [42].

The DVB-S2, -T2 and -C2 standards

With the establishment of the European Digital Video Broadcasting (DVB) standard and the American Television Systems Committee (ATSC) standard, digital TV (DTV) broadcasting is now a reality in several countries. The Ter-restrial broadcasting or Digital Terrestrial Television (DTT), in France the ”Television num´erique terrestre” (TNT) are defined by the DVB-T standard [19]. Even if DTT requires a set top box equipment or a specific chip in the TV, it offers many advantages. Among them, thanks to the Forward Error Correction (FEC), DTT allows obtaining optimum picture when an analogue tuner would only allow a poor quality picture. With DTT, the high definition television (HDTH) is also possible. The format is the 1920 by 1080 pixel/frame format interlaced at 60 fields per second.

The LDPC code in the DVB-S2, -T2 and -C2 standards

The DVB-S2, -T2, -C2 standards features variable coding and modulation to optimize bandwidth utilization based on the priority of the input data, e.g., SDTV could be delivered using a more robust setting than the correspond-ing HDTV service. These DVB standadards also features adaptive coding and modulation to allow flexibly adapting transmission parameters to the reception conditions of terminals, e.g., switching to a lower code rate during fading.

Code rates

The DVB-S2, -T2, -C2 standards [20, 22, 21] are characterized by a wide range of code rates (from 1/4 up to 9/10) as shown in table 1.2. Further-more, FEC frame may have either 64800 bits (normal) or 16200 bits (short). Each code rate and frame length corresponds to an LDPC matrix: this is 21 matrices for the DVB-S2 standard, 13 matrices for the DVB-T2 standard and 11 matrices for the DVB-C2 standard. The matrices construction is identical for the three standards. The advantage is that the same LDPC decoder can be used for the 3 standards. Due to the fact that the decoder is identical for the 3 standards, hereafter DVB-X2 refers to DVB-2, -T2, -C2 standards.


Table of contents :

1 Background 
1.1 Basic concepts
1.1.1 Digital communication
1.1.2 Channel decoders
1.1.3 Linear block codes
1.1.4 LDPC codes
1.1.5 Standard Belief Propagation LDPC decoding
1.2 Sub-optimal algorithms
1.2.1 The normalized Min-Sum algorithm and other related algorithms
1.2.2 Serial implementation of the NMS algorithm
1.3 LDPC Layered decoder
1.3.1 The turbo message passing schedule
1.3.2 Structured matrices
1.3.3 Soft Output (SO) centric decoder
1.3.4 Architecture overview
1.4 The DVB-S2, -T2 and -C2 standards
1.4.1 Digital Terrestrial Television
1.4.2 DVB group
1.4.3 The LDPC code in the DVB-S2, -T2 and -C2 standards
1.4.4 State-of-the-art on DVB-S2 LDPC implementation .
1.5 Testing the performance of a decoder
1.5.1 Software and hardware simulation
1.5.2 Test with all-zero codewords
1.5.3 Test of a communication model
1.5.4 Channel emulator
1.5.5 Interpreting results
1.5.6 Standard requirements
2 Memory update conflicts 
2.1 Conflicts due to the matrix structure
2.1.1 State-of-the-art
2.2 Conflict resolution by group splitting
2.2.1 Construction of the sub-matrices
2.2.2 DDSM in DVB-X2 and simulation results
2.3 Parity check matrix equivalent
2.3.1 Principle of the split-extend process
2.3.2 Simulation results
2.3.3 Performance improvement
2.4 Conflict Resolution by Layer duplication
2.4.1 Conflict resolution by Write Disabling the memory
2.4.2 Scheduling of the layers
2.4.3 Write disabling in the Mc→v memory
2.4.4 Write disabling the Mc→v memory when a Min-Sum algorithm is used
2.4.5 Simulations and memory size results
2.4.6 The write-disable architecture
2.4.7 Synthesis results on FPGA
2.4.8 Conclusion
2.5 Memory update conflicts due to pipeline
2.5.1 Non pipelined CNP
2.5.2 Pipelined CNP
2.5.3 The problem of memory update conflicts
2.5.4 Conflict reduction by group splitting
2.5.5 Conflict resolution by scheduling
2.5.6 Conclusion
2.6 Combining layers duplication and scheduling
2.7 Conclusion
3 Memory optimization 
3.1 Saturation of the stored values
3.1.1 Channel LLR saturation
3.1.2 SO saturation
3.1.3 Saturation of the extrinsic messages
3.1.4 Combining the saturation processes
3.1.5 Saturation optimization conclusion
3.2 Optimizing the size of the extrinsic memory
3.2.1 Extrinsic memory size requirements
3.2.2 Optimization principle
3.2.3 Results of optimization
3.2.4 Case of the sum-product algorithm
3.2.5 Mc→v memory optimization conclusion
3.3 Finite precision architecture of the layered decoder
3.4 Results of memory optimization
3.4.1 Monte-Carlo Simulations results
3.4.2 Synthesis results on FPGA
3.4.3 Memory capacity comparison
3.5 A single port RAM architecture
3.5.1 Single port ram, dual port ram, pseudo dual port ram and dual port RAM
3.5.2 Memories in ASICS and FPGA
3.5.3 Implementation of dual port RAM with single Port
3.5.4 FIFO memory with single port memory modules
3.5.5 Single port memories banks for the SO memories
3.6 Layer scheduling for single port RAM
3.6.1 An example with two memory banks
3.6.2 Generalization for DVB-X2 matrices
3.6.3 Genetic algorithm to solve scheduling problem .
3.7 Conclusion
4 Multi Stream LDPC decoder 
4.1 Introduction
4.2 The parallelism option
4.2.1 Area saving compared with x decoders and conclusion
4.3 Share resources in a dual stream decoder
4.3.1 Sharing principle
4.3.2 Advantages, drawbacks and conclusion
4.4 Use of a buffer
4.4.1 FIFO buffer principle
4.4.2 Preemptive buffer control
4.4.3 Variable iterative decoder
4.4.4 FIFO Buffer size
4.4.5 Advantages and drawbacks
4.4.6 Implementation issue
4.5 Conclusion
5 Conclusion 
5.1 Produced work
5.2 Perspectives
A DVB-S2 matrices construction 
A.1 Standard matrices construction
A.2 Matrix permutations for layered structure
B Hardware Discrete Channel Emulator 
B.1 Introduction
B.2 Linear Feedback Shift Register
B.3 The alias method algorithm
B.4 The HDCE architecture
B.5 Resulting distribution
C R´esum´e ´etendu 
C.1 Introduction
C.2 Pr´e-requis
C.3 Les conflits de mise `a jour de la m´emoire
C.3.1 Conflits dus `a la structure
C.3.2 Conflits dus au pipelining
C.4 Optimisation de la taille m´emoire
C.4.1 Optimisation de la taille des mots
C.4.2 Optimisation des bancs m´emoire des extrins`eques
C.4.3 Utilisation de RAM simple port
C.5 Un d´ecodeur de flux multiple
C.5.1 Parall´elisme
C.5.2 Partage des ressources
C.5.3 Addition d’un buffer `a un d´ecodeur it´eratif variable
C.6 Conclusion
C.6.1 Applications
C.6.2 Perspectives
List of figues
List of tables


Related Posts