ASIP Design Methodology and State of the Art in Channel Decoder Design

Get Complete Project Material File(s) Now! »

 Recursive Systematic Convolutional codes

Convolutional codes have been widely used in applications such as space and satellite communications, cellular mobile, digital broadcasting, etc. Their popularity is due to their simple structure and easily implementable maximum likelihood soft decision decoding methods. A convolutional code of rate r = k=n is a linear function which, at every instant i, transforms an input symbol di of k bits into an output coded symbol ci of n bits (k > n). A code is called Systematic if a part of the output is made up of systematic bits si = di and the rest of the bits (n 􀀀 k) are made up of parity bits. The output bits are the linear combination of the current and the previous input bits. The linear function is usually given by generator polynomials. There are three main types of convolutional codes namely Non-Systematic (NSC), Systematic (SC) and Recursive Systematic (RSC) convolutional codes. The only difference in the encoding of the SC and NSC is that, in NSC the output does not contain input systematic bits as shown in Figure 1.5a. This figure illustrates too how a convolutional encoder can be described using shift registers and modulo two adders. The generator polynomials of the convolutional codes examples of Figure 1.5 are given as: Si = di = g1(x) = 1 + x1 + x2.

Turbo Code Interleaver

Interleavers in a digital communication system are used to temporally disperse the data. The primary interest of them in concatenated codes is to put two copies of same symbol (coming to two encoders) at different interval of time. This enables to retrieve at least one copy of a symbol in a situation where the other one has been destroyed by the channel. An interleaver ( Q ) satisfying this property can be verified by studying the dispersion factor S given by the minimum distance between two symbols in natural order and interleaved order: S = min i;j (ji 􀀀 jj + j(i) 􀀀 (j)j) (1.2) The design of interleavers respecting a dispersion factor can be reasonably achieved through the S-random algorithm proposed in [9]. However, even if this kind of interleaver can be sufficient to validate the performance in the convergence zone of a code, it does not achieve a good asymptotic performance. Therefore to improve the latter, the design of the interleaver must also take into account the nature of component encoders. Complexity of the hardware implementation should, in addition, be taken into account. In fact, the recent wireless standards specify performance and hardware aware interleavling laws for each supported frame length. In following sections the interleaving functions associated to Turbo codes for WiMAX and LTE standards are described.

Almost regular permutation (ARP)

The ARP interleaver is used in double binary Turbo codes for both standards IEEE 802.16e WiMAX . It can be described by the function Q (j) which provides the interleaved address of each double-binary symbol of index j, where j = 0; 1; :::N 􀀀1 and N is the number of symbols in the frame. Y (j) = (P0 j + P + 1) mod N

Max-Log-MAP for Turbo decoding

Considering the above description of Max-Log-MAP algorithm and an 8-state double binary Turbo code, i.e. m = 2 bits per symbol, we can summarize the steps for Turbo decoding as follows: The SISO0 process bit LLRs in the natural order corresponding to (S0; S1; P0; P1) while the SISO1 process in the interleaved sequence (S0 0 ; S1 0 ; P0 0 ; P1 0 ) corresponding to ith symbol. Where (S0; S1) and (S00; S10) are the systematic bits in natural and interleaved order respectively. Similarly, (P0; P1) and (P00; P10) are the parity bits in natural and interleaved order respectively. Since the trellis is double binary, the input bit LLR’s (j ) to the decoder are converted to systematic ( sys i ) and parity symbol LLR’s ( par i )given by the equations (1.27) to (1.30) below. The ith systematic symbol sys00 i (s0; s) can be calculated as the negative of sys11 i (s0; s). Similarly, sys01 i (s0; s), par01 i (s0; s) and par00 i (s0; s) can be calculated as the negative of sys10 i (s0; s), par10 i (s0; s) and par11 i (s0; s) respectively.

READ  Comparison with previous visible Doppler velocimetry measurements

Table of contents :

1 Background: Channel Codes and Decoding Algorithms 
1.1 Communication system overview
1.2 Turbo codes
1.2.1 Recursive Systematic Convolutional codes
1.2.2 Turbo Code Interleaver Almost regular permutation (ARP) Quadratic polynomial permutation (QPP)
1.3 Turbo decoding
1.3.1 Maximum Aposteriori Probability (MAP) algorithm
1.3.2 Max-Log-MAP approximation
1.3.3 Max-Log-MAP for Turbo decoding
1.3.4 Parallelism in Turbo decoding Metric level parallelism SISO decoder level parallelism Turbo decoder level parallelism
1.4 Low Density Parity Check codes
1.4.1 Linear block codes
1.4.2 QC-LDPC codes
1.4.3 LDPC in WiFi and WiMAX standard
1.5 Low Density Parity Check decoding
1.5.1 LDPC decoding algorithm: Normalized Min-Sum (NMS)
1.5.2 Scheduling
1.5.3 Modified NMS formulation for implementation
1.6 Summary
2 ASIP Design Methodology and State of the Art in Channel Decoder Design 
2.1 Customizable embedded processors
2.1.1 Application-Specific Instruction-set Processors
2.1.2 ADL-based design tool: Processor designer
2.1.3 Classical ASIP design flow
2.2 State of the art in channel decoder design
2.2.1 Turbo decoding architectures
2.2.2 LDPC decoding architectures
2.2.3 Multi-code channel decoding architectures
2.3 Initial ASIP architecture for flexible Turbo decoding
2.3.1 Overview of the TurbASIP architecture
2.3.2 TurbASIP pipeline
2.3.3 Max and modulo operators
2.3.4 TurbASIP: sample assembly code
2.3.5 Memory partitions
2.3.6 ASIC synthesis results
2.4 Summary
3 DecASIP: Flexible Turbo/LDPC Decoder 
3.1 Design motivations
3.1.1 Architecture Efficiency
3.1.2 Quantization analysis
3.2 DecASIPv1
3.2.1 System architecture
3.2.2 Turbo mode Memory architecture Processing schedule Pipeline architecture Interleave/deinterleave address generation NoC messages Assembly code
3.2.3 LDPC mode Proposed scheduling illustrated with simple example using 2- DecASIPv1 architecture Proposed scheduling with 8-DecASIPv1 architecture Memory architecture NoC messages Pipeline architecture and assembly Code
3.2.4 ASIC synthesis results
3.3 DecASIPv2
3.3.1 System architecture
3.3.2 Turbo mode
3.3.3 LDPC mode NoC messages and NoC schedule LDPC assembly code
3.3.4 Configuration memory
3.3.5 ASIC synthesis results
3.3.6 Discussions and analysis of recent related implementations
3.4 Summary
4 FPGA and ASIC Prototyping of DecASIP 
4.1 Overview of the proposed FPGA 4-DecASIP system prototype
4.2 Flexible channel encoder
4.2.1 Flexible Turbo encoder
4.2.2 Flexible LDPC encoder
4.3 Flexible Turbo/LDPC decoder
4.4 Other blocks of the system prototype
4.4.1 Pseudo random generator
4.4.2 Flexible channel model
4.4.3 Global input interface
4.4.4 Error counter
4.4.5 Configuration module
4.4.6 Global system controller
4.4.7 Graphical User Interface (GUI)
4.4.8 USB interface
4.5 Results of the FPGA prototype
4.5.1 FPGA synthesis results
4.5.2 Speed of reconfiguration between different decoding modes
4.5.3 Scalability and throughput
4.5.4 Performance results
4.6 ASIC integration of DecASIP
4.6.1 MAG3D chip from CEA-LETI
4.6.2 Integration constraints
4.6.3 ASIC integration results
4.7 Summary
5 TDecASIP: Parameterized Turbo Decoder
5.1 Proposed design flow for parameterized cores
5.2 Design choices and TDecASIP decoder architecture
5.2.1 Design choices
5.2.2 TDecASIP decoder architecture Pipeline control finite state machine Pipeline architecture
5.2.3 Memory organization
5.3 FPGA prototype and synthesis results
5.4 ASIC synthesis results
5.5 Summary
6 LDecASIP: LDPC Decoder 
6.1 Design motivations and LDecASIP decoder architecture
6.2 Prototype and incremental feature addition
6.3 FPGA and ASIC synthesis results
6.4 Summary
Conclusions and Perspectives
R´esum´e en Franc¸ais
List of publications


Related Posts