Some RNS Implementations of Asymmetric Cryptosystems on FPGA from the Literature

Get Complete Project Material File(s) Now! »

Montgomery Reduction

Montgomery [Mon85] introduces a MR algorithm that avoids computing the costly naive division. The idea is to use an auxiliary number r by which the MR and the division are straightforward. For example, in binary representation, mod 2l reduction and (integer) division by 2l are respectively performed by selecting the l least significant bits (LSBs) and by taking all the bits except these l LSBs of the considered numbers. The MR idea is described in Algorithm 7. Note that the output of the algorithm on input x is xr≠1 mod p and not x mod p. Besides its efficiency, Algorithm 7 is consistent with multiplications and additions/subtractions. Indeed, xr≠1 ◊ xÕr≠1 = (xxÕr≠1)r≠1 mod p and xr≠1 + xÕr≠1 = (x + xÕ)r≠1 mod p.

On Pseudo-Mersenne Numbers

Numbers of the form p = 2w ≠ c where 0 Æ c < 2w/2 are said to be pseudo-Mersenne, in reference to Mersenne numbers (of the form p = 2w ≠ 1). Reductions modulo pseudo-Mersenne numbers are efficiently performed. The main idea is to exploit the form of these numbers and use bit selection and multiplication by a small constant to perform the reduction. Let x be a number of binary size 2w (say, the product of two numbers of size w). Start by remarking that x can be written x = x1 ◊ 2w + x0, (1.7) where x0 and x1 are less than 2w. Note also that 2w © c (mod p). Therefore, x mod p © x1c + x0 (mod p).

Overview of the Residue Number System

This subsection presents a simplified description of RNS, first proposed in [Val56, Gar59]. The reader interested in more details about RNS is referred to [ST67]. The concept of congruence will be used throughout the remaining of this thesis, and therefore introduced at the start.
Definition 3 (Congruence). Let m > 0 be an integer. Two integers x and y are congruent modulo m if x ≠ y is a multiple of m. We write x © y (mod m) (1.11) and we say that y is the residue of x modulo m. The integer m is called the modulus. This definition still holds when the modulus m < 0. In the remaining of this document all the moduli are assumed positive for convenience.

RNS Inherent Properties and Complexity Measurement

In RNS, the basic operations multiplication, addition and subtraction are held independently on each modulus of the RNS base. This independence means the mentioned computations are carry-free between the channels. Consequently, multiplications, additions and subtractions can be performed in parallel.
RNS is a nonpositional representation system, that means, the value of a number does not change if its residues in a given base are reordered differently. We emphasize that a reordering of the residues, if done, must be followed by a similar reordering in the used RNS base. For example, (2, 3, 5) in the RNS base {5, 6, 7} and (3, 2, 5) in the (same but reordered) RNS base {6, 5, 7} represent the exact same number 117. However, in the RNS base {5, 6, 7} the tuple (3, 2, 5) represents the number 68. The major drawback of RNS is the difficulty to perform position-related operations such as comparisons, divisions by integers not coprime with the range M of the RNS base, and MRs. This difficulty results from that of evaluating the order of magnitude of operands in nonpositional representation systems. For example, comparisons are difficult because we cannot just rely on the position of digits (as in a usual positional system) of two (or more) integers in RNS to compare them. An alternative is to convert the to-becompared integers into a positional system such as mixed-radix system (MRS) before the comparison. These conversions are costly as we see in Subsection 1.3.3.
Exact divisions by integers not coprime with M are also difficult in RNS for the same reason. Exact divisions by such integers might require to convert those integers to their positional representation before performing the divisions. As mentioned, those conversions are costly. Another way to perform divisions the reader might think of is to subtract the divisor iteratively. However this method requires a comparison per subtraction to determine the end of the division process. As explained, the comparisons are difficult. Another difficult operation in RNS is the mod p reduction, where p is a (large) integer not coprime with the range M of the RNS base. Since operations in protocols of asymmetric cryptographic applications are usually performed in rings or fields, the mod p reduction is a crucial operation. Some ways to perform MRs from the literature are presented in Section 1.2.

RNS Modular Reduction

The importance of MRs at the field level is mentioned in Section 1.2. Moreover, RNS by definition introduces MRs at another level, the various reductions mod mi. In this section we present some algorithms from the literature that efficiently perform the RNS MRs.

RNS Montgomery Reduction

The efficiency of the Montgomery reduction presented in Algorithm 7 motivates Posch and Posch [PP95] and later Kawamura et al. [KKSS00] to propose an RNS adapted version of the algorithm. Their algorithm is presented in Algorithm 10. Let M and MÕ be two RNS bases such that each modulus of MÕ is coprime with all moduli of M. Let also assume x < MMÕ (for example, the product of two numbers). It is important that x be represented in the two bases because of its range.

Table of contents :

List of Figures
List of Tables
List of Algorithms
List of Abbreviations
Introduction
1 State of the Art
1.1 Asymmetric Cryptography
1.1.1 Overview of Asymmetric Cryptography
1.1.2 Elliptic Curve Cryptography
1.2 Modular Reduction
1.2.1 Montgomery Reduction
1.2.2 Barrett Reduction
1.2.3 On Pseudo-Mersenne Numbers
1.3 Residue Number System
1.3.1 Overview of the Residue Number System
1.3.2 RNS Inherent Properties and Complexity Measurement
1.3.3 Base Extension
1.3.4 RNS Modular Reduction
1.4 Field Programmable Gate Arrays
1.4.1 Overview of FPGAs
1.4.2 Some RNS Implementations of Asymmetric Cryptosystems on FPGA from the Literature
1.4.3 High-Level Synthesis
1.5 Conclusion
2 Hierarchical Base Extension
2.1 Notations
2.2 Kawamura Base Extension
2.2.1 Overview of KBE
2.2.2 KBE Algorithm
2.2.3 The Cox-Rower Architecture
2.3 Hierarchical Base Extension
2.3.1 Overview of HBE
2.3.2 HBE Algorithm
2.3.3 A Cox-Rower Architecture Adapted for HBE
2.4 Application to RNS Modular Multiplications
2.5 FPGA Implementation Results
2.6 Conclusion
3 RNS-Flexible Hardware Accelerators for ECC
3.1 Notations and Definitions
3.2 Flexible Kawamura Base Extension
3.2.1 Algorithmical Description of the Flexible KBE
3.2.2 Architecture of the Flexible KBE
3.2.3 FPGA Implementation Results
3.3 Flexible Hierarchical Base Extension
3.3.1 Algorithmical Description of the Flexible HBE
3.3.2 Architecture of the Flexible HBE
3.3.3 FPGA Implementation Results
3.4 Flexible Elliptic Curve Scalar Multiplication
3.4.1 Overview of the Flexible ECSM
3.4.2 FPGA Implementation Results
3.5 Conclusion
Conclusion
A Comparaison d’algorithmes de réduction modulaire en HLS sur FPGA
B Generalized Weierstraß Equation of Elliptic Curves
C Proof of the Chinese Remainder Theorem
Résumé substantiel en français
Bibliography