A Hardware Based Solution: TCAM
Ternary Content Addressable Memories (TCAM) are the most popular hardware-based technique in networking devices. TCAMs store rules as a W-bit field (value, bitmask). For example, if W=4, a prefix 01** will be stored as (0100,1100) inside TCAM memory. An input key is compared with all stored patterns simultaneously. A TCAM gives us a result when-ever a key matches a pattern or not in one cycle. At the same time, other types of memory like Static Random-Access Memory (SRAM) needs more cycles resulting in slower network switching applications.
In Table 2.2 we have four addresses, each one with a diﬀerent port. Based on the address, each of the packets will be directed to one of the four ports. In standard cases, and without TCAMs, if a packet has an address of 0101, according to Table 2.2 Rules 2 and 3 will be triggered. To solve this problem, standard memory like SRAM uses Longest Prefix Match (LPM), where a packet is matched by the most specific entry in the rule table. In this case, Rule 2 is more specific, and the packet will be sent to port b.
An eﬃcient solution to deal with the high number of TCAMs entries is range encoding. It consists in mapping each of the ranges to a short ternary format sequence. Ranges in TCAMs are represented by this sequence without the need to expand them, thus, improving the storage capacity of TCAMs and aﬀecting positively the overall performance.
In , Liu and al. propose a technique that requires less TCAM storage space and a deterministic execution time. In their approach, a key is generated for each of the rules and stored alongside the address in TCAM. The key is an n-bit vector V = fv1; v2:::; vng, where n is the number of distinct ranges. Each of the vi in this case corresponds to a range with vi = 1 if and only if v 2 Ri otherwise vi will be set to 0. Although this technique can reduce the number of entries in TCAM by relying on ternary format instead of prefixes, it requires extra bits to represent keys in the TCAM memory. An algorithm called SRGE is introduced in  to handle the rule expansion problem by encoding ranges with Gray codes. In this binary encoding, two adjacent numbers diﬀer by one bit only. This encoding improves the maximal rule expansion to 2 w 4 prefixes for each range.
From a topological view of the TCAM re-encoding process, the au-thors in  propose a technique to optimize packet classifiers with domain compression to eliminate as many redundant rules as possible, and prefix alignment to reduce range expansion.
A TCAM-based algorithm is presented in  called parallel packet classification (P 2 C). In this approach, each header field is re-encoded using fewer bits, and then all fields are concatenated to form a TCAM entry. Although we will end up with the same number of rules, the total bit size is smaller. This method is based on a primitive-range hierarchy, where diﬀerent ranges are mapped to a diﬀerent layer. Disjoint rangers are put in the same layers and represented by code-words. An approach that relies on inclusion tree (i-tree) called prefix inclusion coding (P IC) is developed in  to overcome some drawbacks of the P 2C approach, especially w.r.t. the update cost where all ranges in a layer will be aﬀected by an insertion while in P IC an insertion will aﬀect a small part of ranges in that layer. DRES  is a bit-map-based scheme based on a greedy algorithm for range encoding in TCAM co-processors. In DRES, ranges are encoded by the algorithm from . Ranges with the highest prefix expansion will be assigned extra bits. In DRES, decoding is done inside TCAMs, meaning that a modification is necessary and existing network devices cannot be employed directly. To handle the range expansion,  proposes a Non-Adjacent Form or NAF. In this approach, a range is re-encoded with a set of signed prefixes. While the encoding reduces ranges, signed representations complicate the matching implementation and performance and add overheads.
The approach Flow Table Reduction Scheme has been introduced in  to minimize the number of flow entries in SDN. This paper focuses on reducing the number of entries by using a new representation for IP ranges since reducing the number of entries can improve the power consumption of TCAM while respecting the capacity constraint.
The DNF (disjunctive normal form) has also been applied to compute the minimal Boolean expression for a range in linear time  and to prove the 2w 4 upper bound. Table 2.4 shows a summary of some range encoding techniques already presented.
SDN decouples the control plane from the data plane as shown in Fig. 2.4. Switches and routers in SDN perform forwarding packets only. However, the controller in the control plane controls how and where each packet will be handled. This level of separation between the control and data plane builds a more flexible environment. In addition, the controller has an overview of all the network. The controller oversees switches in the data plane via a well-known protocol called OpenFlow . Switches running OpenFlow have one or more tables that can store rules called flow tables. OpenFlow switches can have TCAM or SRAM memory. When a packet arrives at a switch for the first time, a packet_in message is sent to the controller if no match is found. Then, the controller sends back a rule to be installed in the switch’s flow table with the proper action that needs to be taken for every packet with the same header field as the first received packet. Since switches in SDN apply the matching process based on rules sent from the controller, they can behave like a router, a firewall, a switch, etc.
The controller in SDN called NOS is a software-based platform that runs on a server. The network is programmable through a software ap-plication running on top of NOS . In this thesis we consider that any packet classification approach needs to run on the controller before sending rules to the switches. For example, if a new classifier is computed in the controller based on an old one but with the same semantics, new rules of the new classifier will be sent to the switches. Since the number of rules in the controller is smaller after applying diﬀerent techniques, then the number of rules in switches will also be smaller. If switches in SDN use TCAM memory, then the classification process becomes faster. The com-munication between controller and switches uses the OpenFlow protocol implemented in both the data and control plane. In the next part, we will describe this protocol and its operations.
OpenFlow is the most widely used communication protocol in southbound API. Started from an academic project in campus networks , this pro-tocol quickly spread between giant companies like Facebook, Google, Mi-crosoft, etc. In addition, OpenFlow-enabled switches have been produced by many vendors, including HP, NetGear, and IBM . While SDN gives an abstraction of the entire network, OpenFlow provides an abstraction for components in the network. Therefore it is important to decouple the two from each other. Even though OpenFlow protocol and SDN can complete each other, they are not tied together. For example, SDN can use other communication protocols for the southbound API like POF , OpFlex  OpenState , Open vSwitch , PAD .
One key element of OpenFlow is the centralization of the control plane. One controller can be connected to multiple switches, and any decision can be based on the global view of the network instead of having limited knowledge of the network. This centralization can be helpful in case of network failure since, in traditional network architecture, a new path needs to be computed at each switch. However, in SDN, the controller can locally compute a new route and send new rules for each aﬀected switch.
OpenFlow can also be used to analyze traﬃc in real-time. Each rule in the routing table has a counter that indicates how many times this rule has been matched. This information can be sent to the controller, where it can be analyzed. To detect a denial of service attack (DDoS), authors in  propose a new method that uses self-organizing maps to classify network traﬃc as malicious or not. In  a new method for source address validation mechanism is used with OpenFlow where a new rule received by the controller for the first time will be analyzed based on the destination source. OpenFlow is also used in all sort of network application from traﬃc engineering [66; 67; 68; 69], mobility and wireless [70; 71; 72; 73], monitoring [74; 75; 76], data centers [77; 78; 79], security [80; 81; 82] and many other.
Rulesets Decomposition and Placement
To decompose and install rules in a network, the authors in  pro-pose a an approach named One Big Switch (OBS), for rules with a two-dimensional endpoint policy. Each section of the two-dimensional repre-sentation corresponds to a node in the tree (a cover). With each cut, two nodes are created, each one with its own rules. OBS tries to solve the problem separately for each path. OBS generates forward rules to avoid the processing of packets al-ready matched by rules in previous switches. The duplication of rules with lower priority and high overlapping with other rules can increase the overhead since, according to their approach, they must be installed in mul-tiple switches. The number of forwarding rules generated can aﬀect the overhead of switches if the set of rules is poorly chosen.
Table of contents :
Chapter 1 General Introduction
1.2 Problem Statement
1.3 Our Contribution
1.4 Thesis Organization
Chapter 2 Related work
2.2 Packet Classification Approaches
2.2.1 A Hardware Based Solution: TCAM
2.2.2 A Software Based Solution: Decision Tree
2.3 Minimization of Packet Classification Rules
2.3.1 Range Encoding
2.3.2 Classifier Minimization
2.4 Software Defined Networking
2.4.1 SDN Architecture
2.4.2 OpenFlow Protocol
2.5 Rules Placement
2.5.1 Rulesets Decomposition and Placement
2.5.2 Rules Caching and Swapping
2.5.3 Path Based Rules Placement
2.5.4 Rules Update
Chapter 3 Filtering Rules Compression with Double Masks
3.2 Double Mask Technique
3.2.2 Double Mask Representation
3.3 Double Mask Computation Algorithms
3.3.1 Naive Algorithm
3.3.2 Linear Time Algorithm
3.4 Evaluation by Simulation
3.4.1 Simulation Setup
3.4.2 Real-world IP Ruleset
3.4.3 Synthetically Generated Rulesets
3.5 Experimental Evaluation
3.5.1 Setup and Parameters
3.5.2 Implementation and Integration
3.5.3 Experiments and Results
Chapter 4 Rules Distribution Over a Single Path Topology
4.2 Problem Statement
4.2.1 Problem Definition
4.3 Distribution Over a Single Path
4.3.1 Rules Representation
4.3.2 Forward Rules Generation
4.3.3 Distribution Algorithm
4.3.4 Algorithmic Complexity
4.4.1 Simulation Setup
4.4.2 Simulation Results
Chapter 5 Rules Distribution Over a Graph
5.2 Two-Terminal Series-Parallel Graph
5.2.1 Distribution Algorithm
5.2.2 Algorithmic Complexity
5.2.3 Generalization to St-Dags
5.3 Two-Tier Distribution Approach
5.3.1 Distribution of Multi-Fields Rulesets
5.3.2 Multi-level Distribution
5.4 Evaluation of Two-Tier Approach
5.4.1 Experimental Setup
5.4.3 Bandwidth and Latency
5.4.4 Multiple Destinations
5.5 Rulesets Update Strategy
5.5.1 Update Strategy With Generated Forward Rules
5.5.2 Update Strategy With Two-Tier Approach
5.5.4 Network Topology Update
Chapter 6 General Conclusion
6.3 Future Work