Hardware monitoring applied to traffic classification

Get Complete Project Material File(s) Now! »

Supported data rate

This is the most obvious feature common to all traffic monitoring applications. Independently of the application, the development platform has a maximum supported data rate. On the receiving side, it is the speed at which it can receive data and make it ready for processing, without any actual processing happening. On the sending side, it is the speed at which in can send data that is already
ready to send. The most important is the sustained rate, that is to say a rate that can be supported indefinitely.
There is no guarantee that an application developed on a given platform will support the maximum rate, because the processing part can be the bottleneck. But no application can support more than the maximum rate of the platform. This data rate is not always the speed supported by the network interfaces, for example 10 Gb/s for 10 gigabit Ethernet. It is not enough to receive data on the interface: it must be made available to the processing units. The maximum data rate should be supported for any kind of packet. On Ethernet, this is usually more challenging for the smallest packets (46 bytes of payload) because a processing overhead is associated to each packet.

Computation power

Once the packets are received and sent fast enough, they must be processed fast enough too. This is the role of the processing units. There can be many of them of different kinds: a Central Processing Unit (CPU), a GPU, a NPU or an FPGA. Each kind of processing unit provides processing power with important differences in:
the frequency: all processing units work using a discretized time, the frequency is the number of clock cycles that can be run during one second.
the parallelism: the number of operations that can be run in parallel during one clock cycle.
the variety and complexity of each operation.
Depending on the application, the need in computation power can vary widely. If the goal is only to count the number of received packets and bytes, to estimate the quantity of data received, there is almost no computation needed. If the goal is to classify traffic using Deep Packet Inspection (DPI), an important number of regular expressions have to be checked against each packet [AFK+12], which requires a lot of computing power.

Flexibility

The computation power itself is not enough to quantify the acceleration factor a platform will bring to a specific application. Depending on the task, the development platform that will bring the most acceleration is not always the same. For example, one platform may have very fast memory accesses, which will make it very good at handling large lookup tables, while another might have very fast floating-point computation units, which will make it adapted to complex mathematic operations. This is particularly true for platforms which are specialized for certain tasks, and have highly optimized units dedicated to these tasks. An application relying heavily on these tasks will be a perfect fit for this platform, but some applications may not use these tasks at all. So it is important to identify which tasks might become bottlenecks throttling the maximum supported data rate of an application. Development platforms should be especially evaluated on these tasks. If the application is not very defined yet, the best choice will be a less specialized platform.

Reliability

To run on a real network, a traffic monitoring application has to be reliable. This can have different aspects:
The supported data rate of the application must be reliable: for example, if the computation speed depends on external factors (other applications on the same machine), this can slow down the packet processing and lower the supported data rate during certain periods of time.
The measurements made on the received traffic must be reliable. This is particularly a problem for the timestamp that indicates when a packet was received. Some applications rely on this delay, like applications that measure the Quality of Service (QoS) on a network [OGI+12], or some traffic classification applications [DdDPSR08]. If the time to process a packet is variable and unknown, the timestamp will be impossible to obtain accurately. The reliability is mainly linked with the ability for the processing units to support real-time applications, that is to say applications that must perform some computations in a determined delay.

Security

No development platform can guarantee the security of any application. But it is easier to develop secure applications on some platforms. A perfectly secure application would work reliably even if attackers try to prevent it, and would not expose data that should remain private. For some applications security may be crucial. A firewall is for example an obvious target to attack a network. For other applications, like passive probes that only collect statistics about the traffic, security is not the most important feature.

Platform openness

This factor is often overlooked, but the openness of a platform has an important impact on the development process. Different parameters can make a platform more or less open:
The availability of open-source projects using the platform. Studying a welldocumented open-source project is a very easy way to learn how to develop for a specific platform. The presence of an online support community is a big help too [SH13].
The possibility to study and modify the code of the development framework. If this code is open-source and well documented, developers will be able to understand faster how to develop new applications.
The possibility to reuse the code developed for this platform on other similar platforms. It may be important to not be locked with specific hardware or a specific vendor, so as to be able to change for a new cheaper of more powerful option. This depends mainly on the language and framework used. Some frameworks are focused only on working on a specific platform, while others are focused on inter-operability. This difference can be seen for example in frameworks for development of scientific applications on GPUs: CUDA works only on Nvidia GPUs, while OpenCL is made to work on all GPUs and on other kinds of processing units like FPGAs [DWL+12].

Table of contents :

A Abstract
B Résumé
B.1 Introduction
B.2 Choisir une plateforme de développement
B.3 Surveillance logicielle pour la sécurité
B.4 Surveillance matérielle pour la classification de trafic
B.5 Plateforme de test avec accélération matérielle
B.6 Conclusion
1 Introduction
1.1 Context
1.2 Objectives
1.3 Traffic monitoring
1.3.1 Topology
1.3.2 Time constraints
1.3.3 Traffic features
1.3.4 Detection technique
1.3.5 Calibration
1.4 Acceleration challenges
1.4.1 Large data storage
1.4.2 Test conditions
1.5 Thesis structure
2 Choosing a development platform
2.1 Criteria
2.1.1 Supported data rate
2.1.2 Computation power
2.1.3 Flexibility
2.1.4 Reliability
2.1.5 Security
2.1.6 Platform openness
2.1.7 Development time
2.1.8 Update simplicity
2.1.9 Future scalability
2.1.10 Hardware cost
2.2 Commodity hardware
2.2.1 Handling traffic
2.2.2 CPU computation
2.2.3 GPU computation
2.3 Network processors
2.3.1 Principles
2.3.2 Development platforms
2.3.3 Use cases
2.4 FPGAs
2.4.1 Composition of an FPGA
2.4.2 Boards for traffic monitoring
2.4.3 Development principles
2.5 Conclusion
3 Software monitoring applied to security
3.1 State of the art on DDoS detection implementation
3.1.1 Monitoring platforms
3.1.2 DDoS attacks
3.1.3 DDoS detection algorithms
3.2 Flexible anomaly detection
3.2.1 Problem statement
3.2.2 Algorithm for DDoS detection
3.3 A flexible framework: BlockMon
3.3.1 Principles
3.3.2 Performance mechanisms
3.3.3 Base blocks and compositions
3.4 Implementing DDoS detection in BlockMon
3.4.1 Algorithm libraries
3.4.2 Single-node detector implementation
3.4.3 Alternative compositions
3.5 Results
3.5.1 Accuracy
3.5.2 Performance
3.5.3 Going further
3.6 Conclusion
4 Hardware monitoring applied to traffic classification
4.1 State of the art on traffic classification
4.1.1 Port-based classification
4.1.2 Deep Packet Inspection (DPI)
4.1.3 Statistical classification
4.1.4 Behavioral classification
4.2 Using SVM for traffic classification
4.2.1 Proposed solution
4.2.2 Background on Support Vector Machine (SVM)
4.2.3 Accuracy of the SVM algorithm
4.3 SVM classification implementation
4.3.1 Requirements
4.3.2 The SVM classification algorithm
4.3.3 Parallelism
4.4 Adaptation to hardware
4.4.1 Architecture
4.4.2 Flow reconstruction
4.4.3 The RBF kernel
4.4.4 The CORDIC algorithm
4.4.5 Comparing the two kernels
4.5 Performance of the hardware-accelerated traffic classifier
4.5.1 Synthesis results
4.5.2 Implementation validation
4.6 Conclusion
5 Hardware-accelerated test platform
5.1 State of the art on traffic generation
5.1.1 Traffic models
5.1.2 Commercial generators
5.1.3 Software-based generators
5.1.4 Hardware-accelerated generators
5.2 An open-source FPGA traffic generator
5.2.1 Requirements
5.2.2 Technical constraints
5.2.3 Global specifications
5.3 Software architecture
5.3.1 The configuration interface
5.3.2 The configuration format
5.3.3 The control tool
5.4 Hardware architecture
5.4.1 Main components
5.4.2 Inside the stream generator
5.5 Generator use cases
5.5.1 Design of a new modifier
5.5.2 Synthesis on the FPGA
5.5.3 Performance of the traffic generator
5.6 Conclusion
6 Conclusion
6.1 Main contributions
6.1.1 Development platform
6.1.2 Software monitoring applied to security
6.1.3 Hardware monitoring applied to traffic classification
6.1.4 Hardware-accelerated test platform
6.2 Acceleration solutions comparison
6.3 Perspectives
Glossary
Bibliography