Security Protocols and Related Protocols

Get Complete Project Material File(s) Now! »

Voice over IP Overview

Voice over IP has become a very interesting research area within the telecommunications field during the last years, given its advantages regarding low call costs. This report examines its integration into the wireless communications world given the limited bandwidth and other constraints present in this environment. The desire to provide suitable security support is the aim of many researchers nowadays. This sections briefly introduces the reader to VoIP technology.


Voice over IP (also referred to as Voice over Packet, Voice over Internet Protocol, or simply VoIP) consists of several interconnected components that convert a voice signal into a stream of packets on a packet network, and viceversa. Thus, VoIP can be defined as the ability to make phone calls (i.e., to do everything we can do today with the Public Switched Telephone Network, PSTN) and to send faxes over data networks with a suitable quality of service and much superior cost/benefit.
A new rich set of advantages and possibilities has emerged with the VoIP technology. Since data traffic has been growing much faster during the last years than telephone traffic, there has been considerable interest in transporting voice over data networks (allowing this voice and fax traffic to travel concurrently with data traffic over a packet data network), rather than the traditional data over voice networks. This fact places the existing telephone capabilities at a significantly lower « total cost of operation ». As far as the end users are concerned, a significant example would be the cost savings for long−distance telephone calls, where these users would not be imposed with additional constraints. On the other hand, the increase of their traffic volumes becomes very attractive for the Internet Service Providers (ISPs), and the equipment producers now have an opportunity to innovate and compete.

Components, Protocols, and Standards

Figure 2.1 depicts the infrastructure of a VoIP system. A significant component of the model is the Gateway. The Gateway is in charge of converting the media provided in one type of network to the format required for another type of network.
The voice packets are transported using IP in compliance with a specification for transmitting multimedia (voice, fax, video, and data) across a network. There are several specifications, recommendations, and standards for performing this transmission:
• ITU−T H.323
• Media Gateway Control Protocol (MGCP), from level 3, Bellcore, Cisco, and Nortel
• IETF Session Initiation Protocol (SIP)
• ITU−T T.38
• Skinny, from Cisco
SIP is nowadays of special interest. SIP is an IETF standard specified in Request For Comments (RFC) 3261[3], and defines a signaling protocol for creating, modifying, and terminating sessions1. Regarding the data transport itself, the most important protocol to handle this is the Real−Time Protocol (RTP)[2].
As far as the quality of service (QoS) and performance are concerned, VoIP is a delay−sensitive application, so a well−engineered, end−to−end network is necessary. Issues such as delay, jitter, congestion, packet−loss, and misordered packet arrival must be carefully handled.
1. These sessions are considered exchanges of data between participants, and include Internet telephone calls, multimedia distributions, and multimedia conferences.
The following list summarizes some examples of services provided by a VoIP network according to market requirements:
• Phone−to−phone
• PC−to−phone and phone−to−PC
• fax−to−fax
• fax−to−email and email−to−fax
• Wireless Connectivity
• PC−to−PC
This study is concerned with the PC−to−PC VoIP service, assuming the users have the appropriate software and hardware installed on their PCs (user agents, sound card, headsets, etc.).

Introduction to SIP and RTP

This section provides a short introduction to the Session Initiation Protocol (SIP) and the Real−Time Protocol (RTP), widely used in VoIP technology. SIP is the most commonly used protocol to create and manage the VoIP media sessions, while RTP is the transport protocol in charge of the transmission of the data in a VoIP session.

Session Initiation Protocol (SIP)


SIP is a signaling protocol used for establishing, modifying, and terminating sessions between users. SIP is defined by IETF as a standard in RFC 3261.
There are many applications of the Internet that require the creation and management of sessions, such as multimedia real−time exchanges, which this project is concerned with. There are various protocols designed to carry this real−time data (voice, video, etc.), such as Real−Time Protocol (RTP), and SIP works in concert with these protocols by establishing, managing, and terminating these exchanges.
As described in the SIP standard (RFC 3261), SIP is an application−layer control protocol with the ability to manage multimedia sessions, such as Internet telephone calls, which makes this protocol suitable for its use in VoIP.
SIP supports five aspects regarding the establishment and termination of communications sessions:
• User location: Determination of the destination end system.
• User availability: Determination of the willingness of the call party to accept a call to this device.
• User capabilities: Negotiation of the session parameters.
• Session setup: Establishment of the session.
• Session management: Modification and termination of the session.
Finally, two important ideas to keep in mind are that SIP does not provide services, but rather provides primitives that can be used to implement these services; and that SIP works with either IPv4 or IPv6.
SIP makes use of an offerer/answerer model, in which the caller represents the offerer and the called party represents the answerer.
The purpose of this section is to introduce the SIP protocol. Details such as security issues related to SIP (one of the goals of this project) are described in detail later in this document.


This subsection presents a simple example of the use of SIP between two end users. This example is related to the SIP Trapezoid and it only shows a simple SIP message exchange. The SIP Trapezoid is depicted in figure 9.1 and described in section 9.1.
In this example, one user (the offerer, referred to as « Alice » for simplicity) calls another user (the answerer, referred to as « Bob ») using his SIP identity, a type of Uniform Resource Identifier (URI), called SIP URI. This SIP URI is similar to an email address and it contains the user name and the host identifier (for example Alice sends a request called INVITE2 to Bob’s provider SIP server ( proxy) via her provider’s SIP server ( proxy). If Bob accepts the call, the media session is established. Figure 3.1 depicts this process. Section 3.1.3 briefly describes the requests and responses.
There are other aspects of SIP functionality besides the establishment and the termination of the call. For instance another important issue regarding SIP is the registration of the users with their provider’s servers. When a SIP−based device (called User Agent) comes online, it first must perform registration with a SIP Registration Server (called Registrar). This process is handled by sending a REGISTER message. Registrations are not normally permanent, they bind the user’s ID with an IP address where it can be contacted. A brief description of the REGISTER message is given in the next subsection.
The following list enumerates the main abilities SIP has in the VoIP context:
2. INVITE is an example of SIP method. These methods are described in section 3.1.3.
• Registering a user with a system
• Inviting users to join a session
• Negotiating the terms and conditions of a session
• Establishing the media stream between two or more end points
• Terminating sessions
More information and details about SIP can be found in the SIP standard (RFC3261).

SIP Requests and Responses

As seen in Figure 3.1, SIP is based on HTTP−like request/response (also referred to as offer/answer) model. The SIP specification defines a set of request messages (which in turn invoke SIP methods) and responses to those requests.
The most important method in SIP is the INVITE method, which is used to establish a session between participants (these participants are supposed to have previously registered with their respective provider’s SIP Registrars). As an example, the following paragraph shows how the first INVITE message shown in Figure 3.1 looks:
Via: SIP/2.0/UDP;branch=z9hG4bK776asdhds
Max−Forwards: 70
To: Bob <>
From: Alice <>;tag=1928301774
CSeq: 314159 INVITE
Contact: <>
Content−Type: application/sdp
Content−Length: 142
(Alice’s SDP not shown)
The first line identifies the method name, and the following lines are a minimum required set of fields of the INVITE message header. These fields are briefly described below:
• Via contains the address at which Alice expects to receive response to her request. The branch parameter identifies the transaction.
• To contains a display name and the SIP URI to which the request was directed.
• From identifies the originator of the request by his/her display name and his/her SIP URI. The tag parameter is used for identification purposes.
• Call−ID is a globally unique identifier for this call.
• CSeq stands for Command Sequence and it is an integer used as a traditional sequence number.
• Contact is a SIP URI that represents a direct route to contact Alice.
• Max−Forwards limits the number of hops to the destination.
• Content−Type describes the message body (the body is not shown).
• Content−Length defines the length of the message body.
Section 20 in SIP standard describes the complete set of header fields.
The details of the session to be established are not explicitly described by SIP, but these details are carried in the SIP message body encoded by other protocol, typically the Session Description Protocol (SDP)[8].
Another important SIP method is REGISTER. As said above, this method is used to register a device address with a system (via SIP Registration Server or Registrar). It is necessary for a device to perform the registration in order to provide location information to permit incoming calls.
Other SIP methods are:
• ACK: Confirms that the client has received a final response to an INVITE request.
• BYE: Indicates that the user wants to terminates a session. This message may be sent by either the originator of the call or the receiver.
• CANCEL: Cancels a previous request message3.
There are many different responses to these methods carried by request messages, all of them divided into six different groups [7]:
• 1xx Responses: Informational Responses (e.g. 180 Ringing and 100 Trying).
• 2xx Responses: Successful Responses (e.g. 200 OK).
• 3xx Responses: Redirection Responses (e.g. 302 Moved Temporarily).
• 4xx Responses: Request Failure Responses (e.g. 404 Not Found).
• 5xx Responses: Server Failure Responses (e.g. 503 Service Unavailable).
• 6xx Responses: Global Failure Responses (e.g. 600 Busy Everywhere).
The complete list and description of the SIP requests and responses can be found in the SIP standard.

READ  The Current Government Assignment to the Special Investigator

Real−Time Protocol (RTP)


Since 1996, the Real−Time Protocol (RTP) is an IETF standard specified in Request For Comments (RFC) 1889. RTP is a transport protocol for real−time applications which provides end− to−end network functions and services suitable for transmitting real−time data, such as audio, video, or simulation data, over unicast or multicast network services. RTP runs on top of a non− reliable transport protocol, such as UDP, to make use of the underlying multiplexing and checksum services.
RTP also provides a control protocol called RTP Control Protocol (RTCP), used for monitoring data delivery and to provide minimal control and identification functionality.
The services provided by RTP for the real−time data delivery include sequence numbering, payload type identification (such as audio samples or compressed video data), timestamping, and delivery monitoring. Security services for RTP and RTCP may be provided in several different ways, such as IPSec encapsulation over Virtual Private Networks (VPNs). The RTP standard also presents
3. It is important not to confuse CANCEL message with BYE message. See Chapter 7, pg. 164 in [7] for a better clarity.
some mechanisms to provide this security. However, a powerful alternative is the RTP profile Secure Real−Time Protocol (SRTP)[15]. RTP security issues and solutions to secure the RTP and RTCP traffic (one of the goals of this project) are described in detail later in this document.

Terminology and definitions

Of special interest for us is the definition of an RTP Session given in RFC 1889:
« RTP session: The association among a set of participants communicating with RTP. For each participant, the session is defined by a particular pair of destination transport addresses (one network address plus a port pair for RTP and RTCP). The destination transport address pair may be common for all participants, as in the case of IP multicast, or may be different for each, as in the case of individual unicast network addresses plus a common port pair. In a multimedia session, each medium is carried in a separate RTP session with its own RTCP packets. The multiple RTP sessions are distinguished by different port number pairs and/or different multicast addresses »[2].
Other significant definitions are summarized as follows:
• Synchronization Source (SSRC): The source of a stream of RTP packets identified by a 32−bit numeric SSRC identifier carried in the RTP header, so as not to be dependent upon the network address. The RTP sender is an example of such a source. More information can be found in [2].
• Contributing Source (CSRC): A source of a stream of RTP packets that has contributed to the combined stream produced by the RTP mixer. The list of these sources is called CSRC list . More information about RTP mixer can be found in [2].
• End system: An application that generates the content to be sent in RTP packets and/or consumes the content of received RTP packets. An end system can act as one or more synchronization sources in a particular RTP session, but typically act as only one (See [2]).
3.2.3 RTP Packet Format
The RTP packet consists of a fixed header, a possibly empty list of contributing sources (unicast transmission), and a payload. The payload contains the real−time application data, such as audio or video data. Detailed information about the payload types is given in the RTP standard (RFC 1889).
The RTP header is depicted in Figure 3.2, and the fixed part has the following fields:
• Version (V): 2 bits. This field identifies the version of RTP. By default it is set to the value 2 for the RFC 1889 RTP specification.
• Padding (P): 1 bit. Set to the value 1 if padding has been applied to this packet.
• Extension (X): 1 bit. If the extension bit is set, the header is followed by exactly one extension field. Detailed information about the RTP extensions is given in section 5.3.1 in [2].
• CSRC count (CC): 4 bits. This field contains the number of CSRC identifiers that follow the RTP header.
• Marker (M): 1 bit. The interpretation of this field is defined by a RTP profile. See section 5.3 in [2] for further information about RTP profiles.
• Payload Type (PT): 7 bits. This field identifies the format of the RTP payload and determines its interpretation by the real−time application.
• Sequence Number: 16 bits. This field increments by one for each RTP packet sent. It may be used by the receiver to detect packet loss. The initial value of this field is random.
• Timestamp: 32 bits. This value reflects the sampling instant of the first octet in the RTP packet. As for the sequence number, the initial value is random.
• SSRC: 32 bits. This field identifies the synchronization source.
The CSRC list (zero to fifteen items, each 32 bits in length) identifies all the contributing sources for the payload of the packet. As noted above, the CC field in the fixed header contains the number of sources identified.

Table of contents :

1. Introduction
2. Voice Over IP Overview
2.1 Introduction
2.2 Components, Protocols, and Standards
3. Introduction to SIP and RTP
3.1 Session Initiation Protocol (SIP)
3.1.1 Introduction
3.1.2 Functionality
3.1.3 SIP Requests and Responses
3.2 Real−Time Protocol (RTP)
3.2.1 Introduction
3.2.2 Terminology and Definitions
3.2.3 RTP Packet Format
3.2.4 RTCP Packet Format
4. Security Services
4.1 Security Attacks
4.2 Authentication
4.3 Access Control
4.4 Data Confidentiality
4.5 Data Integrity
4.6 Non−Repudiation
4.7 Availability
5. Cryptography Overview
5.1 Basic Knowledge
5.1.1 Introduction
5.1.2 Symmetric Cryptography
5.1.3 Asymmetric Cryptography
5.1.4 Symmetric Cryptography vs. Asymmetric Cryptography
5.1.5 Cryptanalysis
5.1.6 One−way Hash Functions and MACs
5.1.7 Overview of the Hash−based Message Authentication Code: HMAC 21
5.1.8 Certificates
5.1.9 Location of encryption devices
5.2 Basic Algorithm and Methods
5.2.1 Advanced Encryption Standard (AES) AES History Overview of the Algorithm
5.2.2 Data Encryption Standard (DES)
5.2.3 Secure Hash Algorithm (SHA) SHA History Overview of the Algorithm
5.2.4 Hash−Based Message Authentication Code
6. Public−Key Infrastructures
6.1 Introduction and Terminology
6.2 X.509 Certification Infrastructure
6.2.1 Chaining
6.2.2 Revocation of Certificates and Certificate Revocation Lists (CRLs) 31
6.3 Certification Infrastructure Models
7. Introduction to Security Protocols and Related Protocols
7.1 Internet Protocol Security (IPSec)
7.1.1 Introduction, Applications, and Benefits of IPSec
7.1.2 IPSec Architecture IPSec Transport and Tunnel Modes Authentication Header (AH) Encapsulating Security Payload (ESP)
7.2 Transport Layer Security (TLS)
7.2.1 Introduction, Applications, and Benefits
7.2.2 SSL/TLS Architecture SSL/TLS Record Protocol SSL/TLS Handshake Protocol
7.3 Key Management Protocols
7.3.1 Introduction
7.3.2 IKE / ISAKMP
7.3.3 Simple Diffie−Hellman Key Exchange
8. Objective: Enabling a Secure Mobile VoIP call
9. Mobile Voice over IP: The Model and its Components
9.1 Significant Components
9.1.1 Mobile Nodes
9.1.2 SIP Servers
9.1.3 DSN Servers
9.2 The SIP Trapezoid
9.3 The SIP Registration
9.4 The RTP Session
9.5 Other Components
9.5.1 Home Agents
9.5.2 AAA Servers
9.5.3 Access Points
10. Alternative Solutions for Secure Mobile Voice over IP
10.1 Security Requirements of the Model
10.2 Securing SIP
10.2.1 Using SSL/TLS in a PKI
10.2.2 Using IPSec
10.2.3 Securing SDP Bodies and SIP Headers
10.2.4 Securing the DNS look−up
10.2.5 Conclusions
10.3 Securing the media stream
10.3.1 Secure Transport Protocol
10.3.2. Key Management
11. A Secure Model for Mobile Voice over IP
11.1 Overview of the Model
11.2 Interoperation of the Components
11.3 Rationale
11.3.1 TLS supported by a PKI
11.3.2 DNSSEC
11.3.3 The User Agent: MINISIP
11.3.4 SRTP vs. IPSec and VPNs
11.3.5 MIKEY
12. SIP Security
12.1 Background
12.2 TLS within SIP
12.3 A First Approach
13. Secure Real−Time Protocol
13.1 SRTP Description
13.1.1 SRTP Packet
13.1.2 SRTCP Packet
13.1.3 Message Authentication and Integrity
13.1.4 Key Derivation
13.1.5 Cryptographic Context
13.1.6 Packet Processing
13.1.7 Predefined Algorithms Encryption Message Authentication and Integrity
13.2 SRTP Implementation: MINIsrtp
13.2.1 Introduction
13.2.2 Tools
13.2.3 Features
13.2.4 Description Classes Algorithm SRtpPacket Class Methods CryptoContext Class Methods Bug Information License
14. Multimedia Internet KEYing (MIKEY)
14.1 Overview
14.2 MIKEY Framework for Secure Mobile VoIP
14.2.1 Terminology Relationship
14.2.2 MIKEY within SIP
14.2.3 MIKEY Integration into SDP
14.2.4 Error Handling
14.2.5 MIKEY Over an Unreliable Transport Protocol
14.2.6 MIKEY Payloads
14.2.7 MIKEY Interface
14.2.8 MIKEY Exchange Method: Signed Diffie−Hellman
15. Description of the Implementation of the Model and its Analysis
15.1 Implementation
15.1.1 MINIsrtp Development
15.1.2 Integration of MINIsrtp into MINISIP User Agent
15.1.3 Setting up of the SIP Servers
15.2 Analysis and Validation of the Model
15.2.1 MINIsrtp Correctness
15.2.2 Performance Measurements on MINIsrtp
16. Conclusions and Future Work
16.1 Conclusions
16.2 Future Work in this Area
Appendix A: MINIsrtp Source Code
Appendix B: A First Approach to a MIKEY Messages Implementation
Appendix C: Acronyms
Appendix D: Notation
Appendix E: Glossary
Figures and Tables Index


Related Posts