Get Complete Project Material File(s) Now! »
Information Retrieval in Telecommunication Databases
Telecommunication networks store vast repositories of data which provides the oppor-tunity to utilise information extracted from these databases as a mechanism to optimise the network to bring about improvement in the network quality, and more importantly from the network operator point of view, to maximise the return on investment (ROI) from the network. With the explosion of the information age, the growth and the ac-cess to information across all disciplines of industry has been exponential. However, the explosive growth in many databases has far out paced the ability to interpret and digest this data . With the large volumes of data that are generated on a daily basis, a growing need exists to be able to extract information that could be considered as useful knowledge. Due to the constant growth in demand for capacity in the net-work, a systematic approach to cellular network planning is crucial to maintain e ective growth and returns on investments . With the availability of relatively accurate data from a cellular network, cellular network planning that takes into consideration tele-tra c issues is vital for the long-term characterisation of subscriber behaviour .
With the drastically varying socio-economic status of various sectors of a typical devel-oping country, tra c trends that are generated from various sectors of a network can vary tremendously. By determining various types of tra c classes that contribute to the tra c loads in a given network, the long term tra c trends can be predicted for the purposes of capacity planning. As highlighted previously, one of the objectives of this study is to develop a classi cation mechanism that is able to categorise various sectors of a mobile subscriber market into tra c classes and to then utilise the identi ed traf-c classes for the bene t of a mobile network provider, for example, in their network planning strategy. As highlighted above, with the advent of database technologies over the past few decades, most modern enterprises today have data accumulated over the years residing in databases that store information about their customers, products, and the various services that they provide. Modern economies have become highly com-petitive with a focus on customers and services . With the availability of large amounts of data available in a mobile network, mechanisms that are able to extract and convert the extracted information into useful information would be bene cial in determining the growth in demand for services in the network as well as determining demand anomalies that may arise in typical networks. The following section highlights the impact of subscriber classes on mobile networks.
Impact of Subscriber Classes in Mobile Networks
While mobile network coverage in most urban areas in developing countries is adequate, the coverage in many rural areas has been much lower. The rise in mobile network coverage has largely been contributed to by the growth in coverage in rural areas due to the lack of universal access and services (UAS) mechanisms . As discussed in , the results of regression analysis comparing mobile subscriptions per 100 inhabitants and the gross national income per capita in US $ indicated that income plays a role in terms of penetration levels. It has been clear in many markets that competition has been a key driver in reducing call tari s in many markets. Innovative business models such as the prepaid system has lead to greater stimulation of markets across Africa. As discussed in , varying prices depending on call types and call period during the day (peak,o -peak) impact on call usage by subscribers. In addition, the reduction in the cost of handsets could further stimulate the market especially in the case of low-income groups . According to Gartner , 428 Million Mobile communication devices were sold worldwide in the 1st Quarter of 2011 which was a 19% increase year-on-year. With most rural communities being dependant on voice based services and innovative applications such as the M-PESA system launched by Kenyan mobile operator, Safaricom in March 2007  and that is being extended to various other markets in Africa, mobile devices have become an integral part of most peoples everyday lives. However, the impact of the above is the pressure placed on mobile networks planners in meeting the rising demand for capacity. In addition to this, the challenge of catering for di erent subscriber classes in the network in many developing countries becomes a greater challenge due to varying service demands.
Customer Segmentation in the Telecommunication Industry
Businesses in the service industry need to have a better understanding of their cus-tomers if they are to provide better services to them as highlighted in . By extract-ing information about customers, companies are able to make major decisions regarding re-organising a business, service o erings, marketing etc . Market segmentation has shown various bene ts over mass marketing strategies. According to , customer segmentation could be de ned as the process of dividing customers into homogeneous groups on the basis of common attributes. Information about a customer provides the opportunity to expand as well as increase pro tability into speci c markets . A common method for segmentation is the use of Clustering techniques. Cluster algo-rithms can be employed to discover groups of customers with common attributes . Its use is appropriate when very little information is known about the categorisation structure of the underlying data . However, the type of data used for clustering and the selection of meaningful attributes for the clustering approach play a critical role in the performance of the clustering process.
In the telecommunications industry, it is often useful to pro le customers based on their patterns of phone usage. This information can be used to pro le the customers and these pro les can then be used for marketing and/or forecasting purposes . Various applications for segmentation have been proposed. By separating subscribers into customer groups with common behaviour, the provision of better calling rates to such a group, for example, could encourage greater usage of services. Identi cation of subscribers who are likely to adopt newer services is also useful in determining how best to o er attractive packages for such segments . Mazzoni et al. investigate the characteristics of Italian cell phone users in . A multi-dimensional segmentation approach is used to determine if di erences exist among Italian mobile phone users leading to the identi cation of di erent market segments and secondly, to determine if it is possible to describe them using the multi-dimensional approach. In , Kian-mehr et al. conduct cluster analysis to identify calling communities using information derived from call data records (CDR). In , factor analysis, clustering, and quantita-tive association is used to nd service adoption patterns of segmented groups. From a network planning and optimisation point of view, by identifying market segments and considering that the network is not homogeneous but rather heterogeneous comprising of subscribers that behave di erently, the network planning strategies employed for cells deployed in these segments has to be managed more e ectively.
Various kinds of data are generated in a typical telecommunication network and include CDR data, network data describing the state of hardware and software components in the network, and customer data that provide information on customers . A gen-eral overview of the Global System for Mobile (GSM) is provided in Appendix A. The OMC within such networks provides various measurements that are generated by dif-ferent counters. These may be sorted into four domains: handover, quality of service, resource availability and usage, and tra c load . Tra c load can be measured from these counters which provide information on when each available TCH is busy. In this study, tra c data extracted from the OMC of a typical mobile network is considered. In proposing a suitable approach for feature extraction and classi cation based on this data, the following section provides an overview of related work on feature extraction and classi cation.
Related Work on Feature Extraction and Classi ca-tion Approaches
The use of signal processing approaches have been widely used for feature extraction and classi cation. However, one of the problems with traditional signal processing ap-proaches is their ine ciency in providing su cient information in the time domain. The choice of approaches such as the wavelet transform has shown the ability of the approaches to handle signals in short time intervals for high frequency components as well as long time intervals for low frequency components . Wavelet based multi-resolution analysis technique (WMAT) approaches have also been shown to be useful in denoising multi-dimensional spatial/temporal signals containing steady/unsteady noise . The empirical mode decomposition (EMD) approach has been shown to be a powerful tool in analysing composite, non-linear and non-stationary signals . The di erence histogram approach , a relatively new approach for feature extraction, has been shown to have bene ts in terms of computational complexity and its suitabil-ity for real-time applications.
The choice of a suitable feature extraction approach is dependent on the ability of the method to handle the characteristics of the inputted signal. Various approaches have been demonstrated in literature that focus on suitable approaches that extract features for various applications. Eristi et al. in  use the wavelet transform and a state vector machine SVM) for the extraction of features from the impulse test response of a transform in time-frequency domain and classi cation of patterns inherent in the features extracted. In , Uyar et al. use a wavelet-based extraction approach based on norm entropy and a classi er based on a multi-layer perceptron for power quality (PQ) disturbance classi cation. It is shown that a reduced size in the feature vector and multi-resolution analysis is achieved with an increase in classi cation accuracy. Gaouda et al. use WMAT in  to detect and localise di erent power quality problems. Stan-dard deviation curves are introduced at di erent resolution levels to classify di erent power quality problems. Peilin et al. use a combination of the wavelet transform (WT) and the SOLAR system (a sparsely connected multi-layer information theory based system) in  to achieve PQ classi cation. It is shown that the combination of the WT and the SOLAR system can achieve good PQ classi cation performance. Hu et al. use wavelet packet energy entropy and a weighted SVM in  to automatically detect and classify PQ disturbances. The WPT is used to denoise the digital signals and to decompose the signals to obtain common features. The SVM is then trained based on the features to make decisions regarding the type of disturbance. Wu et al. develop an automative generator fault diagnosis system using the discrete wavelet transform (DWT) and an arti cial neural network (ANN) for classi cation in . In , He et al. propose the use of WMAT for reducing noise induced by complex uncertainty in data cleansing and parameter estimation in river water quality simulation. Results show that the WMAT does not distort clean data and can e ectively reduce noise in the polluted data. The DWT is used to reduce the complexity of the feature vectors.
Ekici et al. use the WPT and ANN in  for estimating fault locations in transmission lines. The energy and entropy criterion are applied to the wavelet packet coe cients to decrease the size of feature vectors. It is shown that the approach provides a reliable method for reducing data sets in size and enabling the estimation process to be quick and accurate. Youn et al. propose the use of the discrete wavelet packet transform (DWPT) and IIR polyphase ltering scheme in  for a fast spectrum sensing algo-rithm for cognitive radios. The DWPT is used to analyse interesting frequency bands based on multi-resolution while the IIR is used to reduce the complexity of the DWPT implementation. Reeve et al. use the WPT for the investigation of multi-scale tem-poral variability of beach pro les in . Vatansever et al. perform power parameter calculations based on the WPT in . From the above, the use of wavelet based signal decomposition is seen as a suitable approach for feature extraction especially in applications where signal behaviour prompts the use of multi-scale techniques for the extraction of features.
In , Bao et al. use the EMD approach to extract information of modulation from signals contaminated with noise. It is shown that the EMD method is capable of recovering the amplitude-modulated components from strong background noise in an adaptive way. Junsheng et al. propose a fault extraction approach based on EMD and an auto-regressive model (AR) in  for roller bearing fault diagnosis. The EMD method is used to decompose the non-stationary vibration signal of a roller bearing into a number of intrinsic mode functions (IMF) based on which the AR models of each IMF are established and regarded as feature vectors. In , Lin et al. propose an improved EMD method for signal feature extraction. The optimal envelope mean is obtained by an inverse EMD lter in the improved scheme with a new sifting stop criterion proposed to guarantee orthogonality of the sifting results. As an alternative to wavelet based approaches, the EMD based decomposition approach is thus seen as a suitable approach for feature extraction especially in applications where signals are contaminated with noise.
The concept of a di erence histogram (DH) approach is introduced in  which is applied to a two rolling element bearing time series classi cation problem. The pri-mary bene t of the approach presented in  is rstly the ease of implementation of the approach and secondly the reduced complexity of the approach. This could con-tribute to reduced complexity in the processing of large amounts of data as is the case of this study. It is shown in  that in some cases, the proposed approach performed better than alternative approaches to feature extraction and classi cation. As a result, the di erence histogram approach is also considered in study to determine how well it performs when considering network tra c data and to determine the feature extraction capability of the di erence histogram approach.
Mobile Network Planning and Optimisation
The second aspect of this study deals with the optimisation of radio resources in a mobile network based on features extracted from tra c data in the network. The strategy employed for network evolution to meet rising demand in the network should ideally take into account the minimum cost impact that changes in the network need to be made. In a staged manner, the rst step would be to improve coverage to meet rising demand. Once maximum coverage has been reached, capacity limitation would be reached leading to a capacity limited network. To overcome this situation, the hardware and software aspects of the network are adapted to extend the capacity in the network. The rst strategy is to begin BTS splitting into macro and micro base stations. The next strategy is to utilise dual bands to meet the capacity requirements through new frequency band implementation. This would be last option as this option is costly and impacts on the frequency planning of the network . A summary of the network evolution is shown in Figure 2.4 .
A number of studies have been conducted with a focus on improving and optimising the development of cellular networks. The focus of some of these methods has been on the development of strategies that make use of various algorithms to improve/optimise the development of networks. The overall goal of any planning strategy is to minimise the overall cost of the network design . Once an area has been identi ed for service provision, the objective of any network design would be to develop a network that is optimised in terms of number of transmitters (Base Stations), capacity (Number of Subscribers) and frequency. The focus of optimum base station selection is highlighted in , , and . The focus on Adaptive Base Station estimation is generate an optimised number of base station requirement as well as to develop an optimum base station location depending on subscriber densities that would establish a con uence to a speci c location.
Table of contents :
1.2.1 The Global Growth of the Mobile Cellular Market
1.2.2 Drivers for Mobile Subscriber Growth
1.2.3 Traditional Radio Network Planning
1.2.4 Subscriber Segmentation and Time Series Data in Mobile Networks
1.3 Problem Statement
184.108.40.206 Sub-Problem 1
220.127.116.11 Sub-Problem 2
1.4 Benets of Study
1.7 Research Methodology
1.8 Contributions and Outputs of Study
1.9 Outline of Thesis
2 Literature Review
2.2 Impact of Growth on Mobile Networks
2.3 Information Retrieval in Telecommunication Databases
2.3.1 Impact of Subscriber Classes in Mobile Networks
2.3.2 Customer Segmentation in the Telecommunication Industry
2.4 Related Work on Feature Extraction and Classication Approaches
2.5 Mobile Network Planning and Optimisation
2.5.1 Related work on Network Planning and Optimisation
2.6 Channel Allocation Strategies in Mobile Networks
2.6.1 Related Work on Channel Allocation Approaches
2.7 Approaches for Solving Combinatorial Optimisation Problems
2.7.1 Related work on the use of Combinatorial Optimisation Solvers
3 Feature Extraction for Subscriber Classification in Mobile Cellular Networks
3.2 Selecting a Suitable Feature Extraction Approach
3.2.1 Multi-resolution Based Approaches
3.2.2 Time Series Based Analysis
3.3 Background on Feature Extraction Approaches Considered in this Study
3.3.1 The Empirical Mode Decomposition Approach
3.3.2 Wavelet Based Approaches
18.104.22.168 Choice of Analysing Wavelet
3.3.3 The Dierence Histogram Approach
3.3.4 Dimensionality Reduction for the Improvement of Classication
3.4 First-Phase Development of a Feature Extraction and Mobile Cellular Network Subscriber Classication Approach
3.4.1 Mobile Cellular Network Trac Data Characteristics
3.5 Signal Decomposition and Feature Extraction for Trac Data Sets
3.5.1 Cluster Analysis of Extracted Features
3.6 Detailed Analysis of Feature Extraction and Classication Approach
3.6.1 Analysis of the Proposed Subscriber Classication Approach using Selected Feature Extraction Approaches
3.6.2 Analysis Based on the EMD Approach
3.6.3 Analysis Based on the DWPT Approach
3.6.4 Analysis using the Dierence Histogram Approaches
3.7 Cluster Analysis Results
3.7.1 Cluster Analysis using the EMD Approach
3.7.2 Cluster Analysis using the DWPT Approach
3.7.3 Cluster Analysis using the Dierence Histogram Approaches
4 A Two-Level Hybrid Channel Allocation Approach for Mobile Cellu-lar Networks
4.2 Background of the Problem
4.3 Solving Combinatorial Optimisation Problems
4.4 The Mathematical Programming Approach for Solving Linear Problems
4.4.1 Representation of a Linear Programming Model
4.4.2 The Capacitated Facility Location Problem
4.4.3 Facility Location Problem with Incremental Costs
4.5 Solving Mixed-Integer Linear Problems
4.5.1 The Branch-and-Bound Approach
4.5.2 Relaxation and Duality
4.5.3 Lagrangian Relaxation in Capacitated Facility Location Problems
4.6 Formulation of the Two-Level Hybrid Channel Allocation Approach
4.7 Simulation and Performance Evaluation
4.7.1 First Scenario: Fixed Channel Cost per Frequency
4.7.2 Second Scenario: Increased Cost per Frequency Allocated
4.8 Summary of Results
5 Conclusions and Recommendations
5.2 Stated Objectives and Achievements of Study
5.3 Benets of Study
5.4 Contributions of Work
5.5 Recommendations for Future Study
5.6 Final Conclusions