# 6Discrete Probability Distribution

Get Complete Project Material File(s) Now! »

## Theory

The purpose of this chapter is to explain the underlying theory behind the methodology in this paper.

Qualitative or Quantitative Approach

Any research approach can be generalized to follow either a qualitative or quantitative approach. The qualitative approach has its advantages as it is possible to get a more in-depth analysis (Gill Stewart Treasure and Chadwick, 2008) which will often generate soft-value results. This paper will apply a quantitative approach which focuses on a broader number of participants and often applies statistical techniques. A drawback with the first mentioned qualitative approach to risk analysis is the fact that it requires a lot “guesswork” which makes the estimates less reliable (Bakker, 2004). This is the upside of the quantitative approach instead. However, the quantitative approach requires a larger amount of statistical data instead. Data that is not so common with certain types of operational risk, like the IT risk.

### Regulatory Framework

When dealing with risk one typically deals with estimated cost and probabilities. All probabilities are conditioned on the background information (and knowledge) that we have at the time we quantify our uncertainty (Aven, 2003). Many operational losses happen frequently and do not result in major damages. These include everything from small data entry mistakes to minor system failure. However, banks (as well as other financial institutions) can suffer from the operational risk that can cause major losses which are of great concern for a risk manager. It is, therefore, paramount for banks to protect themselves from losses due to operation risk than show the range and magnitude of this risk (Keller and Bayraksan, 2011), and this includes IT risk. Since Basel II was finalized in June 2006 the banks were required to calculate the capital need to cover losses due to operational risk. The Basel II accord allows three ways of calculating operational risk. These are the Basic Indicator Approach, Standardized Approach, and Advanced Measurement Approach. The Advanced Measurement Approach, fourth ward denoted as AMA, allows the banks to develop their own model for estimating their operational risk exposure. The AMA models are usually more complex than the basic indicator or standardized approaches. However, the AMA model usually typically yields better estimates of risk (Keller and Bayraksan, 2011). The bank must have its own, in-house developed model approved first by the respective authority. Dr. Pavel V. Shevchenko’s book; Modelling Operational Risk Using Bayesian Inference. The Loss Distribution Approach is one of the most commonly used models under the AMA according to multiple studies including Keller and Bayraksan (2011), Franchot Georges and Roncalli (2001) and Shevchenko (2011). The Loss Distribution Approach, or LDA, is the model this paper will use when quantifying the IT risk for the banking industry.
The Bank for International Settlements (BIS) is actively working to withdraw the opportunity to exercise the advanced measurement approach for calculating the bank’s capital requirement for operational risk (BIS, 2016). However, this paper will use LDA to calculate an industry’s exposure to a certain type of operational risk, not to give a specific actor in this industry any suggestion on the capital requirement. Therefore, using the LDA method is still interesting and would generate a good estimate of the risk.

Loss Distribution Approach Model

The LDA model needs statistical data of a risk in form of yearly frequency, of which an event occurs, and the monetary value of the losses (severity) given that an event occurs. These two are assumed to be independent of each other and modeled separately (Svensson, 2015). A relevant distribution is fitted into the yearly frequency and the loss, which in turns are being used as inputs to calculate the aggregate loss distribution. To obtain the aggregate loss distribution it is common to use a Monte Carlo simulation. In this paper, the LDA method will be used accordingly. This means that the yearly frequency and losses given an incident will be measured and fitted to an appropriate distribution. A Monte Carlo simulation will then be used with these distributions as input to generate an aggregate distribution for this risk on an industry level.
The LDA is used when modelling the IT risk in this paper. This approach was chosen because it is a quantitative approach which otherwise would have been hard to use for an individual bank to estimate this risk (because of the previously mentioned problem of scarce data, see section 1.5.2: Scarce Data). The reason to why this paper used LDA as a quantitative method and not any other quantitative method is because it is one of the most popular methods under AMA (Shevchenko, 2011). AMA allows the bank to build its own, in-house model for quantifying its operational risk exposure. And since LDA is one of the most used methods in the industry for banks who create their own models, it is probably the best-suited model for quantifying this IT risk.

#### Simulation Method and Distribution

This paper is going to use historical incident data from the banking industry to find and fit appropriate frequency and severity distributions. These distributions will be used as input in a Monte Carlo simulation in order to estimate the aggregate loss distribution of the IT risk. The frequency distribution will be corresponding to the number of incidents that occurs in a given year and will, therefore, be following a discrete distribution. The severity distribution will be corresponding to a number of losses experienced by the industry given an incident occurs and will, therefore, be following a continuous distribution. The distributions will be fitted from historical data using risk analysis software @Risk from Palisade. How well the distributions fit the data will be determined by Akaike information criterion (or AIC), and Bayesian information criterion (or BIC).

Akaike Information Criterion (AIC)

AIC is a measurement of relative quality of a statistical distribution for a given set of data and is something the risk analysis software will help determine. The AIC measurement is based on information theory and will indicate how much information is lost from the data if the given distribution is assumed, in relationship to the other models. The best model is, therefore, the one which minimizes the AIC score (Liddle, 2008). AIC is calculated according to the following formula:= −2      ℒ(max) + 2
where ℒ (max) is the maximum likelihood achievable by the model and k is the number of parameters in the model (Liddle, 2008).

Bayesian Information Criterion (BIC)

The Bayesian information criterion, or BIC, was introduced by Schwarz and it assumes that the data points are independent and identically distributed (Liddle, 2008). BIC works in the same way as AIC, namely, it will rank the best-fitted distribution according to a BIC score where the lowest value will be the best-fitted distribution. According to the website: standfordphd.com, BIC has a preference for simpler models, with a lower number of parameters, than compared to AIC (Standfordphd.com, u.d.). BIC is calculated according to the following formula:

Monte Carlo Simulation

The distributions found to be a good fit for the historical losses will later be used in a model. One distribution is used for modelling frequency while one or two distributions are used for modelling severity. These two or three distributions are used as inputs in Monte Carlo simulations. A Monte Carlo simulation is an open form solution which could be done in multiple ways but involves solving analytical formulas by using a large quantity of randomly generated numbers. (Navarrete, 2006).

Probability Distribution

Probability distributions are defined by a probability function which assigns the probabilities to the possible values of the random variable (Jones, 2017). Hence, a probability distribution lists the possible outcomes of a random variable together with its corresponding probability.
In most general terms, a probability distribution can be seen as a discrete probability distribution or as a continuous probability distribution. It is the values that the random variable can assume that determine this and is a central subject of the probability theory (Andale, 2017). If a random variable can only assume a finite number of values, it would be a discrete distribution and if the random variable could assume an infinite number of values, it would be a continuous distribution. However, there are more ways the many different distributions are categorized and one common way is by looking at their parameters. Many distributions are not a singular distribution but a family of distribution. (Handbook Engineering Stastistics, 2017). It can depend on if a distribution have one or more shape parameters. The shape parameter allows a distribution to take on a variety of shapes, depending on the value of this parameter (Handbook Engineering Stastistics, 2017). A family of distributions includes distributions who are sharing some properties or characteristics. When describing the distributions used in this paper, some common family of distributions are used. The exponential distribution family is one of the most common distribution and includes many of the commonly used distributions. Many of the distributions used in this paper belong to this family. Clark and Thayer (2004) introduces the exponential family in their paper explaining how they are suitable for aggregate loss models. However, other some distributions are included that do not belong to this distribution and belongs to other distribution families instead. An example of a less common family of distributions would be the Pearson family, which are characterized by two quantities usually referred to as β1 and β2 (Lahcene, 2013).
This paper sought to model IT risk using certain distributions to explain the data. This was done by an analytical software where many different distributions where included. However, only a few number of distributions were suggested and later implemented in the models. The theoretical background of the distributions who were included in the models of this paper are explained later in this chapter under the subheadings: “Discrete Probability Distribution” and “Continuous Probability Distribution”.

1.Introduction
1.1Background
1.2Definition
1.3Problem Statement
1.4Purpose
1.5Problem Description and Relevance
2.Literature Review
2.1Most Relevant Literature
2.2Other Relevant Literature
3.Theory
3.1Qualitative or Quantitative Approach
3.2Regulatory Framework
3.3Loss Distribution Approach Model
3.4Simulation Method and Distribution
3.5Probability Distribution
3.6Discrete Probability Distribution
3.7Continuous Probability Distribution
4.Data
4.1External Data
4.2Characteristics of the Data
5.Method
5.1Fitting the Model
5.2Aggregated Loss Distribution
6.Result
6.1Method
6.2Method Two .
7.Analysis
7.1Method One
7.2Method Two
7.3Discussion
8.Conclusion
9.References
GET THE COMPLETE PROJECT