Current Discussions and Findings
How can one invest wealth acquired? Portfolio theory provides a solution to this based on two principles:
Maximizing the expected return and Minimizing the risk of the portfolio.
These two principles are somewhat at odds considering that riskier assets have higher ex-pected returns, as investors demand a reward for bearing greater risk. The difference between the expected return of a risky asset and the risk-free rate of return is called the risk premium. Without risk premiums, not every investor would be willing to invest in risky assets.
Markowitz (1959) Mean-Variance (MV) optimization has been, and probably still is, the standard for efficient portfolio construction for over 50 years. Almost every portfolio program or optimizers for asset allocation is based on some form of variation of the Markowitz process. In modern finance, Markowitz method is theoretically necessary. However, the MV optim-ization is sensitive to the uncertainty which comes with the estimates of the expected return and the variance-covariance matrix of the returns and characteristically results in an ambiguous portfolio optimality and poor out-of-sample1 performance. Tests done have demonstrated that equal weighting subjugates unbounded MV optimized portfolios, and they essentially have no practical investment value. DeMiguel et al. (2007) in their literature tested the performance of 14 models, and this, as stated earlier, was regarding unbounded MV portfolio choice with estimation error in context. They found out that, as in Jobson and Korkie (1981), that none of the models seem to have steady improvements over equal weighting and has in practice, made the MV optimization to be used primarily as a convenient frame of reference for imposing provisional or specific constraints and providing a scientific guise for marketing purposes.
The forerunners of the estimate parameter uncertainty with a statistical perspective of the MV optimization include Frost and Savarino (1986, 1988), Jorion (1986, 1992) and Jobson and Korkie (1980, 1981). Interestedly, the majority of traditional research focused in this area has been on utility maximization or unbounded in-sample MV optimization while overlooking the uncertainty of estimates or out-of-sample performance. Examples of such research include the Black and Litterman (1992) technique. While an in-sample investigative context may be manageable and suitable, the unbounded MV optimization, without considering estimation er-ror, leads most of the time to distorted or irrelevant trading and constraint policies that may harmfully affect loads of dollars or any currency of relevance under any portfolio management.
1 In-sample and out-of-sample terms are typically used in optimization or fitting methods. Running an approach over a particular data set (say over the last three years) is said to be done in-sample. That is results are evaluated over the sample used to fit it and this should technically give the best possible result. If evaluated over, say next two months, then it is done out-of-sample, then evaluation was made in a time period contrary to the approach it was optimized on. In other words, in-sample refers to data available and out-of-sample is data one wants to forecast or estimate.
As mentioned earlier, academic research has mostly concentrated on unbounded in-sample expected utility maximization when it comes to estimation error. An early reference is Bawa et al. (1979). More recent explorations include Jorion (1986) and DeMiguel et al. (2007). However, portfolio optimizers typically prefer MV efficiency for defining portfolio optimality as investors feel more at ease estimating asset returns and risks than parameters related to a utility function. Levy and Markowitz (1979) have shown that portfolios on the MV efficient frontier are regularly good approximations of portfolios that tend to maximize expected utility or many utility functions and return generating processes in practice.
Practical hindrances and its implication/effect
With every ‘perceived perfect’ model comes with it some limitations. Markowitz’s theory has no adverse impact on the investment outcome. Markowitz model gives the right channel to make investments if and only if the specific correct inputs are known and used. The more severe problem arising is estimation error or the parameter uncertainty that comes with the optimization inputs needed. Risk-return estimates are highly very much uncertain practically, and mostly, sensitivity to the changes in inputs of portfolio optimization leads to ambiguity in portfolio optimality.
Practically, expected returns and covariance matrix of the returns are unknown and as such estimated from historical data. This introduces three well-known problems:
Financial data is characteristically non-stationary2 over extended periods of time. This characteristic limits the amount of data that can be essentially used to estimate the mean and covariance of the returns. Conversely, however, the sample covariance has numerous parameters, and this requires large amounts of data to estimate. Considering a portfolio that includes 1000 stocks, then the sample covariance has approximately 500,000 para-meters needed to be estimated. This, as a result, requires alternative estimators of the covariance.
The estimated means and covariance tend to have a significant influence on the optimal portfolio weights. This is to say, a small change of the estimates may lead to extreme changes in the portfolio weights. These portfolio weights computation are made possible by replacing the mean vector and the covariance matrix of returns with their estimates.
The optimal portfolio is likely to increase a significant amount of estimation errors in certain directions. This comes from the fact that should the variance of an asset be sig-nificantly underestimated, it appears to be small, and the optimal portfolio assigns a significant weight to it. Similarly, a considerable weight will be allocated if the mean return of an asset appears to be large as a result of being significantly overestimated and results in the risk of the estimated optimal portfolio being typically under-predicted and return, over-predicted.
2The financial data here being non-stationary means the mean and variance vary with time with the underlying probability distribution. Stationarity, on the other hand, means the distribution does not change with time.
Estimating the expected returns and the covariance matrix of stock returns has always been one of the tackiest points. Estimating these parameters is not a new issue. There is lots of literature which discuss them. Some of this literature include:
That of Stein (1955) and in this article, Stein presents that the accepted sample statistics are not suitable for multivariate problems.
Furthermore, a series of papers also describes the issue in detail. These papers include that of Michaud (1989) and Barry (1974).
Empirical tests done by Chopra and Ziemba (1993) also shows that return estimate errors are predominant in risk estimate errors.
Also, Monte Carlo simulations are used to estimate the enormity of the problem by Jorion (1992) and Broadie (1993).
Notwithstanding this, the standard statistical method of finding these parameters is usually gathering a history of past stock returns and computing their sample covariance matrix and mean. Unfortunately, this process creates problems that have been well covered by Jobson and Korkie (1980). When the number of stocks under study is enormous, in relation to the number of historical return observations available, the sample covariance matrix is estimated with much error and this is usually the case. This implies the most extreme coefficients in the matrix that have been estimated is likely to take on extreme values; not that it is correct rather because they contain an extreme amount of error. Customarily, any MV program or optimizer or software will latch onto these extreme values and place its biggest stakes on those coefficients which are the most extremely unreliable. Michaud (1989) has called this occurrence “error maximization.”
The key parameters in the Mean-Variance Portfolio (MVP) model are the mean vector and the covariance matrix of the returns of securities in the portfolio under consideration. These parameters are assumed to be known. However, in a practical sense, they need to be estimated from the observed market or financial data. Large historical returns of data are required to get quality statistical estimates for the key parameters. However, as stated in Section 1.3, financial data is characteristically non-stationary over long periods of time and as such only a limited amount of recent historical data is relevant. Long periods of time here could be a time period of 30 years or more and a shorter span, between 10 to 15 years. The estimated covariance matrix and the mean return vectors are the key inputs of the optimization program.
The objective is to look into some procedures or estimators which tend to reduce or more likely, correct the estimation errors and the problems that comes with it to provide a much better framework for better investment output. There are several approaches to remedy these issues. However, in this project, two methods are considered. One particular approach has been proposed to remedy the statistical instability issues that arise from poor statistical estimates of the key parameters and this involves the use of Monte Carlo simulations or Resampling (bootstrap) to find a range of optimal portfolios or in other words using resampling (bootstrap) together with optimization to deal with the bias in the weight of the efficient portfolio. Another approach includes using shrinkage estimates of the covariance matrix. This method seeks to strive a compromise between the instability of the sample covariance estimator and the biases introduced by model-based estimators.
Reviewing Fundamental Concepts
Under this chapter, we look at some concepts needed involving diverse mathematical and fin-ancial notions.
The aim of investing is to make a profit. The revenue from investing, or the loss, negative rev-enue in this case, is dependent upon price change and the amounts of assets held. All Investors are interested in earnings that are high in relation to the amount of their initial investments made. Returns are dependent on time such as days or months and independent of other units such as dollars or euros. Other types of returns are discussed next.
Random Walk Synopsis
Stock prices play a pivotal role in this project and as such a key knowledge in its behaviour is of much consequence. An aspect not to overlook is that of its random nature and what comes to mind quickly is the random walk hypothesis. The random walk hypothesis or theory states that changes in stock prices are independent of each other and have the same distribution, and presupposes that a stock price’s movement or trend in the past cannot be used to predict its future movement.
Most Mathematical Finance work assumes stock prices follow a log-normal geometric ran-dom walk or geometric Brownian motion. Is this assumption true? There are two assumptions made on the log-normal geometric random walk1:
Log returns are normally distributed N(m; s2). Log returns are mutually independent.
With the sum of normal random variables also being normal, the normality of single-period log returns will imply normality of multiple-period log returns.
Once again, are the assumptions stated above true? The response is a “no.” Investigating the marginal distributions of numerous series of log returns shows that the return density has a bell shape, and this is somehow similar to that of normal densities. This concept will be portrayed in Chapter 3 using one of the stocks to give a much better visualization. However, the tails of the log return distributions are heavy-tailed in comparison to that of a normal distribution.
Heavy-tailed distributions play a fundamental role in finance and such as stock returns and other changes in market prices usually are heavy-tailed. This is to say that most extreme values occur in this regard and more likely than would be predicted. In finance applications, one needs to be apprehensive when the return or log return distribution is heavy-tailed. Such distributions have the possibility of an enormous negative return, which could, in an instance of consideration, entirely deplete a company’s capital reserves. Significant positive returns are also a matter of concern should short selling be done.
Box-plots, also known as box and whisker plots, are a useful graphical means for comparing data and can be displayed in a vertical or horizontal format. It is used to display patterns of quantitative data based on these measures: minimum, maximum, median (Q2), first quartile (Q1), third quartile (Q3). Based on this, the range and interquartile range (IQR) can be found which are also common measures showing the spread in data. Figure 2.7.1 shows the measures stated earlier. The difference between the third and first quartile gives the interquartile range (IQR) as seen in figure 2.7.1. The “whiskers” show the location of the maximum and minimum points. Some datasets will display, surprisingly, high maximums or surprisingly low minimums called outliers.
Boxplots can also provide information regarding the shape of data. Most likely being that the data is skewed to the right, left or it is symmetric (Figure 2.7.2).
The portfolio under consideration is based on ten (10) different industries listed and ranked on Forbes World’s biggest Public Companies for 2015. The focus is primarily on the United States of America (USA) industries. Concerning each industry from the list, four (4) compan-ies (ranked top 4) are chosen to represent a total of 40 stocks.
The selected industries are Gas and Oil Operations, Discount Stores, Software Programming and Computer Hardware, Industrial Conglomerates, Diversified Insurance, Investment Ser-vices, Pharmaceuticals, Telecommunication Services, Major Banks and Household & Personal Care. Table 3.0.1 on page 16 shows the stocks under consideration in this portfolio.
The period under consideration is a 10-year period from 31st December 2005 to 31st December 2015. Historical stock prices were obtained from Yahoo! Finance (www.finance.yahoo. com) to which they are readily available. The benchmark in focus is the S&P 500 stock market index (with code ^GSPC) as this primarily focuses on stocks issued and traded on the American stock exchanges.
Also, the risk-free rate used is 0.03118 3.12% from the US Treasury; a 10-year treasury by month (http://www.multpl.com/10-year-treasury-rate/table/by-month). The rate was obtained by averaging the monthly risk-free rates for the period under review.
With a 10-year period in focus, each stock has a lifespan of 120 monthly price data. The histor-ical stock price data is converted to monthly return (log returns). The Adjusted Closing Price is used and also, dividends have already been accounted. The values of concern here are the initial investment value and the ending investment value. Due to this calculation process, one observed value is lost making the data be of length T= 119.
Using Goldman Sachs (GS) as a demonstration, the distribution of both the prices and log returns with a scatter plot are shown in Figure 3.0.1 on page 17. The scatter plot gives a visu-alization of how random the prices behave. The histogram for the prices is also an indication that stock prices are indeed not normally distributed. In this regard, the distribution of the log returns appears to be bell-shaped. A normal curve is generated using the R programming language to support this. However, just as mentioned, normality assumption remains fairly reasonable in this context.
The Table 3.1.1 shows the basic measures about descriptive statistics of the monthly log returns for the stocks under consideration.
A more subtle illustration is to use Boxplot to give an overview of the distribution of the data. Much has been discussed in Section 2.7. This will help display patterns of quantitative data such as the median, quartiles and interquartile range.
Figure 3.1.1 shows stocks such as LVLT, VLO and C have relatively wide dispersion (broader interquartile range), while stocks such as KMB, CL, PG and BRKB have relatively low level of dispersion (narrow interquartile range). Considerably the outliers to the right shows more of the log returns being positive in comparison to the negative values with most data values skewed to the right.
Table of contents :
1.2 Current Discussions and Findings
1.3 Practical hindrances and its implication/effect
1.4 Thesis Objective
2 Reviewing Fundamental Concepts
2.2 Net Returns
2.3 Multi-period returns
2.4 Log Returns
2.5 Adjusting for Dividends
2.6 Random Walk Synopsis
3.1 Descriptive Statistics
4 The Language of Portfolio
4.1 Risk – Return
4.2 Feasible Portfolio Opportunities
4.3 The Sharpe Ratio
4.4 The Efficient Frontier
4.5 Generating the Efficient Frontier
4.6 Fundamental Limitations of the MVE
4.7 Concept Demonstration: The Efficient Frontier
4.8 Short Selling Scenario
4.9 No Short Selling Scenario
4.10 Tangency Portfolio
5 Estimation Error In Essence
5.1 Estimation Error
5.2 Finding Estimators To Reduce Estimation Error
5.3 Portfolio Resampling
5.4 Bootstrap Resampling Process – Generating Resampled Efficient Frontier
6 Covariance Estimation
6.1 Estimation by Shrinkage; Ledoit-Wolf Procedure
6.2 Sharpe’s Single-Index Model
6.3 Shrinkage Estimator of the Ledoit-Wolf Covariance Matrix
6.3.1 The Sample Estimator: Sample Covariance Matrix
6.3.2 The Target Estimator: Single-Index Covariance Matrix Estimator
6.3.3 The Optimal Shrinkage Intensity
Fulfilment of Thesis Objective
A Attributable Factors
A.1 Quadratic Utility Approach
A.2 Rank of a Matrix
A.3 Positive Semi-Definite Matrix
B R Programming Language Codes 57
B.1 Efficient Frontier – Short Selling Allowed
B.2 Efficient Frontier – Short Selling NOT Allowed