Programming in R: Modelling Investment Portfolios with Matrix Algebra

Nasdaq

Investment portfolios are a collection of investments.  These investments can be anything including real estate, merchandise inventory, or a collection of businesses in a multinational corporation. However, the term is most commonly used to describe an investment in stocks and bonds in financial markets. Whatever the context may be, a portfolio is a collection of assets purchased at a certain price, held for a certain time, may provide income/cost during the holding period, and then are sold for profit/loss.

Matrix algebra is a branch of mathematics that is often used to model investment portfolios.  The goal of this post is to introduce the used of matrix algebra via the programming language R to solve commonly asked questions about investment portfolios in stocks.   The expected return and the riskiness of the portfolio will be analyzed both analytically and computationally.

1. Vectors and Matrix Definition

The following example is for 3 assets but could easily be extended to a many asset model representation of the portfolio problem. The following notation is used to represent the asset returns, their joint normal distributions, expected returns, variance of returns, and the covariance of returns.

ScreenHunter_04 Dec. 28 11.34R represents the asset return for investments A, B, C. The returns are distributed as a multivariate normal, mu subscript i is the expected return for asset i, sigma squared subscript i is the variance or returns for asset i, and sigma subscript ij is the covariance of returns between asset i and j. The share of wealth invested in assets A, B, and C (not pictured) is represented by x subscript i. The following notation uses the definitions above to construct vectors and the variance-covariance matrix used to further define the model for portfolio returns.

ScreenHunter_04 Dec. 28 11.45

2. Portfolio Returns and Expected Returns

A portfolio’s returns is the weighted average of the individual returns.  The weights are the share of wealth invested each asset which is then multiplied by the return on that asset. One can represent the returns and the expected returns with vectors in the following way:

ScreenHunter_04 Dec. 28 11.55R subscript p,x represents the returns to portfolio p given a certain allocation of wealth x among the assets in portfolio p. Mu represents the expected portfolio returns given the same assets and allocation of wealth.

3. Portfolio risk or variance

The variance of a portfolio can also be written in vector/matrix notation. Recall that when multiplying a vector with itself one must transpose the second multiple. The matrix calculation is shown below with vectors and matrices (bold) and then the familiar variance formula is written out in non-matrix form.

ScreenHunter_04 Dec. 28 12.11

2. Modelling in R

The next step is to programmatically represent the model in R using vectors and matrices.  The following code creates a vector that contains 3 sets of assets with returns of 1%, 4%, and 2%.

ScreenHunter_04 Dec. 28 13.06

Next the variance covariance matrix sigma and the share of wealth invested in assets needs to be programmed into the system.

ScreenHunter_05 Dec. 28 13.12

The final step is to calculate the expected portfolio returns and the variance of the portfolio us cross products, matrix transposes, and matrix multiplication.

ScreenHunter_05 Dec. 28 13.18

These calculations suggested that the expected return on this portfolio is 2.3% with a variance of .0048.  Using the code above one can experiment on how different assets and asset allocation effect the risk and rewards of the portfolio.  In later post we will see how to obtain minimum variance portfolios by allocating shares in a way that reduces risk while maximizing profit, as well as other optimization techniques.

Evaluating Economic Policy While Considering Worker’s Price Expectations: Another Application of Matrix Algebra

This posts develops a model of the macroeconomy in matrix algebra form.  The model consist of  markets for goods, money, and labor in addition to a general economic production function.  In accordance with the adaptive expectations hypothesis, asymmetrical information about the general price level in the economy will be available to producers and consumers.   More specifically, the model derived here assumes that workers supply their labor by considering the expected future price level, while producers decide how many workers to hire based on correct price expectations.

THE MACROECONOMIC MODEL

The Goods Market

The goods market will consist of demand and supply of output.  The demand for output will be denoted by consumption (C), investment (I), and government spending (G).  The supply of output is represented by the letter Y and will be determined by a production function.  The nature of this production function will be described in the production function section of the model.

Consumption is a function of output (aka income), lump sum taxes, and output based taxes.  Investments are a function of the real interest rate and government spending in completely exogenous, or determined outside of the model.

In order to understand the comparative statistics of the goods market, and the entire model for that matter, the total differential of every equation will be derived.  The total differential of the goods market is:

After taking the first differential and moving the endogenous and exogenous variables to the left and right hand side respectively.  The third equation represents the total differential of the good market equation.  The total differential of the production function and the other markets in this model will be derived in a similar way.

The Money Market

The money market equilibrium consist of the intersection between supply and demand for money.  The supply of money is completely exogenous to the model and will be considered in real terms be accounting for the general price level in the economy. The demand for money is a function of output (aka income) and real interest rates.

Taking the total differential and separating the endogenous from exogenous variables yields the equation that will be part of the matrix system.

The Production Function

The production of all goods in the economy will be a function of labor and capital.  Labor and capital are both homogeneous inputs and their derivatives with respect to the output function are both positive.  The total differential of the production function is

The Labor Market

The labor market consist of a supply and demand function of labor.  The supply of labor is given by the workers in the labor market and firms demand their services.  The interactions of both supply and demand yield the equilibrium quantity of labor and the market wage.

Workers supply their labor based on a comparison between the nominal wage and the expected price level.  It is assumed that producers have more information about market conditions so their labor demand decisions are based on the current price level. Workers expectations of future prices depend on current prices.  Their expectations and the total differential of their expectations is as follows…

This result will be an important substitute into the general labor market equilibrium and total differential

Using the result about the total differential about workers price expectations we arrive the following equation after combining like terms…

THE MACROECONOMIC MODEL IN MATRIX FORM


Once the model is in matrix notation it is fairly straight forward to use Cramer’s to solve for different the system.  The Jacobian of the endogenous matrix, that containing the output, real interest rate, price level, and labor market equilibrium coefficients is less than zero. This can be verified by taking the determinant of the first matrix above. The Jacobian will be denoted by |J| from this point forward.

COMPARATIVE STATICS AND POLICY EVALUATION

Expansion in Government Expenditure and Output

An increase in government expenditure can increase the total output in the economy if labor demand does not equal labor supply and workers have incorrect price expectations.  If either one of these conditions aren’t meet than an expansion of government expenditure will not have an impact of total output.



Increase in the Money Supply and Interest Rates



The only ambiguous term in this equation is the 3rd term.  Assuming that the demand of labor equals the supply ( f = g) and that the ratio of the derivative of the expected price level and the actual price level is between zero and one we get that an increase in the money supply decrease the real interest rates. If price expectations are correct then monetary policy will be ineffective else increase in the money supply decrease interest rates.


Clustering to Improve Merchandise Allocation, Testing, and Forecasting: An Application of the K-Medians Algorithm

BACKGROUND

Localization of merchandise assortment has become popular among retailers in recent years.  The recognition that store population’s heterogeneity is an important consideration in the testing, allocating and pricing of merchandise  has boosted the profits when the theory is correctly put to practice.  The issues concerning localization revolve around what stores should be grouped together?  What variables and methodology should be used to group stores to ensure the minimization of differences for stores within a group for all groups created?

Dr. Fisher and Dr. Rajaram from the University of Pennsylvania and UCLA found that using a k-medians clustering methodology based on sales, minimized testing and stock out costs and improved the accuracy of chain-wide forecasts based on small store sample merchandise testing.   The paper they wrote appeared in the Journal of Marketing Science Vol. 19, No. 3, in the Summer of 2000 and its titled, “Accurate Retail Testing of Fashion Merchandise:  Methodology and Application”. In their research they found that when it came to merchandise testing, clustering based strictly on sales was superior to clustering based on sales, location, average temperature, and ethnicity of of the neighborhood in the stores general area.  Less variables was more when it came to clustering for merchandise testing and forecasting.   Although, they did find that from all of these descriptor variables, climate proved to be an important factor in assessing merchandise testing, but not significant enough to beat out clustering based strictly on sales.  This analysis was conducted on a national retail chain that specialized in women’s specialty apparel and two other national shoe retailers with over 100o stores; one of the shoe retailers was Nine West.

The objective of this post is to use k-medians clustering to segment stores into volume groups.  Volume groups that divide store populations based on sales or projected sales data to drive the allocation of merchandise.  The type of k-medians clustering used in this post will be slightly different that that used by Fisher in Rajarma.  The objective of their clustering was to find stores to conduct tests, so they clustered stores in fairly even groups much like the picture above.  Clustering used for allocation should not have an equal number of stores, but instead create clusters that identify the differences and similarities among stores with an emphasis on the top performing stores.    If there is a high performing store that is a large outlier then it should be in a group by itself, luckily STATA has an option for k-medians clustering that does this kind of grouping after doing some simple sorting.

MATHEMATICAL DESCRIPTION OF ALGORITHM

The objective of k-median clustering is to partition data into k-clusters which are less than or equal to the n observations to minimize the within cluster sum of squares for every k cluster created.  In the retail example, there are n-observations of d-dimensional vectors would be analogous to saying there are d-store descriptors (climate, sales, ethnicity in location…) for a fleet of n (or 1000) stores.

Objective

The Greek letter “mu” above represents the median of each cluster.  The internal sum represent the sum of squares of the difference between observation x in cluster s and the median of cluster s.  The outer sum indicates that the sum for each cluster  from i to k is totaled to get a single number which the is be minimized.

Algorithm

1. Assignment Step

The initial step is to assign the initial k-medians to the data.  First the sales data are ranked from highest to lowest then the initial k-medians are assigned to the top k stores.  In the assignment step each observation is grouped according to a cluster based on its distance to the median at time t, this means that in the first step all stores are grouped to the kth median:

2.  Update Median Step

Once an every observation x has been assigned the kth median which minimizes the distance between the  observations and medians in each potential cluster.  The next step is to update and calculate a new median from the newly formed clusters, which mathematically is equivalent to:

The bars around the cluster S indexed with i at time t represent the norm of the vector.  In the case of a one dimensional vector this would be analogous to the sum of the variable used to cluster divided by the number of observations within the cluster.  The algorithm would then calculate the objective function and repeat this process until the minimization problem is solved.  This problem only has a solution if the cluster sum of squares is convergent to a minimum value. In orther words there should be no cycling or saddle points that throw a wrench in the optimization algorithm.

EMPIRICAL EXAMPLE-STATA

Using an ARIMA forecast of a national retailers projected unit sales a k-medians clustering will be devised that minimizes the difference among the stores in the group. This clustering can be used to allocate inventories across the nation to reduce the markdown cost related to overstocking and the missed sales from stockouts.

1) First copy and past store level data (sorted from largest to smallest)  into STATA when go to the following drop down menus Statistics->Multivariate Analysis->Cluster Analysis->Cluster data->k-medians (not shown)

2) Then you will be presented with a screen like the one shown below;  select the sales variable which in this example is named annualized retail 2009.  Next choose “k” which corresponds to the number of clusters that are required, in this case the data will be clustered into 7 groups. Leave the (Dis)similarity measure as Euclidean since that is what the researchers did in the paper “Accurate Retail Testing of Fashion Merchandise:  Methodology and Application”. Hit the Options button…

3) In the options tab, select “First K observations”, since the data is sorted in ascending (descending) order the natural breaks and grouping of stores will occur by selecting this option. Then hit “OK”…

STATA then breaks up the stores into neat clusters based on minimizing the dissimilarity withing groups while adhering to the natural breaks in the data.  The table below summarize the number of stores per cluster the mean, standard deviation, minimum and maximum annualized retail volume of stores within the clusters.

Equilibrium in the Goods and Money Markets: Graphical Approach to the IS/LM Model

The IS/LM model combines the goods and money market equilibrium’s to form an aggregate model that describes a general equilibrium setting in the macroeconomy.  This post will use a graphical approach to establish the intuition behind the building of the IS/LM framework which describes the money and goods markets.  A later post will develop the IS/LM model by using matrix algebra and derivatives to analyze the interactions between the money and the goods markets in the macroeconomy.

GENERATING THE IS CURVE GRAPHICALLY

The IS curve graphically shows the relationship between the interest rate and aggregate output from the demand side of the model.  The derivation of the IS curve can be taken from the dynamics of the aggregate demand function  Y = C + I + G, where output (Y) is equal to consumption (C), (I) represents Income (I) and government expenditures are denoted by (G).  To incorporate the interest rate into this model a slight modification will have to be made to the equation above which entails making consumption and investments a function the interest rates; Y = C(r) + I(r) + G.  The graph below represents current consumption on the Y-axis and current income on the X-axis along with a graphical representation of the equation  Y = C(r) + I(r) + G.

An increase in the interest rate reduces consumption as consumers put more of their disposable income into savings to take advantage of higher returns.  Like consumers, businesses reduce their investment on plant and equipment with as the interest rate increases.  Equilibrium in the graph above is achieved when the current income equals current consumption or Y = C(r) + I(r) + G.  As the graph shows, an increase in interest rates reduces current income/output as both consumers and business reduce consumption and investment.  This relationship can be visualized by plotting the real interest rate and current income, this relationship is the IS curve.

GENERATING THE LM CURVE GRAPHICALLY

The assumption is that the money supply is a fixed quantity in the short-run and is determined by the government.  The demand for money is a function of prices, income/output, and the real interest rate.  The money market is in equilibrium when the money supply equals money demand under the assumption of equilibrium.  Using the equilibrium condition in the money market the LM curve can be derived graphically which shows the relationship between interest rates and output.

 

 

The graph above shows how an increase in income (Y) increases the demand for money.  In order to restore equilibrium to the money market the interest rate increases.  Similar to the IS curve, this relationship can be captured by plotting income (Y) on the X-axis and interest rate on the Y-axis This relationship derived from the money market is called the LM curve and is picture below as an increasing function of the real interest rate.

 

 

COMBINING TO GET THE IS-LM MODEL

The graph above represent the goods and money markets in the economy.  The intersection of the IS and LM curves represents the macroeconomic equilibrium in the goods and money market.  If either the real interest rate or output deviate from this equilibrium market forces will drive the both variables to back to their equilibrium.  In a subsequent post the dynamics and interaction of the money market and the goods market will be examined through more graphical analysis.  Once the intuition of the dynamics of the ISLM model has been established the topic will advance to understanding this model via matrix algebra and differential calculus.

 

 

 

 

 

 

 


Greece’s Financial Tragedy: Comparative Statics Using The Implicit Function Theorem and Cramer’s Rule

At the time of these writings Greece is going through a severe sovereign debt crisis which is severely damaging the value of the Euro in relation to the dollar.  A mathematical model can be created to determine how a reduction in Greek government spending would impact its Gross Domestic Product, domestic interest rates, and the Exchange rate between the dollar that can shed some insight into the workings of the Greek economy.

Equilibrium in the Goods Market:

The equation above represents the national income equation where total output is defined as consumption as a function of income, investment as a function of the real interest rate, exports as a function the exchange rates and imports as a function of the exchange rates and income.  Government spending is exogenous and is pre-determined, but in this case it will be our variable of interest which will represent the policy variable in our system.  We are assuming that the partial derivatives of the national income equation represent the following:

  • Consumption is an increasing function of Income
  • Investment is the economy is inversely related to the Domestic Interest Rate
  • Exports are positively related to an increase in the exchange rate (depreciation of the Greek currency)
  • Imports are an increasing function of income and a decreasing function of the exchange rate (depreciation of Greek currency)

The next equation represents equilibrium in the Money market;

The equation above represents equilibrium in the money market where money demand is a function of income and domestic interest rates.  Money supply is exogenous and is determined by the European Central Bank.  We are assuming that the partial derivatives of the money demand equation follow these  basic properties:

  • Demand for money (L) is an increasing function of income
  • Domestic interest rates and the demand for money are inversely related

The next equation represent the balance of payments which asserts that net exports must be balanced by a net capital outflow and vice versa;

The equation above represents next exports or the capital account and the capital account.  The partial derivatives of the exports and imports are identical to the national income equation and the capital account has the following directional derivatives:

  • Capital will tend to flow into Greece if its’ domestic interest rate increases
  • Increases in the world interest rates will cause an outflow of capital in Greece

Writing the system described by the three equations above in implicit form we get:

In order to ensure that this system will have a solution which exist we need to assume that all partial derivatives are continuous and that the determinant of the Jacobian is non-zero.  A non-zero Jacobian rules out linear independence in the system so that there is a unique solution as demonstrated below:


Taking the differential of the system is the second step to solving this system:

This system can be written in matrix notation in the following form after dividing by the exogenous variable for the differential of government spending:

Finally, using Cramers’ we can solve for the change in Greeks gross domestic product, domestic interest rates, and the exchange rate given an increase in government expenditures:

In conclusion, as the model above shows, and increase (decrease) in Government spending by Greece will cause a temporary increase (decrease)in its gross domestic product.  This (contractionary) expansionary fiscal policy will also (decrease) increase the Greek/U.S. exchange rate (E) which essentially means (appreciating) depreciating their currency.  In Greece this mechanism has had a delayed effect, but the countrys’ reckless fiscal policy is now helping to push down the value of the Euro as speculators try and identify how to fix this problem within the Eurozone.  The fact that the European Union is not as homogeneous as it should be will cause further tough decisions to be made about their policies with countries such as Portugal, Italy, Ireland, Greece and Spain whose fiscal discipline has much to be desired.  There is not clause in the EU contract for separating a member nation and if this were to happen it would be quiet messy for all of the EU so in the mean time the bailouts will be coming for Greece, but clearly fiscal discipline must be instilled if they want to remain a respectable member of the European Union.

Supply and Demand Partial Equilibrium Market Model in Matrix Form

The basic supply and demand model is the workhorse of microeconomic and macroeconomic models.  The model is not without its problems and shortcomings, many of which have been discussed in previous post:  equilibrium assumption, excluding non-linearities, and a host of others, but the predictive power of the model remains one of its greatest properties.

The model consists of a system of linear equations which we are going to set up in its most general form with the equilibrium equation that supply equals demand and two behavioral equations for both the consumers and producers of a generic good.  The law of demand and supply are reflected in the coefficients and matrix algebra will be used to solve the system.

The supply and demand system below; all constants are positive.

The matrix form of the system above can be found by placing all constants on one side and the variables on the other side:

The system above can be solved by taking the inverse of the A matrix using the adjoint which is the transpose of the cofactor matrix.  The solution is given below after the non-zero determinant condition is verified to eliminate linear dependence between equations:

Supply equals demand and the relationships between the variables can be verified to be consistent with what we see happening in markets between consumers and producers.  An increase in price encourages production and discourages consumption holding all other things constant.  Increasing fixed cost for the supply function reduces the quantity supplied and increases the equilibrium price in the market.  There are many other relationships that can be verified as intuitively correct for what we see in terms of market dynamics.  This can be done by taking partial derivatives of the demand, supply, and solution results with respect to the 4 constants in the model: a, b, c, and d.  The following Excel diagram is a pretty good interactive spreadsheet to create that will by a great way of to understand supply and demand dynamics.

Simplex Algorithm and How Dating Web Sites Match Singles

dating

Mathematical Background:

The simplex method was created by George Dantzig (1914-2005)  who was Professor Emeritus of Transportation Sciences and Professor of Operations Research and Computer Science at Stanford University.  George Dantzig created the simplex method as an algorithm for solving a system of linear equations.  The algorithm is used in linear programming to find optimum solutions to these equations.  The equations typically consist of one objective function which you are trying to minimize (i.e. cost) or the dual problem which would consist of an equation your are trying to maximize(i.e. profit). In addititon to the objective function you will have some sort of constraints which will limit your optimum solution.  These constraints could be in the form of storage capacity, capital budgets, time constraints, labor expenses, substitutability of inputs and a host of other factors can play a part in defining the set of constraints.

The original applications to the simplex method were to linear programming.  Linear programming was used during WWII was as a way of minimizing cost to the army and increase losses to the enemy through superior planning and utilization of resources. Examples of the uses of the simplex method and linear programming include the transportation problem where the algorithm minimizes the cost of shipping between n number of warehouses and m number of destinations.  The diet problem is another application where the nutritional needs of an army are taken into account as we try and minimize the combination of foods that will yield the minimal nutritional value with the lowest cost.

Finally, one of the most interesting and recent applicatons of the simplex algorithm has been to the assingment problem in on-line dating services.  The objective of the simplex method in an on-line dating site would be to maximize the matchings of male and females based on the numerical value of potential matches.  In turns out that in this case it is as much an art as it is a science since the value of a match must be determined based on characteristics of the individuals, then and only then can the simplex method be used to maximize the matches.

Example: (Adapted from Dr. Reese, Pepperdine University Graziano Graduate School of Business and Economics 146 at UCLA 2009)

There are 4 single females interested in dating 4 single males. Female i has a value of   ScreenHunter_01 Aug. 14 21.28 for a date with male j.  We make the assumption that the value of the date has a lower bound of zero. An optimal dating equilibrium consist of a pairing of couples such that there is no other allocation of females to males that are feasible.  In other words the best allocation is one where people get the best partner they can and are able to get.

Here is the matrix which represents the value derived from matching a female with a male.   Note:  These values are subjective and they are calculated using compatability questionaires and are in no way meant to be objective.

ScreenHunter_13 Aug. 14 23.44

The matrix represents the value of of the match which can be represented by the difference between the score of a female and the score of a male; based on attractiveness, personality, sense of humor etc. The numbers in the matrix can be represented by the following formula

ScreenHunter_05 Aug. 14 22.10. Females are trying to maximize this formula because it means that they were matched up with a male whose score is relatively large in comparison to their own scores, which means they got a good “catch” .  Males are trying to minimize this formula because that means that they are matched with a person whose relative attractiveness far exceeds their own based on some agreed upon criteria. This is very much like a supply and demand model but with discrete instead of continuous variables and where the “suppliers” are the females and the “consumers” are the males.  In essence what females are trying to do is maximize this formula:

ScreenHunter_07 Aug. 14 22.49

Excel:

This kind of optimization problem can be solved by using the Excel solver in the following steps:

1)  The Excel formlas that correspond to the constraints are given by the following matrix

ScreenHunter_09 Aug. 14 23.00

Assuming that your value matrix started was at   B4:E7  and that your objective matrix was at B10: E13

2) Use the solver with the following settings to get the matches:

ScreenHunter_10 Aug. 14 23.07

ScreenHunter_11 Aug. 14 23.08

The matches should be (Female 1, Male 1), (Female 2, Male 2), (Female 3, Male 4), and (Female 4, Male3)

ScreenHunter_12 Aug. 14 23.09