Tuesday, November 13, 2012

Using The Gaussian Kernel Density Estimator (GKDE) to assess Risk.

B. Hathi , 18 / April/ 11

In several walks of life, we come across the common problem of having to assess risk given certain amount of data. The word ‘risk’ can be used in many contexts and has different meanings – for example, we may refer to risk in qualitative terms as being ‘high’ or ‘low’; we may assign a quantitative measure such as some function of standard deviation (or variance) about an ‘Expectation Value’. In this case, by ‘risk’ I mean calculating probability of an unfavourable outcome.


To calculate probability, we start off with a data set; plot normalised histogram, fit some distribution function (normal distribution is a good starting choice amongst the many distribution fits) and finally calculate the Probability Density Function (PDF) or its inverse, the Cumulative Distribution Function (CDF). However, we may find that none of the standard distributions fit our data set well enough to inspire confidence. Take the Gbp/Usd data set for instance: my previous article’s Figures 2 & 3  show that despite best attempts, none of the fits describe the data set well enough. Now there are two routes available to tackle the problem. One is to break the problem down into many components – in this case, we would assess Pound’s valuation against [US and UK] inflation, GDP, trade balances etc - and try to find relationships between them (or covariance matrix); plot univariate distributions with respect to each parameter; and combine using methods such as Copula (which basically relates a joint distribution to its individual components via the covariance matrix) . The second approach is not to try and untangle the mess and take the distribution at its face value and either apply a bimodal (or even multimodal) normal distribution, or apply the GKDE. Both methods – the Copula and the GKDE – have their merits; one difference I can see is that GKDE is a great way to quickly assess the probability, while it is limited if you want to calculate the Expectation Value using a continuous distribution assumption, since integrating a mathematically unknown equation would be a problem. I am sure some readers can add more details.

Figure 1 below shows the GKDE fit to the same data set as in my previous article  – the Gbp/Usd. The red line shows the results of the GKDE fit and using a box integrator, we can easily compute the area under the red curve.

Figure 1. The yellow histogram bars show Gbp/Usd valuations since 1971, while the overarching 'red' line is the GKDE fit. Since the data are normalised, the area under the curve equals 1. Probabilities can be found by taking area from current price to its left or right side by using the box integrator function. 

Needless to say that I developed this software in Python, using Scipy’s stats module which contains the kde module. Additionally, I have introduced a functionality to calculate area under the curve that stretches equally in both directions about a current value (based on some multiple of the standard deviation) and returns the probability of both sides. Clearly, since we have a full GDKE plot, any partial computations of probability, such as current value +/- 1.std, will not add up to 1 and we need to normalise this.
Source | The Currency Forecasting Blog (CFB) | http://currency-forecasting.blogspot.com/2011/04/using-gaussian-kernel-density-estimator.html

No comments:

Post a Comment