|
|
|
Brainstorms
by Stephen W. PhilbrickSimple Tricks Solve Spreadsheet Simulation Problems
Actuaries have used simulation techniques for many years. Despite the development of sophisticated algorithms for more direct calculation of aggregate loss amounts (Panjer, Heckman/Meyers, and others), there are situations where generating claim counts and amounts is still the optimal approach. A simulation approach can be used to gain greater insight into a problem and to provide a superior tool for explaining results to others. Even where direct calculations are possible, a simulation approach can provide a check on the direct calculations.
I recently needed to determine the likelihood that the second largest claim in a year would exceed a per occurrence retention. I was provided with the probability that the largest claim in a year would exceed the retention, but the severity distribution was not available, so I could not directly calculate the probability that an arbitrary claim would exceed the retention. Although I solved this problem analytically, I also decided to build a simulation model to find the answer and to test the analytical solution. In the process of building this model, I used two techniques that may be helpful to others building a simulation model in a spreadsheet environment.
The straightforward way to build a simulation model is to generate a random claim count, then generate a random loss amount for each claim. (See some of Glenn Meyers papers for examples.) However, in a spreadsheet environment such as Excel, generating the number of claims is not trivial (unless you have an add-in like @Risk). The built-in Excel functions include distributions, of course, such as Poisson and negative binomial, but there are no inverse functions. Fortunately, there is a solution that actually speeds up the calculation and reduces the parameter risk. Assuming we know in advance how many total scenarios we intend to generate, we can calculate precisely how many scenarios should have zero claims, one claim, and so on. For example, if we want to use the Poisson process for generating claims, with annual mean of four claims, and 10,000 simulated years, we can calculate that exactly 183 years should have zero claims, 733 should have one claim, and so on. We can build a spreadsheet with exactly the right number of claims for each scenario. After that, we can generate the severity using a built-in function such as LOGINV, the inverse of the lognormal, or create our own formula.
Some will be concerned that we have "cheated." After all, if we generate all the years with zero claims first, then all the years with one claim, we won’t be generating a sequence of years that models reality. However, depending on what result of the model we use, this approach may be acceptable. If the desired end product is the aggregate loss distribution, then this is equivalent to saying that we will sort the years by the size of the aggregate claim amount, so it doesn’t matter whether we generate the individual years in a random order or in some organized order. On the other hand, if we are interested in information that does depend on the order, for example, if we are interested in the likelihood that consecutive years are loss free, or that consecutive years have a loss in excess of a retention, then the order does matter. In this case, we have to invoke the second technique.
At the same time each of the losses for a particular year is generated, also generate a random value between zero and one. (In Excel, this can be done with the RANDOM function.) After all the years have been generated, with all the losses for one year on a single row, sort the rows using the column containing the random value. This "sort" will effectively scramble the individual years, as if they had been generated by a random process.
To download a documented spreadsheet illustrating how to do this, click here. The spreadsheet was written using Excel 97, so you will need Excel 97 on your machine to download this file.