Back-Testing the ODP Bootstrap & Mack Bootstrap Models

Abstract

Motivation Distributions of unpaid claims are gaining importance within the actuarial community as management, regulators, and others look to the actuarial profession for a quantitative approach to evaluating risk. Actuaries have historically applied their judgment to determine if a best estimate is reasonable, but how do we know if the models used to produce distributions are reasonable? Determining if a distribution is reasonable is a much more complex task than for a point estimate. Is the model producing a reasonable estimate at the 95th percentile? Is it producing reasonable distribution shapes? In effect, actuarial judgment shifts focus from a single point estimate to the entire distribution and we must rely, at least in part, on the proposition that “if the theory is acceptable then the distribution is acceptable.” Therefore, the purpose of this paper is to determine if the theory really holds up in practice.

There are five objectives of this research. First, by greatly expanding the database used to back-test models the testing can provide more evidence to validate (or not) prior research and address any weaknesses in the prior research. Second, all of the prior research focused only on the estimate of a single outcome (i.e., the ultimate for the current accident year), so this research expands the testing for every possible es timate, e.g., each accident period, each calendar period, each incremental cell, etc. Third, more models were tested and some of the model assumptions were tested in order to expand our understanding of the predictive value of different models. Fourth, recent proposals to address model weaknesses were examined to assess their viability. Fifth, a new proposal for using this research to benchmark unpaid claim estimates will be put forth.

Method The estimated distribution of possible outcomes for various models based on the ODP Bootstrap model and the Mack Bootstrap model are saved and compared to the actual outcome up to 9 years later – i.e., a single back-test. While the result from a single data set is not indicative of the quality of the original estimate, comparing results for a large number of data sets does provide an indication of the quality of the model.

Results Based on the back-testing, all tested models appear to underestimate the width of the “true” distribution but some of the models tested appeared to get closer to the “true” distribution than others and the tested adjustments to the model assumptions seem to improve the results, which is a desirable quality. Another key result is to show how the insurance underwriting cycle also impacts the results of the back testing.

Conclusions The major results from prior similar research is confirmed, but the volume of this research has led to a new approach to benchmarking both deterministic and stochastic unpaid claim estimates in practice.

Keywords Back -test, benchmark, bootstrap, chain ladder, Mack model, over-dispersed Poisson, reserve variability, systemic risk, underwriting cycle

Volume
Winter, 2019
Page
1-50
Year
2019
Categories
Financial and Statistical Methods
Statistical Models and Methods
Boot-Strapping and Resampling Methods
Actuarial Applications and Methodologies
Reserving
Reserve Variability
Publications
Casualty Actuarial Society E-Forum
Authors
Mark R Shapland