GLM I: INTRODUCTION TO GENERALIZED LINEAR MODELS
This session will provide a basic introduction to linear and generalized linear models. The session will emphasize regression-based methods as the basis of more advanced modeling techniques. It will begin with a basic review of common statistical methods that have been in use by actuaries and other analysts for many years. No familiarity with modeling methods will be assumed. Topics will include introductions to regression, ANOVA, and discriminate analysis. Methods of performing goodness-of-fit and diagnostic tests to test the underlying assumptions of the model will be presented. In order to discuss generalized linear models, demonstrated topics will include regression and classification. Where possible, the session will illustrate the statistical principles with applications that can be performed in Excel.
PANELIST:
Louise A. Francis, Consulting Principal, Francis Analytics and Actuarial Data Mining Inc.
GLM II: BASIC MODELING STRATEGY
Predictive modeling is gaining significant momentum within U.S. personal lines companies, with generalized linear modeling being the most commonly used technique. Generalized linear modeling is enabling companies to develop more accurate rates more efficiently than ever before.
GLM II is intended to be a practical session outlining basic modeling strategy associated with GLMs. The discussion will cover topics such as selecting an appropriate error structure, simplifying the GLM (i.e., excluding variables, grouping levels, fitting curves), and complicating the GLM (i.e., interactions). The session will discuss diagnostics that help test the selections made.
PANELIST:
Geoffrey Todd Werner, Senior Consultant, EMB America LLC
MODERATOR:
Gaetan R. Veilleux, Associate Actuary, SAFECO Insurance Companies
GLM III: ADVANCED MODELING STRATEGY
GLM III will cover additional refinements such as the pros and cons of modeling directly on loss costs using the Tweedie distribution, the use of splines within a GLM framework, and the use of the offset term to constrain models. This session will demonstrate how to use binomial models or offset terms to use model results from a stable claim type when measuring a related but more volatile claim type, how to combine GLMs across multiple claim types, and how to investigate the appropriateness of a multiplicative model structure.
PANELIST:
Duncan Anderson, Partner, Watson Wyatt LLP
CUSTOMER RETENTION AND NEW BUSINESS CONVERSION ANALYSIS-DISCRETE MODELING ISSUES
Actuaries have produced various methods to measure loss costs, but historically there has been less attention placed on measuring policyholder behavior - a striking contrast to other financial services industries. This session will present multivariate analysis as a viable method for measuring policyholder retention and new business conversion. The discussion will include model considerations and practical tips, as well as illustrative output. Focus will be placed on the benefits and applications of modeling policyholder behavior, including the selection of rate changes that optimize corporate objectives.
PANELISTS:
Claudine H. Modlin, Consulting Actuary, Watson Wyatt Insurance & Financial Services Inc.
Robert J. Walling, Principal and Consulting Actuary, Pinnacle Actuarial Resources Inc.
GENERALIZED LINEAR MODELS OR NON-LINEAR MODELS WITH APPLICATIONS TO AUTO
As generalized linear models become more and more accepted and utilized throughout the insurance industry as a sophisticated pricing tool, there are still many concerns regarding the nature of the statistical solution. The major issue surrounds the structural design of the GLM solution as a purely additive or multiplicative construct and this application to the insurer's rating algorithm. This session will discuss techniques available to build mixed multiplicative additive solutions using standard GLMs. Panelists will focus on ways to expand the GLM construct to include more nonlinear statistical solutions. Diagnostics methods that explore the validity of utilizing more complex structural solutions versus the standard GLM solution will also be discussed.
PANELISTS:
James Tanser, Watson Wyatt LLP
Karl B. Murphy, Partner, EMB Consultancy
HOMEOWNERS MODELING: REFINING TRADITIONAL RATING VARIABLES AND MAXIMIZING THE VALUE OF EXTERNAL DATA
Many GLM ratemaking applications have focused on private passenger auto examples. This session will discuss how the nature of some homeowners' variables affect a predictive modeling analysis. These include both traditional rating variables (such as amount of insurance, deductible, and policy form) as well as external variables related to demographics or weather. The typical indivisible premium approach for analyzing homeowners' data does not lend itself well to proper investigation of these explanatory variables, therefore, the presentation will outline a case for modeling homeowners separately by peril.
PANELISTS:
A. David Cummings, Actuary, State Farm Insurance Companies
Claudine H. Modlin, Consulting Actuary, Watson Wyatt Insurance & Financial Services Inc.
VEHICLE SYMBOLS: DEFINED AND REDEFINED
Vehicle symbols are one of the most widespread and established rating factors in the insurer's rating algorithm; it is also one of the more challenging factors to price. Vehicle symbol groups tend to be heavily correlated with other rating variables, therefore, one-way analysis on this variable is very susceptible to distortions. In addition, the high dimensionality of vehicle types and differing group definitions vary based on type of coverage, which produces additional challenges when studying vehicle similarities. Thus, it is imperative that symbols be analyzed and defined within the context of a flexible multivariate framework.
This session will present ratemaking techniques within an appropriate multivariate framework and discuss diagnostics that can be used to determine the best symbols groups possible.
PANELIST:
Thomas E. Hettinger, Managing Director, EMB America LLC
Richard B. Moncher, Vice President & Actuary, Bristol West Holdings, Inc.
INCORPORATING EXTERNAL INFORMATION INTO THE PREDICTIVE MODELING PROCESS
Predictive models rely on accurate information detailing the characteristics and behavior of customers. While many companies collect data internally for pricing, underwriting, or marketing purposes, they have rarely historically collected the quantity and diversity of data needed to build predictive models across multiple elements of their operations. Leveraging the knowledge obtained from external data, either purchased from vendors or gathered from publicly available government databases, can be the deciding ingredient that turns an appropriate predictive model used to underwrite policies into a truly competitive tool for pricing, underwriting and marketing policies. The panelists in this session will present key issues surrounding external data including 1) how to identify, research, and append external data into an insurance database to achieve certain quality and match-rate goals; and 2) how to use external data to better leverage predictive models by including external response data, addressing reject biases, and translating the knowledge learned from the process into target marketing opportunities.
PANELISTS:
Gary T. Ciardiello, Senior Consulting Actuary, Ernst & Young LLP
Brett M. Nunes, Actuarial Analyst, Ernst & Young LLP
Richard Vlasimsky, Co-Founder and Chief Technical Officer, Valen Technologies
SPECIAL CHALLENGES WITH LARGE DATA MINING PROJECTS
Predictive modeling projects are exciting because of their potential to create competitive advantage by developing insight that your competitors do not have. Predictive modeling projects are also extremely challenging. Two of the most significant challenges are project management and data management.
This session will present the issues that a project manager will likely face on a predictive modeling project. The focus will be on how to address these issues in order to bring a modeling project to a successful completion.
In working with large databases, a key challenge is merging a large number of external data sources into a company's internal
data. The panelists will also present a project case study where numerous external databases were merged into a company's internal data. They will discuss some of the unique data integrity and data quality issues that were faced and how they were addressed.
PANELISTS:
Gary T. Ciardiello, Senior Consulting Actuary, Ernst & Young LLP
Beth E. Fitzgerald, Assistant Vice President, ISO
Jeremy M. Stanley, Actuarial Analyst, Ernst & Young LLP
PUTTING YOUR COMPANY ON THE MAP
Geographical risk classifications, such as fire protection classes, rating territories, and zones, have traditionally been defined based on physical surveys, engineering studies, and data analysis. Rate factors applied to these risk classifications are often determined using one-way analysis techniques, even though geographical risk tends to be highly correlated with other risk factors. This session will present modeling techniques used to determine appropriate geographical risk classification boundaries and rate factors.
MODERATOR:
Michael J. Miller, Consulting Actuary, Towers Perrin
PANELISTS:
Serhat Guven, Senior Consultant, EMB America LLC
Gregory L. Hayward, Assistant Vice President & Actuary, State Farm Insurance Companies
Klayton Southwood, Consultant, Towers Perrin
FRAUD-FIGHTING ACTUARIES: MATHEMATICAL MODELS FOR INSURANCE FRAUD DETECTION
Criminal fraud is distinguished from systematic abuse by exaggerated losses and calls for different detection and handling strategies. Most estimates of the extent of the problem involve large aggregate dollars in losses and loss adjustment expense. This session will give an overview of the insurance claim fraud problem, emphasizing claims processing and fraudulent and abusive claims detection.
Panelists will review a wide variety of statistical methods for detecting and deterring fraud, including:
- Network analysis for detecting staged auto accidents;
- Geographic models for detecting false medical bills;
- Statistical scoring for selecting IRS audits;
- Sequential decision models for identifying fraudulent claims;
- Pattern analysis for detecting fraudulent billing; and
- Text mining for detecting fraud in medical reports.
Recent research results using Massachusetts auto injury data and models for claim severity, claim classification, and the liability settlement process in the presence of fraud and buildup will be covered. An expanded role for the actuary will also be discussed.
MODERATOR/PANELISTS:
Richard A. Derrig, President, OPAL Consulting LLC and Visiting Scholar, Wharton School, University of Pennsylvania
Daniel Finnegan, President and CEO, Quality Planning Corporation
DIMENSION REDUCTION FOR WORKERS COMPENSATION
Workers compensation insurers historically have relied on a simple class plan consisting of only one rating variable-class code. With the help of predictive modeling companies are able to identify many potential rating variables with predictive power, but bridging the gap between a complex class plan and a simple class plan requires dimension reduction techniques.
The first part of this session will offer guidance to practitioners attempting a classification plan review. Practical dimension reduction techniques using workers compensation data will be discussed. The second part of this session will present the major kinds of dimension reduction (row reduction or clustering, and column or variable reduction) using linear methods such as factor analysis and principal components and more advanced approaches such as neural networks and projection pursuit regression. These methods will be applied to workers compensation underwriting applications and claims analysis examples such as fraud detection.
PANELISTS:
David J. Otto, Managing Director, EMB America LLC
Louise A. Francis, Consulting Principal, Francis Analytics and Actuarial Data Mining Inc.
MODERATOR:
Jennifer Marie Oglenski, Actuarial Analyst, Accident Fund Insurance Company of America
IMPLEMENTATION: MAKING PREDICTIVE MODELS "COME ALIVE"
Predictive models for underwriting can help generate significant financial and operational benefits for an insurance company. However models alone will do very little for a company. Rather, they must be strategically inserted into core business processes and utilized in targeted ways for their business value to be realized. This session will focus discussion on end-to-end implementation of predictive models, some methods and techniques for success, and why implementation is the most important part of every predictive modeling project.
PANELIST:
John Lucker, Principal, Deloitte & Touche LLP
Michele Yeagley, Assistant Vice President, Harleysville Group
CLAIMS/AGENCY METRICS AND OTHER NONTRADITIONAL APPLICATIONS OF PREDICTIVE MODELING
Many of the results produced by predictive modeling analyses have applications that can be of direct and powerful value outside of the pricing function. Because predictive modeling attempts to take available data and explain a process better, its value can be applied in a number of different areas. Specifically, predictive models can provide great insights into claim frequency and severity differences that exist between claim offices and adjusters, an evaluation of the benefit of certain claim service providers (e.g., preferred provider network, preferred glass shop), and a determination of the true value added by an agency of producer. In this session, the panel will also discuss real examples of how these nontraditional models are applied, including using predictive modeling in legislative costing.
PANELISTS:
Shawna S. Ackerman, Principal and Consulting Actuary, Pinnacle Actuarial Resources Inc.
Roosevelt C. Mosley Jr., Senior Consulting Actuary, Pinnacle Actuarial Resources Inc.
OTHER MODELING TECHNIQUES
Consider these special characteristics of insurance loss data: (1) it is highly skewed; (2) it is not fully developed; and (3) it has a nonlinear response to potential predictor variables. Standard procedures such as GLM impose some restrictions on the link function and the response distribution that may not be appropriate for insurance loss data with these characteristics. This session discusses special techniques such as generalized additive models (GAM) and maximum likelihood procedures to model data in these cases.
PANELISTS:
Keith D. Holler, Vice President, The Travelers Insurance Company
Glenn G. Meyers, Chief Actuary, ISO Innovative Analytics
CREDIT STUDIES AND REGULATOR'S VIEWS
Many companies use credit information for pricing and underwriting purposes. Multivariate analysis of credit-based data and traditional characteristics is now common. Several regulatory bodies have collected information to allow their own analyses, including a look at whether the use of credit has different impacts on various racial or ethnic groups. This session will present a discussion of published studies of the loss propensity and the potential disproportionate impact of credit-related insurance scores as applied to personal auto and homeowners insurance from the viewpoint of the regulator.
MODERATOR:
Michael J. Miller, Consulting Actuary, Towers Perrin
PANELISTS:
Jesse B. Leary, Deputy Assistant Director, Division of Consumer Protection, Bureau of Economics, US Federal Trade Commission
Philip O. Presley, Chief Actuary, Texas Department of Insurance
Chester J. Szczepanski, Vice President and Chief Actuary, Donegal Insurance Group
PREDICTIVE MODELING FOR SMALLER COMPANIES
Many larger companies have embraced the idea of predictive modeling and have had the volume of data to produce credible results. Small- to medium-sized companies may wonder how predictive modeling can help them, especially when they do not have the data volume of their larger competitors. This session will discuss why predictive modeling has become more important for small- to medium-sized insurers, and how these insurers address some of the unique issues they face when developing models. The session will also cover some of the results obtained when applying these techniques and some of the unique advantages smaller companies have when approaching the predictive modeling process.
PANELISTS:
Richard A. Smith, Consultant, Towers Perrin
Roosevelt C. Mosley Jr., Senior Consulting Actuary, Pinnacle Actuarial Resources Inc.
MODEL VALIDATION AND DIAGNOSTICS: USING OUT-OF-SAMPLE DATA
Modern statistics has greatly enriched the actuary's ability to select the optimal predictive model (diagnostics) and accurately estimate a model's predictive power (validation). In this session, panelists will introduce the various diagnostic and validation (including cross-validation) techniques. The key goal of these straightforward techniques is to avoid models that "overfit" the data and thus will not generalize well to future data. While avoiding overfit models is the main goal, the panel will also make suggestions on what to do when all reasonable models seem overfit and on how validation techniques can be used in cost/benefit decisions. Validation techniques are useful no matter what you do: They can be applied to the simplest credibility computation or to the most sophisticated neural network. Their results can be summarized in charts that can convey a model's predictive power to nontechnical business partners.
PANELISTS:
James Christopher Guszcza, Senior Manager, Deloitte & Touche LLP
Christopher J. Monsour, Consulting Actuary, Towers Perrin
HOW SURE ARE YOU OF THAT ESTIMATE? BOOTSTRAPPING AND ITS COUSINS
Actuaries calculate point estimates of various quantities all the time such as average loss ratio for certain segments, loss reserve estimates by line of business, and the like. Calculating variances or confidence intervals around such estimates has traditionally been a challenge. This session will introduce "bootstrapping," an intuitive, simulation-based method that can be used to calculate the variance of most estimators. Bootstrapping also appears at the core of bagging and boosting techniques designed to improve the quality of the estimates themselves. This session will present the relevant theory of bootstrapping, discuss potential pitfalls, and give practical examples of bootstrap analysis, run on various types of insurance data.
PANELISTS:
James Christopher Guszcza, Senior Manager, Deloitte & Touche LLP
Christopher J. Monsour, Consulting Actuary, Towers Perrin
PREDICTIVE MODELING FOR COMMERCIAL RISKS
In an attempt to replicate the successful applications in the personal lines industry, the commercial lines industry is speeding up its adoption of predictive modeling. This session will review the current predictive modeling development for commercial lines applications. The session will also discuss the underwriting challenges for commercial risks, especially small commercial risks, and how predictive modeling, such as scoring models, can address these needs. The efforts and challenges involved in building a scoring model will be described, including data issues and analysis of models.
PANELISTS:
Beth E. Fitzgerald, Assistant Vice President, ISO
Cheng-Sheng Peter Wu, Director, Deloitte & Touche LLP
WHAT TO DO WHEN YOU CANNOT USE CREDIT
Credit scoring has been widely used in both the personal and commercial line industries to underwrite and price risks. While credit score has been proven to be a powerful tool in segmenting insurance risks, it has also received increasing regulatory restriction and public scrutiny. This session will discuss development and application of alternative scoring models without using credit information.
MODERATOR:
Daniel D. Blau, Director of Applied Research, The Hartford
PANELISTS:
Gary Haith, Lead Scientist, Valen Technologies
Cheng-Sheng Peter Wu, Director, Deloitte & Touche LLP
CLAIMS APPLICATIONS OF GLM
Many of the results produced by GLM analyses have applications that can be of direct and powerful value to claims departments. Specifically, GLMs provide great insights into claim frequency and severity differences that exist between claim offices and adjusters, claim characteristics (medical only versus lost time workers compensation claims), rating variables (e.g., physical damage severities by vehicle symbol). This session will examine how GLM output can be used to provide valuable information that helps claims departments improve their approaches to such issues as statistical or formula case reserves, claims audits and training, and utilization review of third parties (e.g., attorney, auto glass, physician, or fraud investigation networks).
PANELISTS:
Robert J. Walling, Principal and Consulting Actuary, Pinnacle Actuarial Resources Inc.
TBD
DECISION TREES FOR DATA EXPLORATION AND MODELING
Many "advanced" modeling techniques such as neural networks, MARS, and general additive models can be viewed as extensions of the linear regression framework that allow one to automatically model non-linear patterns. Classification and regression trees (CART), on the other hand, exemplifies a different paradigm for multivariate data analysis. CART, like other tree-based modeling techniques, is based on the principle of recursive partitioning. It is a brute-force, non-parametric technique for selecting variables and variable values in a way that optimally partitions one's data into homogenous segments. CART can be viewed as a competitor to linear regression and its cousins, but it can equally be viewed as an exploratory tool that complements other techniques. This session will sketch the basic concepts of CART and also suggest a variety of actuarial applications. Discussion will emphasize the applications exploratory data analysis and variable search, and the processes of searching for variable interactions, creating model-based decision rules, and analyzing complex models.
MODERATOR:
Margaret A. Brinkman, Actuary, Allstate Research & Planning Center
PANELISTS:
James Christopher Guszcza, Senior Manager, Deloitte Consulting
Daniel Steinberg, CEO, Salford Systems