What Is Random Forest Modeling?
Random forest modeling is the technique used by Richard Berk — working with NIJ-funded researchers Geoffrey Barnes and Jordan
Hyatt — to build the risk prediction tool for Philadelphia's Adult Probation and Parole Department. Random forest modeling
could best be described as hundreds of individual decision trees.
In the simplest statistical terms, here is how it works: Data are organized using a technique called "classification and regression
trees." The computer then runs an algorithm that selects predictors at random and repeats and repeats this process to build
several hundred trees — which then allow the randomly selected predictors to average themselves into a single outcome. In
the case of the Philadelphia tool, this outcome was assignment to one of three risk categories (high, moderate or low) for
The final NIJ report (pdf, 64 pages) describes random forest modeling — and the fine-tuning that the research partnership went through as they built three iterations
of the risk prediction tool — in much more detail.
Is random forest modeling an improvement over more traditional actuarial prediction analyses? Barnes and Hyatt say yes.
"It allows for the inclusion of a large number of predictors, the use of a variety of data sources, the expansion of assessments
beyond binary outcomes, and taking the costs of different types of forecasting errors into account," Barnes said.
Date Created: February 27, 2013