What Is Random Forest Modeling?

Sidebar to the article Predicting Recidivism Risk: New Tool in Philadelphia Shows Great Promise by Nancy Ritter

Random forest modeling is the technique used by Richard Berk — working with NIJ-funded researchers Geoffrey Barnes and Jordan Hyatt — to build the risk prediction tool for Philadelphia's Adult Probation and Parole Department. Random forest modeling could best be described as hundreds of individual decision trees.

In the simplest statistical terms, here is how it works: Data are organized using a technique called "classification and regression trees." The computer then runs an algorithm that selects predictors at random and repeats and repeats this process to build several hundred trees — which then allow the randomly selected predictors to average themselves into a single outcome. In the case of the Philadelphia tool, this outcome was assignment to one of three risk categories (high, moderate or low) for probation-supervision purposes.

The final NIJ report (pdf, 64 pages) describes random forest modeling — and the fine-tuning that the research partnership went through as they built three iterations of the risk prediction tool — in much more detail.

Is random forest modeling an improvement over more traditional actuarial prediction analyses? Barnes and Hyatt say yes.

"It allows for the inclusion of a large number of predictors, the use of a variety of data sources, the expansion of assessments beyond binary outcomes, and taking the costs of different types of forecasting errors into account," Barnes said.

Date Created: February 27, 2013