Working Paper
Experts' abilities to make accurate probabilistic forecasts are often evaluated with proper scoring rules. In practice, the Brier score is one of the most commonly used scoring rules when the target outcomes are categorical. However, given that an event with more outcomes is typically more diffcult to forecast, it is unfair to compare the Brier scores of experts who forecast events with different numbers of outcomes.
In this paper, the authors introduce a simple fair skill adjustment to the Brier score to refne such comparisons. To demonstrate our adjustment, they introduce a behavioral model of experts' probabilistic forecasts of one-off events with different numbers of categorical outcomes. Under this model, the authors show that the fair skill Brier score is a more reliable measure of experts' forecasting skills as long as the outcomes are not close to being certain.
Finally, the authors apply the fair skill Brier score to experimental data from two different studies by the US intelligence community and find empirical support for our theoretical results.
Faculty
Associate Professor of Technology and Operations Management