Distributionally-Informed Recommender System Evaluation


Michael D. Ekstrand, Ben Carterette, and Fernando Diaz. 2024. Distributionally-Informed Recommender System Evaluation. Transactions on Recommender Systems 2(1) (March 7th, 2024), 6:1–27. DOI 10.1145/3613455. arXiv:2309.05892 [cs.IR]. NSF PAR 10461937. Cited 9 times. Cited 5 times.

Table 1 from the paper, showing a table with multiple statistics and KDE sparklines representing the whole distribution.


Current practice for evaluating recommender systems typically focuses on point estimates of user-oriented effectiveness metrics or business metrics, sometimes combined with additional metrics for considerations such as diversity and novelty. In this paper, we argue for the need for researchers and practitioners to attend more closely to various distributions that arise from a recommender system (or other information access system) and the sources of uncertainty that lead to these distributions. One immediate implication of our argument is that both researchers and practitioners must report and examine more thoroughly the distribution of utility between and within different stakeholder groups, but distributions of various forms arise in many more aspects of the recommender systems experimental process, and distributional thinking has substantial ramifications for how we design, evaluate, and present recommender systems evaluation and research results. Leveraging and emphasizing distributions in the evaluation of recommender systems is a necessary step to ensure that the systems provide appropriate and equitably-distributed benefit to the people they affect.


Listed Under