Lake Wobegon and the Panopticon: a simulation of real-world reputation systems

blog archive
Author

Tom Slee

Published

November 8, 2015

Note

This page has been migrated from an earlier version of this site. Links and images may be broken.

            For some time I have been working on a simulation of reputation systems: a computational model I can use to think through some of the issues they raise. A first pass at this model is now available, together with a fairly long document describing how it works and some results, on GitHub as a Jupyter notebook [here](https://github.com/tomslee/provider-reputation/blob/master/provider-reputation.ipynb).

I was particularly interested in a seeming paradox in what we have learned about real-world reputation systems. As I say in the introduction:

In the few years since they have become widespread, reputation systems have shown two seemingly contradictory characteristics:

(Lake Wobegon effect) Most ratings are very high. While ratings of Netflix movies peak around 3.5 out of 5, ratings on sharing economy websites are almost all positive (mostly five stars out of five). The oldest and most widely-studied reputation system is eBay, in which well over 90% of ratings are positive; other systems such as BlaBlaCar show over 95% of ratings as “five out of five”.
(Panopticon effect). Service providers live in fear of a bad rating. They are very apprehensive that ratings given for the most frivolous of reasons by a customer they will never see again (and may not be able to identify) may wreck their earnings opportunities, either by outright removal from a platform or by pushing them down the rankings in search recommendations. Yelp restaurant owners rail at “drive-by reviewers” who damage their reputation; Uber drivers fear being “deactivated” (fired), which can happen if their rating slips below 4.6 out of 5 (a rating that would be stellar for a movie).

So are reputation systems effective or not? Here’s the seeming contradiction:

The Lake Wobegon effect suggests that reputation systems are useless: they fail to discriminate between good and bad service providers (my take on this from a couple of years ago is here). This suggestion is supported by quite a bit of recent empirical research which I have summarized in MY NEW BOOK!. Customers are treating reviews as a courtesy, rather than as an opportunity for objective assessment. Rather like a guest book, customers leave nice comments or say nothing at all.
The Panopticon effect suggests that rating systems are extremely effective in controlling the behaviour of service-providers, leading them to be customer-pleasing (sometimes extravagantly so) in order to avoid a damaging bad review.

If you are not a fan of computer models, or just have better things to do, here are my main conclusions, paraphrased: