Skip to content
Wok edited this page Mar 29, 2021 · 25 revisions

Welcome to the Steam-Bayesian-Average wiki!

Ideas

Acclaimed

The idea behind the concept of an "acclaimed" entity (developer or publisher) is that:

  • the distinction between different games is omitted,
  • the reviews for all the games by this entity are aggregated as if we considered a single game.

Reliable

The idea behind the concept of a "reliable" entity is that:

  • the distinction between different games is taken into account,
  • the game scores are aggregated.

Established

The idea behind the concept of an "established" entity is that:

  • ranks have some inertia so that an entity does not fall in the ranking every time a new game is released (with few reviews),
  • to obtain this inertia and smooth the rank changes, game scores are weighted by their contribution to the total of reviews,
  • other than the inertia mentioned above, "established" and "reliable" are similar concepts.

Vocabulary

Acclaimed

The higher the ratio of positive reviews, and the more reviews, the more likely a game, a developer or a publisher is acclaimed.

Bayesian Average formula

where:

  • $x_i$ is 1 if review i is positive, 0 otherwise,
  • n is the number of reviews,
  • m is the prior mean, defined as the average of values for $x_i$ over all the data.
  • C is the prior dataset size, defined as the average of values for n over all the data.

Reliable

The higher the game scores, and the more released games, the more likely a developer or a publisher is reliable.

Bayesian Average formula

where:

  • $x_i$ is the score for game i,
  • n is the number of released games,
  • m is the prior mean, defined w.r.t. the sampled values for $x_i$,
  • C is the prior dataset size, defined w.r.t. the sampled values for n.

Established

The most acclaimed its most reviewed games, and the more reliable, the more likely a developer or a publisher is established.

Bayesian Average formula with weights

where:

  • $x_i$ is the score for game i,
  • n is the number of released games,
  • m is the prior mean, defined w.r.t. the sampled values for $w_i x_i$,
  • C is the prior dataset size, defined w.r.t. the sampled values for n,
  • $w_i$ is the proportion of reviews which are tied to game i, scaled to n,
Weight formula

where:

  • $N_i$ is the number of reviews for game i,

Rankings

Here are rankings for:

FAQ

_Asesprite _appears in the game ranking, although it is not a game. Are softwares supposed to be on this ranking?

I rely on SteamSpy API, which most generic request does not distinguish between games and softwares.

To check whether an app is a game, I would have to ping Steam API, which is rate-limited, or to send a specific request to SteamSpy API to determine the genres and categories of each app. I have done it before, but the cost-benefit analysis is usually terrible, and it would be over-kill for this Github repository.

In the case of a ranking, it would be easier to check each entry at the top of the ranking, say top 250, just before displaying the ranking, so... I might add this layer of complexity, yet I am not too keen on doing so.

I have not heard of devs at the top of the most acclaimed list.

"The most acclaimed" devs are more or less devs of "the most acclaimed" games. It is not directly computed that way, but both lists should be very similar. When there is one game with many reviews and an overwhelmingly positive appreciation overall, the dev of this game will most likely be near the top of the "acclaimed" ranking.

I believe we are more interested in reliability for devs and publishers: if there is a new game developed by this studio, how likely is it that it is good? Is the studio consistently releasing good games (with respect to what their customers expect)?

For developer links you could use the 'developer' parameter for the search, instead of the 'term' parameter. That way you avoid a bunch of extraneous hits.

I have followed your piece of advice. However, the reason why I was using the more generic search form is that there are technical issues with some developers and publishers: SteamSpy provides the list of developers and publishers as a comma-separated string, so whenever a dev has a comma in its name, e.g. 'CAPCOM CO., LTD.', the suffix gets removed, and I don't have the correct URL:

I feel like Steam reviews are almost irrelevant now: Steam reviews only take into account purchases on the Steam store. If I buy a game from Humble Store, and activate it on Steam, then my review won't be taken into account. We rate games not the games bought on a certain store.

This is wrong. Here is a counter-example.

Only the scores shown at the top and at the bottom of a store page on Steam dismiss reviews tied to other stores. The reviews still exist and are provided by both Steam and SteamSpy API. So the data which is used for these rankings include all the reviews.

Top Bottom