Cookie CATS is a hugely popular mobile puzzle game developed by Tactile Entertainment. It's a classic "connect three"-style puzzle game where the player must connect tiles of the same color to clear the board and win the level. It also features singing cats. We're not kidding! Check out this short demo:
As players progress through the levels of the game, they will occasionally encounter gates that force them to wait a non-trivial amount of time or make an in-app purchase to progress. In addition to driving in-app purchases, these gates serve the important purpose of giving players an enforced break from playing the game, hopefully resulting in that the player's enjoyment of the game being increased and prolonged.
But where should the gates be placed? Initially the first gate was placed at level 30, but in this project we're going to analyze an AB-test where we moved the first gate in Cookie Cats from level 30 to level 40. In particular, we will look at the impact on player retention.
The data we have is from 90,189 players that installed the game while the AB-test was running. The variables are:
userid
- a unique number that identifies each player.version
- whether the player was put in the control group (gate_30
- a gate at level 30) or the group with the moved gate (gate_40
- a gate at level 40).sum_gamerounds
- the number of game rounds played by the player during the first 14 days after install.retention_1
- did the player come back and play 1 day after installing?retention_7
- did the player come back and play 7 days after installing?
When a player installed the game, he or she was randomly assigned to either gate_30
or gate_40
. As a sanity check, let's see if there are roughly the same number of players in each AB group.
It appears that there was a slight decrease in 1-day retention when the gate was moved to level 40 (44.2%) compared to the control when it was at level 30 (44.8%). It's a small change, but even small changes in retention can have a large impact. But while we are certain of the difference in the data, how certain should we be that a gate at level 40 will be worse in the future?
There are a couple of ways we can get at the certainty of these retention numbers. Here we will use bootstrapping: We will repeatedly re-sample our dataset (with replacement) and calculate 1-day retention for those samples. The variation in 1-day retention will give us an indication of how uncertain the retention numbers are: