Skip to content

Simplify moran mutate function #1460

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

gaffney2010
Copy link
Member

Makes a few changes for readability:

  1. Handle the non-mutation case as an early exit.
  2. Don't handle the 0-mutation-rate case separately, as this will simply always result in an early exit.
  3. Draw directly from the initial population, as commented, instead of making mutation_targets in the init.

Many tests are failing because we've changed how many random numbers get drawn and when. If the code looks good to a reviewer, I'll update the tests.

Comment on lines -221 to +207
if self.mutation_rate > 0:
# Choose another strategy at random from the initial population
r = self._random.random()
if r < self.mutation_rate:
s = str(self.players[index])
j = self._random.randrange(0, len(self.mutation_targets[s]))
p = self.mutation_targets[s][j]
return p.clone()
# Just clone the player
return self.players[index].clone()
if self._random.random() > self.mutation_rate:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe but I might be remembering incorrectly that we had it this way to avoid sampling a random number if no such sample was needed. I think this helped with performance.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some quick tests, and I'm seeing that drawing a random number takes 1.4 microseconds, while an if takes 2.1 microseconds. We actually reduce an if when mutation_rate = 0, so we may be saving time. But moreover, it doesn't seem like a meaningful optimization.

I'm not sure, though. Should we add a performance test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, though. Should we add a performance test?

Yeah sounds like a good idea to be sure. Running a reasonably sized tournament perhaps? (I have played around with https://pypi.org/project/pytest-benchmark/ which is a neat way to get benchmarks but I have no idea if this is the sort of thing you were thinking: whatever you think is best :) -- no need to do a huge amount for this PR)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the class now uses the bulk random number generator, which should be faster at scale. This came later and may have altered the tradeoffs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a benchmark in another change, coming soon.

@@ -122,20 +122,6 @@ def __init__(
self._random = RandomGenerator(seed=seed)
self._bulk_random = BulkRandomGenerator(self._random.random_seed_int())
self.set_players()
# Build the set of mutation targets
Copy link
Member

@marcharper marcharper Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code was first added here. The idea is that a player can't mutate into the same type if the random pull dictates a mutation (that's the case if the if clause fails). Typically in the literature for the Moran process when mutating one only chooses from the set of initial types (regardless of their prevalence). So the point of the mutation_targets dict is to (1) remove duplicate types from the initial population, and (2) remove the current type. Later, when you want to mutate a player, you can just choose uniformly at random from the mutation dictionary's possibilities, i.e. we've cached all the logic for the choices. (Note this assumes that no new player types are formed, that's a different kind of mutation.)

This is not well explained in the comments, and also mutation_targets might be better named mutation_possibilities or something else. In any case I think the logic of the current code is correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I missed a subtlety that this was deduping initial_strategies. I changed the code to create a single list of mutation_targets and leave the responsibility of choosing a different type to the mutate function. A little closer now to the original code, but still (imo) more easily to understand.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't this less efficient? If the number of initial types is small (which is commonly the case) then the updated code may call for a lot of extra random numbers. The existing code removes the need to check for the same type. Agreed that it could use some cleaning up and clarification though.

The algorithm is supposed to, for given mutation rate \mu, chose another type at random with probability \mu / (n-1) for the other (n-1) types and (1-\mu) for the same type. Maybe there's a simpler/clearer way to do this without sacrificing efficiency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At most it would double (on avg) the number of randoms drawn. My gut says this is insignificant. I will add a performance test to see.

@gaffney2010
Copy link
Member Author

Top table before change, bottom table after change.
image

Looks like there is significant performance regression. I will close the PR. Will follow-up with:

  • Performance script
  • Potentially some comments here.

@gaffney2010 gaffney2010 closed this Jan 8, 2025
@marcharper
Copy link
Member

Thanks for looking into it, let's definitely update the comments to better explain what's happening and why. And maybe exiting early on mutation_rate == 0 will improve the readability a bit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants