Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize PIDSet performance #839

Merged
merged 1 commit into from
Apr 14, 2023
Merged

Conversation

nbokovoy
Copy link
Contributor

This PR introduces several changes to the PIDSet implementation, resulting in improved performance and reduced memory allocations.

  • The lookup map in the PIDSet struct now uses a pidKey struct as a key instead of a string. This change eliminates the need for string formatting and thus reduces memory allocations.
  • The indexOf method now uses the lookup map to find the index of the specified PID directly, making it more efficient.
  • The Remove method has been optimized to avoid the creation of a new slice. Instead, it replaces the removed PID with the last PID in the pids slice and updates the lookup map accordingly.
  • The pidset_test.go file has been updated with a new test for the Remove method and more comprehensive benchmarks that cover different set sizes.

These changes should improve the overall performance of the PIDSet and reduce memory allocations, making the library more efficient and lightweight.

Benchmarks

Original implementation

goos: darwin
goarch: amd64
pkg: github.com/asynkron/protoactor-go/actor
cpu: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
BenchmarkPIDSet
BenchmarkPIDSet/len_1
BenchmarkPIDSet/len_1-12         	 1375432	       740.3 ns/op	     144 B/op	       9 allocs/op
BenchmarkPIDSet/len_5
BenchmarkPIDSet/len_5-12         	 1594579	       721.9 ns/op	     144 B/op	       9 allocs/op
BenchmarkPIDSet/len_20
BenchmarkPIDSet/len_20-12        	 1581361	       732.4 ns/op	     144 B/op	       9 allocs/op
BenchmarkPIDSet/len_500
BenchmarkPIDSet/len_500-12       	  923178	      1169 ns/op	     144 B/op	       8 allocs/op
BenchmarkPIDSet/len_1000
BenchmarkPIDSet/len_1000-12      	  914745	      1112 ns/op	     144 B/op	       8 allocs/op
BenchmarkPIDSet/len_10000
BenchmarkPIDSet/len_10000-12     	   44992	     32307 ns/op	     142 B/op	       8 allocs/op
BenchmarkPIDSet/len_100000
BenchmarkPIDSet/len_100000-12    	    4071	    343884 ns/op	      96 B/op	       6 allocs/op

Updated implementation

goos: darwin
goarch: amd64
pkg: github.com/asynkron/protoactor-go/actor
cpu: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
BenchmarkPIDSet
BenchmarkPIDSet/len_1
BenchmarkPIDSet/len_1-12         	 4889827	       247.1 ns/op	       0 B/op	       0 allocs/op
BenchmarkPIDSet/len_5
BenchmarkPIDSet/len_5-12         	 4651048	       258.0 ns/op	       0 B/op	       0 allocs/op
BenchmarkPIDSet/len_20
BenchmarkPIDSet/len_20-12        	 4052954	       287.8 ns/op	       0 B/op	       0 allocs/op
BenchmarkPIDSet/len_500
BenchmarkPIDSet/len_500-12       	 4052511	       312.5 ns/op	       0 B/op	       0 allocs/op
BenchmarkPIDSet/len_1000
BenchmarkPIDSet/len_1000-12      	 3820195	       273.4 ns/op	       0 B/op	       0 allocs/op
BenchmarkPIDSet/len_10000
BenchmarkPIDSet/len_10000-12     	 3478447	       301.7 ns/op	       0 B/op	       0 allocs/op
BenchmarkPIDSet/len_100000
BenchmarkPIDSet/len_100000-12    	 3048384	       379.0 ns/op	       0 B/op	       0 allocs/op

The downside of these changes is that Remove() method now changes the order of elements in the set. It can affect distribution for the round-robin router. But as I can see router usually has a fixed set of routees so It should be fine.

I need this fix because I have a registry actor which can have up to 100k children that spawns and destroys dynamically. And method removeChild causes a significant load.

CPU Profile of my production application image

* Replace string keys with pidKey struct to save memory
* Improve indexOf efficiency using the lookup map
* Refactor Remove to avoid new slice, update lookup map
* Add Remove method test in pidset_test.go
* Expand benchmarks in pidset_test.go for varied set sizes
These changes enhance performance and lower memory usage in PIDSet.
@rogeralsing
Copy link
Collaborator

Looks great!

@rogeralsing rogeralsing merged commit 22ab527 into asynkron:dev Apr 14, 2023
@nbokovoy nbokovoy deleted the pidset-lookup branch April 14, 2023 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants