You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Reduce unnecessary GreedyPerfPartitioner calls from MemoryBalancedPartitioner (#2914)
Summary:
Pull Request resolved: #2914
MemoryBalancedPartitioner works by adjusting the max memory on devices and calling GreedyPerfPartitioner repeatedly. The max memory is adjusted with a binary search procedure to identify a more memory efficient plan than what GreedyPerfPartitioner gives by default.
The search boundaries for the binary search procedure were inefficient which this diff addresses.
* **Upper bound**
* **Before:** Max device HBM (e.g. 80 GB)
* **After:** Max HBM usage of the default plan since there is no point in searching for plans that use more max memory than what the default plan uses.
* **Lower bound:**
* **Before:** [Avg. HBM per Device] = [Total HBM Needed Across All Shards] / [World Size]
* **After:** max([Avg. HBM per Device], [Max HBM Needed Across All Shards]). A feasible solution requires at least the max HBM that the biggest shard needs so there is no point in searching for options below that.
Making these changes can have impact in two ways:
1. Search procedure is more efficient leading to plans with lower memory
2. We can reduce `search_count` to get comparable plans as before while calling `GreedyPerfPartitioner` less number of times from `MemoryBalancedPartitioner`.
The default impact without further changes from #1 should lead to a marginal max memory improvement.
Reviewed By: iamzainhuda
Differential Revision: D73598477
fbshipit-source-id: 64b001de5a84e5f24afec9684b4602bcbe694e59
0 commit comments