@@ -208,70 +208,51 @@ limit , I propose the following calculation:
208208
209209
210210
211- Where  is (per ` runtime/metrics` memory names)
212-
213- ` ` `
214- /memory/classes/metadata/mcache/free:bytes +
215- /memory/classes/metadata/mcache/inuse:bytes +
216- /memory/classes/metadata/mspan/free:bytes +
217- /memory/classes/metadata/mspan/inuse:bytes +
218- /memory/classes/metadata/other:bytes +
219- /memory/classes/os-stacks:bytes +
220- /memory/classes/other:bytes +
221- /memory/classes/profiling/buckets:bytes
222- ```
223-
224- and ![ ` O_I ` ] ( 48409/inl4.png ) is the maximum of
225- ` /memory/classes/heap/unused:bytes + /memory/classes/heap/free:bytes ` over the
226- last GC cycle.
227-
228- These terms (called ![ ` O ` ] ( 48409/inl5.png ) , for "overheads") account for all
229- memory that is not accounted for by the GC pacer (from the [ new pacer
230- proposal] ( https://github.com/golang/proposal/blob/329650d4723a558c2b76b81b4995fc5c267e6bc1/design/44167-gc-pacer-redesign.md#heap-goal ) ).
211+  is the total amount of memory mapped by the Go runtime.
212+  is the amount of free and unscavenged memory the Go
213+ runtime is holding.
214+  is the number of bytes in allocated heap objects at the
215+ time  is computed.
216+
217+ The second term, , represents the sum of
218+ non-heap overheads.
219+ Free and unscavenged memory is specifically excluded because this is memory that
220+ the runtime might use in the near future, and the scavenger is specifically
221+ instructed to leave the memory up to the heap goal unscavenged.
222+ Failing to exclude free and unscavenged memory could lead to a very poor
223+ accounting of non-heap overheads.
231224
232225With  fully defined, our heap goal for cycle
233- ![ ` n ` ] ( 48409/inl6 .png ) (![ ` N_n ` ] ( 48409/inl7 .png ) ) is a straightforward extension
226+  () is a straightforward extension
234227of the existing one.
235228
236229Where
237- * ![ ` M_n ` ] ( 48409/inl8 .png ) is equal to bytes marked at the end of GC n's mark
230+ *  is equal to bytes marked at the end of GC n's mark
238231 phase
239- * ![ ` S_n ` ] ( 48409/inl9 .png ) is equal to stack bytes at the beginning of GC n's
232+ *  is equal to stack bytes at the beginning of GC n's
240233 mark phase
241- * ![ ` G_n ` ] ( 48409/inl10 .png ) is equal to bytes of globals at the beginning of GC
234+ *  is equal to bytes of globals at the beginning of GC
242235 n's mark phase
243- * ![ ` \gamma ` ] ( 48409/inl11 .png ) is equal to
244- ![ ` 1+\frac{GOGC}{100} ` ] ( 48409/inl12 .png )
236+ *  is equal to
237+ 
245238
246239then
247240
248241
249242
250- Over the course of a GC cycle ![ ` O_M ` ] ( 48409/inl3.png ) remains stable because it
251- increases monotonically.
252- There's only one situation where ![ ` O_M ` ] ( 48409/inl3.png ) can grow tremendously
253- (relative to active heap objects) in a short period of time (< 1 GC cycle), and
254- that's when ` GOMAXPROCS ` increases.
255- So, I also propose recomputing this value at that time.
256-
257- Meanwhile ![ ` O_I ` ] ( 48409/inl4.png ) stays relatively stable (and doesn't have a
258- sawtooth pattern, as one might expect from a sum of idle heap memory) because
259- object sweeping occurs incrementally, specifically proportionally to how fast
260- the application is allocating.
261- Furthermore, this value is guaranteed to stay relatively stable across a single
262- GC cycle, because the total size of the heap for one GC cycle is bounded by the
263- heap goal.
264- Taking the highwater mark of this value places a conservative upper bound on the
265- total impact of this memory, so the heap goal stays safe from major changes.
266-
267- One concern with the above definition of ![ ` \hat{L} ` ] ( 48409/inl1.png ) is that it
268- is fragile to changes to the Go GC.
269- In the past, seemingly unrelated changes to the Go runtime have impacted the
270- GC's pacer, usually due to an unforeseen influence on the accounting that the
271- pacer relies on.
272- To minimize the impact of these accidents on the conversion function, I propose
273- centralizing and categorizing all the variables used in accounting, and writing
274- tests to ensure that expected properties of the account remain in-tact.
243+ Over the course of a GC cycle, non-heap overheads remain stable because the
244+ mostly increase monotonically.
245+ However, the GC needs to be responsive to any change in non-heap overheads.
246+ Therefore, I propose a more heavy-weight recomputation of the heap goal every
247+ time its needed, as opposed to computing it only once per cycle.
248+ This also means the GC trigger point needs to be dynamically recomputable.
249+ This check will create additional overheads, but they're likely to be low, as
250+ the GC's internal statistics are updated only on slow paths.
251+
252+ The nice thing about this definition of  is that
253+ it's fairly robust to changes to the Go GC, since total mapped memory, free and
254+ unscavenged memory, and bytes allocated in objects, are fairly fundamental
255+ properties (especially to any tracing GC design).
275256
276257#### Death spirals
277258
@@ -322,7 +303,7 @@ large enough to accommodate worst-case pause times but not too large such that a
322303more than about a second.
3233041 CPU-second per ` GOMAXPROCS` seems like a reasonable place to start.
324305
325- Unfortunately, 50% is not a reasonable choice for small values of ` GOGC ` .
306+ Unfortunately, 50% is not always a reasonable choice for small values of ` GOGC` .
326307Consider an application running with ` GOGC=10 ` : an overall 50% GC CPU
327308utilization limit for ` GOGC=10 ` is likely going to be always active, leading to
328309significant overshoot.
@@ -359,22 +340,13 @@ use approaches the limit.
359340I propose it does so using a proportional-integral controller whose input is the
360341difference between the memory limit and the memory used by Go, and whose output
361342is the CPU utilization target of the background scavenger.
362- The output will be clamped at a minimum of 1% and a maximum of 10% overall CPU
363- utilization.
364- Note that the 10% is chosen arbitrarily; in general, returning memory to the
365- platform is nowhere near as costly as the GC, but the number must be chosen such
366- that the mutator still has plenty of room to make progress (thus, I assert that
367- 40% of CPU time is enough).
368- In order to make the scavenger scale to overall CPU utilization effectively, it
369- requires some improvements to avoid the aforementioned locking issues it deals
370- with today.
371-
372- Any CPU time spent in the scavenger should also be accounted for in the leaky
373- bucket algorithm described in the [ Death spirals] ( #death-spirals ) section as GC
374- time, however I don't think it should be throttled in the same way.
375- The intuition behind that is that returning memory to the platform is generally
376- going to be more immediately fruitful than spending more time in garbage
377- collection.
343+ This will make the background scavenger more reliable.
344+
345+ However, the background scavenger likely won't return memory to the OS promptly
346+ enough for the memory limit, so in addition, I propose having span allocations
347+ eagerly return memory to the OS to stay under the limit.
348+ The time a goroutine spends in this will also count toward the 50% GC CPU limit
349+ described in the [Death spirals](#death-spirals) section.
378350
379351#### Alternative approaches considered
380352
@@ -418,38 +390,13 @@ go beyond the spans already in-use.
418390
419391##### Returning memory to the platform
420392
421- A potential issue with the proposed design is that because the scavenger is
422- running in the background, it may not react readily to spikes in memory use that
423- exceed the limit.
424-
425- In contrast, [ TCMalloc] ( #tcmalloc ) searches for memory to return eagerly, if an
426- allocation were to exceed the limit.
427- In the Go 1.13 cycle, I attempted a similar policy when first implementing the
428- scavenger, and found that it could cause unacceptable tail latency increases in
429- some applications.
430- While that policy certainly tried to return memory back to the platform
431- significantly more often than it would be in this case, it still has a couple of
432- downsides:
433- 1 . It introduces latency.
434- The background scavenger can be more efficiently time-sliced in between other
435- work, so it generally should only impact throughput.
436- 1 . It's much more complicated to bound the total amount of time spent searching
437- for and returning memory to the platform during an allocation.
438-
439- The key insight as to why this policy works just fine for TCMalloc and won't
440- work for Go comes from a fundamental difference in design.
441- Manual memory allocators are typically designed to have a LIFO-style memory
442- reuse pattern.
443- Once an allocation is freed, it is immediately available for reallocation.
444- In contrast, most efficient tracing garbage collection algorithms require a
445- FIFO-style memory reuse pattern, since allocations are freed in bulk.
446- The result is that the page allocator in a garbage-collected memory allocator is
447- accessed far more frequently than in manual memory allocator, so this path will
448- be hit a lot harder.
449-
450- For the purposes of this design, I don't believe the benefits of eager return
451- outweigh the costs, and I do believe that the proposed design is good enough for
452- most cases.
393+ If returning memory to the OS eagerly becomes a significant performance issue, a
394+ reasonable alternative could be to crank up the background scavenger's CPU usage
395+ in response to growing memory pressure.
396+ This needs more thought, but given that it would now be controlled by a
397+ controller, its CPU usage will be more reliable, and this is an option we can
398+ keep in mind.
399+ One benefit of this option is that it may impact latency less prominently.
453400
454401### Documentation
455402
@@ -513,6 +460,11 @@ decides to shrink the heap space used; more recent implementations (e.g. G1) do
513460so more rarely, except when [the application is
514461idle](https://openjdk.java.net/jeps/346).
515462
463+ Some JVMs are "container aware" and read the memory limits of their containers
464+ to stay under the limit.
465+ This behavior is closer to what is proposed in this document, but I do not
466+ believe the memory limit is directly configurable, like the one proposed here.
467+
516468### SetMaxHeap
517469
518470For nearly 4 years, the Go project has been trialing an experimental API in the
0 commit comments