Description
Some programming languages allow the user to enumerate all heap objects or a subset of them.
Examples
Ruby
# Enumerate all objects and print their classes
ObjectSpace.each_object() {|x| puts x.class }
# Enumerate all strings and print them
ObjectSpace.each_object(String) {|x| puts x }
# Return all reachable objects from obj
ObjectSpace.reachable_objects_from(obj)
# Return all reachable objects from roots
ObjectSpace.reachable_objects_from_root
(See: https://docs.ruby-lang.org/en/master/ObjectSpace.html#method-c-each_object )
JVM TI
FollowReference
(visit reachable objects from a given object)IterateThroughHeap
(traverse the whole heap)GetObjectsWithTags
(return objects that have a given tag)
Can full-heap traversal visit "dead" objects?
Ruby's ObjectSpace.each_object
Ruby's documentation for ObjectSpace.each_object
says
Calls the block once for each living, nonimmediate object in this Ruby process.
But here "living" seems to mean "not yet collected by the GC" because a mere function call is not able to determine whether the object is reachable from any roots. The following program shows that the Foo
instance can still be enumerated until GC is triggered.
puts "Hello!"
class Foo
def initialize(x)
@x = x
end
attr_accessor :x
end
a = Foo.new(42)
puts "Set!"
ObjectSpace.each_object(Foo) {|f| puts f.x}
a = nil
puts "Cleared!"
ObjectSpace.each_object(Foo) {|f| puts f.x}
GC.start
puts "Garbage collected!"
ObjectSpace.each_object(Foo) {|f| puts f.x}
puts "Goodbye!"
result:
Hello!
Set!
42
Cleared!
42
Garbage collected!
Goodbye!
JVM TI
IterateThroughHeap
may reach dead but not reclaimed objects, too. JVM TI doc for IterateThroughHeap
:
Initiate an iteration over all objects in the heap. This includes both reachable and unreachable objects. Objects are visited in no particular order.
What happens if objects are allocated during traversal?
Ruby
The interaction is mysterious. If any object of the same type
is created in the block of ObjectSpace.each_object(type) {|x| ... }
, the newly created objects may or may not be visited.
Example:
puts "Hello!"
class Foo
def initialize(x)
@x = x
end
attr_accessor :x
end
upper_bound = ARGV[0].to_i # command-line argument
(1..upper_bound).each do |i|
Foo.new(i)
end
puts "Objects created."
ObjectSpace.each_object(Foo) {|f| puts f.x}
puts "Iterate and create new objects..."
i = -1
ObjectSpace.each_object(Foo) do |f|
puts f.x
puts "Triggering GC..."
c = Foo.new(i)
d = Foo.new(i - 1)
i = i - 2
end
puts "Goodbye!"
Given the same command line argument, the result may even vary between consecutive executions of the program.
JVM TI
JVM TI guarantees the heap state (including objects and field values) is not changed during traversal. The following paragraph exists in the JVM TI documentation of both FollowReference
and IterateThroughHeap
.
During the execution of this function the state of the heap does not change: no objects are allocated, no objects are garbage collected, and the state of objects (including held values) does not change. As a result, threads executing Java programming language code, threads attempting to resume the execution of Java programming language code, and threads attempting to execute JNI functions are typically stalled.
What happens if GC is triggered during iteration?
Ruby
It is undocumented. ObjectSpace.each_object
does not prevent GC.start
to be called in the block. But it seems that calling GC.start
will remove dead objects immediately. As a result, the ObjectSpace.each_object
method will only visit some dead objects but not others.
puts "Hello!"
class Foo
def initialize(x)
@x = x
end
attr_accessor :x
end
live_root1 = Foo.new("This one lives")
100.times do |j|
Foo.new(j)
end
live_root2 = Foo.new("This one lives, too")
i = 1
ObjectSpace.each_object(Foo) do |obj|
puts "#{i}: #{obj.x}"
if i % 5 == 0
GC.start
end
i = i + 1
end
puts "Goodbye!"
GC will be triggered after visiting 5 objects. The program will visit 5 to 7 objects, depending on whether the two Foo
held by live roots are visited in the first five iterations.
JVM TI
As previously mentioned, the heap state does not change during iteration.
The call-back function of FollowReference
and IterateThroughHeap
is not allowed to call any JNI functions. The JVM TI function ForceGarbageCollection
is not "Callback Safe", either. So it is impossible to trigger GC in the callback.
The GetObjectsWithTags
function does not have any call-backs. It returns an array of object references.
Implementation
One obvious way to implement this feature is using the valid-object (VO) bit to scan each space. However, if a space is sparse, it may be helpful to narrow down the region of scanning by using space-specific metadata. In this way, we only need to scan regions (blocks or lines) actually occupied by objects.
PR #1174 implements heap traversal by scanning the VO bits. For block-based spaces, it only scans blocks occupied by objects. For LOS, objects are simply enumerated from the treadmill.