Skip to content

API for enumerating objects #795

Open
@wks

Description

@wks

Some programming languages allow the user to enumerate all heap objects or a subset of them.

Examples

Ruby

# Enumerate all objects and print their classes
ObjectSpace.each_object() {|x| puts x.class }

# Enumerate all strings and print them
ObjectSpace.each_object(String) {|x| puts x }

# Return all reachable objects from obj
ObjectSpace.reachable_objects_from(obj)

# Return all reachable objects from roots
ObjectSpace.reachable_objects_from_root

(See: https://docs.ruby-lang.org/en/master/ObjectSpace.html#method-c-each_object )

JVM TI

  • FollowReference (visit reachable objects from a given object)
  • IterateThroughHeap (traverse the whole heap)
  • GetObjectsWithTags (return objects that have a given tag)

Can full-heap traversal visit "dead" objects?

Ruby's ObjectSpace.each_object

Ruby's documentation for ObjectSpace.each_object says

Calls the block once for each living, nonimmediate object in this Ruby process.

But here "living" seems to mean "not yet collected by the GC" because a mere function call is not able to determine whether the object is reachable from any roots. The following program shows that the Foo instance can still be enumerated until GC is triggered.

puts "Hello!"

class Foo
  def initialize(x)
    @x = x
  end
  attr_accessor :x
end

a = Foo.new(42)
puts "Set!"
ObjectSpace.each_object(Foo) {|f| puts f.x}

a = nil
puts "Cleared!"
ObjectSpace.each_object(Foo) {|f| puts f.x}

GC.start
puts "Garbage collected!"
ObjectSpace.each_object(Foo) {|f| puts f.x}

puts "Goodbye!"

result:

Hello!
Set!
42
Cleared!
42
Garbage collected!
Goodbye!

JVM TI

IterateThroughHeap may reach dead but not reclaimed objects, too. JVM TI doc for IterateThroughHeap:

Initiate an iteration over all objects in the heap. This includes both reachable and unreachable objects. Objects are visited in no particular order.

What happens if objects are allocated during traversal?

Ruby

The interaction is mysterious. If any object of the same type is created in the block of ObjectSpace.each_object(type) {|x| ... }, the newly created objects may or may not be visited.

Example:

puts "Hello!"

class Foo
  def initialize(x)
    @x = x
  end
  attr_accessor :x
end

upper_bound = ARGV[0].to_i # command-line argument

(1..upper_bound).each do |i|
  Foo.new(i)
end

puts "Objects created."

ObjectSpace.each_object(Foo) {|f| puts f.x}

puts "Iterate and create new objects..."

i = -1

ObjectSpace.each_object(Foo) do |f|
  puts f.x
  puts "Triggering GC..."
  c = Foo.new(i)
  d = Foo.new(i - 1)
  i = i - 2
end

puts "Goodbye!"

Given the same command line argument, the result may even vary between consecutive executions of the program.

JVM TI

JVM TI guarantees the heap state (including objects and field values) is not changed during traversal. The following paragraph exists in the JVM TI documentation of both FollowReference and IterateThroughHeap.

During the execution of this function the state of the heap does not change: no objects are allocated, no objects are garbage collected, and the state of objects (including held values) does not change. As a result, threads executing Java programming language code, threads attempting to resume the execution of Java programming language code, and threads attempting to execute JNI functions are typically stalled.

What happens if GC is triggered during iteration?

Ruby

It is undocumented. ObjectSpace.each_object does not prevent GC.start to be called in the block. But it seems that calling GC.start will remove dead objects immediately. As a result, the ObjectSpace.each_object method will only visit some dead objects but not others.

puts "Hello!"

class Foo
  def initialize(x)
    @x = x
  end
  attr_accessor :x
end

live_root1 = Foo.new("This one lives")

100.times do |j|
  Foo.new(j)
end

live_root2 = Foo.new("This one lives, too")

i = 1

ObjectSpace.each_object(Foo) do |obj|
  puts "#{i}: #{obj.x}"
  if i % 5 == 0
    GC.start
  end
  i = i + 1
end

puts "Goodbye!"

GC will be triggered after visiting 5 objects. The program will visit 5 to 7 objects, depending on whether the two Foo held by live roots are visited in the first five iterations.

JVM TI

As previously mentioned, the heap state does not change during iteration.

The call-back function of FollowReference and IterateThroughHeap is not allowed to call any JNI functions. The JVM TI function ForceGarbageCollection is not "Callback Safe", either. So it is impossible to trigger GC in the callback.

The GetObjectsWithTags function does not have any call-backs. It returns an array of object references.

Implementation

One obvious way to implement this feature is using the valid-object (VO) bit to scan each space. However, if a space is sparse, it may be helpful to narrow down the region of scanning by using space-specific metadata. In this way, we only need to scan regions (blocks or lines) actually occupied by objects.

PR #1174 implements heap traversal by scanning the VO bits. For block-based spaces, it only scans blocks occupied by objects. For LOS, objects are simply enumerated from the treadmill.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-metaArea: Meta issues for the repositoryP-normalPriority: Normal.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions