Skip to content

Commit 508d3bc

Browse files
committed
[GR-40874] Native Image Debuginfo Perf support #4811.
PullRequest: graal/12617
2 parents 33ae152 + a81c5cf commit 508d3bc

25 files changed

+1443
-990
lines changed

docs/reference-manual/native-image/DebugInfo.md

Lines changed: 181 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ The resulting image will contain debug records in a format the GNU Debugger (GDB
1818
Additionally, you can pass `-O0` to the builder which specifies that no compiler optimizations should be performed.
1919
Disabling all optimizations is not required, but in general it makes the debugging experience better.
2020

21+
Debug information is not just useful to the debugger. It can also be used by the Linux performance profiling tools `perf` and `valgrind` to correlate execution statistics such as CPU utilization or cache misses with specific, named Java methods and even link them to individual lines of Java code in the original Java source file.
22+
2123
By default, debug info will only include details of some of the values of parameters and local variables.
2224
This means that the debugger will report many parameters and local variables as being undefined. If you pass `-O0` to the builder then full debug information will be included.
2325
If you
@@ -35,8 +37,22 @@ However, it can significantly increase the size of the generated image on disk.
3537
and local variable information by passing flag `-H:+SourceLevelDebug` can cause a program to be compiled
3638
slightly differently and for some applications this can slow down execution.
3739

40+
The basic `perf report` command, which displays a histogram showing percentage execution time in each Java method, only requires passing flags `-g` and `-H:+SourceLevelDebug` to the `native-image` command.
41+
However, more sophisticated uses of `perf` (i.e. `perf annotate`) and use of
42+
`valgrind` requires debug info to be supplemented with linkage symbols identifying compiled Java methods.
43+
Java method symbols are omitted from the generated native image by default but they can be retained achieved by passing one extra flag to the `native-image` command
44+
45+
```shell
46+
native-image -g -H:+SourceLevelDebug -H:-DeleteLocalSymbols Hello
47+
```
48+
49+
Use of this flag will result in a small increase in the size of the
50+
resulting image file.
51+
3852
> Note: Native Image debugging currently works on Linux with initial support for macOS. The feature is experimental.
3953
54+
> Note: Debug info support for `perf` and `valgrind` on Linux is an experimental feature.
55+
4056
### Table of Contents
4157

4258
- [Source File Caching](#source-file-caching)
@@ -46,6 +62,7 @@ slightly differently and for some applications this can slow down execution.
4662
- [Checking Debug Info on Linux](#checking-debug-info-on-linux)
4763
- [Debugging with Isolates](#debugging-with-isolates)
4864
- [Debugging Helper Methods](#debugging-helper-methods)
65+
- [Special Considerations for using perf and valgrind](#using perf and valgrind)
4966

5067
## Source File Caching
5168

@@ -772,4 +789,167 @@ For example, calling the method below prints high-level information about the Na
772789

773790
### Further Reading
774791

775-
- [Debugging Native Image in VS Code](Debugging.md)
792+
- [Debugging Native Image in VS Code](Debugging.md)
793+
794+
## Special Considerations for using perf and valgrind
795+
796+
Debug info includes details of address ranges for top level and
797+
inlined compiled method code as well as mappings from code addresses
798+
to the corresponding source files and lines.
799+
`perf` and `valgrind` are able to use this information for some of
800+
their recording and reporting operations.
801+
For example, `perf report` is able to associate code adresses sampled
802+
during a `perf record` session with Java methods and print the
803+
DWARF-derived method name for the method in its output histogram.
804+
805+
```
806+
. . .
807+
68.18% 0.00% dirtest dirtest [.] _start
808+
|
809+
---_start
810+
__libc_start_main_alias_2 (inlined)
811+
|
812+
|--65.21%--__libc_start_call_main
813+
| com.oracle.svm.core.code.IsolateEnterStub::JavaMainWrapper_run_5087f5482cc9a6abc971913ece43acb471d2631b (inlined)
814+
| com.oracle.svm.core.JavaMainWrapper::run (inlined)
815+
| |
816+
| |--55.84%--com.oracle.svm.core.JavaMainWrapper::runCore (inlined)
817+
| | com.oracle.svm.core.JavaMainWrapper::runCore0 (inlined)
818+
| | |
819+
| | |--55.25%--DirTest::main (inlined)
820+
| | | |
821+
| | | --54.91%--DirTest::listAll (inlined)
822+
. . .
823+
```
824+
825+
Unfortunately, other operations require Java methods to be identified
826+
by an ELF (local) function symbol table entry locating the start of
827+
the compiled method code.
828+
In particular, assembly code dumps provided by both tools identify
829+
branch and call targets using an offset from the nearest symbol.
830+
Omitting Java method symbols means that offsets are generally
831+
displayed relative to some unrelated global symbol, usually the entry
832+
point for a method exported for invocation by C code.
833+
834+
As an illustration of the problem, the following excerpted output from
835+
`perf annotate` displays the first few annotated instructions of the
836+
compiled code for method `java.lang.String::String()`.
837+
838+
```
839+
. . .
840+
: 501 java.lang.String::String():
841+
: 521 public String(byte[] bytes, int offset, int length, Charset charset) {
842+
0.00 : 519d50: sub $0x68,%rsp
843+
0.00 : 519d54: mov %rdi,0x38(%rsp)
844+
0.00 : 519d59: mov %rsi,0x30(%rsp)
845+
0.00 : 519d5e: mov %edx,0x64(%rsp)
846+
0.00 : 519d62: mov %ecx,0x60(%rsp)
847+
0.00 : 519d66: mov %r8,0x28(%rsp)
848+
0.00 : 519d6b: cmp 0x8(%r15),%rsp
849+
0.00 : 519d6f: jbe 51ae1a <graal_vm_locator_symbol+0xe26ba>
850+
0.00 : 519d75: nop
851+
0.00 : 519d76: nop
852+
: 522 Objects.requireNonNull(charset);
853+
0.00 : 519d77: nop
854+
: 524 java.util.Objects::requireNonNull():
855+
: 207 if (obj == null)
856+
0.00 : 519d78: nop
857+
0.00 : 519d79: nop
858+
: 209 return obj;
859+
. . .
860+
```
861+
862+
The leftmost column shows percentages for the amount of time recorded
863+
at each instruction in samples obtained during the `perf record` run.
864+
Each instruction is prefaced with it's address in the program's code
865+
section.
866+
The disassembly interleaves the source lines from which the code is
867+
derived, 521-524 for the top level code and 207-209 for the code
868+
inlined from from `Objects.requireNonNull()`.
869+
Also, the start of the method is labelled with the name defined in the
870+
DWARF debug info, `java.lang.String::String()`.
871+
However, the branch instruction `jbe` at address `0x519d6f` uses a
872+
very large offset from `graal_vm_locator_symbol`.
873+
The printed offset does identify the correct address relative to the
874+
location of the symbol.
875+
However, this fails to make clear that the target address actually
876+
lies within the compiled code range for method `String::String()` i.e. that thsi is a method-local branch.
877+
878+
Readability of the tool output is significantly improved if
879+
option `-H-DeleteLocalSymbols` is passed to the `native-image`
880+
command.
881+
The equivalent `perf annotate` output with this option enabled is as
882+
follows:
883+
884+
```
885+
. . .
886+
: 5 000000000051aac0 <String_constructor_f60263d569497f1facccd5467ef60532e990f75d>:
887+
: 6 java.lang.String::String():
888+
: 521 * {@code offset} is greater than {@code bytes.length - length}
889+
: 522 *
890+
: 523 * @since 1.6
891+
: 524 */
892+
: 525 @SuppressWarnings("removal")
893+
: 526 public String(byte[] bytes, int offset, int length, Charset charset) {
894+
0.00 : 51aac0: sub $0x68,%rsp
895+
0.00 : 51aac4: mov %rdi,0x38(%rsp)
896+
0.00 : 51aac9: mov %rsi,0x30(%rsp)
897+
0.00 : 51aace: mov %edx,0x64(%rsp)
898+
0.00 : 51aad2: mov %ecx,0x60(%rsp)
899+
0.00 : 51aad6: mov %r8,0x28(%rsp)
900+
0.00 : 51aadb: cmp 0x8(%r15),%rsp
901+
0.00 : 51aadf: jbe 51bbc1 <String_constructor_f60263d569497f1facccd5467ef60532e990f75d+0x1101>
902+
0.00 : 51aae5: nop
903+
0.00 : 51aae6: nop
904+
: 522 Objects.requireNonNull(charset);
905+
0.00 : 51aae7: nop
906+
: 524 java.util.Objects::requireNonNull():
907+
: 207 * @param <T> the type of the reference
908+
: 208 * @return {@code obj} if not {@code null}
909+
: 209 * @throws NullPointerException if {@code obj} is {@code null}
910+
: 210 */
911+
: 211 public static <T> T requireNonNull(T obj) {
912+
: 212 if (obj == null)
913+
0.00 : 51aae8: nop
914+
0.00 : 51aae9: nop
915+
: 209 throw new NullPointerException();
916+
: 210 return obj;
917+
. . .
918+
```
919+
920+
In this version the start address of the method is now labelled with
921+
the mangled symbol name `String_constructor_f60263d569497f1facccd5467ef60532e990f75d`
922+
as well as the DWARF name.
923+
The branch target is now printed using an offset from that start
924+
symbol.
925+
926+
Unfortunately, `perf` and `valgrind` do not correctly understand the
927+
mangling algorithm employed by GraalVM, nor are they currently able to
928+
replace the mangled name with the DWARF name in the disassembly even
929+
though both symbol and DWARF function data are known to identify code
930+
starting at the same address.
931+
So, the branch instruction still prints its target using a symbol plus
932+
offset but it is at least using the method symbol this time.
933+
934+
Also, because address `51aac0` is now recognized as a method start,
935+
`perf` has preceded the first line of the method with 5 context lines,
936+
which list the tail end of the method's javadoc comment.
937+
Unfortunately, perf has numbered these lines incorrectly, labelling
938+
the first comment with 521 rather than 516.
939+
940+
Executing command `perf annotate` will provide a disassembly listing
941+
for all methods and C functions in the image.
942+
It is possible to annotate a specific method by passing it's name as
943+
an argument to the perf annotate command.
944+
Note, however, that `perf` requries the mangled symbol name as
945+
argument rather than the DWARF name.
946+
So, in order to annotate method `java.lang.String::String()` it is
947+
necessary to run command `perf annotate
948+
String_constructor_f60263d569497f1facccd5467ef60532e990f75d`.
949+
950+
The `valgrind` tool `callgrind` also requires local symbols to be
951+
retained in order to provide high quality output.
952+
When `callgrind` is used in combination with a viewer like
953+
`kcachegrind` it is possible to identify a great deal of valuable
954+
information about native image execution aand relate it back to
955+
specific source code lines.

docs/reference-manual/native-image/DebuggingAndDiagnostics.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ permalink: /reference-manual/native-image/debugging-and-diagnostics/
88
# Debugging and Diagnostics
99

1010
Native Image provides utilities for debugging and inspecting the produced binary:
11-
- For debugging produced binaries, see [Debug Information](DebugInfo.md)
11+
- For debugging produced binaries and obtaining performance profile statistics, see [Debug Information](DebugInfo.md)
1212
- For Java-like debugging of native executables in a running state, see [Debugging Native Executables](Debugging.md)
1313
- For generating heap dumps, see [Heap Dump Support](guides/create-heap-dump-from-native-executable.md)
1414
- For JFR events recording, see [JDK Flight Recorder (JFR)](JFR.md)

substratevm/mx.substratevm/mx_substratevm.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1274,6 +1274,17 @@ def debuginfotest(args, config=None):
12741274
config=config
12751275
)
12761276

1277+
@mx.command(suite_name=suite.name, command_name='debuginfotestshared', usage_msg='[options]')
1278+
def debuginfotestshared(args, config=None):
1279+
"""
1280+
builds a debuginfo ctutorial image but does not yet test it with gdb"
1281+
"""
1282+
# set an explicit path to the source tree for the tutorial code
1283+
sourcepath = mx.project('com.oracle.svm.tutorial').source_dirs()[0]
1284+
all_args = ['-H:GenerateDebugInfo=1', '-H:+SourceLevelDebug', '-H:DebugInfoSourceSearchPath=' + sourcepath, '-H:-DeleteLocalSymbols'] + args
1285+
# build and run the native image using debug info
1286+
# ideally we ought to script a gdb run
1287+
native_image_context_run(_cinterfacetutorial, all_args)
12771288

12781289
@mx.command(suite_name=suite.name, command_name='helloworld', usage_msg='[options]')
12791290
def helloworld(args, config=None):

0 commit comments

Comments
 (0)