reliable line numbers in code info printing #29893

vtjnash · 2018-11-01T18:02:39Z

This implements a uniform and flexible printing of line number and inlining information across all of our compiler introspection passes (*see comparison screen shots at the bottom). The goal here is to expose all information which may be necessary to the user, but to also provide a uniform framework for letting the user augment or limit that information. As such, all code_ functions and macros now take a debuginfo argument.

This PR utilizes a flexible callback-based design (loosely based on the LLVM design of the same that has proved very useful to us) so that users can create new designs which render less or more information than the default (and to also set those to the default(*), if they wish). Currently the only modes for :debuginfo that are implemented are :source (showing only line number information) and :none (showing no debug information). Also, all provide :default (user-customizable(*)). In the future, I’m hoping users will implement new modes to suit other specific interests. Such examples, ordered by verbosity, include:

none (“just give me the code” mode—implemented here)
oneliner (only show information if it fits on one line—e.g. the approximate printer we have now on v1.0, but needs Keno to port this over)
source (fully accurate line info printer capable of handling inlining—implemented here)
types (like source, but also show specialization information on everything)
metadata (like types, but also show all metadata that the compiler has)
variables (like metadata, but also represent further information about where everything came from—e.g. what happened to the original variables or fields)

Example snippets below. This function was chosen because it has unusually complicated line information, even at the source level, because of the macro. This is not true of most functions: typically unoptimized code will not contain any inlining bar delineating any statements as belonging to some foreign frame. I’m also hopeful to see if someone—maybe me—can move the bar to the right, while still preserving the accuracy that I've worked hard over the years to ensure.) Note: these were taken using the Solarized Dark theme, which sometimes shifts colors pretty far from their defaults to much more muted tones, so YMMV.

input

code_lowered

code_warntype

code_typed

code_llvm

code_native

* user-customization is handled via an open dictionary Base.IRShow.debuginfo that maps symbol name (such as :default) to a callback annotation printer. Users can alter or add mappings in their .juliarc.jl files and create new packages to create novel new modes of their own!

Keno · 2018-11-01T18:11:55Z

This seems nice, but can we leave the default non-verbose? I'm worried about new users trying to use these and getting a mess of line info thrown at them (this is already happening with code_native as implemented right now).

JeffBezanson · 2018-11-01T18:12:40Z

This is an absurd bikeshed comment, but the brackets look weird to me. The closing ┘ characters should be └.

vtjnash · 2018-11-26T15:52:31Z

oneliner mode (only show information if it fits on one line)

This just needs someone to adapt it to the framework here. If it had gotten done by now by someone, I was figuring that we could merge it into this PR. But we can merge that on master separately whenever it's ready. For now, I’m most interested in finally recovering the capability for this code to present information completely and correctly.

I'm worried about new users trying to use these and getting a mess of line info thrown at them

I have great confidence in users abilities—they aren’t stupid. Regardless, I expect that I'll appreciate having this information available by default in bug reports, in much the same way that I want backtraces to be overly information-laden by default.
I'm trying very hard not to comment on the current situation (hence my very slow rate of replies), but stating this opinion as a fact is really not helpful. As it is now, I don't understand most of the line information getting thrown at me. I'm somewhat sure I've come across bugs, but the output is so intentionally inaccurate that I've been unable to open new issues to explain the problems. I think that new users will appreciate having the information shown in a format that attempts to simply and accurately represent the source of each ssavalue. And regardless, I know that some users will already be happy that this PR allows them to eliminate "extraneous" output, even though that wasn't exactly my goal.

Btw, @Keno you mentioned that you'd like the bars on the right. I'm generally for that too, since otherwise the right-hand-size seems to be under-utilized, and I don't yet know what else we would need to put there. Once this general framework is merged, I'm hopeful that we can find some way to get that changed where supported.

@JeffBezanson sure, done. It was a tossup for me whether it was drawing a strict grouping or some more abstract margin line design. Going with the bracket interpretation seems good.

Keno · 2018-11-26T16:03:33Z

May I suggest for the purpose of getting this merged, you switch the default to non-verbose, since everybody seems to be in favor of that part of this change, and then we have a separate debate on that part of the change? I don't think coupling the two is necessary.

Keno · 2018-11-26T16:09:32Z

And FWIW, I'd be fine if "non-verbose" means "none" here until we re-implement the RHS information.

vtjnash · 2018-11-26T16:54:22Z

Can we just recommend that those folks edit their .juliarc.jl to set their personal preference (and no, not everyone agrees with you)? I think for casual and initial interaction with this infrastructure, we may want to present users with a hint at the wider range of information what we can provide. I expect that most users aren't familiar with ssa-form, and would find the extent of our lowering pretty confusing—where these line numbers may help to anchor them. But then as they dig deeper into their problem and have isolated it to one section of the code, they can start to turn off (or perhaps sometimes increase) the irrelevant verbosity around that snippet.

True, that doesn't really need to block merging, since I can just set my default too, so I've switched that default in my latest commit:

Base.IRShow.debuginfo[:default] = Base.IRShow.debuginfo[:source] # or :none

Keno

Ok with me.

StefanKarpinski · 2018-11-26T23:09:05Z

Should we call this verbosity instead of debuginfo? Or just info? I can't seem to find where the different allowable values are. What does :source do as compared to :default?

vtjnash · 2018-11-27T16:49:06Z

I can't seem to find where the different allowable values are. What does :source do as compared to :default?

Maybe not too visible in the diff, but the currently allowed values are explained in every doc-string for each code_* function and macro (and observations about possible future work to add more values is mentioned in the PR text above). These doc-string additions ended up somewhere near to the lower 3/4th of the github changes view.

Looking at the doc-string, it could also be called comments. But you don't say why those proposals would be better.

What does :source do as compared to :default?

The alias :default can be decided to be set to anything else, without affecting the meaning of :source. It's the representation of "not-specified".

StefanKarpinski · 2018-11-27T17:06:30Z

Perhaps I'm missing it but I don't see :source mentioned in any doc string or otherwise documented.

My objection is that the name debuginfo is long, redundant and slightly awkward. We also already use underscores in the names of other code reflection options, so this should be debug_info for consistency. I think that comments or verbose would be better names. I also find the symbol values a bit awkward. Perhaps an integer verbosity level with 0 for no debug comments, 1 for a medium amount (roughly what we used to print), and 2 for all the information that's available.

vtjnash · 2018-11-27T17:41:04Z

In prose, I write the option name without the leading colon.

My objection is that the name debuginfo is long, redundant and slightly awkward. We also already use underscores in the names of other code reflection options

Seriously, you're worried that it's a single character too long? What is it redundant with? If you're going to bike-shed the name, it'd be helpful to at least have a more concrete reason than "comments=2" is "better and less awkward". Most of the options are single words, and avoid the underscore question. But typically across Julia, we
seem to be attempting to avoid them (ref, Underscore Audit. Aside: there's also a great typo on that page: "strive to eliminate all names where underscores are necessary"). I guess I gravitated towards debuginfo from prior exposure (In LLVM, this information is called DebugInfo or DI), although "comments" seems an acceptable compromise to me.

Perhaps an integer verbosity level with 0

We could permit integers as aliases, for ease of typing, but even C has had enums for ages for better readability.

jekbradbury · 2018-11-28T19:55:57Z

On this branch, if you set Base.IRShow.debuginfo[:default] = Base.IRShow.debuginfo[:none], the debuginfo=:source kwarg doesn't seem to work for me; the reverse is fine for code_typed and code_llvm but not code_native.

vtjnash · 2018-11-28T20:59:03Z

It works fine, but specifies something slightly different. In particular, they can independently specify a choice to filter out available debug information. We may synchronize those options in more cases in the future. But right now, master doesn't give you any choice, so this is at least an improvement.

vchuravy

LGTM. I am quite looking forward to this.

stdlib/InteractiveUtils/src/codeview.jl

Use box-drawing characters and indentation to make the output readable more rapidly.

…_native For consistency of user experience, reduce the variance in our IR printing across formats. This also now shows inlining and line number information even if the output device might not support color, which was previously impossible (a regression since v0.6).

Here we make the observation that it's somewhat common to have chains of methods of one function that recursively handle arguments in some fashion (for example, map-tuple or +). However, since they all have the same method name, it's possible to represent these on a single line.

These levels are intended to roughly correspond approximately to compiler options -g0 (none) through -g2 (including local variables). Currently we do not implement the ability to represent local variable information, so the default is to only show line number information suitable for identifying the source line of a given statement (but not the types).

Only for Julia IR, for now. Later, this should be threaded into code_llvm and enabled there also. I chose green for new nodes (like new life), and used yellow for metadata (because the strong contrast helps to hide it visually, and similar to sticky notes), and used grey for basic blocks (because they are minor information), and used cyan/gray/Base.warn_color() for type information (because that's what we've always used).

StefanKarpinski · 2018-11-30T14:51:07Z

Is it just me or does this option not have any effect?

julia> function collatz(n)
           k = 0
           while n > 1
               n = iseven(n) ? n >> 1 : 3n + 1
               k += 1
           end
           return k
       end
collatz (generic function with 1 method)

julia> @code_native debuginfo=:none collatz(100)
	.section	__TEXT,__text,regular,pure_instructions
; ┌ @ REPL[9]:2 within `collatz'
	xorl	%eax, %eax
; │ @ REPL[9]:3 within `collatz'
; │┌ @ operators.jl:286 within `>'
; ││┌ @ int.jl:49 within `<'
	cmpq	$2, %rdi
; │└└
	jl	L45
	nopl	(%rax,%rax)
; │ @ REPL[9]:4 within `collatz'
; │┌ @ int.jl:448 within `>>' @ int.jl:441
L16:
	movq	%rdi, %rcx
	sarq	%rcx
; │└
	testb	$1, %dil
; │┌ @ int.jl:53 within `+'
	leaq	1(%rdi,%rdi,2), %rdi
; │└
	cmoveq	%rcx, %rdi
; │ @ REPL[9]:5 within `collatz'
; │┌ @ int.jl:53 within `+'
	addq	$1, %rax
; │└
; │ @ REPL[9]:3 within `collatz'
; │┌ @ operators.jl:286 within `>'
; ││┌ @ int.jl:49 within `<'
	cmpq	$2, %rdi
; │└└
	jge	L16
; │ @ REPL[9]:7 within `collatz'
L45:
	retq
	nop
; └

KristofferC · 2018-11-30T15:32:14Z

It works for code_llvm but not for code_native, it seems.

KristofferC · 2018-11-30T15:34:02Z

code_typed doesnt seem to have any line annotations?

julia> @code_typed collatz(3)
CodeInfo(
1 ─       nothing::Nothing
2 ┄ %2  = φ (#1 => 0, #6 => %18)::Int64
│   %3  = φ (#1 => _2, #6 => %17)::Int64
│   %4  = (Base.slt_int)(1, %3)::Bool
└──       goto #7 if not %4
3 ─ %6  = (Base.checked_srem_int)(%3, 2)::Int64
│   %7  = (%6 === 0)::Bool
│   %8  = (Base.not_int)(%7)::Bool
│   %9  = (Base.not_int)(%8)::Bool
└──       goto #5 if not %9
4 ─ %11 = (Base.ashr_int)(%3, 0x0000000000000001)::Int64
│   %12 = (Base.shl_int)(%3, 0xffffffffffffffff)::Int64
│   %13 = (Base.ifelse)(true, %11, %12)::Int64
└──       goto #6
5 ─ %15 = (Base.mul_int)(3, %3)::Int64
└── %16 = (Base.add_int)(%15, 1)::Int64
6 ┄ %17 = φ (#4 => %13, #5 => %16)::Int64
│   %18 = (Base.add_int)(%2, 1)::Int64
└──       goto #2
7 ─       return %2
) => Int64

KristofferC · 2018-11-30T15:40:11Z

Something seems wrong with the lines to the left? They are cut everywhere there are instructions.

KristofferC · 2018-11-30T15:46:10Z

Wasn't the intention to default to not showing all this line info?

StefanKarpinski · 2018-11-30T15:56:39Z

@vtjnash does not want to do that even though there does seem to be fairly broad consensus behind it. Personally, my preferred state of affairs would be having three levels of debug info:

none: just show the code
some: show a bit of debug info but not all of it
all: show all available debug info

Where some is the default. As I've stated above, I also feel that the debuginfo is not the best name, and that :source, :none are not especially clear as values for this option. However, I don't get the impression that feedback, discussion or consensus are welcomed here.

vtjnash · 2018-11-30T16:17:02Z

@KristofferC It's intentional: LLVM really does not not want us printing on the same line (so that the output is valid code). We can hack around it as follows, but then it's not a valid file anymore, so on balance it doesn't seem to be worthwhile.

diff --git a/src/disasm.cpp b/src/disasm.cpp
index 418643d568..04f6465059 100644
--- a/src/disasm.cpp
+++ b/src/disasm.cpp
@@ -806,6 +806,7 @@ static void jl_dump_asm_internal(
     // start out with a raw stream, and create formatted stream from it here.
     // LLVM will desctruct the formatted stream, and we keep the raw stream.
     auto ustream = llvm::make_unique<formatted_raw_ostream>(rstream);
+    llvm::formatted_raw_ostream *private_ustream = ustream.get();
     Streamer.reset(TheTarget->createAsmStreamer(Ctx, std::move(ustream), /*asmverbose*/true,
                                                 /*useDwarfDirectory*/ true,
                                                 IP, CE, MAB, /*ShowInst*/ false));
@@ -860,7 +861,8 @@ static void jl_dump_asm_internal(
         uint64_t nextLineAddr = -1;
         DILineInfoTable::iterator di_lineIter = di_lineinfo.begin();
         DILineInfoTable::iterator di_lineEnd = di_lineinfo.end();
-        DILineInfoPrinter dbgctx{"; ", true};
+        DILineInfoPrinter dbgctx{"", true};
+        dbgctx.SetVerbosity(debuginfo);
         if (pass != 0) {
             if (di_ctx && di_lineIter != di_lineEnd) {
                 // Set up the line info
@@ -907,6 +909,7 @@ static void jl_dump_asm_internal(
                 if (symbol)
                     Streamer->EmitLabel(symbol);
             }
+            *private_ustream << dbgctx.inlining_indent("│");
 
             MCInst Inst;
             MCDisassembler::DecodeStatus S;

Wasn't the intention to default to not showing all this line info?

No. The intention of the PR was to improve formatting. If you want to change the defaults, make your own PR.

maxbennedich · 2019-01-31T11:19:00Z

I came here trying to figure out why debuginfo is not working for code_native. Every time I do code_native, I find myself manually removing the comments, and frankly I'm getting quite tired of it. So I typed up this hacky macro, and I'm sharing it here if anyone else is in the same situation. It uses Suppressor.jl.

using Suppressor

macro no_comments(ex)
    quote
        full_output = split((@capture_out $(esc(ex))), '\n')
        without_comments = filter(s -> !startswith(s, ';'), full_output)
        foreach(println, without_comments)
    end
end

Before:

julia> code_native(x -> sqrt(x), (Float64,); syntax=:intel)
	.section	__TEXT,__text,regular,pure_instructions
; ┌ @ REPL[6]:1 within `#17'
; │┌ @ math.jl:492 within `sqrt'
; ││┌ @ REPL[6]:1 within `<'
	push	rax
	vxorps	xmm1, xmm1, xmm1
	vucomisd	xmm1, xmm0
; ││└
	ja	L17
; ││ @ math.jl:493 within `sqrt'
	vsqrtsd	xmm0, xmm0, xmm0
; │└
	pop	rax
	ret
; │┌ @ math.jl:492 within `sqrt'
L17:
	movabs	rax, offset throw_complex_domainerror
	movabs	rdi, 4363115776
	call	rax
	ud2
	nop	dword ptr [rax]
; └└

After:

julia> @no_comments code_native(x -> sqrt(x), (Float64,); syntax=:intel)
	.section	__TEXT,__text,regular,pure_instructions
	push	rax
	vxorps	xmm1, xmm1, xmm1
	vucomisd	xmm1, xmm0
	ja	L17
	vsqrtsd	xmm0, xmm0, xmm0
	pop	rax
	ret
L17:
	movabs	rax, offset throw_complex_domainerror
	movabs	rdi, 4363115776
	call	rax
	ud2
	nop	dword ptr [rax]

¡A blissful sight!

StefanKarpinski · 2019-01-31T16:13:42Z

@vtjnash: It's pretty ridiculous that we still cannot suppress all this junk in the output of @code_* macros. There's even debuginfo option but it doesn't do anything. Wah, wah, wah...

vchuravy · 2019-01-31T17:29:10Z

This is fixed by #30223

KristofferC · 2019-05-28T10:18:43Z

doc/src/devdocs/reflection.md

@@ -122,11 +122,11 @@ calls and expand argument types automatically:
 ```julia-repl
 julia> @code_llvm +(1,1)

-; Function Attrs: sspreq
-define i64 @"julia_+_130862"(i64, i64) #0 {
+;  @ int.jl:53 within `+'


What's the reason for printing this as `+' instead of `+`? It messes up syntax highlighting sometimes when pasting output.

@vtjnash

I'd say just fix it to be paired backticks, which, imo looks better anyway.

jrevels mentioned this pull request Nov 2, 2018

Const-prop through tuple JuliaLabs/Cassette.jl#71

Closed

vtjnash force-pushed the jn/codeinfo-lineno branch from c2bda2c to 8cab827 Compare November 13, 2018 05:51

vtjnash force-pushed the jn/codeinfo-lineno branch from 8cab827 to 9f72610 Compare November 26, 2018 15:54

Keno approved these changes Nov 26, 2018

View reviewed changes

vtjnash force-pushed the jn/codeinfo-lineno branch from 86ed2fa to a0885fb Compare November 26, 2018 19:20

vchuravy approved these changes Nov 29, 2018

View reviewed changes

stdlib/InteractiveUtils/src/codeview.jl Outdated Show resolved Hide resolved

vtjnash added 7 commits November 29, 2018 12:00

code_llvm annotations: format prettier

3fd593d

Use box-drawing characters and indentation to make the output readable more rapidly.

s/┘/└/g

d4bdcd9

this is why we can't have nice things

4c13c65

vtjnash force-pushed the jn/codeinfo-lineno branch from a0885fb to 4c13c65 Compare November 29, 2018 17:00

Merge branch 'master' into jn/codeinfo-lineno

8ae8304

vchuravy merged commit b3cf152 into master Nov 30, 2018

vchuravy deleted the jn/codeinfo-lineno branch November 30, 2018 14:24

KristofferC reviewed May 28, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

reliable line numbers in code info printing #29893

reliable line numbers in code info printing #29893

vtjnash commented Nov 1, 2018 •

edited

Loading

Keno commented Nov 1, 2018

JeffBezanson commented Nov 1, 2018

vtjnash commented Nov 26, 2018

Keno commented Nov 26, 2018

Keno commented Nov 26, 2018

vtjnash commented Nov 26, 2018

Keno left a comment

StefanKarpinski commented Nov 26, 2018

vtjnash commented Nov 27, 2018

StefanKarpinski commented Nov 27, 2018

vtjnash commented Nov 27, 2018

jekbradbury commented Nov 28, 2018 •

edited

Loading

vtjnash commented Nov 28, 2018

vchuravy left a comment

StefanKarpinski commented Nov 30, 2018

KristofferC commented Nov 30, 2018

KristofferC commented Nov 30, 2018

KristofferC commented Nov 30, 2018

KristofferC commented Nov 30, 2018

StefanKarpinski commented Nov 30, 2018 •

edited

Loading

vtjnash commented Nov 30, 2018

maxbennedich commented Jan 31, 2019

StefanKarpinski commented Jan 31, 2019

vchuravy commented Jan 31, 2019

KristofferC May 28, 2019

StefanKarpinski May 28, 2019

reliable line numbers in code info printing #29893

reliable line numbers in code info printing #29893

Conversation

vtjnash commented Nov 1, 2018 • edited Loading

Keno commented Nov 1, 2018

JeffBezanson commented Nov 1, 2018

vtjnash commented Nov 26, 2018

Keno commented Nov 26, 2018

Keno commented Nov 26, 2018

vtjnash commented Nov 26, 2018

Keno left a comment

Choose a reason for hiding this comment

StefanKarpinski commented Nov 26, 2018

vtjnash commented Nov 27, 2018

StefanKarpinski commented Nov 27, 2018

vtjnash commented Nov 27, 2018

jekbradbury commented Nov 28, 2018 • edited Loading

vtjnash commented Nov 28, 2018

vchuravy left a comment

Choose a reason for hiding this comment

StefanKarpinski commented Nov 30, 2018

KristofferC commented Nov 30, 2018

KristofferC commented Nov 30, 2018

KristofferC commented Nov 30, 2018

KristofferC commented Nov 30, 2018

StefanKarpinski commented Nov 30, 2018 • edited Loading

vtjnash commented Nov 30, 2018

maxbennedich commented Jan 31, 2019

StefanKarpinski commented Jan 31, 2019

vchuravy commented Jan 31, 2019

KristofferC May 28, 2019

Choose a reason for hiding this comment

StefanKarpinski May 28, 2019

Choose a reason for hiding this comment

vtjnash commented Nov 1, 2018 •

edited

Loading

jekbradbury commented Nov 28, 2018 •

edited

Loading

StefanKarpinski commented Nov 30, 2018 •

edited

Loading