tag:github.com,2008:https://github.com/acmel/dwarves/releasesRelease notes from dwarves2025-04-18T15:25:09Ztag:github.com,2008:Repository/2717477/v1.302025-04-18T15:25:09Zv1.30: CI testing:<ul>
<li>support for github CI tests to build pahole with gcc<br>
and LLVM.</li>
<li>support for github CI tests to build pahole, a kernel<br>
along with BTF using that pahole and run tests.</li>
<li>tests can also be run standalone; see toplevel README<br>
for details.</li>
</ul>
<p>DWARF loader:</p>
<ul>
<li>better detection of abort during thread processing.</li>
</ul>
<p>BTF encoder:</p>
<ul>
<li>
<p>pahole now uses an improved scheme to detect presence of<br>
newer libbpf functions for cases where pahole is built with<br>
a non-embedded libbpf. A local weak declaration is added,<br>
and if the function is non-NULL - indicating it is present -<br>
the associated feature is avaialble. BTF feature detection<br>
makes use of this now and BTF features declared in pahole<br>
can provide a feature check function.</p>
</li>
<li>
<p>Type tags are now emitted for bpf_arena pointers if the<br>
attributes btf_feature is specified.</p>
</li>
<li>
<p>kfunc tagging has been refactored into btf_encoder__collect_kfuncs<br>
to simplify from the previous two-stage collect/tag process.</p>
</li>
<li>
<p>To support global variables other than per-CPU variables, code<br>
was added to match a variable with the relevant section. However<br>
variables in to-be-discarded sections have address value 0 and<br>
appeared to be in the per-CPU section (since it starts at 0).<br>
Add checks to ensure the variable really is in the relevant<br>
ELF section.</p>
</li>
<li>
<p>To avoid expensive variable address checking in the above case,<br>
filter out variables prefixed by _<em>gendwarfksyms_ptr</em> which are<br>
present when CONFIG_GENDWARFKSYMS is set.</p>
</li>
<li>
<p>Memory access bugs reported by address sanitizer were also fixed.</p>
</li>
</ul>
<p>Signed-off-by: Alan Maguire <a href="mailto:alan.maguire@oracle.com">alan.maguire@oracle.com</a><br>
Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p>acmeltag:github.com,2008:Repository/2717477/v1.292025-01-21T15:01:32Zv1.29: DWARF loader:<ul>
<li>Multithreading is now contained in the DWARF loader using a jobs queue and a<br>
pool of worker threads.</li>
</ul>
<p>BTF encoder:</p>
<ul>
<li>
<p>The parallel reproducible BTF generation done using the new DWARF loader<br>
multithreading model is as fast as the old non-reproducible one and thus is<br>
now always performed, making the "reproducible_build" flag moot.</p>
<p>The memory consumption is now greatly reduced as well.</p>
</li>
</ul>
<p>BTF loader:</p>
<ul>
<li>
<p>Support for multiple BTF_DECL_TAGs pointing to same tag.</p>
<p>Example:</p>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="$ pfunct vmlinux -F btf -f bpf_rdonly_cast
bpf_kfunc bpf_fastcall void *bpf_rdonly_cast(const void *obj__ign, u32 btf_id__k);
$"><pre class="notranslate"><code>$ pfunct vmlinux -F btf -f bpf_rdonly_cast
bpf_kfunc bpf_fastcall void *bpf_rdonly_cast(const void *obj__ign, u32 btf_id__k);
$
</code></pre></div>
</li>
</ul>
<p>Regression tests:</p>
<ul>
<li>Verify that pfunct prints btf_decl_tags read from BTF.</li>
</ul>
<p>pfunct:</p>
<ul>
<li>Don't print functions twice when using -f.</li>
</ul>
<p>Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p>acmeltag:github.com,2008:Repository/2717477/v1.282024-12-07T14:06:35Zv1.28: pahole:<ul>
<li>
<p>Various improvements to reduce the memory footprint of pahole, notably when<br>
doing BTF encoding.</p>
</li>
<li>
<p>Show flexible arrays statistics, it detects them at the end of member types,<br>
in the middle, etc. This should help with the efforts to spot problematic<br>
usage of flexible arrays in the kernel sources, examples:</p>
<p><a href="https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=6ab5318f536927cb" rel="nofollow">https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=6ab5318f536927cb</a></p>
</li>
<li>
<p>Introduce --with_embedded_flexible_array option.</p>
</li>
<li>
<p>Add '--padding N' to show only structs with N bytes of padding.</p>
</li>
<li>
<p>Add '--padding_ge N' to show only structs with at least N bytes of padding.</p>
</li>
<li>
<p>Introduce --running_kernel_vmlinux to find a vmlinux that matches the<br>
build-id of the running kernel, e.g.:</p>
<p>$ pahole --running_kernel_vmlinux<br>
/usr/lib/debug/lib/modules/6.11.7-200.fc40.x86_64/vmlinux<br>
$ rpm -qf /usr/lib/debug/lib/modules/6.11.7-200.fc40.x86_64/vmlinux<br>
kernel-debuginfo-6.11.7-200.fc40.x86_64<br>
$</p>
<p>This is a shortcut to find the right vmlinux to use for the running kernel<br>
and helps with regression tests.</p>
</li>
</ul>
<p>pfunct:</p>
<ul>
<li>Don't stop at the first function that matches a filter, show all of them.</li>
</ul>
<p>BTF Encoder:</p>
<ul>
<li>
<p>Allow encoding data about all global variables, not just per CPU ones.</p>
<p>There are several reasons why type information for all global variables to be<br>
useful in the kernel, including drgn without DWARF, __ksym BPF programs return<br>
type.</p>
<p>This is non-default, experiment with it using 'pahole --btf-features=+global_var'</p>
</li>
<li>
<p>Handle .BTF_ids section endianness, allowing for cross builds involving<br>
machines with different endianness to work.</p>
<p>For instance, encoding BTF info on a s390 vmlinux file on a x86_64 workstation.</p>
</li>
<li>
<p>Generate decl tags for bpf_fastcall for eligible kfuncs.</p>
</li>
<li>
<p>Add "distilled_base" BTF feature to split BTF generation.</p>
</li>
<li>
<p>Use the ELF_C_READ_MMAP mode with libelf, reducing peak memory utilization.</p>
</li>
</ul>
<p>BTF Loader:</p>
<ul>
<li>Allow overiding /sys/kernel/btf/vmlinux with some other file, for testing,<br>
via the PAHOLE_VMLINUX_BTF_FILENAME environment variable.</li>
</ul>
<p>DWARF loader:</p>
<ul>
<li>
<p>Allow setting the list of compile units produced from languages to skip via<br>
the PAHOLE_LANG_EXCLUDE environment variable.</p>
</li>
<li>
<p>Serialize access to elfutils dwarf_getlocation() to avoid elfutils internal<br>
data structure corruption when running multithreaded pahole.</p>
</li>
<li>
<p>Honour --lang_exclude when merging LTO built CUs.</p>
</li>
<li>
<p>Add the debuginfod client cache directory to the vmlinux search path.</p>
</li>
<li>
<p>Print the CU's language when a tag isn't supported.</p>
</li>
<li>
<p>Initial support for the DW_TAG_GNU_formal_parameter_pack,<br>
DW_TAG_GNU_template_parameter_pack, DW_TAG_template_value_param and<br>
DW_TAG_template_type_param DWARF tags.</p>
</li>
<li>
<p>Improve the parameter parsing by checking DW_OP_[GNU_]entry_value, this<br>
makes some more functions to be made eligible by the BTF encoder, for instance<br>
the perf_event_read() in the 6.11 kernel.</p>
</li>
</ul>
<p>Core:</p>
<ul>
<li>Use pahole to help in reorganizing its data structures to reduce its memory<br>
footprint.</li>
</ul>
<p>Regression tests:</p>
<ul>
<li>
<p>Introduce a tests/ directory for adding regression tests, run it with:</p>
<p>$ tests/tests</p>
<p>Or run the individual tests directly.</p>
</li>
<li>
<p>Add a regression test for the reproducible build feature that establishes<br>
as a baseline a detached BTF file without asking for a reproducible build and<br>
then compares the output of 'bpftool btf dump file' for this file with the one<br>
from BTF reproducible build encodings done with a growing number or threads.</p>
</li>
<li>
<p>Add a regression test for the flexible arrays features, checking if the various<br>
comments about flexible arrays match the statistics at the final of the pahole<br>
pretty print output.</p>
</li>
<li>
<p>Add a test that checks if pahole fails when running on a BTF system and BTF was<br>
requested, previously it was falling back to DWARF silently.</p>
</li>
<li>
<p>Add test validating BTF encoding, reasons we skip functions: DWARF functions<br>
that made it into BTF match signatures, functions we say we skipped, we did<br>
indeed skip them in BTF encoding and that it was correct to skip these<br>
functions.</p>
</li>
<li>
<p>Add regression test for 'pahole --prettify' that uses perf to record a simple<br>
workload and then pretty print the resulting perf.data file to check that what<br>
is produced are the expected records for such a file.</p>
</li>
</ul>
<p>Link: <a href="https://lore.kernel.org/all/Z0jVLcpgyENlGg6E@x1/" rel="nofollow">https://lore.kernel.org/all/Z0jVLcpgyENlGg6E@x1/</a><br>
Tested-by: Alan Maguire <a href="mailto:alan.maguire@oracle.com">alan.maguire@oracle.com</a><br>
Tested-by: Jiri Olsa <a href="mailto:jolsa@kernel.org">jolsa@kernel.org</a><br>
Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p>acmeltag:github.com,2008:Repository/2717477/v1.272024-06-11T19:33:03Zv1.27: BTF encoder:<ul>
<li>
<p>Inject kfunc decl tags into BTF from the BTF IDs ELF section in the Linux<br>
kernel vmlinux file.</p>
<p>This allows tools such as bpftools and pfunct to enumerate the available kfuncs<br>
and to gets its function signature, the type of its return and of its<br>
arguments. See the example in the BTF loader changes description, below.</p>
</li>
<li>
<p>Support parallel reproducible builds, where it doesn't matter how many<br>
threads are used, the end BTF encoding result is the same.</p>
</li>
<li>
<p>Sanitize unsupported DWARF int type with greater-than-16 byte, as BTF doesn't<br>
support it.</p>
</li>
</ul>
<p>BTF loader:</p>
<ul>
<li>
<p>Initial support for BTF_KIND_DECL_TAG:</p>
<p>$ pfunct --prototypes -F btf vmlinux.btf.decl_tag,decl_tag_kfuncs | grep ^bpf_kfunc | head<br>
bpf_kfunc void cubictcp_init(struct sock * sk);<br>
bpf_kfunc void cubictcp_cwnd_event(struct sock * sk, enum tcp_ca_event event);<br>
bpf_kfunc void cubictcp_cong_avoid(struct sock * sk, u32 ack, u32 acked);<br>
bpf_kfunc u32 cubictcp_recalc_ssthresh(struct sock * sk);<br>
bpf_kfunc void cubictcp_state(struct sock * sk, u8 new_state);<br>
bpf_kfunc void cubictcp_acked(struct sock * sk, const struct ack_sample * sample);<br>
bpf_kfunc int bpf_iter_css_new(struct bpf_iter_css * it, struct cgroup_subsys_state * start, unsigned int flags);<br>
bpf_kfunc struct cgroup_subsys_state * bpf_iter_css_next(struct bpf_iter_css * it);<br>
bpf_kfunc void bpf_iter_css_destroy(struct bpf_iter_css * it);<br>
bpf_kfunc s64 bpf_map_sum_elem_count(const struct bpf_map * map);<br>
$ pfunct --prototypes -F btf vmlinux.btf.decl_tag,decl_tag_kfuncs | grep ^bpf_kfunc | wc -l<br>
116<br>
$</p>
</li>
</ul>
<p>pretty printing:</p>
<ul>
<li>Fix hole discovery with inheritance in C++.</li>
</ul>
<p>Tested-by: Alan Maguire <a href="mailto:alan.maguire@oracle.com">alan.maguire@oracle.com</a><br>
Tested-by: Daniel Xu <a href="mailto:dxu@dxuuu.xyz">dxu@dxuuu.xyz</a><br>
Tested-by: Jiri Olsa <a href="mailto:olsajiri@gmail.com">olsajiri@gmail.com</a><br>
Link: <a href="https://lore.kernel.org/all/ZmIXxgbfIJGWmXer@x1/T/#u" rel="nofollow">https://lore.kernel.org/all/ZmIXxgbfIJGWmXer@x1/T/#u</a><br>
Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p>acmeltag:github.com,2008:Repository/2717477/v1.262024-02-28T19:25:31Zv1.26: pahole:<ul>
<li>
<p>When expanding types using 'pahole -E' do it for union and struct typedefs and for enums too.</p>
<p>E.g: that 'state' field in 'struct module':</p>
<p>$ pahole module | head<br>
struct module {<br>
enum module_state state; /* 0 4 */</p>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" /* XXX 4 bytes hole, try to pack */
struct list_head list; /* 8 16 */
char name[56]; /* 24 56 */
/* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
struct module_kobject mkobj; /* 80 96 */
/* --- cacheline 2 boundary (128 bytes) was 48 bytes ago --- */"><pre class="notranslate"><code> /* XXX 4 bytes hole, try to pack */
struct list_head list; /* 8 16 */
char name[56]; /* 24 56 */
/* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
struct module_kobject mkobj; /* 80 96 */
/* --- cacheline 2 boundary (128 bytes) was 48 bytes ago --- */
</code></pre></div>
<p>$</p>
<p>now gets expanded:</p>
<p>$ pahole -E module | head<br>
struct module {<br>
enum module_state {<br>
MODULE_STATE_LIVE = 0,<br>
MODULE_STATE_COMING = 1,<br>
MODULE_STATE_GOING = 2,<br>
MODULE_STATE_UNFORMED = 3,<br>
} state; /* 0 4 */</p>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" /* XXX 4 bytes hole, try to pack */"><pre class="notranslate"><code> /* XXX 4 bytes hole, try to pack */
</code></pre></div>
<p>$</p>
</li>
<li>
<p>Print number of holes, bit holes and bit paddings in class member types.</p>
<p>Doing this recursively to show how much waste a complex data structure has<br>
is something that still needs to be done, there were the low hanging fruits<br>
on the path to having that feature.</p>
<p>For instance, for 'struct task_struct' in the Linux kernel we get this<br>
extra info:</p>
<p>--- task_struct.before.c 2024-02-09 11:38:39.249638750 -0300<br>
+++ task_struct.after.c 2024-02-09 16:19:34.221134835 -0300<br>
@@ -29,6 +29,12 @@</p>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" /* --- cacheline 2 boundary (128 bytes) --- */
struct sched_entity se; /* 128 256 */"><pre class="notranslate"><code> /* --- cacheline 2 boundary (128 bytes) --- */
struct sched_entity se; /* 128 256 */
</code></pre></div>
<ul>
<li></li>
<li>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* XXX last struct has 3 holes */"><pre class="notranslate"><code>/* XXX last struct has 3 holes */
</code></pre></div>
</li>
<li>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* --- cacheline 6 boundary (384 bytes) --- */
struct sched_rt_entity rt; /* 384 48 */
struct sched_dl_entity dl; /* 432 224 */"><pre class="notranslate"><code>/* --- cacheline 6 boundary (384 bytes) --- */
struct sched_rt_entity rt; /* 384 48 */
struct sched_dl_entity dl; /* 432 224 */
</code></pre></div>
</li>
<li></li>
<li>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" /* XXX last struct has 1 bit hole */"><pre class="notranslate"><code> /* XXX last struct has 1 bit hole */
</code></pre></div>
</li>
<li>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* --- cacheline 10 boundary (640 bytes) was 16 bytes ago --- */
const struct sched_class * sched_class; /* 656 8 */
struct rb_node core_node; /* 664 24 */"><pre class="notranslate"><code>/* --- cacheline 10 boundary (640 bytes) was 16 bytes ago --- */
const struct sched_class * sched_class; /* 656 8 */
struct rb_node core_node; /* 664 24 */
</code></pre></div>
</li>
</ul>
<p>@@ -100,6 +103,9 @@<br>
/* --- cacheline 35 boundary (2240 bytes) was 16 bytes ago --- <em>/<br>
struct list_head tasks; /</em> 2256 16 <em>/<br>
struct plist_node pushable_tasks; /</em> 2272 40 */<br>
+</p>
<ul>
<li>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* XXX last struct has 1 hole */"><pre class="notranslate"><code>/* XXX last struct has 1 hole */
</code></pre></div>
</li>
<li>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* --- cacheline 36 boundary (2304 bytes) was 8 bytes ago --- */
struct rb_node pushable_dl_tasks; /* 2312 24 */
struct mm_struct * mm; /* 2336 8 */"><pre class="notranslate"><code>/* --- cacheline 36 boundary (2304 bytes) was 8 bytes ago --- */
struct rb_node pushable_dl_tasks; /* 2312 24 */
struct mm_struct * mm; /* 2336 8 */
</code></pre></div>
</li>
</ul>
<p>@@ -172,6 +178,9 @@<br>
/* XXX last struct has 4 bytes of padding */</p>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" struct vtime vtime; /* 2744 48 */"><pre class="notranslate"><code> struct vtime vtime; /* 2744 48 */
</code></pre></div>
<ul>
<li></li>
<li>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* XXX last struct has 1 hole */"><pre class="notranslate"><code>/* XXX last struct has 1 hole */
</code></pre></div>
</li>
<li>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* --- cacheline 43 boundary (2752 bytes) was 40 bytes ago --- */
atomic_t tick_dep_mask; /* 2792 4 */"><pre class="notranslate"><code>/* --- cacheline 43 boundary (2752 bytes) was 40 bytes ago --- */
atomic_t tick_dep_mask; /* 2792 4 */
</code></pre></div>
</li>
</ul>
<p>@@ -396,9 +405,12 @@<br>
/* --- cacheline 145 boundary (9280 bytes) --- <em>/<br>
struct thread_struct thread <strong>attribute</strong>((<strong>aligned</strong>(64))); /</em> 9280 4416 */</p>
<ul>
<li>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" /* XXX last struct has 1 hole, 1 bit hole */"><pre class="notranslate"><code> /* XXX last struct has 1 hole, 1 bit hole */
</code></pre></div>
</li>
<li>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* size: 13696, cachelines: 214, members: 262 */
/* sum members: 13518, holes: 21, sum holes: 162 */
/* sum bitfield members: 82 bits, bit holes: 2, sum bit holes: 46 bits */
/* member types with holes: 4, total: 6, bit holes: 2, total: 2 */
/* paddings: 6, sum paddings: 49 */
/* forced alignments: 2, forced holes: 2, sum forced holes: 88 */"><pre class="notranslate"><code>/* size: 13696, cachelines: 214, members: 262 */
/* sum members: 13518, holes: 21, sum holes: 162 */
/* sum bitfield members: 82 bits, bit holes: 2, sum bit holes: 46 bits */
/* member types with holes: 4, total: 6, bit holes: 2, total: 2 */
/* paddings: 6, sum paddings: 49 */
/* forced alignments: 2, forced holes: 2, sum forced holes: 88 */
</code></pre></div>
</li>
</ul>
<p>};</p>
</li>
<li>
<p>Introduce --contains_enumerator=ENUMERATOR_NAME:</p>
<p>E.g.:</p>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="$ pahole --contains_enumerator S_VERSION
enum file_time_flags {
S_ATIME = 1,
S_MTIME = 2,
S_CTIME = 4,
S_VERSION = 8,
}
$"><pre class="notranslate"><code>$ pahole --contains_enumerator S_VERSION
enum file_time_flags {
S_ATIME = 1,
S_MTIME = 2,
S_CTIME = 4,
S_VERSION = 8,
}
$
</code></pre></div>
<p>The shorter form --contains_enum is also accepted.</p>
</li>
<li>
<p>Fix pretty printing when using DWARF, where sometimes the class (-C) and a specified "type_enum",<br>
may not be present on the same CU, so wait till both are found.</p>
<p>Now this example that reads the 'struct perf_event_header' and 'enum perf_event_type'<br>
from the DWARF info in ~/bin/perf to pretty print records in the perf.data file works<br>
just like when using type info from BTF in ~/bin/perf:</p>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="$ pahole -F dwarf -V ~/bin/perf \
--header=perf_file_header \
--seek_bytes '$header.data.offset' \
--size_bytes='$header.data.size' \
-C 'perf_event_header(sizeof,type,type_enum=perf_event_type,filter=type==PERF_RECORD_MMAP2)' \
--prettify perf.data --count 1
pahole: sizeof_operator for 'perf_event_header' is 'size'
pahole: type member for 'perf_event_header' is 'type'
pahole: type enum for 'perf_event_header' is 'perf_event_type'
pahole: filter for 'perf_event_header' is 'type==PERF_RECORD_MMAP2'
pahole: seek bytes evaluated from --seek_bytes=$header.data.offset is 0x3f0
pahole: size bytes evaluated from --size_bytes=$header.data.size is 0xd10
// type=perf_event_header, offset=0xc20, sizeof=8, real_sizeof=112
{
.header = {
.type = PERF_RECORD_MMAP2,
.misc = 2,
.size = 112,
},
.pid = 1533617,
.tid = 1533617,
.start = 94667542700032,
.len = 90112,
.pgoff = 16384,{
.maj = 0,
.min = 33,
.ino = 35914923,
.ino_generation = 26870,
},{
.build_id_size = 0,
.__reserved_1 = 0,
.__reserved_2 = 0,
.build_id = { 33, 0, 0, 0, -85, 4, 36, 2, 0, 0, 0, 0, -10, 104, 0, 0, 0, 0, 0, 0 },
},
.prot = 5,
.flags = 2,
.filename = "/usr/bin/ls",
},
$"><pre class="notranslate"><code>$ pahole -F dwarf -V ~/bin/perf \
--header=perf_file_header \
--seek_bytes '$header.data.offset' \
--size_bytes='$header.data.size' \
-C 'perf_event_header(sizeof,type,type_enum=perf_event_type,filter=type==PERF_RECORD_MMAP2)' \
--prettify perf.data --count 1
pahole: sizeof_operator for 'perf_event_header' is 'size'
pahole: type member for 'perf_event_header' is 'type'
pahole: type enum for 'perf_event_header' is 'perf_event_type'
pahole: filter for 'perf_event_header' is 'type==PERF_RECORD_MMAP2'
pahole: seek bytes evaluated from --seek_bytes=$header.data.offset is 0x3f0
pahole: size bytes evaluated from --size_bytes=$header.data.size is 0xd10
// type=perf_event_header, offset=0xc20, sizeof=8, real_sizeof=112
{
.header = {
.type = PERF_RECORD_MMAP2,
.misc = 2,
.size = 112,
},
.pid = 1533617,
.tid = 1533617,
.start = 94667542700032,
.len = 90112,
.pgoff = 16384,{
.maj = 0,
.min = 33,
.ino = 35914923,
.ino_generation = 26870,
},{
.build_id_size = 0,
.__reserved_1 = 0,
.__reserved_2 = 0,
.build_id = { 33, 0, 0, 0, -85, 4, 36, 2, 0, 0, 0, 0, -10, 104, 0, 0, 0, 0, 0, 0 },
},
.prot = 5,
.flags = 2,
.filename = "/usr/bin/ls",
},
$
</code></pre></div>
</li>
</ul>
<p>DWARF loader:</p>
<ul>
<li>
<p>Add support for DW_TAG_constant, first seen in Go DWARF.</p>
</li>
<li>
<p>Fix loading DW_TAG_subroutine_type generated by the Go compiler, where it may<br>
have a DW_AT_byte_size. Go DWARF. And pretty print it as if<br>
it was from C, this helped in writing BPF programs to attach to Go binaries, using<br>
uprobes.</p>
</li>
</ul>
<p>BTF loader:</p>
<ul>
<li>Fix loading of 32-bit signed enums.</li>
</ul>
<p>BTF encoder:</p>
<ul>
<li>
<p>Add 'pahole --btf_features' to allow consumers to specify an opt-in set of<br>
features they want to use in BTF encoding.</p>
<p>Supported features are a comma-separated combination of</p>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" encode_force Ignore invalid symbols when encoding BTF.
var Encode variables using BTF_KIND_VAR in BTF.
float Encode floating-point types in BTF.
decl_tag Encode declaration tags using BTF_KIND_DECL_TAG.
type_tag Encode type tags using BTF_KIND_TYPE_TAG.
enum64 Encode enum64 values with BTF_KIND_ENUM64.
optimized_func Encode representations of optimized functions
with suffixes like ".isra.0" etc
consistent_func Avoid encoding inconsistent static functions.
These occur when a parameter is optimized out
in some CUs and not others, or when the same
function name has inconsistent BTF descriptions
in different CUs."><pre class="notranslate"><code> encode_force Ignore invalid symbols when encoding BTF.
var Encode variables using BTF_KIND_VAR in BTF.
float Encode floating-point types in BTF.
decl_tag Encode declaration tags using BTF_KIND_DECL_TAG.
type_tag Encode type tags using BTF_KIND_TYPE_TAG.
enum64 Encode enum64 values with BTF_KIND_ENUM64.
optimized_func Encode representations of optimized functions
with suffixes like ".isra.0" etc
consistent_func Avoid encoding inconsistent static functions.
These occur when a parameter is optimized out
in some CUs and not others, or when the same
function name has inconsistent BTF descriptions
in different CUs.
</code></pre></div>
<p>Specifying "--btf_features=all" is the equivalent to setting all of the<br>
above. If pahole does not know about a feature specified in<br>
--btf_features it silently ignores it.</p>
<p>The --btf_features can either be specified via a single comma-separated<br>
list<br>
--btf_features=enum64,float</p>
<p>...or via multiple --btf_features values</p>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" --btf_features=enum64 --btf_features=float"><pre class="notranslate"><code> --btf_features=enum64 --btf_features=float
</code></pre></div>
<p>These properties allow us to use the --btf_features option in the kernel<br>
scripts/pahole_flags.sh script to specify the desired set of BTF<br>
features.</p>
<p>If a feature named in --btf_features is not present in the version of<br>
pahole used, BTF encoding will not complain. This is desired because it<br>
means we no longer have to tie new features to a specific pahole<br>
version.</p>
<p>Use --btf_features_strict to change that behaviour and bail out if one of<br>
the requested features isn't present.</p>
<p>To see the supported features, use:</p>
<p>$ pahole --supported_btf_features<br>
encode_force,var,float,decl_tag,type_tag,enum64,optimized_func,consistent_func<br>
$</p>
</li>
</ul>
<p>btfdiff:</p>
<ul>
<li>
<p>Parallelize loading BTF and DWARF, speeding up a bit.</p>
</li>
<li>
<p>Do type expansion to cover "private" types and enumerations.</p>
</li>
</ul>
<p>Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p>acmeltag:github.com,2008:Repository/2717477/v1.242022-08-22T23:29:04Zv1.24: BTF encoder:<ul>
<li>
<p>Add support to BTF_KIND_ENUM64 to represent enumeration entries<br>
with more than 32 bits.</p>
</li>
<li>
<p>Support multithreaded encoding, in addition to DWARF<br>
multithreaded loading, speeding up the process.</p>
<p>Selected just like DWARF multithreaded loading, using the<br>
'pahole -j' option.</p>
</li>
<li>
<p>Encode 'char' type as signed.</p>
</li>
</ul>
<p>BTF Loader:</p>
<ul>
<li>Add support to BTF_KIND_ENUM64.</li>
</ul>
<p>pahole:</p>
<ul>
<li>
<p>Introduce --lang and --lang_exclude to specify the language the<br>
DWARF compile units were originated from to use or filter.</p>
<p>Use case is to exclude Rust compile units while aspects of the<br>
DWARF generated for it get sorted out in a way that the kernel<br>
BPF verifier don't refuse loading the BTF generated from them.</p>
</li>
<li>
<p>Introduce --compile to generate compilable code in a similar fashion to:</p>
<p>bpftool btf dump file vmlinux format c > vmlinux.h</p>
<p>As with 'bpftool', this will notice type shadowing, i.e. multiple types<br>
with the same name and will disambiguate by adding a suffix.</p>
</li>
<li>
<p>Don't segfault when processing bogus files.</p>
</li>
</ul>
<p>Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p>acmeltag:github.com,2008:Repository/2717477/v1.222021-08-23T13:10:10Zv1.22: pahole:<ul>
<li>
<p>Allow encoding BTF to a separate BTF file (detached) instead of to a new<br>
".BTF" ELF section in the file being encoded (vmlinux usually).</p>
</li>
<li>
<p>Introduce -j/--jobs option to specify the number of threads to use. Without<br>
arguments means one thread per CPU. So far used for the DWARF loader, will<br>
be used as well for the BTF encoder.</p>
</li>
<li>
<p>Show all different types with the same name, not just the first one found.</p>
</li>
<li>
<p>Introduce sorted type output (--sort), needed with multithreaded DWARF loading,<br>
to use with things like 'btfdiff' that expects the output from DWARF and BTF<br>
types to be comparable using 'diff'.</p>
</li>
<li>
<p>Stop assuming that reading from stdin means pretty printing as this broke<br>
pre-existing scripts, introduce a explicit --prettify command line option.</p>
</li>
<li>
<p>Improve type resolution for the --header command line option.</p>
</li>
<li>
<p>Disable incomplete CTF encoder, this needs to be done using the external<br>
libctf library.</p>
</li>
<li>
<p>Do not consider the ftrace filter when encoding BTF for kernel functions.</p>
</li>
<li>
<p>Add --kabi_prefix to avoid deduplication woes when using _RH_KABI_REPLACE(),</p>
</li>
<li>
<p>Add --with_flexible_array to show just types with flexible arrays.</p>
</li>
</ul>
<p>DWARF Loader:</p>
<ul>
<li>
<p>Multithreaded loading, requires elfutils >= 0.178.</p>
</li>
<li>
<p>Lock calls to non-thread safe elfutils' libdw functions (dwarf_decl_file()<br>
and dwarf_decl_line())</p>
</li>
<li>
<p>Change hash table size to one that performs better with current typical<br>
vmlinux files.</p>
</li>
<li>
<p>Allow tweaking the hash table size from the command line.</p>
</li>
<li>
<p>Stop allocating memory for strings obtained from libdw, just defer freeing<br>
the Dwfl handler so that references to its strings can be safely kept.</p>
</li>
<li>
<p>Use a frontend cache for the latest lookup result.</p>
</li>
<li>
<p>Allow ignoring some DWARF tags when loading for encoding BTF, as BTF doesn't<br>
have equivalents for things like DW_TAG_inline_expansion and DW_TAG_label.</p>
</li>
<li>
<p>Allow ignoring some DWARF tag attributes, such as DW_AT_alignment, not used<br>
when encoding BTF.</p>
</li>
<li>
<p>Do not query for non-C attributes when loading a C language CU (compilation unit).</p>
</li>
</ul>
<p>BTF encoder:</p>
<ul>
<li>Preparatory work for multithreaded encoding, the focus for 1.23.</li>
</ul>
<p>btfdiff:</p>
<ul>
<li>
<p>Support diffing against a detached BTF file, e.g.: 'btfdiff vmlinux vmlinux.btf'</p>
</li>
<li>
<p>Support multithreaded DWARF loading, using the new pahole --sort option to have<br>
the output from both BTF and DWARF sorted and thus comparable via 'diff'.</p>
</li>
</ul>
<p>Build:</p>
<ul>
<li>
<p>Support building with libc libraries lacking either obstacks or argp, such<br>
as Alpine Linux's musl libc.</p>
</li>
<li>
<p>Support systems without getconf() to obtain the data cacheline size, such<br>
as musl libc.</p>
</li>
<li>
<p>Add a buildcmd.sh for test builds, tested using the same set of containers<br>
used for testing the Linux kernel perf tools.</p>
</li>
<li>
<p>Enable selecting building with a shared libdwarves library or statically.</p>
</li>
<li>
<p>Allow to use the libbpf package found in distributions instead of with the<br>
accompanying libbpf git submodule.</p>
</li>
</ul>
<p>Cleanups:</p>
<ul>
<li>
<p>Address lots of compiler warnings accumulated by not using -Wextra, it'll<br>
be added in the next release after allowing not to use it to build libbpf.</p>
</li>
<li>
<p>Address covscan report issues.</p>
</li>
</ul>
<p>Documentation:</p>
<ul>
<li>
<p>Improve the --nr_methods/-m pahole man page entry.</p>
</li>
<li>
<p>Clarify that currently --nr_methods doesn't work together witn -C.</p>
</li>
</ul>
<p>Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p>acmeltag:github.com,2008:Repository/2717477/v1.212021-04-12T14:53:29Zv1.21: DWARF loader:<ul>
<li>
<p>Handle DWARF5 DW_OP_addrx properly</p>
<p>Part of the effort to support the subset of DWARF5 that is generated when building the kernel.</p>
</li>
<li>
<p>Handle subprogram ret type with abstract_origin properly</p>
<p>Adds a second pass to resolve abstract origin DWARF description of functions to aid<br>
the BTF encoder in getting the right return type.</p>
</li>
<li>
<p>Check .notes section for LTO build info</p>
<p>When LTO is used, currently only with clang, we need to do extra steps to handle references<br>
from one object (compile unit, aka CU) to another, a way for DWARF to avoid duplicating<br>
information.</p>
</li>
<li>
<p>Check .debug_abbrev for cross-CU references</p>
<p>When the kernel build process doesn't add an ELF note in vmlinux indicating that LTO was<br>
used and thus intra-CU references are present and thus we need to use a more expensive<br>
way to resolve types and (again) thus to encode BTF, we need to look at DWARF's .debug_abbrev<br>
ELF section to figure out if such intra-CU references are present.</p>
</li>
<li>
<p>Permit merging all DWARF CU's for clang LTO built binary</p>
<p>Allow not trowing away previously supposedly self contained compile units<br>
(objects, aka CU, aka Compile Units) as they have type descriptions that will<br>
be used in later CUs.</p>
</li>
<li>
<p>Permit a flexible HASHTAGS__BITS</p>
<p>So that we can use a more expensive algorithm when we need to keep previously processed<br>
compile units that will then be referenced by later ones to resolve types.</p>
</li>
<li>
<p>Use a better hashing function, from libbpf</p>
<p>Enabling patch to combine compile units when using LTO.</p>
</li>
</ul>
<p>BTF encoder:</p>
<ul>
<li>
<p>Add --btf_gen_all flag</p>
<p>A new command line to allow asking for the generation of all BTF encodings, so that we<br>
can stop adding new command line options to enable new encodings in the kernel Makefile.</p>
</li>
<li>
<p>Match ftrace addresses within ELF functions</p>
<p>To cope with differences in how DWARF and ftrace describes function boundaries.</p>
</li>
<li>
<p>Funnel ELF error reporting through a macro</p>
<p>To use libelf's elf_error() function, improving error messages.</p>
</li>
<li>
<p>Sanitize non-regular int base type</p>
<p>Cope with clang with dwarf5 non-regular int base types, tricky stuff, see yhs<br>
full explanation in the relevant cset.</p>
</li>
<li>
<p>Add support for the floating-point types</p>
<p>S/390 has floats'n'doubles in its arch specific linux headers, cope with that.</p>
</li>
</ul>
<p>Pretty printer:</p>
<ul>
<li>
<p>Honour conf_fprintf.hex when printing enumerations</p>
<p>If the user specifies --hex in the command line, honour it when printing enumerations.</p>
</li>
</ul>
<p>Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p>acmeltag:github.com,2008:Repository/2717477/v1.202021-02-04T21:34:13Zv1.20:<p>BTF encoder:</p>
<ul>
<li>
<p>Improve ELF error reporting using elf_errmsg(elf_errno()).</p>
</li>
<li>
<p>Improve objcopy error handling.</p>
</li>
<li>
<p>Fix handling of 'restrict' qualifier, that was being treated as a 'const'.</p>
</li>
<li>
<p>Support SHN_XINDEX in st_shndx symbol indexes, to handle ELF objects with<br>
more than 65534 sections, for instance, which happens with kernels built<br>
with 'KCFLAGS="-ffunction-sections -fdata-sections", Other cases may<br>
include when using FG-ASLR, LTO.</p>
</li>
<li>
<p>Cope with functions without a name, as seen sometimes when building kernel<br>
images with some versions of clang, when a SEGFAULT was taking place.</p>
</li>
<li>
<p>Fix BTF variable generation for kernel modules, not skipping variables at<br>
offset zero.</p>
</li>
<li>
<p>Fix address size to match what is in the ELF file being processed, to fix using<br>
a 64-bit pahole binary to generate BTF for a 32-bit vmlinux image.</p>
</li>
<li>
<p>Use kernel module ftrace addresses when finding which functions to encode,<br>
which increases the number of functions encoded.</p>
</li>
</ul>
<p>libbpf:</p>
<ul>
<li>Allow use of packaged version, for distros wanting to dynamically link with<br>
the system's libbpf package instead of using the libbpf git submodule shipped<br>
in pahole's source code.</li>
</ul>
<p>DWARF loader:</p>
<ul>
<li>
<p>Support DW_AT_data_bit_offset</p>
<p>This appeared in DWARF4 but is supported only in gcc's -gdwarf-5,<br>
support it in a way that makes the output be the same for both cases.</p>
<p>$ gcc -gdwarf-5 -c examples/dwarf5/bf.c<br>
$ pahole bf.o<br>
struct pea {<br>
long int a:1; /* 0: 0 8 <em>/<br>
long int b:1; /</em> 0: 1 8 <em>/<br>
long int c:1; /</em> 0: 2 8 */</p>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" /* XXX 29 bits hole, try to pack */
/* Bitfield combined with next fields */
int after_bitfield; /* 4 4 */
/* size: 8, cachelines: 1, members: 4 */
/* sum members: 4 */
/* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 29 bits */
/* last cacheline: 8 bytes */"><pre class="notranslate"><code> /* XXX 29 bits hole, try to pack */
/* Bitfield combined with next fields */
int after_bitfield; /* 4 4 */
/* size: 8, cachelines: 1, members: 4 */
/* sum members: 4 */
/* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 29 bits */
/* last cacheline: 8 bytes */
</code></pre></div>
<p>};</p>
</li>
<li>
<p>DW_FORM_implicit_const in attr_numeric() and attr_offset()</p>
</li>
<li>
<p>Support DW_TAG_GNU_call_site, its the standardized rename of the previously supported<br>
DW_TAG_GNU_call_site.</p>
</li>
</ul>
<p>build:</p>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="- Fix compilation on 32-bit architectures."><pre class="notranslate"><code>- Fix compilation on 32-bit architectures.
</code></pre></div>
<p>Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p>acmeltag:github.com,2008:Repository/2717477/v1.192020-11-23T19:21:26Zv1.19:<ul>
<li>
<p>Support split BTF, where a main BTF file, vmlinux, can be used to find types<br>
and then a kernel module, for instance, can have just what is unique to it.</p>
<p>For instance, looking for a type in the main vmlinux BTF info:</p>
<p>$ pahole wmi_notify_handler<br>
pahole: type 'wmi_notify_handler' not found<br>
$</p>
<p>If we look at the 'wmi' module BTF info that is in:</p>
<p>$ ls -la /sys/kernel/btf/wmi<br>
-r--r--r--. 1 root root 2866 Nov 18 13:35 /sys/kernel/btf/wmi<br>
$</p>
<p>$ pahole /sys/kernel/btf/wmi -C wmi_notify_handler<br>
typedef void (*wmi_notify_handler)(u32, void *);<br>
$</p>
<p>'--btf_base=/sys/kernel/btf/vmlinux' was automatically added in this last<br>
example, an option that was also introduced in this version where types used in<br>
the wmi.ko module but present in vmlinux can be found so that there is no<br>
duplicity of types.</p>
</li>
<li>
<p>Update libbpf to get the split BTF support and use some of its functions to<br>
load BTF and speed up DWARF loading and BTF encoding.</p>
</li>
<li>
<p>Support cross-compiled ELF binaries with different endianness</p>
</li>
<li>
<p>Support showing typedefs for anonymous types, like structs, unions and enums,<br>
see the "Align enumerators" entry below for an example, another:</p>
<p>$ pahole rwlock_t<br>
typedef struct {<br>
arch_rwlock_t raw_lock; /* 0 8 */</p>
<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" /* size: 8, cachelines: 1, members: 1 */
/* last cacheline: 8 bytes */"><pre class="notranslate"><code> /* size: 8, cachelines: 1, members: 1 */
/* last cacheline: 8 bytes */
</code></pre></div>
<p>} rwlock_t;<br>
$</p>
</li>
<li>
<p>Align enumerators:</p>
<p>$ pahole ZSTD_strategy<br>
typedef enum {<br>
ZSTD_fast = 0,<br>
ZSTD_dfast = 1,<br>
ZSTD_greedy = 2,<br>
ZSTD_lazy = 3,<br>
ZSTD_lazy2 = 4,<br>
ZSTD_btlazy2 = 5,<br>
ZSTD_btopt = 6,<br>
ZSTD_btopt2 = 7,<br>
} ZSTD_strategy;<br>
$</p>
</li>
<li>
<p>Workaround bugs in the generation of DWARF records for functions in some gcc<br>
versions that were causing breakage in the encoding of BTF:</p>
<p><a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060" rel="nofollow">https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060</a> "Missing DW_AT_declaration=1 in dwarf data"</p>
</li>
<li>
<p>Ignore zero-sized ELF symbols instead of erroring out.</p>
</li>
<li>
<p>Handle union forward declaration properly in the BTF loader.</p>
</li>
<li>
<p>Introduce --numeric_version for use in scripts and Makefiles:</p>
<p>$ pahole --version<br>
v1.19<br>
$ pahole --numeric_version<br>
119<br>
$</p>
<p>To avoid things like this in the kernel's scripts/link-vmlinux.sh:</p>
<p>pahole_ver=$(${PAHOLE} --version | sed -E 's/v([0-9]+).([0-9]+)/\1\2/')</p>
</li>
<li>
<p>Try sole pfunct argument as a function name, just like pahole with type names:</p>
<p>$ pfunct tcp_v4_rcv<br>
int tcp_v4_rcv(struct sk_buff * skb);<br>
$</p>
</li>
<li>
<p>Speed up pfunct using some of the load techniques used in pahole.</p>
</li>
<li>
<p>Discard CUs after BTF encoding as they're not used anymore, greatly reducing<br>
memory usage and speeding up vmlinux BTF encoding.</p>
</li>
<li>
<p>Revamp how per-CPU variables are encoded in BTF.</p>
</li>
<li>
<p>Include BTF info for static functions.</p>
</li>
<li>
<p>Use BTF's string APIs for strings management, greatly improving performance<br>
over the tsearch().</p>
</li>
<li>
<p>Increase size of DWARF lookup hash table, shaving off about 1 second out of<br>
about 20 seconds total for Linux BTF dedup.</p>
</li>
<li>
<p>Stop BTF encoding when errors are found in some DWARF CU.</p>
</li>
<li>
<p>Implement --packed, to show just packed structures, for instance, here are<br>
the top 5 packed data structures in the Linux kernel:</p>
<p>$ pahole --sizes --packed | sort -k2 -nr | head -5<br>
e820_table 64004 0<br>
boot_params 4096 0<br>
efi_variable 2084 0<br>
snd_soc_tplg_pcm 912 0<br>
ntb_info_regs 800 0<br>
$</p>
<p>And here is one of them:</p>
<p>$ pahole efi_variable<br>
struct efi_variable {<br>
efi_char16_t VariableName[512]; /* 0 1024 <em>/<br>
/</em> --- cacheline 16 boundary (1024 bytes) --- <em>/<br>
efi_guid_t VendorGuid; /</em> 1024 16 <em>/<br>
long unsigned int DataSize; /</em> 1040 8 <em>/<br>
__u8 Data[1024]; /</em> 1048 1024 <em>/<br>
/</em> --- cacheline 32 boundary (2048 bytes) was 24 bytes ago --- <em>/<br>
efi_status_t Status; /</em> 2072 8 <em>/<br>
__u32 Attributes; /</em> 2080 4 */</p>
<p>/* size: 2084, cachelines: 33, members: 6 <em>/<br>
/</em> last cacheline: 36 bytes */<br>
} <strong>attribute</strong>((<strong>packed</strong>));<br>
$</p>
</li>
<li>
<p>Fix bug in distros such as OpenSUSE:15.2 where DW_AT_alignment isn't defined.</p>
</li>
</ul>
<p>Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p>acmel