tag:github.com,2008:https://github.com/acmel/dwarves/releases Release notes from dwarves 2025-04-18T15:25:09Z tag:github.com,2008:Repository/2717477/v1.30 2025-04-18T15:25:09Z v1.30: CI testing: <ul> <li>support for github CI tests to build pahole with gcc<br> and LLVM.</li> <li>support for github CI tests to build pahole, a kernel<br> along with BTF using that pahole and run tests.</li> <li>tests can also be run standalone; see toplevel README<br> for details.</li> </ul> <p>DWARF loader:</p> <ul> <li>better detection of abort during thread processing.</li> </ul> <p>BTF encoder:</p> <ul> <li> <p>pahole now uses an improved scheme to detect presence of<br> newer libbpf functions for cases where pahole is built with<br> a non-embedded libbpf. A local weak declaration is added,<br> and if the function is non-NULL - indicating it is present -<br> the associated feature is avaialble. BTF feature detection<br> makes use of this now and BTF features declared in pahole<br> can provide a feature check function.</p> </li> <li> <p>Type tags are now emitted for bpf_arena pointers if the<br> attributes btf_feature is specified.</p> </li> <li> <p>kfunc tagging has been refactored into btf_encoder__collect_kfuncs<br> to simplify from the previous two-stage collect/tag process.</p> </li> <li> <p>To support global variables other than per-CPU variables, code<br> was added to match a variable with the relevant section. However<br> variables in to-be-discarded sections have address value 0 and<br> appeared to be in the per-CPU section (since it starts at 0).<br> Add checks to ensure the variable really is in the relevant<br> ELF section.</p> </li> <li> <p>To avoid expensive variable address checking in the above case,<br> filter out variables prefixed by _<em>gendwarfksyms_ptr</em> which are<br> present when CONFIG_GENDWARFKSYMS is set.</p> </li> <li> <p>Memory access bugs reported by address sanitizer were also fixed.</p> </li> </ul> <p>Signed-off-by: Alan Maguire <a href="mailto:alan.maguire@oracle.com">alan.maguire@oracle.com</a><br> Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p> acmel tag:github.com,2008:Repository/2717477/v1.29 2025-01-21T15:01:32Z v1.29: DWARF loader: <ul> <li>Multithreading is now contained in the DWARF loader using a jobs queue and a<br> pool of worker threads.</li> </ul> <p>BTF encoder:</p> <ul> <li> <p>The parallel reproducible BTF generation done using the new DWARF loader<br> multithreading model is as fast as the old non-reproducible one and thus is<br> now always performed, making the "reproducible_build" flag moot.</p> <p>The memory consumption is now greatly reduced as well.</p> </li> </ul> <p>BTF loader:</p> <ul> <li> <p>Support for multiple BTF_DECL_TAGs pointing to same tag.</p> <p>Example:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="$ pfunct vmlinux -F btf -f bpf_rdonly_cast bpf_kfunc bpf_fastcall void *bpf_rdonly_cast(const void *obj__ign, u32 btf_id__k); $"><pre class="notranslate"><code>$ pfunct vmlinux -F btf -f bpf_rdonly_cast bpf_kfunc bpf_fastcall void *bpf_rdonly_cast(const void *obj__ign, u32 btf_id__k); $ </code></pre></div> </li> </ul> <p>Regression tests:</p> <ul> <li>Verify that pfunct prints btf_decl_tags read from BTF.</li> </ul> <p>pfunct:</p> <ul> <li>Don't print functions twice when using -f.</li> </ul> <p>Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p> acmel tag:github.com,2008:Repository/2717477/v1.28 2024-12-07T14:06:35Z v1.28: pahole: <ul> <li> <p>Various improvements to reduce the memory footprint of pahole, notably when<br> doing BTF encoding.</p> </li> <li> <p>Show flexible arrays statistics, it detects them at the end of member types,<br> in the middle, etc. This should help with the efforts to spot problematic<br> usage of flexible arrays in the kernel sources, examples:</p> <p><a href="https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=6ab5318f536927cb" rel="nofollow">https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=6ab5318f536927cb</a></p> </li> <li> <p>Introduce --with_embedded_flexible_array option.</p> </li> <li> <p>Add '--padding N' to show only structs with N bytes of padding.</p> </li> <li> <p>Add '--padding_ge N' to show only structs with at least N bytes of padding.</p> </li> <li> <p>Introduce --running_kernel_vmlinux to find a vmlinux that matches the<br> build-id of the running kernel, e.g.:</p> <p>$ pahole --running_kernel_vmlinux<br> /usr/lib/debug/lib/modules/6.11.7-200.fc40.x86_64/vmlinux<br> $ rpm -qf /usr/lib/debug/lib/modules/6.11.7-200.fc40.x86_64/vmlinux<br> kernel-debuginfo-6.11.7-200.fc40.x86_64<br> $</p> <p>This is a shortcut to find the right vmlinux to use for the running kernel<br> and helps with regression tests.</p> </li> </ul> <p>pfunct:</p> <ul> <li>Don't stop at the first function that matches a filter, show all of them.</li> </ul> <p>BTF Encoder:</p> <ul> <li> <p>Allow encoding data about all global variables, not just per CPU ones.</p> <p>There are several reasons why type information for all global variables to be<br> useful in the kernel, including drgn without DWARF, __ksym BPF programs return<br> type.</p> <p>This is non-default, experiment with it using 'pahole --btf-features=+global_var'</p> </li> <li> <p>Handle .BTF_ids section endianness, allowing for cross builds involving<br> machines with different endianness to work.</p> <p>For instance, encoding BTF info on a s390 vmlinux file on a x86_64 workstation.</p> </li> <li> <p>Generate decl tags for bpf_fastcall for eligible kfuncs.</p> </li> <li> <p>Add "distilled_base" BTF feature to split BTF generation.</p> </li> <li> <p>Use the ELF_C_READ_MMAP mode with libelf, reducing peak memory utilization.</p> </li> </ul> <p>BTF Loader:</p> <ul> <li>Allow overiding /sys/kernel/btf/vmlinux with some other file, for testing,<br> via the PAHOLE_VMLINUX_BTF_FILENAME environment variable.</li> </ul> <p>DWARF loader:</p> <ul> <li> <p>Allow setting the list of compile units produced from languages to skip via<br> the PAHOLE_LANG_EXCLUDE environment variable.</p> </li> <li> <p>Serialize access to elfutils dwarf_getlocation() to avoid elfutils internal<br> data structure corruption when running multithreaded pahole.</p> </li> <li> <p>Honour --lang_exclude when merging LTO built CUs.</p> </li> <li> <p>Add the debuginfod client cache directory to the vmlinux search path.</p> </li> <li> <p>Print the CU's language when a tag isn't supported.</p> </li> <li> <p>Initial support for the DW_TAG_GNU_formal_parameter_pack,<br> DW_TAG_GNU_template_parameter_pack, DW_TAG_template_value_param and<br> DW_TAG_template_type_param DWARF tags.</p> </li> <li> <p>Improve the parameter parsing by checking DW_OP_[GNU_]entry_value, this<br> makes some more functions to be made eligible by the BTF encoder, for instance<br> the perf_event_read() in the 6.11 kernel.</p> </li> </ul> <p>Core:</p> <ul> <li>Use pahole to help in reorganizing its data structures to reduce its memory<br> footprint.</li> </ul> <p>Regression tests:</p> <ul> <li> <p>Introduce a tests/ directory for adding regression tests, run it with:</p> <p>$ tests/tests</p> <p>Or run the individual tests directly.</p> </li> <li> <p>Add a regression test for the reproducible build feature that establishes<br> as a baseline a detached BTF file without asking for a reproducible build and<br> then compares the output of 'bpftool btf dump file' for this file with the one<br> from BTF reproducible build encodings done with a growing number or threads.</p> </li> <li> <p>Add a regression test for the flexible arrays features, checking if the various<br> comments about flexible arrays match the statistics at the final of the pahole<br> pretty print output.</p> </li> <li> <p>Add a test that checks if pahole fails when running on a BTF system and BTF was<br> requested, previously it was falling back to DWARF silently.</p> </li> <li> <p>Add test validating BTF encoding, reasons we skip functions: DWARF functions<br> that made it into BTF match signatures, functions we say we skipped, we did<br> indeed skip them in BTF encoding and that it was correct to skip these<br> functions.</p> </li> <li> <p>Add regression test for 'pahole --prettify' that uses perf to record a simple<br> workload and then pretty print the resulting perf.data file to check that what<br> is produced are the expected records for such a file.</p> </li> </ul> <p>Link: <a href="https://lore.kernel.org/all/Z0jVLcpgyENlGg6E@x1/" rel="nofollow">https://lore.kernel.org/all/Z0jVLcpgyENlGg6E@x1/</a><br> Tested-by: Alan Maguire <a href="mailto:alan.maguire@oracle.com">alan.maguire@oracle.com</a><br> Tested-by: Jiri Olsa <a href="mailto:jolsa@kernel.org">jolsa@kernel.org</a><br> Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p> acmel tag:github.com,2008:Repository/2717477/v1.27 2024-06-11T19:33:03Z v1.27: BTF encoder: <ul> <li> <p>Inject kfunc decl tags into BTF from the BTF IDs ELF section in the Linux<br> kernel vmlinux file.</p> <p>This allows tools such as bpftools and pfunct to enumerate the available kfuncs<br> and to gets its function signature, the type of its return and of its<br> arguments. See the example in the BTF loader changes description, below.</p> </li> <li> <p>Support parallel reproducible builds, where it doesn't matter how many<br> threads are used, the end BTF encoding result is the same.</p> </li> <li> <p>Sanitize unsupported DWARF int type with greater-than-16 byte, as BTF doesn't<br> support it.</p> </li> </ul> <p>BTF loader:</p> <ul> <li> <p>Initial support for BTF_KIND_DECL_TAG:</p> <p>$ pfunct --prototypes -F btf vmlinux.btf.decl_tag,decl_tag_kfuncs | grep ^bpf_kfunc | head<br> bpf_kfunc void cubictcp_init(struct sock * sk);<br> bpf_kfunc void cubictcp_cwnd_event(struct sock * sk, enum tcp_ca_event event);<br> bpf_kfunc void cubictcp_cong_avoid(struct sock * sk, u32 ack, u32 acked);<br> bpf_kfunc u32 cubictcp_recalc_ssthresh(struct sock * sk);<br> bpf_kfunc void cubictcp_state(struct sock * sk, u8 new_state);<br> bpf_kfunc void cubictcp_acked(struct sock * sk, const struct ack_sample * sample);<br> bpf_kfunc int bpf_iter_css_new(struct bpf_iter_css * it, struct cgroup_subsys_state * start, unsigned int flags);<br> bpf_kfunc struct cgroup_subsys_state * bpf_iter_css_next(struct bpf_iter_css * it);<br> bpf_kfunc void bpf_iter_css_destroy(struct bpf_iter_css * it);<br> bpf_kfunc s64 bpf_map_sum_elem_count(const struct bpf_map * map);<br> $ pfunct --prototypes -F btf vmlinux.btf.decl_tag,decl_tag_kfuncs | grep ^bpf_kfunc | wc -l<br> 116<br> $</p> </li> </ul> <p>pretty printing:</p> <ul> <li>Fix hole discovery with inheritance in C++.</li> </ul> <p>Tested-by: Alan Maguire <a href="mailto:alan.maguire@oracle.com">alan.maguire@oracle.com</a><br> Tested-by: Daniel Xu <a href="mailto:dxu@dxuuu.xyz">dxu@dxuuu.xyz</a><br> Tested-by: Jiri Olsa <a href="mailto:olsajiri@gmail.com">olsajiri@gmail.com</a><br> Link: <a href="https://lore.kernel.org/all/ZmIXxgbfIJGWmXer@x1/T/#u" rel="nofollow">https://lore.kernel.org/all/ZmIXxgbfIJGWmXer@x1/T/#u</a><br> Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p> acmel tag:github.com,2008:Repository/2717477/v1.26 2024-02-28T19:25:31Z v1.26: pahole: <ul> <li> <p>When expanding types using 'pahole -E' do it for union and struct typedefs and for enums too.</p> <p>E.g: that 'state' field in 'struct module':</p> <p>$ pahole module | head<br> struct module {<br> enum module_state state; /* 0 4 */</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" /* XXX 4 bytes hole, try to pack */ struct list_head list; /* 8 16 */ char name[56]; /* 24 56 */ /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */ struct module_kobject mkobj; /* 80 96 */ /* --- cacheline 2 boundary (128 bytes) was 48 bytes ago --- */"><pre class="notranslate"><code> /* XXX 4 bytes hole, try to pack */ struct list_head list; /* 8 16 */ char name[56]; /* 24 56 */ /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */ struct module_kobject mkobj; /* 80 96 */ /* --- cacheline 2 boundary (128 bytes) was 48 bytes ago --- */ </code></pre></div> <p>$</p> <p>now gets expanded:</p> <p>$ pahole -E module | head<br> struct module {<br> enum module_state {<br> MODULE_STATE_LIVE = 0,<br> MODULE_STATE_COMING = 1,<br> MODULE_STATE_GOING = 2,<br> MODULE_STATE_UNFORMED = 3,<br> } state; /* 0 4 */</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" /* XXX 4 bytes hole, try to pack */"><pre class="notranslate"><code> /* XXX 4 bytes hole, try to pack */ </code></pre></div> <p>$</p> </li> <li> <p>Print number of holes, bit holes and bit paddings in class member types.</p> <p>Doing this recursively to show how much waste a complex data structure has<br> is something that still needs to be done, there were the low hanging fruits<br> on the path to having that feature.</p> <p>For instance, for 'struct task_struct' in the Linux kernel we get this<br> extra info:</p> <p>--- task_struct.before.c 2024-02-09 11:38:39.249638750 -0300<br> +++ task_struct.after.c 2024-02-09 16:19:34.221134835 -0300<br> @@ -29,6 +29,12 @@</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" /* --- cacheline 2 boundary (128 bytes) --- */ struct sched_entity se; /* 128 256 */"><pre class="notranslate"><code> /* --- cacheline 2 boundary (128 bytes) --- */ struct sched_entity se; /* 128 256 */ </code></pre></div> <ul> <li></li> <li> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* XXX last struct has 3 holes */"><pre class="notranslate"><code>/* XXX last struct has 3 holes */ </code></pre></div> </li> <li> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* --- cacheline 6 boundary (384 bytes) --- */ struct sched_rt_entity rt; /* 384 48 */ struct sched_dl_entity dl; /* 432 224 */"><pre class="notranslate"><code>/* --- cacheline 6 boundary (384 bytes) --- */ struct sched_rt_entity rt; /* 384 48 */ struct sched_dl_entity dl; /* 432 224 */ </code></pre></div> </li> <li></li> <li> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" /* XXX last struct has 1 bit hole */"><pre class="notranslate"><code> /* XXX last struct has 1 bit hole */ </code></pre></div> </li> <li> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* --- cacheline 10 boundary (640 bytes) was 16 bytes ago --- */ const struct sched_class * sched_class; /* 656 8 */ struct rb_node core_node; /* 664 24 */"><pre class="notranslate"><code>/* --- cacheline 10 boundary (640 bytes) was 16 bytes ago --- */ const struct sched_class * sched_class; /* 656 8 */ struct rb_node core_node; /* 664 24 */ </code></pre></div> </li> </ul> <p>@@ -100,6 +103,9 @@<br> /* --- cacheline 35 boundary (2240 bytes) was 16 bytes ago --- <em>/<br> struct list_head tasks; /</em> 2256 16 <em>/<br> struct plist_node pushable_tasks; /</em> 2272 40 */<br> +</p> <ul> <li> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* XXX last struct has 1 hole */"><pre class="notranslate"><code>/* XXX last struct has 1 hole */ </code></pre></div> </li> <li> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* --- cacheline 36 boundary (2304 bytes) was 8 bytes ago --- */ struct rb_node pushable_dl_tasks; /* 2312 24 */ struct mm_struct * mm; /* 2336 8 */"><pre class="notranslate"><code>/* --- cacheline 36 boundary (2304 bytes) was 8 bytes ago --- */ struct rb_node pushable_dl_tasks; /* 2312 24 */ struct mm_struct * mm; /* 2336 8 */ </code></pre></div> </li> </ul> <p>@@ -172,6 +178,9 @@<br> /* XXX last struct has 4 bytes of padding */</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" struct vtime vtime; /* 2744 48 */"><pre class="notranslate"><code> struct vtime vtime; /* 2744 48 */ </code></pre></div> <ul> <li></li> <li> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* XXX last struct has 1 hole */"><pre class="notranslate"><code>/* XXX last struct has 1 hole */ </code></pre></div> </li> <li> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* --- cacheline 43 boundary (2752 bytes) was 40 bytes ago --- */ atomic_t tick_dep_mask; /* 2792 4 */"><pre class="notranslate"><code>/* --- cacheline 43 boundary (2752 bytes) was 40 bytes ago --- */ atomic_t tick_dep_mask; /* 2792 4 */ </code></pre></div> </li> </ul> <p>@@ -396,9 +405,12 @@<br> /* --- cacheline 145 boundary (9280 bytes) --- <em>/<br> struct thread_struct thread <strong>attribute</strong>((<strong>aligned</strong>(64))); /</em> 9280 4416 */</p> <ul> <li> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" /* XXX last struct has 1 hole, 1 bit hole */"><pre class="notranslate"><code> /* XXX last struct has 1 hole, 1 bit hole */ </code></pre></div> </li> <li> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="/* size: 13696, cachelines: 214, members: 262 */ /* sum members: 13518, holes: 21, sum holes: 162 */ /* sum bitfield members: 82 bits, bit holes: 2, sum bit holes: 46 bits */ /* member types with holes: 4, total: 6, bit holes: 2, total: 2 */ /* paddings: 6, sum paddings: 49 */ /* forced alignments: 2, forced holes: 2, sum forced holes: 88 */"><pre class="notranslate"><code>/* size: 13696, cachelines: 214, members: 262 */ /* sum members: 13518, holes: 21, sum holes: 162 */ /* sum bitfield members: 82 bits, bit holes: 2, sum bit holes: 46 bits */ /* member types with holes: 4, total: 6, bit holes: 2, total: 2 */ /* paddings: 6, sum paddings: 49 */ /* forced alignments: 2, forced holes: 2, sum forced holes: 88 */ </code></pre></div> </li> </ul> <p>};</p> </li> <li> <p>Introduce --contains_enumerator=ENUMERATOR_NAME:</p> <p>E.g.:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="$ pahole --contains_enumerator S_VERSION enum file_time_flags { S_ATIME = 1, S_MTIME = 2, S_CTIME = 4, S_VERSION = 8, } $"><pre class="notranslate"><code>$ pahole --contains_enumerator S_VERSION enum file_time_flags { S_ATIME = 1, S_MTIME = 2, S_CTIME = 4, S_VERSION = 8, } $ </code></pre></div> <p>The shorter form --contains_enum is also accepted.</p> </li> <li> <p>Fix pretty printing when using DWARF, where sometimes the class (-C) and a specified "type_enum",<br> may not be present on the same CU, so wait till both are found.</p> <p>Now this example that reads the 'struct perf_event_header' and 'enum perf_event_type'<br> from the DWARF info in ~/bin/perf to pretty print records in the perf.data file works<br> just like when using type info from BTF in ~/bin/perf:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="$ pahole -F dwarf -V ~/bin/perf \ --header=perf_file_header \ --seek_bytes '$header.data.offset' \ --size_bytes='$header.data.size' \ -C 'perf_event_header(sizeof,type,type_enum=perf_event_type,filter=type==PERF_RECORD_MMAP2)' \ --prettify perf.data --count 1 pahole: sizeof_operator for 'perf_event_header' is 'size' pahole: type member for 'perf_event_header' is 'type' pahole: type enum for 'perf_event_header' is 'perf_event_type' pahole: filter for 'perf_event_header' is 'type==PERF_RECORD_MMAP2' pahole: seek bytes evaluated from --seek_bytes=$header.data.offset is 0x3f0 pahole: size bytes evaluated from --size_bytes=$header.data.size is 0xd10 // type=perf_event_header, offset=0xc20, sizeof=8, real_sizeof=112 { .header = { .type = PERF_RECORD_MMAP2, .misc = 2, .size = 112, }, .pid = 1533617, .tid = 1533617, .start = 94667542700032, .len = 90112, .pgoff = 16384,{ .maj = 0, .min = 33, .ino = 35914923, .ino_generation = 26870, },{ .build_id_size = 0, .__reserved_1 = 0, .__reserved_2 = 0, .build_id = { 33, 0, 0, 0, -85, 4, 36, 2, 0, 0, 0, 0, -10, 104, 0, 0, 0, 0, 0, 0 }, }, .prot = 5, .flags = 2, .filename = &quot;/usr/bin/ls&quot;, }, $"><pre class="notranslate"><code>$ pahole -F dwarf -V ~/bin/perf \ --header=perf_file_header \ --seek_bytes '$header.data.offset' \ --size_bytes='$header.data.size' \ -C 'perf_event_header(sizeof,type,type_enum=perf_event_type,filter=type==PERF_RECORD_MMAP2)' \ --prettify perf.data --count 1 pahole: sizeof_operator for 'perf_event_header' is 'size' pahole: type member for 'perf_event_header' is 'type' pahole: type enum for 'perf_event_header' is 'perf_event_type' pahole: filter for 'perf_event_header' is 'type==PERF_RECORD_MMAP2' pahole: seek bytes evaluated from --seek_bytes=$header.data.offset is 0x3f0 pahole: size bytes evaluated from --size_bytes=$header.data.size is 0xd10 // type=perf_event_header, offset=0xc20, sizeof=8, real_sizeof=112 { .header = { .type = PERF_RECORD_MMAP2, .misc = 2, .size = 112, }, .pid = 1533617, .tid = 1533617, .start = 94667542700032, .len = 90112, .pgoff = 16384,{ .maj = 0, .min = 33, .ino = 35914923, .ino_generation = 26870, },{ .build_id_size = 0, .__reserved_1 = 0, .__reserved_2 = 0, .build_id = { 33, 0, 0, 0, -85, 4, 36, 2, 0, 0, 0, 0, -10, 104, 0, 0, 0, 0, 0, 0 }, }, .prot = 5, .flags = 2, .filename = "/usr/bin/ls", }, $ </code></pre></div> </li> </ul> <p>DWARF loader:</p> <ul> <li> <p>Add support for DW_TAG_constant, first seen in Go DWARF.</p> </li> <li> <p>Fix loading DW_TAG_subroutine_type generated by the Go compiler, where it may<br> have a DW_AT_byte_size. Go DWARF. And pretty print it as if<br> it was from C, this helped in writing BPF programs to attach to Go binaries, using<br> uprobes.</p> </li> </ul> <p>BTF loader:</p> <ul> <li>Fix loading of 32-bit signed enums.</li> </ul> <p>BTF encoder:</p> <ul> <li> <p>Add 'pahole --btf_features' to allow consumers to specify an opt-in set of<br> features they want to use in BTF encoding.</p> <p>Supported features are a comma-separated combination of</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" encode_force Ignore invalid symbols when encoding BTF. var Encode variables using BTF_KIND_VAR in BTF. float Encode floating-point types in BTF. decl_tag Encode declaration tags using BTF_KIND_DECL_TAG. type_tag Encode type tags using BTF_KIND_TYPE_TAG. enum64 Encode enum64 values with BTF_KIND_ENUM64. optimized_func Encode representations of optimized functions with suffixes like &quot;.isra.0&quot; etc consistent_func Avoid encoding inconsistent static functions. These occur when a parameter is optimized out in some CUs and not others, or when the same function name has inconsistent BTF descriptions in different CUs."><pre class="notranslate"><code> encode_force Ignore invalid symbols when encoding BTF. var Encode variables using BTF_KIND_VAR in BTF. float Encode floating-point types in BTF. decl_tag Encode declaration tags using BTF_KIND_DECL_TAG. type_tag Encode type tags using BTF_KIND_TYPE_TAG. enum64 Encode enum64 values with BTF_KIND_ENUM64. optimized_func Encode representations of optimized functions with suffixes like ".isra.0" etc consistent_func Avoid encoding inconsistent static functions. These occur when a parameter is optimized out in some CUs and not others, or when the same function name has inconsistent BTF descriptions in different CUs. </code></pre></div> <p>Specifying "--btf_features=all" is the equivalent to setting all of the<br> above. If pahole does not know about a feature specified in<br> --btf_features it silently ignores it.</p> <p>The --btf_features can either be specified via a single comma-separated<br> list<br> --btf_features=enum64,float</p> <p>...or via multiple --btf_features values</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" --btf_features=enum64 --btf_features=float"><pre class="notranslate"><code> --btf_features=enum64 --btf_features=float </code></pre></div> <p>These properties allow us to use the --btf_features option in the kernel<br> scripts/pahole_flags.sh script to specify the desired set of BTF<br> features.</p> <p>If a feature named in --btf_features is not present in the version of<br> pahole used, BTF encoding will not complain. This is desired because it<br> means we no longer have to tie new features to a specific pahole<br> version.</p> <p>Use --btf_features_strict to change that behaviour and bail out if one of<br> the requested features isn't present.</p> <p>To see the supported features, use:</p> <p>$ pahole --supported_btf_features<br> encode_force,var,float,decl_tag,type_tag,enum64,optimized_func,consistent_func<br> $</p> </li> </ul> <p>btfdiff:</p> <ul> <li> <p>Parallelize loading BTF and DWARF, speeding up a bit.</p> </li> <li> <p>Do type expansion to cover "private" types and enumerations.</p> </li> </ul> <p>Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p> acmel tag:github.com,2008:Repository/2717477/v1.24 2022-08-22T23:29:04Z v1.24: BTF encoder: <ul> <li> <p>Add support to BTF_KIND_ENUM64 to represent enumeration entries<br> with more than 32 bits.</p> </li> <li> <p>Support multithreaded encoding, in addition to DWARF<br> multithreaded loading, speeding up the process.</p> <p>Selected just like DWARF multithreaded loading, using the<br> 'pahole -j' option.</p> </li> <li> <p>Encode 'char' type as signed.</p> </li> </ul> <p>BTF Loader:</p> <ul> <li>Add support to BTF_KIND_ENUM64.</li> </ul> <p>pahole:</p> <ul> <li> <p>Introduce --lang and --lang_exclude to specify the language the<br> DWARF compile units were originated from to use or filter.</p> <p>Use case is to exclude Rust compile units while aspects of the<br> DWARF generated for it get sorted out in a way that the kernel<br> BPF verifier don't refuse loading the BTF generated from them.</p> </li> <li> <p>Introduce --compile to generate compilable code in a similar fashion to:</p> <p>bpftool btf dump file vmlinux format c &gt; vmlinux.h</p> <p>As with 'bpftool', this will notice type shadowing, i.e. multiple types<br> with the same name and will disambiguate by adding a suffix.</p> </li> <li> <p>Don't segfault when processing bogus files.</p> </li> </ul> <p>Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p> acmel tag:github.com,2008:Repository/2717477/v1.22 2021-08-23T13:10:10Z v1.22: pahole: <ul> <li> <p>Allow encoding BTF to a separate BTF file (detached) instead of to a new<br> ".BTF" ELF section in the file being encoded (vmlinux usually).</p> </li> <li> <p>Introduce -j/--jobs option to specify the number of threads to use. Without<br> arguments means one thread per CPU. So far used for the DWARF loader, will<br> be used as well for the BTF encoder.</p> </li> <li> <p>Show all different types with the same name, not just the first one found.</p> </li> <li> <p>Introduce sorted type output (--sort), needed with multithreaded DWARF loading,<br> to use with things like 'btfdiff' that expects the output from DWARF and BTF<br> types to be comparable using 'diff'.</p> </li> <li> <p>Stop assuming that reading from stdin means pretty printing as this broke<br> pre-existing scripts, introduce a explicit --prettify command line option.</p> </li> <li> <p>Improve type resolution for the --header command line option.</p> </li> <li> <p>Disable incomplete CTF encoder, this needs to be done using the external<br> libctf library.</p> </li> <li> <p>Do not consider the ftrace filter when encoding BTF for kernel functions.</p> </li> <li> <p>Add --kabi_prefix to avoid deduplication woes when using _RH_KABI_REPLACE(),</p> </li> <li> <p>Add --with_flexible_array to show just types with flexible arrays.</p> </li> </ul> <p>DWARF Loader:</p> <ul> <li> <p>Multithreaded loading, requires elfutils &gt;= 0.178.</p> </li> <li> <p>Lock calls to non-thread safe elfutils' libdw functions (dwarf_decl_file()<br> and dwarf_decl_line())</p> </li> <li> <p>Change hash table size to one that performs better with current typical<br> vmlinux files.</p> </li> <li> <p>Allow tweaking the hash table size from the command line.</p> </li> <li> <p>Stop allocating memory for strings obtained from libdw, just defer freeing<br> the Dwfl handler so that references to its strings can be safely kept.</p> </li> <li> <p>Use a frontend cache for the latest lookup result.</p> </li> <li> <p>Allow ignoring some DWARF tags when loading for encoding BTF, as BTF doesn't<br> have equivalents for things like DW_TAG_inline_expansion and DW_TAG_label.</p> </li> <li> <p>Allow ignoring some DWARF tag attributes, such as DW_AT_alignment, not used<br> when encoding BTF.</p> </li> <li> <p>Do not query for non-C attributes when loading a C language CU (compilation unit).</p> </li> </ul> <p>BTF encoder:</p> <ul> <li>Preparatory work for multithreaded encoding, the focus for 1.23.</li> </ul> <p>btfdiff:</p> <ul> <li> <p>Support diffing against a detached BTF file, e.g.: 'btfdiff vmlinux vmlinux.btf'</p> </li> <li> <p>Support multithreaded DWARF loading, using the new pahole --sort option to have<br> the output from both BTF and DWARF sorted and thus comparable via 'diff'.</p> </li> </ul> <p>Build:</p> <ul> <li> <p>Support building with libc libraries lacking either obstacks or argp, such<br> as Alpine Linux's musl libc.</p> </li> <li> <p>Support systems without getconf() to obtain the data cacheline size, such<br> as musl libc.</p> </li> <li> <p>Add a buildcmd.sh for test builds, tested using the same set of containers<br> used for testing the Linux kernel perf tools.</p> </li> <li> <p>Enable selecting building with a shared libdwarves library or statically.</p> </li> <li> <p>Allow to use the libbpf package found in distributions instead of with the<br> accompanying libbpf git submodule.</p> </li> </ul> <p>Cleanups:</p> <ul> <li> <p>Address lots of compiler warnings accumulated by not using -Wextra, it'll<br> be added in the next release after allowing not to use it to build libbpf.</p> </li> <li> <p>Address covscan report issues.</p> </li> </ul> <p>Documentation:</p> <ul> <li> <p>Improve the --nr_methods/-m pahole man page entry.</p> </li> <li> <p>Clarify that currently --nr_methods doesn't work together witn -C.</p> </li> </ul> <p>Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p> acmel tag:github.com,2008:Repository/2717477/v1.21 2021-04-12T14:53:29Z v1.21: DWARF loader: <ul> <li> <p>Handle DWARF5 DW_OP_addrx properly</p> <p>Part of the effort to support the subset of DWARF5 that is generated when building the kernel.</p> </li> <li> <p>Handle subprogram ret type with abstract_origin properly</p> <p>Adds a second pass to resolve abstract origin DWARF description of functions to aid<br> the BTF encoder in getting the right return type.</p> </li> <li> <p>Check .notes section for LTO build info</p> <p>When LTO is used, currently only with clang, we need to do extra steps to handle references<br> from one object (compile unit, aka CU) to another, a way for DWARF to avoid duplicating<br> information.</p> </li> <li> <p>Check .debug_abbrev for cross-CU references</p> <p>When the kernel build process doesn't add an ELF note in vmlinux indicating that LTO was<br> used and thus intra-CU references are present and thus we need to use a more expensive<br> way to resolve types and (again) thus to encode BTF, we need to look at DWARF's .debug_abbrev<br> ELF section to figure out if such intra-CU references are present.</p> </li> <li> <p>Permit merging all DWARF CU's for clang LTO built binary</p> <p>Allow not trowing away previously supposedly self contained compile units<br> (objects, aka CU, aka Compile Units) as they have type descriptions that will<br> be used in later CUs.</p> </li> <li> <p>Permit a flexible HASHTAGS__BITS</p> <p>So that we can use a more expensive algorithm when we need to keep previously processed<br> compile units that will then be referenced by later ones to resolve types.</p> </li> <li> <p>Use a better hashing function, from libbpf</p> <p>Enabling patch to combine compile units when using LTO.</p> </li> </ul> <p>BTF encoder:</p> <ul> <li> <p>Add --btf_gen_all flag</p> <p>A new command line to allow asking for the generation of all BTF encodings, so that we<br> can stop adding new command line options to enable new encodings in the kernel Makefile.</p> </li> <li> <p>Match ftrace addresses within ELF functions</p> <p>To cope with differences in how DWARF and ftrace describes function boundaries.</p> </li> <li> <p>Funnel ELF error reporting through a macro</p> <p>To use libelf's elf_error() function, improving error messages.</p> </li> <li> <p>Sanitize non-regular int base type</p> <p>Cope with clang with dwarf5 non-regular int base types, tricky stuff, see yhs<br> full explanation in the relevant cset.</p> </li> <li> <p>Add support for the floating-point types</p> <p>S/390 has floats'n'doubles in its arch specific linux headers, cope with that.</p> </li> </ul> <p>Pretty printer:</p> <ul> <li> <p>Honour conf_fprintf.hex when printing enumerations</p> <p>If the user specifies --hex in the command line, honour it when printing enumerations.</p> </li> </ul> <p>Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p> acmel tag:github.com,2008:Repository/2717477/v1.20 2021-02-04T21:34:13Z v1.20: <p>BTF encoder:</p> <ul> <li> <p>Improve ELF error reporting using elf_errmsg(elf_errno()).</p> </li> <li> <p>Improve objcopy error handling.</p> </li> <li> <p>Fix handling of 'restrict' qualifier, that was being treated as a 'const'.</p> </li> <li> <p>Support SHN_XINDEX in st_shndx symbol indexes, to handle ELF objects with<br> more than 65534 sections, for instance, which happens with kernels built<br> with 'KCFLAGS="-ffunction-sections -fdata-sections", Other cases may<br> include when using FG-ASLR, LTO.</p> </li> <li> <p>Cope with functions without a name, as seen sometimes when building kernel<br> images with some versions of clang, when a SEGFAULT was taking place.</p> </li> <li> <p>Fix BTF variable generation for kernel modules, not skipping variables at<br> offset zero.</p> </li> <li> <p>Fix address size to match what is in the ELF file being processed, to fix using<br> a 64-bit pahole binary to generate BTF for a 32-bit vmlinux image.</p> </li> <li> <p>Use kernel module ftrace addresses when finding which functions to encode,<br> which increases the number of functions encoded.</p> </li> </ul> <p>libbpf:</p> <ul> <li>Allow use of packaged version, for distros wanting to dynamically link with<br> the system's libbpf package instead of using the libbpf git submodule shipped<br> in pahole's source code.</li> </ul> <p>DWARF loader:</p> <ul> <li> <p>Support DW_AT_data_bit_offset</p> <p>This appeared in DWARF4 but is supported only in gcc's -gdwarf-5,<br> support it in a way that makes the output be the same for both cases.</p> <p>$ gcc -gdwarf-5 -c examples/dwarf5/bf.c<br> $ pahole bf.o<br> struct pea {<br> long int a:1; /* 0: 0 8 <em>/<br> long int b:1; /</em> 0: 1 8 <em>/<br> long int c:1; /</em> 0: 2 8 */</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" /* XXX 29 bits hole, try to pack */ /* Bitfield combined with next fields */ int after_bitfield; /* 4 4 */ /* size: 8, cachelines: 1, members: 4 */ /* sum members: 4 */ /* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 29 bits */ /* last cacheline: 8 bytes */"><pre class="notranslate"><code> /* XXX 29 bits hole, try to pack */ /* Bitfield combined with next fields */ int after_bitfield; /* 4 4 */ /* size: 8, cachelines: 1, members: 4 */ /* sum members: 4 */ /* sum bitfield members: 3 bits, bit holes: 1, sum bit holes: 29 bits */ /* last cacheline: 8 bytes */ </code></pre></div> <p>};</p> </li> <li> <p>DW_FORM_implicit_const in attr_numeric() and attr_offset()</p> </li> <li> <p>Support DW_TAG_GNU_call_site, its the standardized rename of the previously supported<br> DW_TAG_GNU_call_site.</p> </li> </ul> <p>build:</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content="- Fix compilation on 32-bit architectures."><pre class="notranslate"><code>- Fix compilation on 32-bit architectures. </code></pre></div> <p>Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p> acmel tag:github.com,2008:Repository/2717477/v1.19 2020-11-23T19:21:26Z v1.19: <ul> <li> <p>Support split BTF, where a main BTF file, vmlinux, can be used to find types<br> and then a kernel module, for instance, can have just what is unique to it.</p> <p>For instance, looking for a type in the main vmlinux BTF info:</p> <p>$ pahole wmi_notify_handler<br> pahole: type 'wmi_notify_handler' not found<br> $</p> <p>If we look at the 'wmi' module BTF info that is in:</p> <p>$ ls -la /sys/kernel/btf/wmi<br> -r--r--r--. 1 root root 2866 Nov 18 13:35 /sys/kernel/btf/wmi<br> $</p> <p>$ pahole /sys/kernel/btf/wmi -C wmi_notify_handler<br> typedef void (*wmi_notify_handler)(u32, void *);<br> $</p> <p>'--btf_base=/sys/kernel/btf/vmlinux' was automatically added in this last<br> example, an option that was also introduced in this version where types used in<br> the wmi.ko module but present in vmlinux can be found so that there is no<br> duplicity of types.</p> </li> <li> <p>Update libbpf to get the split BTF support and use some of its functions to<br> load BTF and speed up DWARF loading and BTF encoding.</p> </li> <li> <p>Support cross-compiled ELF binaries with different endianness</p> </li> <li> <p>Support showing typedefs for anonymous types, like structs, unions and enums,<br> see the "Align enumerators" entry below for an example, another:</p> <p>$ pahole rwlock_t<br> typedef struct {<br> arch_rwlock_t raw_lock; /* 0 8 */</p> <div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clipboard-copy-content=" /* size: 8, cachelines: 1, members: 1 */ /* last cacheline: 8 bytes */"><pre class="notranslate"><code> /* size: 8, cachelines: 1, members: 1 */ /* last cacheline: 8 bytes */ </code></pre></div> <p>} rwlock_t;<br> $</p> </li> <li> <p>Align enumerators:</p> <p>$ pahole ZSTD_strategy<br> typedef enum {<br> ZSTD_fast = 0,<br> ZSTD_dfast = 1,<br> ZSTD_greedy = 2,<br> ZSTD_lazy = 3,<br> ZSTD_lazy2 = 4,<br> ZSTD_btlazy2 = 5,<br> ZSTD_btopt = 6,<br> ZSTD_btopt2 = 7,<br> } ZSTD_strategy;<br> $</p> </li> <li> <p>Workaround bugs in the generation of DWARF records for functions in some gcc<br> versions that were causing breakage in the encoding of BTF:</p> <p><a href="https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060" rel="nofollow">https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97060</a> "Missing DW_AT_declaration=1 in dwarf data"</p> </li> <li> <p>Ignore zero-sized ELF symbols instead of erroring out.</p> </li> <li> <p>Handle union forward declaration properly in the BTF loader.</p> </li> <li> <p>Introduce --numeric_version for use in scripts and Makefiles:</p> <p>$ pahole --version<br> v1.19<br> $ pahole --numeric_version<br> 119<br> $</p> <p>To avoid things like this in the kernel's scripts/link-vmlinux.sh:</p> <p>pahole_ver=$(${PAHOLE} --version | sed -E 's/v([0-9]+).([0-9]+)/\1\2/')</p> </li> <li> <p>Try sole pfunct argument as a function name, just like pahole with type names:</p> <p>$ pfunct tcp_v4_rcv<br> int tcp_v4_rcv(struct sk_buff * skb);<br> $</p> </li> <li> <p>Speed up pfunct using some of the load techniques used in pahole.</p> </li> <li> <p>Discard CUs after BTF encoding as they're not used anymore, greatly reducing<br> memory usage and speeding up vmlinux BTF encoding.</p> </li> <li> <p>Revamp how per-CPU variables are encoded in BTF.</p> </li> <li> <p>Include BTF info for static functions.</p> </li> <li> <p>Use BTF's string APIs for strings management, greatly improving performance<br> over the tsearch().</p> </li> <li> <p>Increase size of DWARF lookup hash table, shaving off about 1 second out of<br> about 20 seconds total for Linux BTF dedup.</p> </li> <li> <p>Stop BTF encoding when errors are found in some DWARF CU.</p> </li> <li> <p>Implement --packed, to show just packed structures, for instance, here are<br> the top 5 packed data structures in the Linux kernel:</p> <p>$ pahole --sizes --packed | sort -k2 -nr | head -5<br> e820_table 64004 0<br> boot_params 4096 0<br> efi_variable 2084 0<br> snd_soc_tplg_pcm 912 0<br> ntb_info_regs 800 0<br> $</p> <p>And here is one of them:</p> <p>$ pahole efi_variable<br> struct efi_variable {<br> efi_char16_t VariableName[512]; /* 0 1024 <em>/<br> /</em> --- cacheline 16 boundary (1024 bytes) --- <em>/<br> efi_guid_t VendorGuid; /</em> 1024 16 <em>/<br> long unsigned int DataSize; /</em> 1040 8 <em>/<br> __u8 Data[1024]; /</em> 1048 1024 <em>/<br> /</em> --- cacheline 32 boundary (2048 bytes) was 24 bytes ago --- <em>/<br> efi_status_t Status; /</em> 2072 8 <em>/<br> __u32 Attributes; /</em> 2080 4 */</p> <p>/* size: 2084, cachelines: 33, members: 6 <em>/<br> /</em> last cacheline: 36 bytes */<br> } <strong>attribute</strong>((<strong>packed</strong>));<br> $</p> </li> <li> <p>Fix bug in distros such as OpenSUSE:15.2 where DW_AT_alignment isn't defined.</p> </li> </ul> <p>Signed-off-by: Arnaldo Carvalho de Melo <a href="mailto:acme@redhat.com">acme@redhat.com</a></p> acmel