-
Couldn't load subscription status.
- Fork 145
xsk: increase NAPI budget for AF_XDP zero-copy path #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Master branch: f9bec5d patch https://patchwork.ozlabs.org/project/netdev/patch/20200907150217.30888-2-bjorn.topel@gmail.com/ applied successfully |
The NAPI budget (NAPI_POLL_WEIGHT), meaning the number of received packets that are allowed to be processed for each NAPI invocation, takes into consideration that each received packet is aimed for the kernel networking stack. That is not the case for the AF_XDP receive path, where the cost of each packet is significantly less. Therefore, this commit adds a new NAPI budget, which is the NAPI_POLL_WEIGHT scaled by 4. Typically that would be 256 in most configuration. It is encouraged that AF_XDP zero-copy capable drivers use the XSK_NAPI_WEIGHT, when zero-copy is enabled. Processing 256 packets targeted for AF_XDP is still less work than 64 (NAPI_POLL_WEIGHT) packets going to the kernel networking stack. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> --- include/net/xdp_sock_drv.h | 3 +++ 1 file changed, 3 insertions(+)
Start using XSK_NAPI_WEIGHT as NAPI poll budget for the AF_XDP Rx zero-copy path. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> --- drivers/net/ethernet/intel/i40e/i40e_xsk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Start using XSK_NAPI_WEIGHT as NAPI poll budget for the AF_XDP Rx zero-copy path. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> --- drivers/net/ethernet/intel/ice/ice_xsk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Start using XSK_NAPI_WEIGHT as NAPI poll budget for the AF_XDP Rx zero-copy path. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> --- drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
|
Master branch: bc0b5a0 patch https://patchwork.ozlabs.org/project/netdev/patch/20200907150217.30888-2-bjorn.topel@gmail.com/ applied successfully |
|
At least one diff in series https://patchwork.ozlabs.org/project/netdev/list/?series=199973 expired. Closing PR. |
…s metrics" test Linux 5.9 introduced perf test case "Parse and process metrics" and on s390 this test case always dumps core: [root@t35lp67 perf]# ./perf test -vvvv -F 67 67: Parse and process metrics : --- start --- metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC parsing metric: inst_retired.any / cpu_clk_unhalted.thread Segmentation fault (core dumped) [root@t35lp67 perf]# I debugged this core dump and gdb shows this call chain: (gdb) where #0 0x000003ffabc3192a in __strnlen_c_1 () from /lib64/libc.so.6 #1 0x000003ffabc293de in strcasestr () from /lib64/libc.so.6 #2 0x0000000001102ba2 in match_metric(list=0x1e6ea20 "inst_retired.any", n=<optimized out>) at util/metricgroup.c:368 #3 find_metric (map=<optimized out>, map=<optimized out>, metric=0x1e6ea20 "inst_retired.any") at util/metricgroup.c:765 #4 __resolve_metric (ids=0x0, map=<optimized out>, metric_list=0x0, metric_no_group=<optimized out>, m=<optimized out>) at util/metricgroup.c:844 #5 resolve_metric (ids=0x0, map=0x0, metric_list=0x0, metric_no_group=<optimized out>) at util/metricgroup.c:881 #6 metricgroup__add_metric (metric=<optimized out>, metric_no_group=metric_no_group@entry=false, events=<optimized out>, events@entry=0x3ffd84fb878, metric_list=0x0, metric_list@entry=0x3ffd84fb868, map=0x0) at util/metricgroup.c:943 #7 0x00000000011034ae in metricgroup__add_metric_list (map=0x13f9828 <map>, metric_list=0x3ffd84fb868, events=0x3ffd84fb878, metric_no_group=<optimized out>, list=<optimized out>) at util/metricgroup.c:988 #8 parse_groups (perf_evlist=perf_evlist@entry=0x1e70260, str=str@entry=0x12f34b2 "IPC", metric_no_group=<optimized out>, metric_no_merge=<optimized out>, fake_pmu=fake_pmu@entry=0x1462f18 <perf_pmu.fake>, metric_events=0x3ffd84fba58, map=0x1) at util/metricgroup.c:1040 #9 0x0000000001103eb2 in metricgroup__parse_groups_test( evlist=evlist@entry=0x1e70260, map=map@entry=0x13f9828 <map>, str=str@entry=0x12f34b2 "IPC", metric_no_group=metric_no_group@entry=false, metric_no_merge=metric_no_merge@entry=false, metric_events=0x3ffd84fba58) at util/metricgroup.c:1082 #10 0x00000000010c84d8 in __compute_metric (ratio2=0x0, name2=0x0, ratio1=<synthetic pointer>, name1=0x12f34b2 "IPC", vals=0x3ffd84fbad8, name=0x12f34b2 "IPC") at tests/parse-metric.c:159 #11 compute_metric (ratio=<synthetic pointer>, vals=0x3ffd84fbad8, name=0x12f34b2 "IPC") at tests/parse-metric.c:189 #12 test_ipc () at tests/parse-metric.c:208 ..... ..... omitted many more lines This test case was added with commit 218ca91 ("perf tests: Add parse metric test for frontend metric"). When I compile with make DEBUG=y it works fine and I do not get a core dump. It turned out that the above listed function call chain worked on a struct pmu_event array which requires a trailing element with zeroes which was missing. The marco map_for_each_event() loops over that array tests for members metric_expr/metric_name/metric_group being non-NULL. Adding this element fixes the issue. Output after: [root@t35lp46 perf]# ./perf test 67 67: Parse and process metrics : Ok [root@t35lp46 perf]# Committer notes: As Ian remarks, this is not s390 specific: <quote Ian> This also shows up with address sanitizer on all architectures (perhaps change the patch title) and perhaps add a "Fixes: <commit>" tag. ================================================================= ==4718==ERROR: AddressSanitizer: global-buffer-overflow on address 0x55c93b4d59e8 at pc 0x55c93a1541e2 bp 0x7ffd24327c60 sp 0x7ffd24327c58 READ of size 8 at 0x55c93b4d59e8 thread T0 #0 0x55c93a1541e1 in find_metric tools/perf/util/metricgroup.c:764:2 #1 0x55c93a153e6c in __resolve_metric tools/perf/util/metricgroup.c:844:9 #2 0x55c93a152f18 in resolve_metric tools/perf/util/metricgroup.c:881:9 #3 0x55c93a1528db in metricgroup__add_metric tools/perf/util/metricgroup.c:943:9 #4 0x55c93a151996 in metricgroup__add_metric_list tools/perf/util/metricgroup.c:988:9 #5 0x55c93a1511b9 in parse_groups tools/perf/util/metricgroup.c:1040:8 #6 0x55c93a1513e1 in metricgroup__parse_groups_test tools/perf/util/metricgroup.c:1082:9 #7 0x55c93a0108ae in __compute_metric tools/perf/tests/parse-metric.c:159:8 #8 0x55c93a010744 in compute_metric tools/perf/tests/parse-metric.c:189:9 #9 0x55c93a00f5ee in test_ipc tools/perf/tests/parse-metric.c:208:2 #10 0x55c93a00f1e8 in test__parse_metric tools/perf/tests/parse-metric.c:345:2 #11 0x55c939fd7202 in run_test tools/perf/tests/builtin-test.c:410:9 #12 0x55c939fd6736 in test_and_print tools/perf/tests/builtin-test.c:440:9 #13 0x55c939fd58c3 in __cmd_test tools/perf/tests/builtin-test.c:661:4 #14 0x55c939fd4e02 in cmd_test tools/perf/tests/builtin-test.c:807:9 #15 0x55c939e4763d in run_builtin tools/perf/perf.c:313:11 #16 0x55c939e46475 in handle_internal_command tools/perf/perf.c:365:8 #17 0x55c939e4737e in run_argv tools/perf/perf.c:409:2 #18 0x55c939e45f7e in main tools/perf/perf.c:539:3 0x55c93b4d59e8 is located 0 bytes to the right of global variable 'pme_test' defined in 'tools/perf/tests/parse-metric.c:17:25' (0x55c93b4d54a0) of size 1352 SUMMARY: AddressSanitizer: global-buffer-overflow tools/perf/util/metricgroup.c:764:2 in find_metric Shadow bytes around the buggy address: 0x0ab9a7692ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692af0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x0ab9a7692b30: 00 00 00 00 00 00 00 00 00 00 00 00 00[f9]f9 f9 0x0ab9a7692b40: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 0x0ab9a7692b50: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 0x0ab9a7692b60: f9 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00 0x0ab9a7692b70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x0ab9a7692b80: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc </quote> I'm also adding the missing "Fixes" tag and setting just .name to NULL, as doing it that way is more compact (the compiler will zero out everything else) and the table iterators look for .name being NULL as the sentinel marking the end of the table. Fixes: 0a507af ("perf tests: Add parse metric test for ipc metric") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: http://lore.kernel.org/lkml/20200825071211.16959-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The evsel->unit borrows a pointer of pmu event or alias instead of
owns a string. But tool event (duration_time) passes a result of
strdup() caused a leak.
It was found by ASAN during metric test:
Direct leak of 210 byte(s) in 70 object(s) allocated from:
#0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
#1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414
#2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414
#3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
#4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
#5 0x559fbbcc95da in __parse_events util/parse-events.c:2141
#6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406
#7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393
#8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415
#9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498
#10 0x559fbbc0109b in run_test tests/builtin-test.c:410
#11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
#12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
#13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
#14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308
Fixes: f0fbb11 ("perf stat: Implement duration_time as a proper event")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The test_generic_metric() missed to release entries in the pctx. Asan
reported following leak (and more):
Direct leak of 128 byte(s) in 1 object(s) allocated from:
#0 0x7f4c9396980e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
#1 0x55f7e748cc14 in hashmap_grow (/home/namhyung/project/linux/tools/perf/perf+0x90cc14)
#2 0x55f7e748d497 in hashmap__insert (/home/namhyung/project/linux/tools/perf/perf+0x90d497)
#3 0x55f7e7341667 in hashmap__set /home/namhyung/project/linux/tools/perf/util/hashmap.h:111
#4 0x55f7e7341667 in expr__add_ref util/expr.c:120
#5 0x55f7e7292436 in prepare_metric util/stat-shadow.c:783
#6 0x55f7e729556d in test_generic_metric util/stat-shadow.c:858
#7 0x55f7e712390b in compute_single tests/parse-metric.c:128
#8 0x55f7e712390b in __compute_metric tests/parse-metric.c:180
#9 0x55f7e712446d in compute_metric tests/parse-metric.c:196
#10 0x55f7e712446d in test_dcache_l2 tests/parse-metric.c:295
#11 0x55f7e712446d in test__parse_metric tests/parse-metric.c:355
#12 0x55f7e70be09b in run_test tests/builtin-test.c:410
#13 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
#14 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
#15 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
#16 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#17 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#18 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#19 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#20 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308
Fixes: 6d432c4 ("perf tools: Add test_generic_metric function")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The metricgroup__add_metric() can find multiple match for a metric group
and it's possible to fail. Also it can fail in the middle like in
resolve_metric() even for single metric.
In those cases, the intermediate list and ids will be leaked like:
Direct leak of 3 byte(s) in 1 object(s) allocated from:
#0 0x7f4c938f40b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
#1 0x55f7e71c1bef in __add_metric util/metricgroup.c:683
#2 0x55f7e71c31d0 in add_metric util/metricgroup.c:906
#3 0x55f7e71c3844 in metricgroup__add_metric util/metricgroup.c:940
#4 0x55f7e71c488d in metricgroup__add_metric_list util/metricgroup.c:993
#5 0x55f7e71c488d in parse_groups util/metricgroup.c:1045
#6 0x55f7e71c60a4 in metricgroup__parse_groups_test util/metricgroup.c:1087
#7 0x55f7e71235ae in __compute_metric tests/parse-metric.c:164
#8 0x55f7e7124650 in compute_metric tests/parse-metric.c:196
#9 0x55f7e7124650 in test_recursion_fail tests/parse-metric.c:318
#10 0x55f7e7124650 in test__parse_metric tests/parse-metric.c:356
#11 0x55f7e70be09b in run_test tests/builtin-test.c:410
#12 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
#13 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
#14 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
#15 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
#16 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
#17 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
#18 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
#19 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308
Fixes: 83de0b7 ("perf metric: Collect referenced metrics in struct metric_ref_node")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Ido Schimmel says: ==================== mlxsw: Refactor headroom management Petr says: On Spectrum, port buffers, also called port headroom, is where packets are stored while they are parsed and the forwarding decision is being made. For lossless traffic flows, in case shared buffer admission is not allowed, headroom is also where to put the extra traffic received before the sent PAUSE takes effect. Another aspect of the port headroom is the so called internal buffer, which is used for egress mirroring. Linux supports two DCB interfaces related to the headroom: dcbnl_setbuffer for configuration, and dcbnl_getbuffer for inspection. In order to make it possible to implement these interfaces, it is first necessary to clean up headroom handling, which is currently strewn in several places in the driver. The end goal is an architecture whereby it is possible to take a copy of the current configuration, adjust parameters, and then hand the proposed configuration over to the system to implement it. When everything works, the proposed configuration is accepted and saved. First, this centralizes the reconfiguration handling to one function, which takes care of coordinating buffer size changes and priority map changes to avoid introducing drops. Second, the fact that the configuration is all in one place makes it easy to keep a backup and handle error path rollbacks, which were previously hard to understand. Patch #1 introduces struct mlxsw_sp_hdroom, which will keep port headroom configuration. Patch #2 unifies handling of delay provision between PFC and PAUSE. From now on, delay is to be measured in bytes of extra space, and will not include MTU. PFC handler sets the delay directly from the parameter it gets through the DCB interface. For PAUSE, MLXSW_SP_PAUSE_DELAY is converted to have the same meaning. In patches #3-#5, MTU, lossiness and priorities are gradually moved over to struct mlxsw_sp_hdroom. In patches #6-#11, handling of buffer resizing and priority maps is moved from spectrum.c and spectrum_dcb.c to spectrum_buffers.c. The API is gradually adapted so that struct mlxsw_sp_hdroom becomes the main interface through which the various clients express how the headroom should be configured. Patch #12 is a small cleanup that the previous transformation made possible. In patch #13, the port init code becomes a boring client of the headroom code, instead of rolling its own thing. Patches #14 and #15 move handling of internal mirroring buffer to the new headroom code as well. Previously, this code was in the SPAN module. This patchset converts the SPAN module to another boring client of the headroom code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
Ido Schimmel says: ==================== nexthop: Add support for nexthop objects offload This patch set adds support for nexthop objects offload with a dummy implementation over netdevsim. mlxsw support will be added later. The general idea is very similar to route offload in that notifications are sent whenever nexthop objects are changed. A listener can veto the change and the error will be communicated to user space with extack. To keep listeners as simple as possible, they not only receive notifications for the nexthop object that is changed, but also for all the other objects affected by this change. For example, when a single nexthop is replaced, a replace notification is sent for the single nexthop, but also for all the nexthop groups this nexthop is member in. This relieves listeners from the need to track such dependencies. To simplify things further for listeners, the notification info does not contain the raw nexthop data structures (e.g., 'struct nexthop'), but less complex data structures into which the raw data structures are parsed into. Tested with a new selftest over netdevsim and with fib_nexthops.sh: Tests passed: 164 Tests failed: 0 Patch set overview: Patches #1-#4 introduce the aforementioned data structures and convert existing listeners (i.e., the VXLAN driver) to use them. Patches #5-#6 add a new RTNH_F_TRAP flag and the ability to set it and RTNH_F_OFFLOAD on nexthops. This flag is used by netdevsim for testing purposes and will also be used by mlxsw. These flags are consistent with the existing RTM_F_OFFLOAD and RTM_F_TRAP flags. Patches #7-#14 gradually add the new nexthop notifications. Patches #15-#18 add a dummy implementation for nexthop offload over netdevsim and a selftest to exercise both good and bad flows. Changes since RFC [1]: Patch #1: s/is_encap/has_encap/ Patch #3: Add a blank line in __nh_notifier_single_info_init() Patch #5: Reword commit message Patch #6: s/nexthop_hw_flags_set/nexthop_set_hw_flags/ Patch #7: Reword commit message Patch #11: Allocate extack on the stack Follow-up patch sets: selftests: forwarding: Add nexthop objects tests mlxsw: Preparations for nexthop objects support - part 1/2 mlxsw: Preparations for nexthop objects support - part 2/2 mlxsw: Add support for nexthop objects mlxsw: Add support for blackhole nexthops mlxsw: Update adjacency index more efficiently [1] https://lore.kernel.org/netdev/20200908091037.2709823-1-idosch@idosch.org/ ==================== Link: https://lore.kernel.org/r/20201104133040.1125369-1-idosch@idosch.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This fix is for a failure that occurred in the DWARF unwind perf test.
Stack unwinders may probe memory when looking for frames.
Memory sanitizer will poison and track uninitialized memory on the
stack, and on the heap if the value is copied to the heap.
This can lead to false memory sanitizer failures for the use of an
uninitialized value.
Avoid this problem by removing the poison on the copied stack.
The full msan failure with track origins looks like:
==2168==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x559ceb10755b in handle_cfi elfutils/libdwfl/frame_unwind.c:648:8
#1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#23 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559ceb106acf in __libdwfl_frame_reg_set elfutils/libdwfl/frame_unwind.c:77:22
#1 0x559ceb106acf in handle_cfi elfutils/libdwfl/frame_unwind.c:627:13
#2 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#3 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#4 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#5 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#6 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#7 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#8 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#9 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#10 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#11 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#12 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#13 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#14 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#15 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#16 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#17 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#18 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#19 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#20 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#21 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#22 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#23 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#24 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559ceb106a54 in handle_cfi elfutils/libdwfl/frame_unwind.c:613:9
#1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#23 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559ceaff8800 in memory_read tools/perf/util/unwind-libdw.c:156:10
#1 0x559ceb10f053 in expr_eval elfutils/libdwfl/frame_unwind.c:501:13
#2 0x559ceb1060cc in handle_cfi elfutils/libdwfl/frame_unwind.c:603:18
#3 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#4 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#5 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#6 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#7 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#8 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#9 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#10 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#11 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#12 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#13 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#14 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#15 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#16 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#17 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#18 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#19 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#20 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#21 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#22 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#23 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#24 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#25 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559cea9027d9 in __msan_memcpy llvm/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1558:3
#1 0x559cea9d2185 in sample_ustack tools/perf/arch/x86/tests/dwarf-unwind.c:41:2
#2 0x559cea9d202c in test__arch_unwind_sample tools/perf/arch/x86/tests/dwarf-unwind.c:72:9
#3 0x559ceabc9cbd in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:106:6
#4 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#5 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#6 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#7 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#8 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#9 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#10 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#11 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#12 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#13 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#14 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#15 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#16 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#17 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was created by an allocation of 'bf' in the stack frame of function 'perf_event__synthesize_mmap_events'
#0 0x559ceafc5f60 in perf_event__synthesize_mmap_events tools/perf/util/synthetic-events.c:445
SUMMARY: MemorySanitizer: use-of-uninitialized-value elfutils/libdwfl/frame_unwind.c:648:8 in handle_cfi
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: clang-built-linux@googlegroups.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20201113182053.754625-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fix the following possible deadlock reported by lockdep disabling BH running mt76_free_pending_txwi() ================================ WARNING: inconsistent lock state 5.9.0-rc6 #14 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. rmmod/1227 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff888156a83530 (&dev->lock#2){+.?.}-{2:2}, at: mt76_dma_cleanup+0x125/0x150 [mt76] {IN-SOFTIRQ-W} state was registered at: __lock_acquire+0x20c/0x6b0 lock_acquire+0x9d/0x220 _raw_spin_lock+0x2c/0x70 mt76_dma_tx_cleanup+0xc7/0x200 [mt76] mt76x02_poll_tx+0x31/0xb0 [mt76x02_lib] napi_poll+0x3a/0x100 net_rx_action+0xa8/0x200 __do_softirq+0xc4/0x430 asm_call_on_stack+0xf/0x20 do_softirq_own_stack+0x49/0x60 irq_exit_rcu+0x9a/0xd0 common_interrupt+0xa4/0x190 asm_common_interrupt+0x1e/0x40 irq event stamp: 9915 hardirqs last enabled at (9915): [<ffffffff8124e286>] __free_pages_ok+0x336/0x3b0 hardirqs last disabled at (9914): [<ffffffff8124e24e>] __free_pages_ok+0x2fe/0x3b0 softirqs last enabled at (9912): [<ffffffffa03aa672>] mt76_dma_rx_cleanup+0xa2/0x120 [mt76] softirqs last disabled at (9846): [<ffffffffa03aa5ea>] mt76_dma_rx_cleanup+0x1a/0x120 [mt76] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&dev->lock#2); <Interrupt> lock(&dev->lock#2); *** DEADLOCK *** 1 lock held by rmmod/1227: #0: ffff88815b5eb240 (&dev->mutex){....}-{3:3}, at: driver_detach+0xb5/0x110 stack backtrace: CPU: 1 PID: 1227 Comm: rmmod Kdump: loaded Not tainted 5.9.0-rc6-wdn-src+ #14 Hardware name: Dell Inc. Studio XPS 1340/0K183D, BIOS A11 09/08/2009 Call Trace: dump_stack+0x77/0xa0 mark_lock_irq.cold+0x15/0x39 mark_lock+0x1fc/0x500 mark_usage+0xc7/0x140 __lock_acquire+0x20c/0x6b0 ? find_held_lock+0x2b/0x80 ? sched_clock_cpu+0xc/0xb0 lock_acquire+0x9d/0x220 ? mt76_dma_cleanup+0x125/0x150 [mt76] _raw_spin_lock+0x2c/0x70 ? mt76_dma_cleanup+0x125/0x150 [mt76] mt76_dma_cleanup+0x125/0x150 [mt76] mt76x2_cleanup+0x5a/0x70 [mt76x2e] mt76x2e_remove+0x18/0x30 [mt76x2e] pci_device_remove+0x36/0xa0 __device_release_driver+0x16c/0x220 driver_detach+0xcf/0x110 bus_remove_driver+0x56/0xca pci_unregister_driver+0x36/0x80 __do_sys_delete_module.constprop.0+0x127/0x200 ? syscall_enter_from_user_mode+0x1d/0x50 ? trace_hardirqs_on+0x1c/0xe0 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7ff0da54e36b Code: 73 01 c3 48 8b 0d 2d 0b 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fd 0a 0c 00 f7 d8 64 89 01 48 Fixes: dd57a95 ("mt76: move txwi handling code to dma.c, since it is mmio specific") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>
Ido Schimmel says:
====================
mlxsw: Introduce initial XM router support
This patch set implements initial eXtended Mezzanine (XM) router
support.
The XM is an external device connected to the Spectrum-{2,3} ASICs using
dedicated Ethernet ports. Its purpose is to increase the number of
routes that can be offloaded to hardware. This is achieved by having the
ASIC act as a cache that refers cache misses to the XM where the FIB is
stored and LPM lookup is performed.
Future patch sets will add more sophisticated cache flushing and
selftests that utilize cache counters on the ASIC, which we plan to
expose via devlink-metric [1].
Patch set overview:
Patches #1-#2 add registers to insert/remove routes to/from the XM and
to enable/disable it. Patch #3 utilizes these registers in order to
implement XM-specific router low-level operations.
Patches #4-#5 query from firmware the availability of the XM and the
local ports that are used to connect the ASIC to the XM, so that netdevs
will not be created for them.
Patches #6-#8 initialize the XM by configuring its cache parameters.
Patch #9-#10 implement cache management, so that LPM lookup will be
correctly cached in the ASIC.
Patches #11-#13 implement cache flushing, so that routes
insertions/removals to/from the XM will flush the affected entries in
the cache.
Patch #14 configures the ASIC to allocate half of its memory for the
cache, so that room will be left for other entries (e.g., FDBs,
neighbours).
Patch #15 starts using the XM for IPv4 route offload, when available.
[1] https://lore.kernel.org/netdev/20200817125059.193242-1-idosch@idosch.org/
====================
Link: https://lore.kernel.org/r/20201214113041.2789043-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Calling btrfs_qgroup_reserve_meta_prealloc from btrfs_delayed_inode_reserve_metadata can result in flushing delalloc while holding a transaction and delayed node locks. This is deadlock prone. In the past multiple commits: * ae5e070 ("btrfs: qgroup: don't try to wait flushing if we're already holding a transaction") * 6f23277 ("btrfs: qgroup: don't commit transaction when we already hold the handle") Tried to solve various aspects of this but this was always a whack-a-mole game. Unfortunately those 2 fixes don't solve a deadlock scenario involving btrfs_delayed_node::mutex. Namely, one thread can call btrfs_dirty_inode as a result of reading a file and modifying its atime: PID: 6963 TASK: ffff8c7f3f94c000 CPU: 2 COMMAND: "test" #0 __schedule at ffffffffa529e07d #1 schedule at ffffffffa529e4ff #2 schedule_timeout at ffffffffa52a1bdd #3 wait_for_completion at ffffffffa529eeea <-- sleeps with delayed node mutex held #4 start_delalloc_inodes at ffffffffc0380db5 #5 btrfs_start_delalloc_snapshot at ffffffffc0393836 #6 try_flush_qgroup at ffffffffc03f04b2 #7 __btrfs_qgroup_reserve_meta at ffffffffc03f5bb6 <-- tries to reserve space and starts delalloc inodes. #8 btrfs_delayed_update_inode at ffffffffc03e31aa <-- acquires delayed node mutex #9 btrfs_update_inode at ffffffffc0385ba8 #10 btrfs_dirty_inode at ffffffffc038627b <-- TRANSACTIION OPENED #11 touch_atime at ffffffffa4cf0000 #12 generic_file_read_iter at ffffffffa4c1f123 #13 new_sync_read at ffffffffa4ccdc8a #14 vfs_read at ffffffffa4cd0849 #15 ksys_read at ffffffffa4cd0bd1 #16 do_syscall_64 at ffffffffa4a052eb #17 entry_SYSCALL_64_after_hwframe at ffffffffa540008c This will cause an asynchronous work to flush the delalloc inodes to happen which can try to acquire the same delayed_node mutex: PID: 455 TASK: ffff8c8085fa4000 CPU: 5 COMMAND: "kworker/u16:30" #0 __schedule at ffffffffa529e07d #1 schedule at ffffffffa529e4ff #2 schedule_preempt_disabled at ffffffffa529e80a #3 __mutex_lock at ffffffffa529fdcb <-- goes to sleep, never wakes up. #4 btrfs_delayed_update_inode at ffffffffc03e3143 <-- tries to acquire the mutex #5 btrfs_update_inode at ffffffffc0385ba8 <-- this is the same inode that pid 6963 is holding #6 cow_file_range_inline.constprop.78 at ffffffffc0386be7 #7 cow_file_range at ffffffffc03879c1 #8 btrfs_run_delalloc_range at ffffffffc038894c #9 writepage_delalloc at ffffffffc03a3c8f #10 __extent_writepage at ffffffffc03a4c01 #11 extent_write_cache_pages at ffffffffc03a500b #12 extent_writepages at ffffffffc03a6de2 #13 do_writepages at ffffffffa4c277eb #14 __filemap_fdatawrite_range at ffffffffa4c1e5bb #15 btrfs_run_delalloc_work at ffffffffc0380987 <-- starts running delayed nodes #16 normal_work_helper at ffffffffc03b706c #17 process_one_work at ffffffffa4aba4e4 #18 worker_thread at ffffffffa4aba6fd #19 kthread at ffffffffa4ac0a3d #20 ret_from_fork at ffffffffa54001ff To fully address those cases the complete fix is to never issue any flushing while holding the transaction or the delayed node lock. This patch achieves it by calling qgroup_reserve_meta directly which will either succeed without flushing or will fail and return -EDQUOT. In the latter case that return value is going to be propagated to btrfs_dirty_inode which will fallback to start a new transaction. That's fine as the majority of time we expect the inode will have BTRFS_DELAYED_NODE_INODE_DIRTY flag set which will result in directly copying the in-memory state. Fixes: c53e965 ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT") CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Nikolay Borisov <nborisov@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
The evlist has the maps with its own refcounts so we don't need to set
the pointers to NULL. Otherwise following error was reported by Asan.
# perf test -v 4
4: Read samples using the mmap interface :
--- start ---
test child forked, pid 139782
mmap size 528384B
=================================================================
==139782==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7f1f76daee8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x564ba21a0fea in cpu_map__trim_new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:79
#2 0x564ba21a1a0f in perf_cpu_map__read /home/namhyung/project/linux/tools/lib/perf/cpumap.c:149
#3 0x564ba21a21cf in cpu_map__read_all_cpu_map /home/namhyung/project/linux/tools/lib/perf/cpumap.c:166
#4 0x564ba21a21cf in perf_cpu_map__new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:181
#5 0x564ba1e48298 in test__basic_mmap tests/mmap-basic.c:55
#6 0x564ba1e278fb in run_test tests/builtin-test.c:428
#7 0x564ba1e278fb in test_and_print tests/builtin-test.c:458
#8 0x564ba1e29a53 in __cmd_test tests/builtin-test.c:679
#9 0x564ba1e29a53 in cmd_test tests/builtin-test.c:825
#10 0x564ba1e95cb4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
#11 0x564ba1d1fa88 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
#12 0x564ba1d1fa88 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
#13 0x564ba1d1fa88 in main /home/namhyung/project/linux/tools/perf/perf.c:539
#14 0x7f1f768e4d09 in __libc_start_main ../csu/libc-start.c:308
...
test child finished with 1
---- end ----
Read samples using the mmap interface: FAILED!
failed to open shell test directory: /home/namhyung/libexec/perf-core/tests/shell
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Link: https://lore.kernel.org/r/20210301140409.184570-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The evlist and the cpu/thread maps should be released together.
Otherwise following error was reported by Asan.
Note that this test still has memory leaks in DSOs so it still fails
even after this change. I'll take a look at that too.
# perf test -v 26
26: Object code reading :
--- start ---
test child forked, pid 154184
Looking at the vmlinux_path (8 entries long)
symsrc__init: build id mismatch for vmlinux.
symsrc__init: cannot get elf header.
Using /proc/kcore for kernel data
Using /proc/kallsyms for symbols
Parsing event 'cycles'
mmap size 528384B
...
=================================================================
==154184==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 439 byte(s) in 1 object(s) allocated from:
#0 0x7fcb66e77037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
#1 0x55ad9b7e821e in dso__new_id util/dso.c:1256
#2 0x55ad9b8cfd4a in __machine__addnew_vdso util/vdso.c:132
#3 0x55ad9b8cfd4a in machine__findnew_vdso util/vdso.c:347
#4 0x55ad9b845b7e in map__new util/map.c:176
#5 0x55ad9b8415a2 in machine__process_mmap2_event util/machine.c:1787
#6 0x55ad9b8fab16 in perf_tool__process_synth_event util/synthetic-events.c:64
#7 0x55ad9b8fab16 in perf_event__synthesize_mmap_events util/synthetic-events.c:499
#8 0x55ad9b8fbfdf in __event__synthesize_thread util/synthetic-events.c:741
#9 0x55ad9b8ff3e3 in perf_event__synthesize_thread_map util/synthetic-events.c:833
#10 0x55ad9b738585 in do_test_code_reading tests/code-reading.c:608
#11 0x55ad9b73b25d in test__code_reading tests/code-reading.c:722
#12 0x55ad9b6f28fb in run_test tests/builtin-test.c:428
#13 0x55ad9b6f28fb in test_and_print tests/builtin-test.c:458
#14 0x55ad9b6f4a53 in __cmd_test tests/builtin-test.c:679
#15 0x55ad9b6f4a53 in cmd_test tests/builtin-test.c:825
#16 0x55ad9b760cc4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
#17 0x55ad9b5eaa88 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
#18 0x55ad9b5eaa88 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
#19 0x55ad9b5eaa88 in main /home/namhyung/project/linux/tools/perf/perf.c:539
#20 0x7fcb669acd09 in __libc_start_main ../csu/libc-start.c:308
...
SUMMARY: AddressSanitizer: 471 byte(s) leaked in 2 allocation(s).
test child finished with 1
---- end ----
Object code reading: FAILED!
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210301140409.184570-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The evlist and the cpu/thread maps should be released together.
Otherwise following error was reported by Asan.
$ perf test -v 28
28: Use a dummy software event to keep tracking:
--- start ---
test child forked, pid 156810
mmap size 528384B
=================================================================
==156810==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7f637d2bce8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x55cc6295cffa in cpu_map__trim_new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:79
#2 0x55cc6295da1f in perf_cpu_map__read /home/namhyung/project/linux/tools/lib/perf/cpumap.c:149
#3 0x55cc6295e1df in cpu_map__read_all_cpu_map /home/namhyung/project/linux/tools/lib/perf/cpumap.c:166
#4 0x55cc6295e1df in perf_cpu_map__new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:181
#5 0x55cc626287cf in test__keep_tracking tests/keep-tracking.c:84
#6 0x55cc625e38fb in run_test tests/builtin-test.c:428
#7 0x55cc625e38fb in test_and_print tests/builtin-test.c:458
#8 0x55cc625e5a53 in __cmd_test tests/builtin-test.c:679
#9 0x55cc625e5a53 in cmd_test tests/builtin-test.c:825
#10 0x55cc62651cc4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
#11 0x55cc624dba88 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
#12 0x55cc624dba88 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
#13 0x55cc624dba88 in main /home/namhyung/project/linux/tools/perf/perf.c:539
#14 0x7f637cdf2d09 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: 72 byte(s) leaked in 2 allocation(s).
test child finished with 1
---- end ----
Use a dummy software event to keep tracking: FAILED!
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210301140409.184570-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The evlist and cpu/thread maps should be released together.
Otherwise the following error was reported by Asan.
$ perf test -v 35
35: Track with sched_switch :
--- start ---
test child forked, pid 159287
Using CPUID GenuineIntel-6-8E-C
mmap size 528384B
1295 events recorded
=================================================================
==159287==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 40 byte(s) in 1 object(s) allocated from:
#0 0x7fa28d9a2e8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
#1 0x5652f5a5affa in cpu_map__trim_new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:79
#2 0x5652f5a5ba1f in perf_cpu_map__read /home/namhyung/project/linux/tools/lib/perf/cpumap.c:149
#3 0x5652f5a5c1df in cpu_map__read_all_cpu_map /home/namhyung/project/linux/tools/lib/perf/cpumap.c:166
#4 0x5652f5a5c1df in perf_cpu_map__new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:181
#5 0x5652f5723bbf in test__switch_tracking tests/switch-tracking.c:350
#6 0x5652f56e18fb in run_test tests/builtin-test.c:428
#7 0x5652f56e18fb in test_and_print tests/builtin-test.c:458
#8 0x5652f56e3a53 in __cmd_test tests/builtin-test.c:679
#9 0x5652f56e3a53 in cmd_test tests/builtin-test.c:825
#10 0x5652f574fcc4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
#11 0x5652f55d9a88 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
#12 0x5652f55d9a88 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
#13 0x5652f55d9a88 in main /home/namhyung/project/linux/tools/perf/perf.c:539
#14 0x7fa28d4d8d09 in __libc_start_main ../csu/libc-start.c:308
SUMMARY: AddressSanitizer: 72 byte(s) leaked in 2 allocation(s).
test child finished with 1
---- end ----
Track with sched_switch: FAILED!
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210301140409.184570-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: NipaLocal <nipa@local>
When the XDP program is loaded, the XDP callback adds new Tx queues. This means that the callback must update the Tx scheduler with the new queue number. In the event of a Tx scheduler failure, the XDP callback should also fail and roll back any changes previously made for XDP preparation. The previous implementation had a bug that not all changes made by the XDP callback were rolled back. This caused the crash with the following call trace: [ +9.549584] ice 0000:ca:00.0: Failed VSI LAN queue config for XDP, error: -5 [ +0.382335] Oops: general protection fault, probably for non-canonical address 0x50a2250a90495525: 0000 [kernel-patches#1] SMP NOPTI [ +0.010710] CPU: 103 UID: 0 PID: 0 Comm: swapper/103 Not tainted 6.14.0-net-next-mar-31+ kernel-patches#14 PREEMPT(voluntary) [ +0.010175] Hardware name: Intel Corporation M50CYP2SBSTD/M50CYP2SBSTD, BIOS SE5C620.86B.01.01.0005.2202160810 02/16/2022 [ +0.010946] RIP: 0010:__ice_update_sample+0x39/0xe0 [ice] [...] [ +0.002715] Call Trace: [ +0.002452] <IRQ> [ +0.002021] ? __die_body.cold+0x19/0x29 [ +0.003922] ? die_addr+0x3c/0x60 [ +0.003319] ? exc_general_protection+0x17c/0x400 [ +0.004707] ? asm_exc_general_protection+0x26/0x30 [ +0.004879] ? __ice_update_sample+0x39/0xe0 [ice] [ +0.004835] ice_napi_poll+0x665/0x680 [ice] [ +0.004320] __napi_poll+0x28/0x190 [ +0.003500] net_rx_action+0x198/0x360 [ +0.003752] ? update_rq_clock+0x39/0x220 [ +0.004013] handle_softirqs+0xf1/0x340 [ +0.003840] ? sched_clock_cpu+0xf/0x1f0 [ +0.003925] __irq_exit_rcu+0xc2/0xe0 [ +0.003665] common_interrupt+0x85/0xa0 [ +0.003839] </IRQ> [ +0.002098] <TASK> [ +0.002106] asm_common_interrupt+0x26/0x40 [ +0.004184] RIP: 0010:cpuidle_enter_state+0xd3/0x690 Fix this by performing the missing unmapping of XDP queues from q_vectors and setting the XDP rings pointer back to NULL after all those queues are released. Also, add an immediate exit from the XDP callback in case of ring preparation failure. Fixes: efc2214 ("ice: Add support for XDP") Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Jesse Brandeburg <jbrandeburg@cloudflare.com> Tested-by: Saritha Sanigani <sarithax.sanigani@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Petr Machata says: ==================== ipmr, ip6mr: Allow MC-routing locally-generated MC packets Multicast routing is today handled in the input path. Locally generated MC packets don't hit the IPMR code. Thus if a VXLAN remote address is multicast, the driver needs to set an OIF during route lookup. In practice that means that MC routing configuration needs to be kept in sync with the VXLAN FDB and MDB. Ideally, the VXLAN packets would be routed by the MC routing code instead. To that end, this patchset adds support to route locally generated multicast packets. However, an installation that uses a VXLAN underlay netdevice for which it also has matching MC routes, would get a different routing with this patch. Previously, the MC packets would be delivered directly to the underlay port, whereas now they would be MC-routed. In order to avoid this change in behavior, introduce an IPCB/IP6CB flag. Unless the flag is set, the new MC-routing code is skipped. All this is keyed to a new VXLAN attribute, IFLA_VXLAN_MC_ROUTE. Only when it is set does any of the above engage. In addition to that, and as is the case today with MC forwarding, IPV4_DEVCONF_MC_FORWARDING must be enabled for the netdevice that acts as a source of MC traffic (i.e. the VXLAN PHYS_DEV), so an MC daemon must be attached to the netdevice. When a VXLAN netdevice with a MC remote is brought up, the physical netdevice joins the indicated MC group. This is important for local delivery of MC packets, so it is still necessary to configure a physical netdevice -- the parameter cannot go away. The netdevice would however typically not be a front panel port, but a dummy. An MC daemon would then sit on top of that netdevice as well as any front panel ports that it needs to service, and have routes set up between the two. A way to configure the VXLAN netdevice to take advantage of the new MC routing would be: # ip link add name d up type dummy # ip link add name vx10 up type vxlan id 1000 dstport 4789 \ local 192.0.2.1 group 225.0.0.1 ttl 16 dev d mrcoute # ip link set dev vx10 master br # plus vlans etc. With the following MC routes: (192.0.2.1, 225.0.0.1) iif=d oil=swp1,swp2 # TX route (*, 225.0.0.1) iif=swp1 oil=d,swp2 # RX route (*, 225.0.0.1) iif=swp2 oil=d,swp1 # RX route The RX path has not changed, with the exception of an extra MC hop. Packets are delivered to the front panel port and MC-forwarded to the VXLAN physical port, here "d". Since the port has joined the multicast group, the packets are locally delivered, and end up being processed by the VXLAN netdevice. This patchset is based on earlier patches from Nikolay Aleksandrov and Roopa Prabhu, though it underwent significant changes. Roopa broadly presented the topic on LPC 2019 [0]. Patchset progression: - Patches kernel-patches#1 to kernel-patches#4 add ip_mr_output() - Patches kernel-patches#5 to kernel-patches#10 add ip6_mr_output() - Patch kernel-patches#11 adds the VXLAN bits to enable MR engagement - Patches kernel-patches#12 to kernel-patches#14 prepare selftest libraries - Patch kernel-patches#15 includes a new test suite [0] https://www.youtube.com/watch?v=xlReECfi-uo ==================== Link: https://patch.msgid.link/cover.1750113335.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In KVM guests with Hyper-V hypercalls enabled, the hypercalls HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST and HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST_EX allow a guest to request invalidation of portions of a virtual TLB. For this, the hypercall parameter includes a list of GVAs that are supposed to be invalidated. However, when non-canonical GVAs are passed, there is currently no filtering in place and they are eventually passed to checked invocations of INVVPID on Intel / INVLPGA on AMD. While AMD's INVLPGA silently ignores non-canonical addresses (effectively a no-op), Intel's INVVPID explicitly signals VM-Fail and ultimately triggers the WARN_ONCE in invvpid_error(): invvpid failed: ext=0x0 vpid=1 gva=0xaaaaaaaaaaaaa000 WARNING: CPU: 6 PID: 326 at arch/x86/kvm/vmx/vmx.c:482 invvpid_error+0x91/0xa0 [kvm_intel] Modules linked in: kvm_intel kvm 9pnet_virtio irqbypass fuse CPU: 6 UID: 0 PID: 326 Comm: kvm-vm Not tainted 6.15.0 kernel-patches#14 PREEMPT(voluntary) RIP: 0010:invvpid_error+0x91/0xa0 [kvm_intel] Call Trace: vmx_flush_tlb_gva+0x320/0x490 [kvm_intel] kvm_hv_vcpu_flush_tlb+0x24f/0x4f0 [kvm] kvm_arch_vcpu_ioctl_run+0x3013/0x5810 [kvm] Hyper-V documents that invalid GVAs (those that are beyond a partition's GVA space) are to be ignored. While not completely clear whether this ruling also applies to non-canonical GVAs, it is likely fine to make that assumption, and manual testing on Azure confirms "real" Hyper-V interprets the specification in the same way. Skip non-canonical GVAs when processing the list of address to avoid tripping the INVVPID failure. Alternatively, KVM could filter out "bad" GVAs before inserting into the FIFO, but practically speaking the only downside of pushing validation to the final processing is that doing so is suboptimal for the guest, and no well-behaved guest will request TLB flushes for non-canonical addresses. Fixes: 2609708 ("KVM: x86: hyper-v: Handle HVCALL_FLUSH_VIRTUAL_ADDRESS_LIST{,EX} calls gently") Cc: stable@vger.kernel.org Signed-off-by: Manuel Andreas <manuel.andreas@tum.de> Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/c090efb3-ef82-499f-a5e0-360fc8420fb7@tum.de Signed-off-by: Sean Christopherson <seanjc@google.com>
Without the change `perf `hangs up on charaster devices. On my system
it's enough to run system-wide sampler for a few seconds to get the
hangup:
$ perf record -a -g --call-graph=dwarf
$ perf report
# hung
`strace` shows that hangup happens on reading on a character device
`/dev/dri/renderD128`
$ strace -y -f -p 2780484
strace: Process 2780484 attached
pread64(101</dev/dri/renderD128>, strace: Process 2780484 detached
It's call trace descends into `elfutils`:
$ gdb -p 2780484
(gdb) bt
#0 0x00007f5e508f04b7 in __libc_pread64 (fd=101, buf=0x7fff9df7edb0, count=0, offset=0)
at ../sysdeps/unix/sysv/linux/pread64.c:25
#1 0x00007f5e52b79515 in read_file () from /<<NIX>>/elfutils-0.192/lib/libelf.so.1
#2 0x00007f5e52b25666 in libdw_open_elf () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#3 0x00007f5e52b25907 in __libdw_open_file () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#4 0x00007f5e52b120a9 in dwfl_report_elf@@ELFUTILS_0.156 ()
from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#5 0x000000000068bf20 in __report_module (al=al@entry=0x7fff9df80010, ip=ip@entry=139803237033216, ui=ui@entry=0x5369b5e0)
at util/dso.h:537
#6 0x000000000068c3d1 in report_module (ip=139803237033216, ui=0x5369b5e0) at util/unwind-libdw.c:114
#7 frame_callback (state=0x535aef10, arg=0x5369b5e0) at util/unwind-libdw.c:242
#8 0x00007f5e52b261d3 in dwfl_thread_getframes () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#9 0x00007f5e52b25bdb in get_one_thread_cb () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#10 0x00007f5e52b25faa in dwfl_getthreads () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#11 0x00007f5e52b26514 in dwfl_getthread_frames () from /<<NIX>>/elfutils-0.192/lib/libdw.so.1
#12 0x000000000068c6ce in unwind__get_entries (cb=cb@entry=0x5d4620 <unwind_entry>, arg=arg@entry=0x10cd5fa0,
thread=thread@entry=0x1076a290, data=data@entry=0x7fff9df80540, max_stack=max_stack@entry=127,
best_effort=best_effort@entry=false) at util/thread.h:152
#13 0x00000000005dae95 in thread__resolve_callchain_unwind (evsel=0x106006d0, thread=0x1076a290, cursor=0x10cd5fa0,
sample=0x7fff9df80540, max_stack=127, symbols=true) at util/machine.c:2939
#14 thread__resolve_callchain_unwind (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, sample=0x7fff9df80540,
max_stack=127, symbols=true) at util/machine.c:2920
#15 __thread__resolve_callchain (thread=0x1076a290, cursor=0x10cd5fa0, evsel=0x106006d0, evsel@entry=0x7fff9df80440,
sample=0x7fff9df80540, parent=parent@entry=0x7fff9df804a0, root_al=root_al@entry=0x7fff9df80440, max_stack=127, symbols=true)
at util/machine.c:2970
#16 0x00000000005d0cb2 in thread__resolve_callchain (thread=<optimized out>, cursor=<optimized out>, evsel=0x7fff9df80440,
sample=<optimized out>, parent=0x7fff9df804a0, root_al=0x7fff9df80440, max_stack=127) at util/machine.h:198
#17 sample__resolve_callchain (sample=<optimized out>, cursor=<optimized out>, parent=parent@entry=0x7fff9df804a0,
evsel=evsel@entry=0x106006d0, al=al@entry=0x7fff9df80440, max_stack=max_stack@entry=127) at util/callchain.c:1127
#18 0x0000000000617e08 in hist_entry_iter__add (iter=iter@entry=0x7fff9df80480, al=al@entry=0x7fff9df80440, max_stack_depth=127,
arg=arg@entry=0x7fff9df81ae0) at util/hist.c:1255
#19 0x000000000045d2d0 in process_sample_event (tool=0x7fff9df81ae0, event=<optimized out>, sample=0x7fff9df80540,
evsel=0x106006d0, machine=<optimized out>) at builtin-report.c:334
#20 0x00000000005e3bb1 in perf_session__deliver_event (session=0x105ff2c0, event=0x7f5c7d735ca0, tool=0x7fff9df81ae0,
file_offset=2914716832, file_path=0x105ffbf0 "perf.data") at util/session.c:1367
#21 0x00000000005e8d93 in do_flush (oe=0x105ffa50, show_progress=false) at util/ordered-events.c:245
#22 __ordered_events__flush (oe=0x105ffa50, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:324
#23 0x00000000005e1f64 in perf_session__process_user_event (session=0x105ff2c0, event=0x7f5c7d752b18, file_offset=2914835224,
file_path=0x105ffbf0 "perf.data") at util/session.c:1419
#24 0x00000000005e47c7 in reader__read_event (rd=rd@entry=0x7fff9df81260, session=session@entry=0x105ff2c0,
--Type <RET> for more, q to quit, c to continue without paging--
quit
prog=prog@entry=0x7fff9df81220) at util/session.c:2132
#25 0x00000000005e4b37 in reader__process_events (rd=0x7fff9df81260, session=0x105ff2c0, prog=0x7fff9df81220)
at util/session.c:2181
#26 __perf_session__process_events (session=0x105ff2c0) at util/session.c:2226
#27 perf_session__process_events (session=session@entry=0x105ff2c0) at util/session.c:2390
#28 0x0000000000460add in __cmd_report (rep=0x7fff9df81ae0) at builtin-report.c:1076
#29 cmd_report (argc=<optimized out>, argv=<optimized out>) at builtin-report.c:1827
#30 0x00000000004c5a40 in run_builtin (p=p@entry=0xd8f7f8 <commands+312>, argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0)
at perf.c:351
#31 0x00000000004c5d63 in handle_internal_command (argc=argc@entry=1, argv=argv@entry=0x7fff9df844b0) at perf.c:404
#32 0x0000000000442de3 in run_argv (argcp=<synthetic pointer>, argv=<synthetic pointer>) at perf.c:448
#33 main (argc=<optimized out>, argv=0x7fff9df844b0) at perf.c:556
The hangup happens because nothing in` perf` or `elfutils` checks if a
mapped file is easily readable.
The change conservatively skips all non-regular files.
Signed-off-by: Sergei Trofimovich <slyich@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250505174419.2814857-1-slyich@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Symbolize stack traces by creating a live machine. Add this
functionality to dump_stack and switch dump_stack users to use
it. Switch TUI to use it. Add stack traces to the child test function
which can be useful to diagnose blocked code.
Example output:
```
$ perf test -vv PERF_RECORD_
...
7: PERF_RECORD_* events & perf_sample fields:
7: PERF_RECORD_* events & perf_sample fields : Running (1 active)
^C
Signal (2) while running tests.
Terminating tests with the same signal
Internal test harness failure. Completing any started tests:
: 7: PERF_RECORD_* events & perf_sample fields:
---- unexpected signal (2) ----
#0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0
#1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
#2 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64
#3 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72
#4 0x7fc12fef1393 in __nanosleep nanosleep.c:26
#5 0x7fc12ff02d68 in __sleep sleep.c:55
#6 0x55788c63196b in test__PERF_RECORD perf-record.c:0
#7 0x55788c620fb0 in run_test_child builtin-test.c:0
#8 0x55788c5bd18d in start_command run-command.c:127
#9 0x55788c621ef3 in __cmd_test builtin-test.c:0
#10 0x55788c6225bf in cmd_test ??:0
#11 0x55788c5afbd0 in run_builtin perf.c:0
#12 0x55788c5afeeb in handle_internal_command perf.c:0
#13 0x55788c52b383 in main ??:0
#14 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74
#15 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
#16 0x55788c52b9d1 in _start ??:0
---- unexpected signal (2) ----
#0 0x55788c6210a3 in child_test_sig_handler builtin-test.c:0
#1 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
#2 0x7fc12fea3a14 in pthread_sigmask@GLIBC_2.2.5 pthread_sigmask.c:45
#3 0x7fc12fe49fd9 in __GI___sigprocmask sigprocmask.c:26
#4 0x7fc12ff2601b in __longjmp_chk longjmp.c:36
#5 0x55788c6210c0 in print_test_result.isra.0 builtin-test.c:0
#6 0x7fc12fe49df0 in __restore_rt libc_sigaction.c:0
#7 0x7fc12fe99687 in __internal_syscall_cancel cancellation.c:64
#8 0x7fc12fee5f7a in clock_nanosleep@GLIBC_2.2.5 clock_nanosleep.c:72
#9 0x7fc12fef1393 in __nanosleep nanosleep.c:26
#10 0x7fc12ff02d68 in __sleep sleep.c:55
#11 0x55788c63196b in test__PERF_RECORD perf-record.c:0
#12 0x55788c620fb0 in run_test_child builtin-test.c:0
#13 0x55788c5bd18d in start_command run-command.c:127
#14 0x55788c621ef3 in __cmd_test builtin-test.c:0
#15 0x55788c6225bf in cmd_test ??:0
#16 0x55788c5afbd0 in run_builtin perf.c:0
#17 0x55788c5afeeb in handle_internal_command perf.c:0
#18 0x55788c52b383 in main ??:0
#19 0x7fc12fe33ca8 in __libc_start_call_main libc_start_call_main.h:74
#20 0x7fc12fe33d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
#21 0x55788c52b9d1 in _start ??:0
7: PERF_RECORD_* events & perf_sample fields : Skip (permissions)
```
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250624210500.2121303-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Calling perf top with branch filters enabled on Intel CPU's
with branch counters logging (A.K.A LBR event logging [1]) support
results in a segfault.
$ perf top -e '{cpu_core/cpu-cycles/,cpu_core/event=0xc6,umask=0x3,frontend=0x11,name=frontend_retired_dsb_miss/}' -j any,counter
...
Thread 27 "perf" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffafff76c0 (LWP 949003)]
perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653
653 *width = env->cpu_pmu_caps ? env->br_cntr_width :
(gdb) bt
#0 perf_env__find_br_cntr_info (env=0xf66dc0 <perf_env>, nr=0x0, width=0x7fffafff62c0) at util/env.c:653
#1 0x00000000005b1599 in symbol__account_br_cntr (branch=0x7fffcc3db580, evsel=0xfea2d0, offset=12, br_cntr=8) at util/annotate.c:345
#2 0x00000000005b17fb in symbol__account_cycles (addr=5658172, start=5658160, sym=0x7fffcc0ee420, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:389
#3 0x00000000005b1976 in addr_map_symbol__account_cycles (ams=0x7fffcd7b01d0, start=0x7fffcd7b02b0, cycles=539, evsel=0xfea2d0, br_cntr=8) at util/annotate.c:422
#4 0x000000000068d57f in hist__account_cycles (bs=0x110d288, al=0x7fffafff6540, sample=0x7fffafff6760, nonany_branch_mode=false, total_cycles=0x0, evsel=0xfea2d0) at util/hist.c:2850
#5 0x0000000000446216 in hist_iter__top_callback (iter=0x7fffafff6590, al=0x7fffafff6540, single=true, arg=0x7fffffff9e00) at builtin-top.c:737
#6 0x0000000000689787 in hist_entry_iter__add (iter=0x7fffafff6590, al=0x7fffafff6540, max_stack_depth=127, arg=0x7fffffff9e00) at util/hist.c:1359
#7 0x0000000000446710 in perf_event__process_sample (tool=0x7fffffff9e00, event=0x110d250, evsel=0xfea2d0, sample=0x7fffafff6760, machine=0x108c968) at builtin-top.c:845
#8 0x0000000000447735 in deliver_event (qe=0x7fffffffa120, qevent=0x10fc200) at builtin-top.c:1211
#9 0x000000000064ccae in do_flush (oe=0x7fffffffa120, show_progress=false) at util/ordered-events.c:245
#10 0x000000000064d005 in __ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP, timestamp=0) at util/ordered-events.c:324
#11 0x000000000064d0ef in ordered_events__flush (oe=0x7fffffffa120, how=OE_FLUSH__TOP) at util/ordered-events.c:342
#12 0x00000000004472a9 in process_thread (arg=0x7fffffff9e00) at builtin-top.c:1120
#13 0x00007ffff6e7dba8 in start_thread (arg=<optimized out>) at pthread_create.c:448
#14 0x00007ffff6f01b8c in __GI___clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
The cause is that perf_env__find_br_cntr_info tries to access a
null pointer pmu_caps in the perf_env struct. A similar issue exists
for homogeneous core systems which use the cpu_pmu_caps structure.
Fix this by populating cpu_pmu_caps and pmu_caps structures with
values from sysfs when calling perf top with branch stack sampling
enabled.
[1], LBR event logging introduced here:
https://lore.kernel.org/all/20231025201626.3000228-5-kan.liang@linux.intel.com/
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250612163659.1357950-2-thomas.falcon@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
…CAN XL step 3/3"
Vincent Mailhol <mailhol@kernel.org> says:
In November last year, I sent an RFC to introduce CAN XL [1]. That
RFC, despite positive feedback, was put on hold due to some unanswered
question concerning the PWM encoding [2].
While stuck, some small preparation work was done in parallel in [3]
by refactoring the struct can_priv and doing some trivial clean-up and
renaming. Initially, [3] received zero feedback but was eventually
merged after splitting it in smaller parts and resending it.
Finally, in July this year, we clarified the remaining mysteries about
PWM calculation, thus unlocking the series. Summer being a bit busy
because of some personal matters brings us to now.
After doing all the refactoring and adding all the CAN XL features,
the final result is more than 30 patches, definitively too much for a
single series. So I am splitting the remaining changes three:
- can: rework the CAN MTU logic [4]
- can: netlink: preparation before introduction of CAN XL (this series)
- CAN XL (will come right after the two preparation series get merged)
And thus, this series continues and finishes the preparation work done
in [3] and [4]. It contains all the refactoring needed to smoothly
introduce CAN XL. The goal is to:
- split the functions in smaller pieces: CAN XL will introduce a
fair amount of code. And some functions which are already fairly
long (86 lines for can_validate(), 215 lines for can_changelink())
would grow to disproportionate sizes if the CAN XL logic were to
be inlined in those functions.
- repurpose the existing code to handle both CAN FD and CAN XL: a
huge part of CAN XL simply reuses the CAN FD logic. All the
existing CAN FD logic is made more generic to handle both CAN FD
and XL.
In more details:
- Patch kernel-patches#1 moves struct data_bittiming_params from dev.h to
bittiming.h and patch kernel-patches#2 makes can_get_relative_tdco() FD agnostic
before also moving it to bittiming.h.
- Patch kernel-patches#3 adds some comments to netlink.h tagging which IFLA
symbols are FD specific.
- Patches kernel-patches#4 to kernel-patches#6 are refactoring can_validate() and
can_validate_bittiming().
- Patches kernel-patches#7 to kernel-patches#11 are refactoring can_changelink() and
can_tdc_changelink().
- Patches kernel-patches#12 and kernel-patches#13 are refactoring can_get_size() and
can_tdc_get_size().
- Patches kernel-patches#14 to kernel-patches#17 are refactoring can_fill_info() and
can_tdc_fill_info().
- Patch kernel-patches#18 makes can_calc_tdco() FD agnostic.
- Patch kernel-patches#19 adds can_get_ctrlmode_str() which converts control mode
flags into strings. This is done in preparation of patch kernel-patches#20.
- Patch kernel-patches#20 is the final patch and improves the user experience by
providing detailed error messages whenever invalid parameters are
provided. All those error messages came into handy when debugging
the upcoming CAN XL patches.
Aside from the last patch, the other changes do not impact any of the
existing functionalities.
The follow up series which introduces CAN XL is nearly completed but
will be sent only once this one is approved: one thing at a time, I do
not want to overwhelm people (including myself).
[1] https://lore.kernel.org/linux-can/20241110155902.72807-16-mailhol.vincent@wanadoo.fr/
[2] https://lore.kernel.org/linux-can/c4771c16-c578-4a6d-baee-918fe276dbe9@wanadoo.fr/
[3] https://lore.kernel.org/linux-can/20241110155902.72807-16-mailhol.vincent@wanadoo.fr/
[4] https://lore.kernel.org/linux-can/20250923-can-fix-mtu-v2-0-984f9868db69@kernel.org/
Link: https://patch.msgid.link/20250923-canxl-netlink-prep-v4-0-e720d28f66fe@kernel.org
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Patch series "mm: remove nth_page()", v2. As discussed recently with Linus, nth_page() is just nasty and we would like to remove it. To recap, the reason we currently need nth_page() within a folio is because on some kernel configs (SPARSEMEM without SPARSEMEM_VMEMMAP), the memmap is allocated per memory section. While buddy allocations cannot cross memory section boundaries, hugetlb and dax folios can. So crossing a memory section means that "page++" could do the wrong thing. Instead, nth_page() on these problematic configs always goes from page->pfn, to the go from (++pfn)->page, which is rather nasty. Likely, many people have no idea when nth_page() is required and when it might be dropped. We refer to such problematic PFN ranges and "non-contiguous pages". If we only deal with "contiguous pages", there is not need for nth_page(). Besides that "obvious" folio case, we might end up using nth_page() within CMA allocations (again, could span memory sections), and in one corner case (kfence) when processing memblock allocations (again, could span memory sections). So let's handle all that, add sanity checks, and remove nth_page(). Patch kernel-patches#1 -> kernel-patches#5 : stop making SPARSEMEM_VMEMMAP user-selectable + cleanups Patch kernel-patches#6 -> kernel-patches#13 : disallow folios to have non-contiguous pages Patch kernel-patches#14 -> kernel-patches#20 : remove nth_page() usage within folios Patch kernel-patches#22 : disallow CMA allocations of non-contiguous pages Patch kernel-patches#23 -> kernel-patches#33 : sanity+check + remove nth_page() usage within SG entry Patch kernel-patches#34 : sanity-check + remove nth_page() usage in unpin_user_page_range_dirty_lock() Patch kernel-patches#35 : remove nth_page() in kfence Patch kernel-patches#36 : adjust stale comment regarding nth_page Patch kernel-patches#37 : mm: remove nth_page() A lot of this is inspired from the discussion at [1] between Linus, Jason and me, so cudos to them. This patch (of 37): In an ideal world, we wouldn't have to deal with SPARSEMEM without SPARSEMEM_VMEMMAP, but in particular for 32bit SPARSEMEM_VMEMMAP is considered too costly and consequently not supported. However, if an architecture does support SPARSEMEM with SPARSEMEM_VMEMMAP, let's forbid the user to disable VMEMMAP: just like we already do for arm64, s390 and x86. So if SPARSEMEM_VMEMMAP is supported, don't allow to use SPARSEMEM without SPARSEMEM_VMEMMAP. This implies that the option to not use SPARSEMEM_VMEMMAP will now be gone for loongarch, powerpc, riscv and sparc. All architectures only enable SPARSEMEM_VMEMMAP with 64bit support, so there should not really be a big downside to using the VMEMMAP (quite the contrary). This is a preparation for not supporting (1) folio sizes that exceed a single memory section (2) CMA allocations of non-contiguous page ranges in SPARSEMEM without SPARSEMEM_VMEMMAP configs, whereby we want to limit possible impact as much as possible (e.g., gigantic hugetlb page allocations suddenly fails). Link: https://lkml.kernel.org/r/20250901150359.867252-1-david@redhat.com Link: https://lkml.kernel.org/r/20250901150359.867252-2-david@redhat.com Link: https://lore.kernel.org/all/CAHk-=wiCYfNp4AJLBORU-c7ZyRBUp66W2-Et6cdQ4REx-GyQ_A@mail.gmail.com/T/#u [1] Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: SeongJae Park <sj@kernel.org> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: "David S. Miller" <davem@davemloft.net> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexander Potapenko <glider@google.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: Alex Willamson <alex.williamson@redhat.com> Cc: Bart van Assche <bvanassche@acm.org> Cc: Borislav Betkov <bp@alien8.de> Cc: Brendan Jackman <jackmanb@google.com> Cc: Brett Creeley <brett.creeley@amd.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christoph Lameter (Ampere) <cl@gentwo.org> Cc: Damien Le Maol <dlemoal@kernel.org> Cc: Dave Airlie <airlied@gmail.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Doug Gilbert <dgilbert@interlog.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Herbert Xu <herbert@gondor.apana.org.au> Cc: Ingo Molnar <mingo@redhat.com> Cc: Inki Dae <m.szyprowski@samsung.com> Cc: James Bottomley <james.bottomley@HansenPartnership.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Jonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Lars Persson <lars.persson@axis.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Marco Elver <elver@google.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Maxim Levitky <maximlevitsky@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Niklas Cassel <cassel@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pavel Begunkov <asml.silence@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Robin Murohy <robin.murphy@arm.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Tvrtko Ursulin <tursulin@ursulin.net> Cc: Ulf Hansson <ulf.hansson@linaro.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Cc: Yishai Hadas <yishaih@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pull request for series with
subject: xsk: increase NAPI budget for AF_XDP zero-copy path
version: 1
url: https://patchwork.ozlabs.org/project/netdev/list/?series=199973