Skip to content

DLPX-75524 avoid unnecessary nfserr_jukebox returns from nfsd_file_acquire #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 11, 2021

Conversation

don-brady
Copy link
Contributor

Context:

This change picks up an upstream kernel fix to address the side effect of Oracle DB corruption.

commit d30881f573e565ebb5dbb50b31ed6106b5c81328
Author: Trond Myklebust <trond.myklebust@hammerspace.com>
Date:   Thu Feb 18 21:02:07 2021 -0500

    nfsd: Don't keep looking up unhashed files in the nfsd file cache

    If a file is unhashed, then we're going to reject it anyway and retry,
    so make sure we skip it when we're doing the RCU lockless lookup.
    This avoids a number of unnecessary nfserr_jukebox returns from
    nfsd_file_acquire()

    Fixes: 65294c1f2c5e ("nfsd: add a new struct file caching facility to nfsd")
    Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
    Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Problem:

In the 5.4 Linux kernel we picked up a new file caching facility in nfsd. A side effect of that change is that we can now encounter spurious nfserr_jukebox errors during nfs write calls. For Oracle 19 VDBs using dNFS this error during a write request can cause corruption. This is being tracked internally as DLPX-75524.

Solution:

From the upstream commit:

    If a file is unhashed, then we're going to reject it anyway and retry,
    so make sure we skip it when we're doing the RCU lockless lookup.
    This avoids a number of unnecessary nfserr_jukebox returns from
    nfsd_file_acquire()

Also added logging for all cases in nfsd_file_acquire() that can return nfserr_jukebox to assist in any future diagnosis.

Testing

Change was tested by the escalation team. With this fix in place the reproducible corruption was no longer occurring.

appliance-build-orchestrator-pre-push:
http://selfservice.jenkins.delphix.com/job/devops-gate/job/master/job/appliance-build-orchestrator-pre-push/5247/

The test plan is located here: https://docs.google.com/document/d/1tLqW_-zX48inlsf2fwvf5rimrdBjKbMWuJHgmuVnF8Y/edit#

Implementation:

This change resides in the nfsd.ko kernel module.

Deployment Plan:

This is PR is for the master branch. A hotfix was also made available from this change.

…quire

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
@don-brady don-brady merged commit e4b8431 into delphix:master May 11, 2021
delphix-devops-bot pushed a commit that referenced this pull request May 12, 2021
BugLink: https://bugs.launchpad.net/bugs/1918974

[ Upstream commit c5c97ca ]

The ubsan reported the following error.  It was because sample's raw
data missed u32 padding at the end.  So it broke the alignment of the
array after it.

The raw data contains an u32 size prefix so the data size should have
an u32 padding after 8-byte aligned data.

27: Sample parsing  :util/synthetic-events.c:1539:4:
  runtime error: store to misaligned address 0x62100006b9bc for type
  '__u64' (aka 'unsigned long long'), which requires 8 byte alignment
0x62100006b9bc: note: pointer points here
  00 00 00 00 ff ff ff ff  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff
              ^
    #0 0x561532a9fc96 in perf_event__synthesize_sample util/synthetic-events.c:1539:13
    #1 0x5615327f4a4f in do_test tests/sample-parsing.c:284:8
    #2 0x5615327f3f50 in test__sample_parsing tests/sample-parsing.c:381:9
    #3 0x56153279d3a1 in run_test tests/builtin-test.c:424:9
    #4 0x56153279c836 in test_and_print tests/builtin-test.c:454:9
    #5 0x56153279b7eb in __cmd_test tests/builtin-test.c:675:4
    #6 0x56153279abf0 in cmd_test tests/builtin-test.c:821:9
    #7 0x56153264e796 in run_builtin perf.c:312:11
    #8 0x56153264cf03 in handle_internal_command perf.c:364:8
    #9 0x56153264e47d in run_argv perf.c:408:2
    #10 0x56153264c9a9 in main perf.c:538:3
    #11 0x7f137ab6fbbc in __libc_start_main (/lib64/libc.so.6+0x38bbc)
    #12 0x561532596828 in _start ...

SUMMARY: UndefinedBehaviorSanitizer: misaligned-pointer-use
 util/synthetic-events.c:1539:4 in

Fixes: 045f8cd ("perf tests: Add a sample parsing test")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210214091638.519643-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request May 12, 2021
BugLink: https://bugs.launchpad.net/bugs/1920235

[ Upstream commit e8bd76e ]

kernel panic trace looks like:

 #5 [ffffb9e08698fc80] do_page_fault at ffffffffb666e0d7
 #6 [ffffb9e08698fcb0] page_fault at ffffffffb70010fe
    [exception RIP: amp_read_loc_assoc_final_data+63]
    RIP: ffffffffc06ab54f  RSP: ffffb9e08698fd68  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: ffff8c8845a5a000  RCX: 0000000000000004
    RDX: 0000000000000000  RSI: ffff8c8b9153d000  RDI: ffff8c8845a5a000
    RBP: ffffb9e08698fe40   R8: 00000000000330e0   R9: ffffffffc0675c94
    R10: ffffb9e08698fe58  R11: 0000000000000001  R12: ffff8c8b9cbf6200
    R13: 0000000000000000  R14: 0000000000000000  R15: ffff8c8b2026da0b
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #7 [ffffb9e08698fda8] hci_event_packet at ffffffffc0676904 [bluetooth]
 #8 [ffffb9e08698fe50] hci_rx_work at ffffffffc06629ac [bluetooth]
 #9 [ffffb9e08698fe98] process_one_work at ffffffffb66f95e7

hcon->amp_mgr seems NULL triggered kernel panic in following line inside
function amp_read_loc_assoc_final_data

        set_bit(READ_LOC_AMP_ASSOC_FINAL, &mgr->state);

Fixed by checking NULL for mgr.

Signed-off-by: Gopal Tiwari <gtiwari@redhat.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kelsey Skunberg <kelsey.skunberg@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request May 12, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request May 17, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request May 18, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Jun 6, 2021
BugLink: https://bugs.launchpad.net/bugs/1926489

commit 90bd070 upstream.

The following deadlock is detected:

  truncate -> setattr path is waiting for pending direct IO to be done (inode->i_dio_count become zero) with inode->i_rwsem held (down_write).

  PID: 14827  TASK: ffff881686a9af80  CPU: 20  COMMAND: "ora_p005_hrltd9"
   #0  __schedule at ffffffff818667cc
   #1  schedule at ffffffff81866de6
   #2  inode_dio_wait at ffffffff812a2d04
   #3  ocfs2_setattr at ffffffffc05f322e [ocfs2]
   #4  notify_change at ffffffff812a5a09
   #5  do_truncate at ffffffff812808f5
   #6  do_sys_ftruncate.constprop.18 at ffffffff81280cf2
   #7  sys_ftruncate at ffffffff81280d8e
   #8  do_syscall_64 at ffffffff81003949
   #9  entry_SYSCALL_64_after_hwframe at ffffffff81a001ad

dio completion path is going to complete one direct IO (decrement
inode->i_dio_count), but before that it hung at locking inode->i_rwsem:

   #0  __schedule+700 at ffffffff818667cc
   #1  schedule+54 at ffffffff81866de6
   #2  rwsem_down_write_failed+536 at ffffffff8186aa28
   #3  call_rwsem_down_write_failed+23 at ffffffff8185a1b7
   #4  down_write+45 at ffffffff81869c9d
   #5  ocfs2_dio_end_io_write+180 at ffffffffc05d5444 [ocfs2]
   #6  ocfs2_dio_end_io+85 at ffffffffc05d5a85 [ocfs2]
   #7  dio_complete+140 at ffffffff812c873c
   #8  dio_aio_complete_work+25 at ffffffff812c89f9
   #9  process_one_work+361 at ffffffff810b1889
  #10  worker_thread+77 at ffffffff810b233d
  #11  kthread+261 at ffffffff810b7fd5
  #12  ret_from_fork+62 at ffffffff81a0035e

Thus above forms ABBA deadlock.  The same deadlock was mentioned in
upstream commit 28f5a8a ("ocfs2: should wait dio before inode lock
in ocfs2_setattr()").  It seems that that commit only removed the
cluster lock (the victim of above dead lock) from the ABBA deadlock
party.

End-user visible effects: Process hang in truncate -> ocfs2_setattr path
and other processes hang at ocfs2_dio_end_io_write path.

This is to fix the deadlock itself.  It removes inode_lock() call from
dio completion path to remove the deadlock and add ip_alloc_sem lock in
setattr path to synchronize the inode modifications.

[wen.gang.wang@oracle.com: remove the "had_alloc_lock" as suggested]
  Link: https://lkml.kernel.org/r/20210402171344.1605-1-wen.gang.wang@oracle.com

Link: https://lkml.kernel.org/r/20210331203654.3911-1-wen.gang.wang@oracle.com
Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Jun 6, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Jun 18, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Jun 24, 2021
BugLink: https://bugs.launchpad.net/bugs/1928825

[ Upstream commit 0a360e8 ]

Add the error path for registering tty devices and roll back in case of error
in bid to avoid the UAF like the below one reported.

Plus syzbot reported general protection fault in cdev_del() on Sep 24, 2020
and both cases are down to the kobject_put() in tty_cdev_add().

 ------------[ cut here ]------------
 refcount_t: underflow; use-after-free.
 WARNING: CPU: 1 PID: 8923 at lib/refcount.c:28
 refcount_warn_saturate+0x1cf/0x210 -origin/lib/refcount.c:28
 Modules linked in:
 CPU: 1 PID: 8923 Comm: executor Not tainted 5.12.0-rc5+ #8
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
 1.13.0-1ubuntu1.1 04/01/2014
 RIP: 0010:refcount_warn_saturate+0x1cf/0x210 -origin/lib/refcount.c:28
 Code: 4f ff ff ff e8 32 fa b5 fe 48 c7 c7 3d f8 f6 86 e8 d6 ab c6 fe
 c6 05 7c 34 67 04 01 48 c7 c7 68 f8 6d 86 31 c0 e8 81 2e 9d fe <0f> 0b
 e9 22 ff ff ff e8 05 fa b5 fe 48 c7 c7 3e f8 f6 86 e8 a9 ab
 RSP: 0018:ffffc90001633c60 EFLAGS: 00010246
 RAX: 15d08b2e34b77800 RBX: 0000000000000003 RCX: ffff88804c056c80
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
 RBP: 0000000000000003 R08: ffffffff813767aa R09: 0001ffffffffffff
 R10: 0001ffffffffffff R11: ffff88804c056c80 R12: ffff888040b7d000
 R13: ffff88804c206938 R14: ffff88804c206900 R15: ffff888041b18488
 FS:  00000000022c9940(0000) GS:ffff88807ec00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00007f9f9b122008 CR3: 0000000044b4b000 CR4: 0000000000750ee0
 PKRU: 55555554
 Call Trace:
  __refcount_sub_and_test -origin/./include/linux/refcount.h:283 [inline]
  __refcount_dec_and_test -origin/./include/linux/refcount.h:315 [inline]
  refcount_dec_and_test -origin/./include/linux/refcount.h:333 [inline]
  kref_put -origin/./include/linux/kref.h:64 [inline]
  kobject_put+0x17b/0x180 -origin/lib/kobject.c:753
  cdev_del+0x4b/0x50 -origin/fs/char_dev.c:597
  tty_unregister_device+0x99/0xd0 -origin/drivers/tty/tty_io.c:3343
  gsmld_detach_gsm -origin/drivers/tty/n_gsm.c:2409 [inline]
  gsmld_close+0x6c/0x140 -origin/drivers/tty/n_gsm.c:2478
  tty_ldisc_close -origin/drivers/tty/tty_ldisc.c:488 [inline]
  tty_ldisc_kill -origin/drivers/tty/tty_ldisc.c:636 [inline]
  tty_ldisc_release+0x1b6/0x400 -origin/drivers/tty/tty_ldisc.c:809
  tty_release_struct+0x19/0xb0 -origin/drivers/tty/tty_io.c:1714
  tty_release+0x9ad/0xa00 -origin/drivers/tty/tty_io.c:1885
  __fput+0x260/0x4e0 -origin/fs/file_table.c:280
  ____fput+0x11/0x20 -origin/fs/file_table.c:313
  task_work_run+0x8e/0x110 -origin/kernel/task_work.c:140
  tracehook_notify_resume -origin/./include/linux/tracehook.h:189 [inline]
  exit_to_user_mode_loop -origin/kernel/entry/common.c:174 [inline]
  exit_to_user_mode_prepare+0x16b/0x1a0 -origin/kernel/entry/common.c:208
  __syscall_exit_to_user_mode_work -origin/kernel/entry/common.c:290 [inline]
  syscall_exit_to_user_mode+0x20/0x40 -origin/kernel/entry/common.c:301
  do_syscall_64+0x45/0x80 -origin/arch/x86/entry/common.c:56
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Reported-by: syzbot+c49fe6089f295a05e6f8@syzkaller.appspotmail.com
Reported-and-tested-by: Hao Sun <sunhao.th@gmail.com>
Signed-off-by: Hillf Danton <hdanton@sina.com>
Link: https://lore.kernel.org/r/20210412035758.1974-1-hdanton@sina.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Jun 24, 2021
BugLink: https://bugs.launchpad.net/bugs/1929615

[ Upstream commit bbd6f0a ]

In bnxt_rx_pkt(), the RX buffers are expected to complete in order.
If the RX consumer index indicates an out of order buffer completion,
it means we are hitting a hardware bug and the driver will abort all
remaining RX packets and reset the RX ring.  The RX consumer index
that we pass to bnxt_discard_rx() is not correct.  We should be
passing the current index (tmp_raw_cons) instead of the old index
(raw_cons).  This bug can cause us to be at the wrong index when
trying to abort the next RX packet.  It can crash like this:

 #0 [ffff9bbcdf5c39a8] machine_kexec at ffffffff9b05e007
 #1 [ffff9bbcdf5c3a00] __crash_kexec at ffffffff9b111232
 #2 [ffff9bbcdf5c3ad0] panic at ffffffff9b07d61e
 #3 [ffff9bbcdf5c3b50] oops_end at ffffffff9b030978
 #4 [ffff9bbcdf5c3b78] no_context at ffffffff9b06aaf0
 #5 [ffff9bbcdf5c3bd8] __bad_area_nosemaphore at ffffffff9b06ae2e
 #6 [ffff9bbcdf5c3c28] bad_area_nosemaphore at ffffffff9b06af24
 #7 [ffff9bbcdf5c3c38] __do_page_fault at ffffffff9b06b67e
 #8 [ffff9bbcdf5c3cb0] do_page_fault at ffffffff9b06bb12
 #9 [ffff9bbcdf5c3ce0] page_fault at ffffffff9bc015c5
    [exception RIP: bnxt_rx_pkt+237]
    RIP: ffffffffc0259cdd  RSP: ffff9bbcdf5c3d98  RFLAGS: 00010213
    RAX: 000000005dd8097f  RBX: ffff9ba4cb11b7e0  RCX: ffffa923cf6e9000
    RDX: 0000000000000fff  RSI: 0000000000000627  RDI: 0000000000001000
    RBP: ffff9bbcdf5c3e60   R8: 0000000000420003   R9: 000000000000020d
    R10: ffffa923cf6ec138  R11: ffff9bbcdf5c3e83  R12: ffff9ba4d6f928c0
    R13: ffff9ba4cac28080  R14: ffff9ba4cb11b7f0  R15: ffff9ba4d5a30000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Fixes: a1b0e4e ("bnxt_en: Improve RX consumer index validity check.")
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Jun 24, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Jul 12, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Jul 22, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
pzakha pushed a commit that referenced this pull request Aug 23, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
pzakha pushed a commit that referenced this pull request Aug 24, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Sep 2, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Sep 10, 2021
…nces

BugLink: https://bugs.launchpad.net/bugs/1938199

[ Upstream commit ecfbd7b ]

FunctionFS device structure 'struct ffs_dev' and driver data structure
'struct ffs_data' are bound to each other with cross-reference pointers
'ffs_data->private_data' and 'ffs_dev->ffs_data'. While the first one
is supposed to be valid through the whole life of 'struct ffs_data'
(and while 'struct ffs_dev' exists non-freed), the second one is cleared
in 'ffs_closed()' (called from 'ffs_data_reset()' or the last
'ffs_data_put()'). This can be called several times, alternating in
different order with 'ffs_free_inst()', that, if possible, clears
the other cross-reference.

As a result, different cases of these calls order may leave stale
cross-reference pointers, used when the pointed structure is already
freed. Even if it occasionally doesn't cause kernel crash, this error
is reported by KASAN-enabled kernel configuration.

For example, the case [last 'ffs_data_put()' - 'ffs_free_inst()'] was
fixed by commit cdafb6d ("usb: gadget: f_fs: Fix use-after-free in
ffs_free_inst").

The other case ['ffs_data_reset()' - 'ffs_free_inst()' - 'ffs_data_put()']
now causes KASAN reported error [1], when 'ffs_data_reset()' clears
'ffs_dev->ffs_data', then 'ffs_free_inst()' frees the 'struct ffs_dev',
but can't clear 'ffs_data->private_data', which is then accessed
in 'ffs_closed()' called from 'ffs_data_put()'. This happens since
'ffs_dev->ffs_data' reference is cleared too early.

Moreover, one more use case, when 'ffs_free_inst()' is called immediately
after mounting FunctionFS device (that is before the descriptors are
written and 'ffs_ready()' is called), and then 'ffs_data_reset()'
or 'ffs_data_put()' is called from accessing "ep0" file or unmounting
the device. This causes KASAN error report like [2], since
'ffs_dev->ffs_data' is not yet set when 'ffs_free_inst()' can't properly
clear 'ffs_data->private_data', that is later accessed to freed structure.

Fix these (and may be other) cases of stale pointers access by moving
setting and clearing of the mentioned cross-references to the single
places, setting both of them when 'struct ffs_data' is created and
bound to 'struct ffs_dev', and clearing both of them when one of the
structures is destroyed. It seems convenient to make this pointer
initialization and structures binding in 'ffs_acquire_dev()' and
make pointers clearing in 'ffs_release_dev()'. This required some
changes in these functions parameters and return types.

Also, 'ffs_release_dev()' calling requires some cleanup, fixing minor
issues, like (1) 'ffs_release_dev()' is not called if 'ffs_free_inst()'
is called without unmounting the device, and "release_dev" callback
is not called at all, or (2) "release_dev" callback is called before
"ffs_closed" callback on unmounting, which seems to be not correctly
nested with "acquire_dev" and "ffs_ready" callbacks.
Make this cleanup togther with other mentioned 'ffs_release_dev()' changes.

[1]
==================================================================
root@rcar-gen3:~# mkdir /dev/cfs
root@rcar-gen3:~# mkdir /dev/ffs
root@rcar-gen3:~# modprobe libcomposite
root@rcar-gen3:~# mount -t configfs none /dev/cfs
root@rcar-gen3:~# mkdir /dev/cfs/usb_gadget/g1
root@rcar-gen3:~# mkdir /dev/cfs/usb_gadget/g1/functions/ffs.ffs
[   64.340664] file system registered
root@rcar-gen3:~# mount -t functionfs ffs /dev/ffs
root@rcar-gen3:~# cd /dev/ffs
root@rcar-gen3:/dev/ffs# /home/root/ffs-test
ffs-test: info: ep0: writing descriptors (in v2 format)
[   83.181442] read descriptors
[   83.186085] read strings
ffs-test: info: ep0: writing strings
ffs-test: dbg:  ep1: starting
ffs-test: dbg:  ep2: starting
ffs-test: info: ep1: starts
ffs-test: info: ep2: starts
ffs-test: info: ep0: starts

^C
root@rcar-gen3:/dev/ffs# cd /home/root/
root@rcar-gen3:~# rmdir /dev/cfs/usb_gadget/g1/functions/ffs.ffs
[   98.935061] unloading
root@rcar-gen3:~# umount /dev/ffs
[  102.734301] ==================================================================
[  102.742059] BUG: KASAN: use-after-free in ffs_release_dev+0x64/0xa8 [usb_f_fs]
[  102.749683] Write of size 1 at addr ffff0004d46ff549 by task umount/2997
[  102.756709]
[  102.758311] CPU: 0 PID: 2997 Comm: umount Not tainted 5.13.0-rc4+ #8
[  102.764971] Hardware name: Renesas Salvator-X board based on r8a77951 (DT)
[  102.772179] Call trace:
[  102.774779]  dump_backtrace+0x0/0x330
[  102.778653]  show_stack+0x20/0x2c
[  102.782152]  dump_stack+0x11c/0x1ac
[  102.785833]  print_address_description.constprop.0+0x30/0x274
[  102.791862]  kasan_report+0x14c/0x1c8
[  102.795719]  __asan_report_store1_noabort+0x34/0x58
[  102.800840]  ffs_release_dev+0x64/0xa8 [usb_f_fs]
[  102.805801]  ffs_fs_kill_sb+0x50/0x84 [usb_f_fs]
[  102.810663]  deactivate_locked_super+0xa0/0xf0
[  102.815339]  deactivate_super+0x98/0xac
[  102.819378]  cleanup_mnt+0xd0/0x1b0
[  102.823057]  __cleanup_mnt+0x1c/0x28
[  102.826823]  task_work_run+0x104/0x180
[  102.830774]  do_notify_resume+0x458/0x14e0
[  102.835083]  work_pending+0xc/0x5f8
[  102.838762]
[  102.840357] Allocated by task 2988:
[  102.844032]  kasan_save_stack+0x28/0x58
[  102.848071]  kasan_set_track+0x28/0x3c
[  102.852016]  ____kasan_kmalloc+0x84/0x9c
[  102.856142]  __kasan_kmalloc+0x10/0x1c
[  102.860088]  __kmalloc+0x214/0x2f8
[  102.863678]  kzalloc.constprop.0+0x14/0x20 [usb_f_fs]
[  102.868990]  ffs_alloc_inst+0x8c/0x208 [usb_f_fs]
[  102.873942]  try_get_usb_function_instance+0xf0/0x164 [libcomposite]
[  102.880629]  usb_get_function_instance+0x64/0x68 [libcomposite]
[  102.886858]  function_make+0x128/0x1ec [libcomposite]
[  102.892185]  configfs_mkdir+0x330/0x590 [configfs]
[  102.897245]  vfs_mkdir+0x12c/0x1bc
[  102.900835]  do_mkdirat+0x180/0x1d0
[  102.904513]  __arm64_sys_mkdirat+0x80/0x94
[  102.908822]  invoke_syscall+0xf8/0x25c
[  102.912772]  el0_svc_common.constprop.0+0x150/0x1a0
[  102.917891]  do_el0_svc+0xa0/0xd4
[  102.921386]  el0_svc+0x24/0x34
[  102.924613]  el0_sync_handler+0xcc/0x154
[  102.928743]  el0_sync+0x198/0x1c0
[  102.932238]
[  102.933832] Freed by task 2996:
[  102.937144]  kasan_save_stack+0x28/0x58
[  102.941181]  kasan_set_track+0x28/0x3c
[  102.945128]  kasan_set_free_info+0x28/0x4c
[  102.949435]  ____kasan_slab_free+0x104/0x118
[  102.953921]  __kasan_slab_free+0x18/0x24
[  102.958047]  slab_free_freelist_hook+0x148/0x1f0
[  102.962897]  kfree+0x318/0x440
[  102.966123]  ffs_free_inst+0x164/0x2d8 [usb_f_fs]
[  102.971075]  usb_put_function_instance+0x84/0xa4 [libcomposite]
[  102.977302]  ffs_attr_release+0x18/0x24 [usb_f_fs]
[  102.982344]  config_item_put+0x140/0x1a4 [configfs]
[  102.987486]  configfs_rmdir+0x3fc/0x518 [configfs]
[  102.992535]  vfs_rmdir+0x114/0x234
[  102.996122]  do_rmdir+0x274/0x2b0
[  102.999617]  __arm64_sys_unlinkat+0x94/0xc8
[  103.004015]  invoke_syscall+0xf8/0x25c
[  103.007961]  el0_svc_common.constprop.0+0x150/0x1a0
[  103.013080]  do_el0_svc+0xa0/0xd4
[  103.016575]  el0_svc+0x24/0x34
[  103.019801]  el0_sync_handler+0xcc/0x154
[  103.023930]  el0_sync+0x198/0x1c0
[  103.027426]
[  103.029020] The buggy address belongs to the object at ffff0004d46ff500
[  103.029020]  which belongs to the cache kmalloc-128 of size 128
[  103.042079] The buggy address is located 73 bytes inside of
[  103.042079]  128-byte region [ffff0004d46ff500, ffff0004d46ff580)
[  103.054236] The buggy address belongs to the page:
[  103.059262] page:0000000021aa849b refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff0004d46fee00 pfn:0x5146fe
[  103.070437] head:0000000021aa849b order:1 compound_mapcount:0
[  103.076456] flags: 0x8000000000010200(slab|head|zone=2)
[  103.081948] raw: 8000000000010200 fffffc0013521a80 0000000d0000000d ffff0004c0002300
[  103.090052] raw: ffff0004d46fee00 000000008020001e 00000001ffffffff 0000000000000000
[  103.098150] page dumped because: kasan: bad access detected
[  103.103985]
[  103.105578] Memory state around the buggy address:
[  103.110602]  ffff0004d46ff400: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  103.118161]  ffff0004d46ff480: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  103.125726] >ffff0004d46ff500: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  103.133284]                                               ^
[  103.139120]  ffff0004d46ff580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[  103.146679]  ffff0004d46ff600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  103.154238] ==================================================================
[  103.161792] Disabling lock debugging due to kernel taint
[  103.167319] Unable to handle kernel paging request at virtual address 0037801d6000018e
[  103.175406] Mem abort info:
[  103.178457]   ESR = 0x96000004
[  103.181609]   EC = 0x25: DABT (current EL), IL = 32 bits
[  103.187020]   SET = 0, FnV = 0
[  103.190185]   EA = 0, S1PTW = 0
[  103.193417] Data abort info:
[  103.196385]   ISV = 0, ISS = 0x00000004
[  103.200315]   CM = 0, WnR = 0
[  103.203366] [0037801d6000018e] address between user and kernel address ranges
[  103.210611] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[  103.216231] Modules linked in: usb_f_fs libcomposite configfs ath9k_htc led_class mac80211 libarc4 ath9k_common ath9k_hw ath cfg80211 aes_ce_blk sata_rc4
[  103.259233] CPU: 0 PID: 2997 Comm: umount Tainted: G    B             5.13.0-rc4+ #8
[  103.267031] Hardware name: Renesas Salvator-X board based on r8a77951 (DT)
[  103.273951] pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--)
[  103.280001] pc : ffs_data_clear+0x138/0x370 [usb_f_fs]
[  103.285197] lr : ffs_data_clear+0x124/0x370 [usb_f_fs]
[  103.290385] sp : ffff800014777a80
[  103.293725] x29: ffff800014777a80 x28: ffff0004d7649c80 x27: 0000000000000000
[  103.300931] x26: ffff800014777fb0 x25: ffff60009aec9394 x24: ffff0004d7649ca4
[  103.308136] x23: 1fffe0009a3d063a x22: dfff800000000000 x21: ffff0004d1e831d0
[  103.315340] x20: e1c000eb00000bb4 x19: ffff0004d1e83000 x18: 0000000000000000
[  103.322545] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[  103.329748] x14: 0720072007200720 x13: 0720072007200720 x12: 1ffff000012ef658
[  103.336952] x11: ffff7000012ef658 x10: 0720072007200720 x9 : ffff800011322648
[  103.344157] x8 : ffff800014777818 x7 : ffff80000977b2c7 x6 : 0000000000000000
[  103.351359] x5 : 0000000000000001 x4 : ffff7000012ef659 x3 : 0000000000000001
[  103.358562] x2 : 0000000000000000 x1 : 1c38001d6000018e x0 : e1c000eb00000c70
[  103.365766] Call trace:
[  103.368235]  ffs_data_clear+0x138/0x370 [usb_f_fs]
[  103.373076]  ffs_data_reset+0x20/0x304 [usb_f_fs]
[  103.377829]  ffs_data_closed+0x1ec/0x244 [usb_f_fs]
[  103.382755]  ffs_fs_kill_sb+0x70/0x84 [usb_f_fs]
[  103.387420]  deactivate_locked_super+0xa0/0xf0
[  103.391905]  deactivate_super+0x98/0xac
[  103.395776]  cleanup_mnt+0xd0/0x1b0
[  103.399299]  __cleanup_mnt+0x1c/0x28
[  103.402906]  task_work_run+0x104/0x180
[  103.406691]  do_notify_resume+0x458/0x14e0
[  103.410823]  work_pending+0xc/0x5f8
[  103.414351] Code: b4000a54 9102f280 12000802 d343fc01 (38f66821)
[  103.420490] ---[ end trace 57b43a50e8244f57 ]---
Segmentation fault
root@rcar-gen3:~#
==================================================================

[2]
==================================================================
root@rcar-gen3:~# mkdir /dev/ffs
root@rcar-gen3:~# modprobe libcomposite
root@rcar-gen3:~#
root@rcar-gen3:~# mount -t configfs none /dev/cfs
root@rcar-gen3:~# mkdir /dev/cfs/usb_gadget/g1
root@rcar-gen3:~# mkdir /dev/cfs/usb_gadget/g1/functions/ffs.ffs
[   54.766480] file system registered
root@rcar-gen3:~# mount -t functionfs ffs /dev/ffs
root@rcar-gen3:~# rmdir /dev/cfs/usb_gadget/g1/functions/ffs.ffs
[   63.197597] unloading
root@rcar-gen3:~# cat /dev/ffs/ep0
cat: read error:[   67.213506] ==================================================================
[   67.222095] BUG: KASAN: use-after-free in ffs_data_clear+0x70/0x370 [usb_f_fs]
[   67.229699] Write of size 1 at addr ffff0004c26e974a by task cat/2994
[   67.236446]
[   67.238045] CPU: 0 PID: 2994 Comm: cat Not tainted 5.13.0-rc4+ #8
[   67.244431] Hardware name: Renesas Salvator-X board based on r8a77951 (DT)
[   67.251624] Call trace:
[   67.254212]  dump_backtrace+0x0/0x330
[   67.258081]  show_stack+0x20/0x2c
[   67.261579]  dump_stack+0x11c/0x1ac
[   67.265260]  print_address_description.constprop.0+0x30/0x274
[   67.271286]  kasan_report+0x14c/0x1c8
[   67.275143]  __asan_report_store1_noabort+0x34/0x58
[   67.280265]  ffs_data_clear+0x70/0x370 [usb_f_fs]
[   67.285220]  ffs_data_reset+0x20/0x304 [usb_f_fs]
[   67.290172]  ffs_data_closed+0x240/0x244 [usb_f_fs]
[   67.295305]  ffs_ep0_release+0x40/0x54 [usb_f_fs]
[   67.300256]  __fput+0x304/0x580
[   67.303576]  ____fput+0x18/0x24
[   67.306893]  task_work_run+0x104/0x180
[   67.310846]  do_notify_resume+0x458/0x14e0
[   67.315154]  work_pending+0xc/0x5f8
[   67.318834]
[   67.320429] Allocated by task 2988:
[   67.324105]  kasan_save_stack+0x28/0x58
[   67.328144]  kasan_set_track+0x28/0x3c
[   67.332090]  ____kasan_kmalloc+0x84/0x9c
[   67.336217]  __kasan_kmalloc+0x10/0x1c
[   67.340163]  __kmalloc+0x214/0x2f8
[   67.343754]  kzalloc.constprop.0+0x14/0x20 [usb_f_fs]
[   67.349066]  ffs_alloc_inst+0x8c/0x208 [usb_f_fs]
[   67.354017]  try_get_usb_function_instance+0xf0/0x164 [libcomposite]
[   67.360705]  usb_get_function_instance+0x64/0x68 [libcomposite]
[   67.366934]  function_make+0x128/0x1ec [libcomposite]
[   67.372260]  configfs_mkdir+0x330/0x590 [configfs]
[   67.377320]  vfs_mkdir+0x12c/0x1bc
[   67.380911]  do_mkdirat+0x180/0x1d0
[   67.384589]  __arm64_sys_mkdirat+0x80/0x94
[   67.388899]  invoke_syscall+0xf8/0x25c
[   67.392850]  el0_svc_common.constprop.0+0x150/0x1a0
[   67.397969]  do_el0_svc+0xa0/0xd4
[   67.401464]  el0_svc+0x24/0x34
[   67.404691]  el0_sync_handler+0xcc/0x154
[   67.408819]  el0_sync+0x198/0x1c0
[   67.412315]
[   67.413909] Freed by task 2993:
[   67.417220]  kasan_save_stack+0x28/0x58
[   67.421257]  kasan_set_track+0x28/0x3c
[   67.425204]  kasan_set_free_info+0x28/0x4c
[   67.429513]  ____kasan_slab_free+0x104/0x118
[   67.434001]  __kasan_slab_free+0x18/0x24
[   67.438128]  slab_free_freelist_hook+0x148/0x1f0
[   67.442978]  kfree+0x318/0x440
[   67.446205]  ffs_free_inst+0x164/0x2d8 [usb_f_fs]
[   67.451156]  usb_put_function_instance+0x84/0xa4 [libcomposite]
[   67.457385]  ffs_attr_release+0x18/0x24 [usb_f_fs]
[   67.462428]  config_item_put+0x140/0x1a4 [configfs]
[   67.467570]  configfs_rmdir+0x3fc/0x518 [configfs]
[   67.472626]  vfs_rmdir+0x114/0x234
[   67.476215]  do_rmdir+0x274/0x2b0
[   67.479710]  __arm64_sys_unlinkat+0x94/0xc8
[   67.484108]  invoke_syscall+0xf8/0x25c
[   67.488055]  el0_svc_common.constprop.0+0x150/0x1a0
[   67.493175]  do_el0_svc+0xa0/0xd4
[   67.496671]  el0_svc+0x24/0x34
[   67.499896]  el0_sync_handler+0xcc/0x154
[   67.504024]  el0_sync+0x198/0x1c0
[   67.507520]
[   67.509114] The buggy address belongs to the object at ffff0004c26e9700
[   67.509114]  which belongs to the cache kmalloc-128 of size 128
[   67.522171] The buggy address is located 74 bytes inside of
[   67.522171]  128-byte region [ffff0004c26e9700, ffff0004c26e9780)
[   67.534328] The buggy address belongs to the page:
[   67.539355] page:000000003177a217 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x5026e8
[   67.549175] head:000000003177a217 order:1 compound_mapcount:0
[   67.555195] flags: 0x8000000000010200(slab|head|zone=2)
[   67.560687] raw: 8000000000010200 fffffc0013037100 0000000c00000002 ffff0004c0002300
[   67.568791] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000
[   67.576890] page dumped because: kasan: bad access detected
[   67.582725]
[   67.584318] Memory state around the buggy address:
[   67.589343]  ffff0004c26e9600: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   67.596903]  ffff0004c26e9680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   67.604463] >ffff0004c26e9700: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[   67.612022]                                               ^
[   67.617860]  ffff0004c26e9780: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[   67.625421]  ffff0004c26e9800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[   67.632981] ==================================================================
[   67.640535] Disabling lock debugging due to kernel taint
 File descriptor[   67.646100] Unable to handle kernel paging request at virtual address fabb801d4000018d
 in bad state
[   67.655456] Mem abort info:
[   67.659619]   ESR = 0x96000004
[   67.662801]   EC = 0x25: DABT (current EL), IL = 32 bits
[   67.668225]   SET = 0, FnV = 0
[   67.671375]   EA = 0, S1PTW = 0
[   67.674613] Data abort info:
[   67.677587]   ISV = 0, ISS = 0x00000004
[   67.681522]   CM = 0, WnR = 0
[   67.684588] [fabb801d4000018d] address between user and kernel address ranges
[   67.691849] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   67.697470] Modules linked in: usb_f_fs libcomposite configfs ath9k_htc led_class mac80211 libarc4 ath9k_common ath9k_hw ath cfg80211 aes_ce_blk crypto_simd cryptd aes_ce_cipher ghash_ce gf128mul sha2_ce sha1_ce evdev sata_rcar libata xhci_plat_hcd scsi_mod xhci_hcd rene4
[   67.740467] CPU: 0 PID: 2994 Comm: cat Tainted: G    B             5.13.0-rc4+ #8
[   67.748005] Hardware name: Renesas Salvator-X board based on r8a77951 (DT)
[   67.754924] pstate: 00000005 (nzcv daif -PAN -UAO -TCO BTYPE=--)
[   67.760974] pc : ffs_data_clear+0x138/0x370 [usb_f_fs]
[   67.766178] lr : ffs_data_clear+0x124/0x370 [usb_f_fs]
[   67.771365] sp : ffff800014767ad0
[   67.774706] x29: ffff800014767ad0 x28: ffff800009cf91c0 x27: ffff0004c54861a0
[   67.781913] x26: ffff0004dc90b288 x25: 1fffe00099ec10f5 x24: 00000000000a801d
[   67.789118] x23: 1fffe00099f6953a x22: dfff800000000000 x21: ffff0004cfb4a9d0
[   67.796322] x20: d5e000ea00000bb1 x19: ffff0004cfb4a800 x18: 0000000000000000
[   67.803526] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[   67.810730] x14: 0720072007200720 x13: 0720072007200720 x12: 1ffff000028ecefa
[   67.817934] x11: ffff7000028ecefa x10: 0720072007200720 x9 : ffff80001132c014
[   67.825137] x8 : ffff8000147677d8 x7 : ffff8000147677d7 x6 : 0000000000000000
[   67.832341] x5 : 0000000000000001 x4 : ffff7000028ecefb x3 : 0000000000000001
[   67.839544] x2 : 0000000000000005 x1 : 1abc001d4000018d x0 : d5e000ea00000c6d
[   67.846748] Call trace:
[   67.849218]  ffs_data_clear+0x138/0x370 [usb_f_fs]
[   67.854058]  ffs_data_reset+0x20/0x304 [usb_f_fs]
[   67.858810]  ffs_data_closed+0x240/0x244 [usb_f_fs]
[   67.863736]  ffs_ep0_release+0x40/0x54 [usb_f_fs]
[   67.868488]  __fput+0x304/0x580
[   67.871665]  ____fput+0x18/0x24
[   67.874837]  task_work_run+0x104/0x180
[   67.878622]  do_notify_resume+0x458/0x14e0
[   67.882754]  work_pending+0xc/0x5f8
[   67.886282] Code: b4000a54 9102f280 12000802 d343fc01 (38f66821)
[   67.892422] ---[ end trace 6d7cedf53d7abbea ]---
Segmentation fault
root@rcar-gen3:~#
==================================================================

Fixes: 4b187fc ("usb: gadget: FunctionFS: add devices management code")
Fixes: 3262ad8 ("usb: gadget: f_fs: Stop ffs_closed NULL pointer dereference")
Fixes: cdafb6d ("usb: gadget: f_fs: Fix use-after-free in ffs_free_inst")
Reported-by: Bhuvanesh Surachari <bhuvanesh_surachari@mentor.com>
Tested-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Reviewed-by: Eugeniu Rosca <erosca@de.adit-jv.com>
Signed-off-by: Andrew Gabbasov <andrew_gabbasov@mentor.com>
Link: https://lore.kernel.org/r/20210603171507.22514-1-andrew_gabbasov@mentor.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Sep 10, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Sep 18, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Sep 30, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Oct 13, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Oct 16, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Oct 17, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Oct 18, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Oct 19, 2021
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Oct 24, 2021
BugLink: https://bugs.launchpad.net/bugs/1943484

commit 4d14c5c upstream

Calling btrfs_qgroup_reserve_meta_prealloc from
btrfs_delayed_inode_reserve_metadata can result in flushing delalloc
while holding a transaction and delayed node locks. This is deadlock
prone. In the past multiple commits:

 * ae5e070 ("btrfs: qgroup: don't try to wait flushing if we're
already holding a transaction")

 * 6f23277 ("btrfs: qgroup: don't commit transaction when we already
 hold the handle")

Tried to solve various aspects of this but this was always a
whack-a-mole game. Unfortunately those 2 fixes don't solve a deadlock
scenario involving btrfs_delayed_node::mutex. Namely, one thread
can call btrfs_dirty_inode as a result of reading a file and modifying
its atime:

  PID: 6963   TASK: ffff8c7f3f94c000  CPU: 2   COMMAND: "test"
  #0  __schedule at ffffffffa529e07d
  #1  schedule at ffffffffa529e4ff
  #2  schedule_timeout at ffffffffa52a1bdd
  #3  wait_for_completion at ffffffffa529eeea             <-- sleeps with delayed node mutex held
  #4  start_delalloc_inodes at ffffffffc0380db5
  #5  btrfs_start_delalloc_snapshot at ffffffffc0393836
  #6  try_flush_qgroup at ffffffffc03f04b2
  #7  __btrfs_qgroup_reserve_meta at ffffffffc03f5bb6     <-- tries to reserve space and starts delalloc inodes.
  #8  btrfs_delayed_update_inode at ffffffffc03e31aa      <-- acquires delayed node mutex
  #9  btrfs_update_inode at ffffffffc0385ba8
 #10  btrfs_dirty_inode at ffffffffc038627b               <-- TRANSACTIION OPENED
 #11  touch_atime at ffffffffa4cf0000
 #12  generic_file_read_iter at ffffffffa4c1f123
 #13  new_sync_read at ffffffffa4ccdc8a
 #14  vfs_read at ffffffffa4cd0849
 #15  ksys_read at ffffffffa4cd0bd1
 #16  do_syscall_64 at ffffffffa4a052eb
 #17  entry_SYSCALL_64_after_hwframe at ffffffffa540008c

This will cause an asynchronous work to flush the delalloc inodes to
happen which can try to acquire the same delayed_node mutex:

  PID: 455    TASK: ffff8c8085fa4000  CPU: 5   COMMAND: "kworker/u16:30"
  #0  __schedule at ffffffffa529e07d
  #1  schedule at ffffffffa529e4ff
  #2  schedule_preempt_disabled at ffffffffa529e80a
  #3  __mutex_lock at ffffffffa529fdcb                    <-- goes to sleep, never wakes up.
  #4  btrfs_delayed_update_inode at ffffffffc03e3143      <-- tries to acquire the mutex
  #5  btrfs_update_inode at ffffffffc0385ba8              <-- this is the same inode that pid 6963 is holding
  #6  cow_file_range_inline.constprop.78 at ffffffffc0386be7
  #7  cow_file_range at ffffffffc03879c1
  #8  btrfs_run_delalloc_range at ffffffffc038894c
  #9  writepage_delalloc at ffffffffc03a3c8f
 #10  __extent_writepage at ffffffffc03a4c01
 #11  extent_write_cache_pages at ffffffffc03a500b
 #12  extent_writepages at ffffffffc03a6de2
 #13  do_writepages at ffffffffa4c277eb
 #14  __filemap_fdatawrite_range at ffffffffa4c1e5bb
 #15  btrfs_run_delalloc_work at ffffffffc0380987         <-- starts running delayed nodes
 #16  normal_work_helper at ffffffffc03b706c
 #17  process_one_work at ffffffffa4aba4e4
 #18  worker_thread at ffffffffa4aba6fd
 #19  kthread at ffffffffa4ac0a3d
 #20  ret_from_fork at ffffffffa54001ff

To fully address those cases the complete fix is to never issue any
flushing while holding the transaction or the delayed node lock. This
patch achieves it by calling qgroup_reserve_meta directly which will
either succeed without flushing or will fail and return -EDQUOT. In the
latter case that return value is going to be propagated to
btrfs_dirty_inode which will fallback to start a new transaction. That's
fine as the majority of time we expect the inode will have
BTRFS_DELAYED_NODE_INODE_DIRTY flag set which will result in directly
copying the in-memory state.

Fixes: c53e965 ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT")
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Luke Nowakowski-Krijger <luke.nowakowskikrijger@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Dec 15, 2022
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Jan 11, 2023
BugLink: https://bugs.launchpad.net/bugs/1992211

[ Upstream commit 84a5358 ]

The SRv6 layer allows defining HMAC data that can later be used to sign IPv6
Segment Routing Headers. This configuration is realised via netlink through
four attributes: SEG6_ATTR_HMACKEYID, SEG6_ATTR_SECRET, SEG6_ATTR_SECRETLEN and
SEG6_ATTR_ALGID. Because the SECRETLEN attribute is decoupled from the actual
length of the SECRET attribute, it is possible to provide invalid combinations
(e.g., secret = "", secretlen = 64). This case is not checked in the code and
with an appropriately crafted netlink message, an out-of-bounds read of up
to 64 bytes (max secret length) can occur past the skb end pointer and into
skb_shared_info:

Breakpoint 1, seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208
208		memcpy(hinfo->secret, secret, slen);
(gdb) bt
 #0  seg6_genl_sethmac (skb=<optimized out>, info=<optimized out>) at net/ipv6/seg6.c:208
 #1  0xffffffff81e012e9 in genl_family_rcv_msg_doit (skb=skb@entry=0xffff88800b1f9f00, nlh=nlh@entry=0xffff88800b1b7600,
    extack=extack@entry=0xffffc90000ba7af0, ops=ops@entry=0xffffc90000ba7a80, hdrlen=4, net=0xffffffff84237580 <init_net>, family=<optimized out>,
    family=<optimized out>) at net/netlink/genetlink.c:731
 #2  0xffffffff81e01435 in genl_family_rcv_msg (extack=0xffffc90000ba7af0, nlh=0xffff88800b1b7600, skb=0xffff88800b1f9f00,
    family=0xffffffff82fef6c0 <seg6_genl_family>) at net/netlink/genetlink.c:775
 #3  genl_rcv_msg (skb=0xffff88800b1f9f00, nlh=0xffff88800b1b7600, extack=0xffffc90000ba7af0) at net/netlink/genetlink.c:792
 #4  0xffffffff81dfffc3 in netlink_rcv_skb (skb=skb@entry=0xffff88800b1f9f00, cb=cb@entry=0xffffffff81e01350 <genl_rcv_msg>)
    at net/netlink/af_netlink.c:2501
 #5  0xffffffff81e00919 in genl_rcv (skb=0xffff88800b1f9f00) at net/netlink/genetlink.c:803
 #6  0xffffffff81dff6ae in netlink_unicast_kernel (ssk=0xffff888010eec800, skb=0xffff88800b1f9f00, sk=0xffff888004aed000)
    at net/netlink/af_netlink.c:1319
 #7  netlink_unicast (ssk=ssk@entry=0xffff888010eec800, skb=skb@entry=0xffff88800b1f9f00, portid=portid@entry=0, nonblock=<optimized out>)
    at net/netlink/af_netlink.c:1345
 #8  0xffffffff81dff9a4 in netlink_sendmsg (sock=<optimized out>, msg=0xffffc90000ba7e48, len=<optimized out>) at net/netlink/af_netlink.c:1921
...
(gdb) p/x ((struct sk_buff *)0xffff88800b1f9f00)->head + ((struct sk_buff *)0xffff88800b1f9f00)->end
$1 = 0xffff88800b1b76c0
(gdb) p/x secret
$2 = 0xffff88800b1b76c0
(gdb) p slen
$3 = 64 '@'

The OOB data can then be read back from userspace by dumping HMAC state. This
commit fixes this by ensuring SECRETLEN cannot exceed the actual length of
SECRET.

Reported-by: Lucas Leong <wmliang.tw@gmail.com>
Tested: verified that EINVAL is correctly returned when secretlen > len(secret)
Fixes: 4f4853d ("ipv6: sr: implement API to control SR HMAC structure")
Signed-off-by: David Lebrun <dlebrun@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Jan 11, 2023
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Feb 2, 2023
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Feb 10, 2023
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Mar 4, 2023
BugLink: https://bugs.launchpad.net/bugs/2002347

[ Upstream commit 8c76310 ]

In the error path in ata_tport_add(), when calling put_device(),
ata_tport_release() is called, it will put the refcount of 'ap->host'.

And then ata_host_put() is called again, the refcount is decreased
to 0, ata_host_release() is called, all ports are freed and set to
null.

When unbinding the device after failure, ata_host_stop() is called
to release the resources, it leads a null-ptr-deref(), because all
the ports all freed and null.

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
CPU: 7 PID: 18671 Comm: modprobe Kdump: loaded Tainted: G            E      6.1.0-rc3+ #8
pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : ata_host_stop+0x3c/0x84 [libata]
lr : release_nodes+0x64/0xd0
Call trace:
 ata_host_stop+0x3c/0x84 [libata]
 release_nodes+0x64/0xd0
 devres_release_all+0xbc/0x1b0
 device_unbind_cleanup+0x20/0x70
 really_probe+0x158/0x320
 __driver_probe_device+0x84/0x120
 driver_probe_device+0x44/0x120
 __driver_attach+0xb4/0x220
 bus_for_each_dev+0x78/0xdc
 driver_attach+0x2c/0x40
 bus_add_driver+0x184/0x240
 driver_register+0x80/0x13c
 __pci_register_driver+0x4c/0x60
 ahci_pci_driver_init+0x30/0x1000 [ahci]

Fix this by removing redundant ata_host_put() in the error path.

Fixes: 2623c7a ("libata: add refcounting to ata_host")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Mar 4, 2023
…g the sock

BugLink: https://bugs.launchpad.net/bugs/2003914

[ Upstream commit 3cf7203 ]

There is a race condition in vxlan that when deleting a vxlan device
during receiving packets, there is a possibility that the sock is
released after getting vxlan_sock vs from sk_user_data. Then in
later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got
NULL pointer dereference. e.g.

   #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757
   #1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d
   #2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48
   #3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b
   #4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb
   #5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542
   #6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62
      [exception RIP: vxlan_ecn_decapsulate+0x3b]
      RIP: ffffffffc1014e7b  RSP: ffffa25ec6978cb0  RFLAGS: 00010246
      RAX: 0000000000000008  RBX: ffff8aa000888000  RCX: 0000000000000000
      RDX: 000000000000000e  RSI: ffff8a9fc7ab803e  RDI: ffff8a9fd1168700
      RBP: ffff8a9fc7ab803e   R8: 0000000000700000   R9: 00000000000010ae
      R10: ffff8a9fcb748980  R11: 0000000000000000  R12: ffff8a9fd1168700
      R13: ffff8aa000888000  R14: 00000000002a0000  R15: 00000000000010ae
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   #7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan]
   #8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507
   #9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45
  #10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807
  #11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951
  #12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde
  #13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b
  #14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139
  #15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a
  #16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3
  #17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca
  #18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3

Reproducer: https://github.com/Mellanox/ovs-tests/blob/master/test-ovs-vxlan-remove-tunnel-during-traffic.sh

Fix this by waiting for all sk_user_data reader to finish before
releasing the sock.

Reported-by: Jianlin Shi <jishi@redhat.com>
Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Fixes: 6a93cc9 ("udp-tunnel: Add a few more UDP tunnel APIs")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Mar 4, 2023
BugLink: https://bugs.launchpad.net/bugs/2003914

[ Upstream commit b18cba0 ]

Commit 9130b8d ("SUNRPC: allow for upcalls for the same uid
but different gss service") introduced `auth` argument to
__gss_find_upcall(), but in gss_pipe_downcall() it was left as NULL
since it (and auth->service) was not (yet) determined.

When multiple upcalls with the same uid and different service are
ongoing, it could happen that __gss_find_upcall(), which returns the
first match found in the pipe->in_downcall list, could not find the
correct gss_msg corresponding to the downcall we are looking for.
Moreover, it might return a msg which is not sent to rpc.gssd yet.

We could see mount.nfs process hung in D state with multiple mount.nfs
are executed in parallel.  The call trace below is of CentOS 7.9
kernel-3.10.0-1160.24.1.el7.x86_64 but we observed the same hang w/
elrepo kernel-ml-6.0.7-1.el7.

PID: 71258  TASK: ffff91ebd4be0000  CPU: 36  COMMAND: "mount.nfs"
 #0 [ffff9203ca3234f8] __schedule at ffffffffa3b8899f
 #1 [ffff9203ca323580] schedule at ffffffffa3b88eb9
 #2 [ffff9203ca323590] gss_cred_init at ffffffffc0355818 [auth_rpcgss]
 #3 [ffff9203ca323658] rpcauth_lookup_credcache at ffffffffc0421ebc
[sunrpc]
 #4 [ffff9203ca3236d8] gss_lookup_cred at ffffffffc0353633 [auth_rpcgss]
 #5 [ffff9203ca3236e8] rpcauth_lookupcred at ffffffffc0421581 [sunrpc]
 #6 [ffff9203ca323740] rpcauth_refreshcred at ffffffffc04223d3 [sunrpc]
 #7 [ffff9203ca3237a0] call_refresh at ffffffffc04103dc [sunrpc]
 #8 [ffff9203ca3237b8] __rpc_execute at ffffffffc041e1c9 [sunrpc]
 #9 [ffff9203ca323820] rpc_execute at ffffffffc0420a48 [sunrpc]

The scenario is like this. Let's say there are two upcalls for
services A and B, A -> B in pipe->in_downcall, B -> A in pipe->pipe.

When rpc.gssd reads pipe to get the upcall msg corresponding to
service B from pipe->pipe and then writes the response, in
gss_pipe_downcall the msg corresponding to service A will be picked
because only uid is used to find the msg and it is before the one for
B in pipe->in_downcall.  And the process waiting for the msg
corresponding to service A will be woken up.

Actual scheduing of that process might be after rpc.gssd processes the
next msg.  In rpc_pipe_generic_upcall it clears msg->errno (for A).
The process is scheduled to see gss_msg->ctx == NULL and
gss_msg->msg.errno == 0, therefore it cannot break the loop in
gss_create_upcall and is never woken up after that.

This patch adds a simple check to ensure that a msg which is not
sent to rpc.gssd yet is not chosen as the matching upcall upon
receiving a downcall.

Signed-off-by: minoura makoto <minoura@valinux.co.jp>
Signed-off-by: Hiroshi Shimamoto <h-shimamoto@nec.com>
Tested-by: Hiroshi Shimamoto <h-shimamoto@nec.com>
Cc: Trond Myklebust <trondmy@hammerspace.com>
Fixes: 9130b8d ("SUNRPC: allow for upcalls for same uid but different gss service")
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Mar 4, 2023
BugLink: https://bugs.launchpad.net/bugs/2003914

[ Upstream commit 031af50 ]

The inline assembly for arm64's cmpxchg_double*() implementations use a
+Q constraint to hazard against other accesses to the memory location
being exchanged. However, the pointer passed to the constraint is a
pointer to unsigned long, and thus the hazard only applies to the first
8 bytes of the location.

GCC can take advantage of this, assuming that other portions of the
location are unchanged, leading to a number of potential problems.

This is similar to what we fixed back in commit:

  fee960b ("arm64: xchg: hazard against entire exchange variable")

... but we forgot to adjust cmpxchg_double*() similarly at the same
time.

The same problem applies, as demonstrated with the following test:

| struct big {
|         u64 lo, hi;
| } __aligned(128);
|
| unsigned long foo(struct big *b)
| {
|         u64 hi_old, hi_new;
|
|         hi_old = b->hi;
|         cmpxchg_double_local(&b->lo, &b->hi, 0x12, 0x34, 0x56, 0x78);
|         hi_new = b->hi;
|
|         return hi_old ^ hi_new;
| }

... which GCC 12.1.0 compiles as:

| 0000000000000000 <foo>:
|    0:   d503233f        paciasp
|    4:   aa0003e4        mov     x4, x0
|    8:   1400000e        b       40 <foo+0x40>
|    c:   d2800240        mov     x0, #0x12                       // #18
|   10:   d2800681        mov     x1, #0x34                       // #52
|   14:   aa0003e5        mov     x5, x0
|   18:   aa0103e6        mov     x6, x1
|   1c:   d2800ac2        mov     x2, #0x56                       // #86
|   20:   d2800f03        mov     x3, #0x78                       // #120
|   24:   48207c82        casp    x0, x1, x2, x3, [x4]
|   28:   ca050000        eor     x0, x0, x5
|   2c:   ca060021        eor     x1, x1, x6
|   30:   aa010000        orr     x0, x0, x1
|   34:   d2800000        mov     x0, #0x0                        // #0    <--- BANG
|   38:   d50323bf        autiasp
|   3c:   d65f03c0        ret
|   40:   d2800240        mov     x0, #0x12                       // #18
|   44:   d2800681        mov     x1, #0x34                       // #52
|   48:   d2800ac2        mov     x2, #0x56                       // #86
|   4c:   d2800f03        mov     x3, #0x78                       // #120
|   50:   f9800091        prfm    pstl1strm, [x4]
|   54:   c87f1885        ldxp    x5, x6, [x4]
|   58:   ca0000a5        eor     x5, x5, x0
|   5c:   ca0100c6        eor     x6, x6, x1
|   60:   aa0600a6        orr     x6, x5, x6
|   64:   b5000066        cbnz    x6, 70 <foo+0x70>
|   68:   c8250c82        stxp    w5, x2, x3, [x4]
|   6c:   35ffff45        cbnz    w5, 54 <foo+0x54>
|   70:   d2800000        mov     x0, #0x0                        // #0     <--- BANG
|   74:   d50323bf        autiasp
|   78:   d65f03c0        ret

Notice that at the lines with "BANG" comments, GCC has assumed that the
higher 8 bytes are unchanged by the cmpxchg_double() call, and that
`hi_old ^ hi_new` can be reduced to a constant zero, for both LSE and
LL/SC versions of cmpxchg_double().

This patch fixes the issue by passing a pointer to __uint128_t into the
+Q constraint, ensuring that the compiler hazards against the entire 16
bytes being modified.

With this change, GCC 12.1.0 compiles the above test as:

| 0000000000000000 <foo>:
|    0:   f9400407        ldr     x7, [x0, #8]
|    4:   d503233f        paciasp
|    8:   aa0003e4        mov     x4, x0
|    c:   1400000f        b       48 <foo+0x48>
|   10:   d2800240        mov     x0, #0x12                       // #18
|   14:   d2800681        mov     x1, #0x34                       // #52
|   18:   aa0003e5        mov     x5, x0
|   1c:   aa0103e6        mov     x6, x1
|   20:   d2800ac2        mov     x2, #0x56                       // #86
|   24:   d2800f03        mov     x3, #0x78                       // #120
|   28:   48207c82        casp    x0, x1, x2, x3, [x4]
|   2c:   ca050000        eor     x0, x0, x5
|   30:   ca060021        eor     x1, x1, x6
|   34:   aa010000        orr     x0, x0, x1
|   38:   f9400480        ldr     x0, [x4, #8]
|   3c:   d50323bf        autiasp
|   40:   ca0000e0        eor     x0, x7, x0
|   44:   d65f03c0        ret
|   48:   d2800240        mov     x0, #0x12                       // #18
|   4c:   d2800681        mov     x1, #0x34                       // #52
|   50:   d2800ac2        mov     x2, #0x56                       // #86
|   54:   d2800f03        mov     x3, #0x78                       // #120
|   58:   f9800091        prfm    pstl1strm, [x4]
|   5c:   c87f1885        ldxp    x5, x6, [x4]
|   60:   ca0000a5        eor     x5, x5, x0
|   64:   ca0100c6        eor     x6, x6, x1
|   68:   aa0600a6        orr     x6, x5, x6
|   6c:   b5000066        cbnz    x6, 78 <foo+0x78>
|   70:   c8250c82        stxp    w5, x2, x3, [x4]
|   74:   35ffff45        cbnz    w5, 5c <foo+0x5c>
|   78:   f9400480        ldr     x0, [x4, #8]
|   7c:   d50323bf        autiasp
|   80:   ca0000e0        eor     x0, x7, x0
|   84:   d65f03c0        ret

... sampling the high 8 bytes before and after the cmpxchg, and
performing an EOR, as we'd expect.

For backporting, I've tested this atop linux-4.9.y with GCC 5.5.0. Note
that linux-4.9.y is oldest currently supported stable release, and
mandates GCC 5.1+. Unfortunately I couldn't get a GCC 5.1 binary to run
on my machines due to library incompatibilities.

I've also used a standalone test to check that we can use a __uint128_t
pointer in a +Q constraint at least as far back as GCC 4.8.5 and LLVM
3.9.1.

Fixes: 5284e1b ("arm64: xchg: Implement cmpxchg_double")
Fixes: e9a4b79 ("arm64: cmpxchg_dbl: patch in lse instructions when supported by the CPU")
Reported-by: Boqun Feng <boqun.feng@gmail.com>
Link: https://lore.kernel.org/lkml/Y6DEfQXymYVgL3oJ@boqun-archlinux/
Reported-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/Y6GXoO4qmH9OIZ5Q@hirez.programming.kicks-ass.net/
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: stable@vger.kernel.org
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20230104151626.3262137-1-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Mar 4, 2023
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
prakashsurya pushed a commit that referenced this pull request Mar 14, 2023
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Mar 30, 2023
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Apr 20, 2023
BugLink: https://bugs.launchpad.net/bugs/2008946

commit c6ec929 upstream.

In Google internal bug 265639009 we've received an (as yet) unreproducible
crash report from an aarch64 GKI 5.10.149-android13 running device.

AFAICT the source code is at:
  https://android.googlesource.com/kernel/common/+/refs/tags/ASB-2022-12-05_13-5.10

The call stack is:
  ncm_close() -> ncm_notify() -> ncm_do_notify()
with the crash at:
  ncm_do_notify+0x98/0x270
Code: 79000d0b b9000a6c f940012a f9400269 (b9405d4)

Which I believe disassembles to (I don't know ARM assembly, but it looks sane enough to me...):

  // halfword (16-bit) store presumably to event->wLength (at offset 6 of struct usb_cdc_notification)
  0B 0D 00 79    strh w11, [x8, #6]

  // word (32-bit) store presumably to req->Length (at offset 8 of struct usb_request)
  6C 0A 00 B9    str  w12, [x19, #8]

  // x10 (NULL) was read here from offset 0 of valid pointer x9
  // IMHO we're reading 'cdev->gadget' and getting NULL
  // gadget is indeed at offset 0 of struct usb_composite_dev
  2A 01 40 F9    ldr  x10, [x9]

  // loading req->buf pointer, which is at offset 0 of struct usb_request
  69 02 40 F9    ldr  x9, [x19]

  // x10 is null, crash, appears to be attempt to read cdev->gadget->max_speed
  4B 5D 40 B9    ldr  w11, [x10, #0x5c]

which seems to line up with ncm_do_notify() case NCM_NOTIFY_SPEED code fragment:

  event->wLength = cpu_to_le16(8);
  req->length = NCM_STATUS_BYTECOUNT;

  /* SPEED_CHANGE data is up/down speeds in bits/sec */
  data = req->buf + sizeof *event;
  data[0] = cpu_to_le32(ncm_bitrate(cdev->gadget));

My analysis of registers and NULL ptr deref crash offset
  (Unable to handle kernel NULL pointer dereference at virtual address 000000000000005c)
heavily suggests that the crash is due to 'cdev->gadget' being NULL when executing:
  data[0] = cpu_to_le32(ncm_bitrate(cdev->gadget));
which calls:
  ncm_bitrate(NULL)
which then calls:
  gadget_is_superspeed(NULL)
which reads
  ((struct usb_gadget *)NULL)->max_speed
and hits a panic.

AFAICT, if I'm counting right, the offset of max_speed is indeed 0x5C.
(remember there's a GKI KABI reservation of 16 bytes in struct work_struct)

It's not at all clear to me how this is all supposed to work...
but returning 0 seems much better than panic-ing...

Cc: Felipe Balbi <balbi@kernel.org>
Cc: Lorenzo Colitti <lorenzo@google.com>
Cc: Carlos Llamas <cmllamas@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Cc: stable <stable@kernel.org>
Link: https://lore.kernel.org/r/20230117131839.1138208-1-maze@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Apr 20, 2023
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Apr 28, 2023
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request May 26, 2023
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Jun 3, 2023
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Jun 4, 2023
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Jun 5, 2023
…quire (#8)

Upstream fix from kernel 5.12
nfsd: Don't keep looking up unhashed files in the nfsd file cache

If a file is unhashed, then we're going to reject it anyway and retry,
so make sure we skip it when we're doing the RCU lockless lookup.
This avoids a number of unnecessary nfserr_jukebox returns from
nfsd_file_acquire()

Fixes: 65294c1 ("nfsd: add a new struct file caching facility to nfsd")
delphix-devops-bot pushed a commit that referenced this pull request Sep 1, 2023
BugLink: https://bugs.launchpad.net/bugs/2023230

[ Upstream commit 4e264be ]

When a system with E810 with existing VFs gets rebooted the following
hang may be observed.

 Pid 1 is hung in iavf_remove(), part of a network driver:
 PID: 1        TASK: ffff965400e5a340  CPU: 24   COMMAND: "systemd-shutdow"
  #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb
  #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d
  #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc
  #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930
  #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf]
  #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513
  #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa
  #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc
  #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e
  #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429
 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4
 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice]
 #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice]
 #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice]
 #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1
 #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386
 #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870
 #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6
 #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159
 #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc
 #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d
 #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169
 #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b
     RIP: 00007f1baa5c13d7  RSP: 00007fffbcc55a98  RFLAGS: 00000202
     RAX: ffffffffffffffda  RBX: 0000000000000000  RCX: 00007f1baa5c13d7
     RDX: 0000000001234567  RSI: 0000000028121969  RDI: 00000000fee1dead
     RBP: 00007fffbcc55ca0   R8: 0000000000000000   R9: 00007fffbcc54e90
     R10: 00007fffbcc55050  R11: 0000000000000202  R12: 0000000000000005
     R13: 0000000000000000  R14: 00007fffbcc55af0  R15: 0000000000000000
     ORIG_RAX: 00000000000000a9  CS: 0033  SS: 002b

During reboot all drivers PM shutdown callbacks are invoked.
In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE.
In ice_shutdown() the call chain above is executed, which at some point
calls iavf_remove(). However iavf_remove() expects the VF to be in one
of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If
that's not the case it sleeps forever.
So if iavf_shutdown() gets invoked before iavf_remove() the system will
hang indefinitely because the adapter is already in state __IAVF_REMOVE.

Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE,
as we already went through iavf_shutdown().

Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove")
Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove")
Reported-by: Marius Cornea <mcornea@redhat.com>
Signed-off-by: Stefan Assmann <sassmann@kpanic.de>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Luke Nowakowski-Krijger <luke.nowakowskikrijger@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Sep 13, 2023
BugLink: https://bugs.launchpad.net/bugs/2028408

[ Upstream commit 05bb016 ]

ACPICA commit 770653e3ba67c30a629ca7d12e352d83c2541b1e

Before this change we see the following UBSAN stack trace in Fuchsia:

  #0    0x000021e4213b3302 in acpi_ds_init_aml_walk(struct acpi_walk_state*, union acpi_parse_object*, struct acpi_namespace_node*, u8*, u32, struct acpi_evaluate_info*, u8) ../../third_party/acpica/source/components/dispatcher/dswstate.c:682 <platform-bus-x86.so>+0x233302
  #1.2  0x000020d0f660777f in ubsan_get_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:41 <libclang_rt.asan.so>+0x3d77f
  #1.1  0x000020d0f660777f in maybe_print_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:51 <libclang_rt.asan.so>+0x3d77f
  #1    0x000020d0f660777f in ~scoped_report() compiler-rt/lib/ubsan/ubsan_diag.cpp:387 <libclang_rt.asan.so>+0x3d77f
  #2    0x000020d0f660b96d in handlepointer_overflow_impl() compiler-rt/lib/ubsan/ubsan_handlers.cpp:809 <libclang_rt.asan.so>+0x4196d
  #3    0x000020d0f660b50d in compiler-rt/lib/ubsan/ubsan_handlers.cpp:815 <libclang_rt.asan.so>+0x4150d
  #4    0x000021e4213b3302 in acpi_ds_init_aml_walk(struct acpi_walk_state*, union acpi_parse_object*, struct acpi_namespace_node*, u8*, u32, struct acpi_evaluate_info*, u8) ../../third_party/acpica/source/components/dispatcher/dswstate.c:682 <platform-bus-x86.so>+0x233302
  #5    0x000021e4213e2369 in acpi_ds_call_control_method(struct acpi_thread_state*, struct acpi_walk_state*, union acpi_parse_object*) ../../third_party/acpica/source/components/dispatcher/dsmethod.c:605 <platform-bus-x86.so>+0x262369
  #6    0x000021e421437fac in acpi_ps_parse_aml(struct acpi_walk_state*) ../../third_party/acpica/source/components/parser/psparse.c:550 <platform-bus-x86.so>+0x2b7fac
  #7    0x000021e4214464d2 in acpi_ps_execute_method(struct acpi_evaluate_info*) ../../third_party/acpica/source/components/parser/psxface.c:244 <platform-bus-x86.so>+0x2c64d2
  #8    0x000021e4213aa052 in acpi_ns_evaluate(struct acpi_evaluate_info*) ../../third_party/acpica/source/components/namespace/nseval.c:250 <platform-bus-x86.so>+0x22a052
  #9    0x000021e421413dd8 in acpi_ns_init_one_device(acpi_handle, u32, void*, void**) ../../third_party/acpica/source/components/namespace/nsinit.c:735 <platform-bus-x86.so>+0x293dd8
  #10   0x000021e421429e98 in acpi_ns_walk_namespace(acpi_object_type, acpi_handle, u32, u32, acpi_walk_callback, acpi_walk_callback, void*, void**) ../../third_party/acpica/source/components/namespace/nswalk.c:298 <platform-bus-x86.so>+0x2a9e98
  #11   0x000021e4214131ac in acpi_ns_initialize_devices(u32) ../../third_party/acpica/source/components/namespace/nsinit.c:268 <platform-bus-x86.so>+0x2931ac
  #12   0x000021e42147c40d in acpi_initialize_objects(u32) ../../third_party/acpica/source/components/utilities/utxfinit.c:304 <platform-bus-x86.so>+0x2fc40d
  #13   0x000021e42126d603 in acpi::acpi_impl::initialize_acpi(acpi::acpi_impl*) ../../src/devices/board/lib/acpi/acpi-impl.cc:224 <platform-bus-x86.so>+0xed603

Add a simple check that avoids incrementing a pointer by zero, but
otherwise behaves as before. Note that our findings are against ACPICA
20221020, but the same code exists on master.

Link: acpica/acpica@770653e3
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Sep 13, 2023
BugLink: https://bugs.launchpad.net/bugs/2028701

commit 19b8766 upstream.

Each physical partition can provide multiple services each with UUID.
Each such service can be presented as logical partition with a unique
combination of VM ID and UUID. The number of distinct UUID in a system
will be less than or equal to the number of logical partitions.

However, currently it fails to register more than one logical partition
or service within a physical partition as the device name contains only
VM ID while both VM ID and UUID are maintained in the partition information.
The kernel complains with the below message:

  | sysfs: cannot create duplicate filename '/devices/arm-ffa-8001'
  | CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.3.0-rc7 #8
  | Hardware name: FVP Base RevC (DT)
  | Call trace:
  |  dump_backtrace+0xf8/0x118
  |  show_stack+0x18/0x24
  |  dump_stack_lvl+0x50/0x68
  |  dump_stack+0x18/0x24
  |  sysfs_create_dir_ns+0xe0/0x13c
  |  kobject_add_internal+0x220/0x3d4
  |  kobject_add+0x94/0x100
  |  device_add+0x144/0x5d8
  |  device_register+0x20/0x30
  |  ffa_device_register+0x88/0xd8
  |  ffa_setup_partitions+0x108/0x1b8
  |  ffa_init+0x2ec/0x3a4
  |  do_one_initcall+0xcc/0x240
  |  do_initcall_level+0x8c/0xac
  |  do_initcalls+0x54/0x94
  |  do_basic_setup+0x1c/0x28
  |  kernel_init_freeable+0x100/0x16c
  |  kernel_init+0x20/0x1a0
  |  ret_from_fork+0x10/0x20
  | kobject_add_internal failed for arm-ffa-8001 with -EEXIST, don't try to
  | register things with the same name in the same directory.
  | arm_ffa arm-ffa: unable to register device arm-ffa-8001 err=-17
  | ARM FF-A: ffa_setup_partitions: failed to register partition ID 0x8001

By virtue of being random enough to avoid collisions when generated in a
distributed system, there is no way to compress UUID keys to the number
of bits required to identify each. We can eliminate '-' in the name but
it is not worth eliminating 4 bytes and add unnecessary logic for doing
that. Also v1.0 doesn't provide the UUID of the partitions which makes
it hard to use the same for the device name.

So to keep it simple, let us alloc an ID using ida_alloc() and append the
same to "arm-ffa" to make up a unique device name. Also stash the id value
in ffa_dev to help freeing the ID later when the device is destroyed.

Fixes: e781858 ("firmware: arm_ffa: Add initial FFA bus support for device enumeration")
Reported-by: Lucian Paul-Trifu <lucian.paul-trifu@arm.com>
Link: https://lore.kernel.org/r/20230419-ffa_fixes_6-4-v2-3-d9108e43a176@arm.com
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Oct 6, 2023
BugLink: https://bugs.launchpad.net/bugs/2032689

[ Upstream commit b684c09 ]

ppc_save_regs() skips one stack frame while saving the CPU register states.
Instead of saving current R1, it pulls the previous stack frame pointer.

When vmcores caused by direct panic call (such as `echo c >
/proc/sysrq-trigger`), are debugged with gdb, gdb fails to show the
backtrace correctly. On further analysis, it was found that it was because
of mismatch between r1 and NIP.

GDB uses NIP to get current function symbol and uses corresponding debug
info of that function to unwind previous frames, but due to the
mismatching r1 and NIP, the unwinding does not work, and it fails to
unwind to the 2nd frame and hence does not show the backtrace.

GDB backtrace with vmcore of kernel without this patch:

---------
(gdb) bt
 #0  0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>,
    newregs=0xc000000004f8f8d8) at ./arch/powerpc/include/asm/kexec.h:69
 #1  __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974
 #2  0x0000000000000063 in ?? ()
 #3  0xc000000003579320 in ?? ()
---------

Further analysis revealed that the mismatch occurred because
"ppc_save_regs" was saving the previous stack's SP instead of the current
r1. This patch fixes this by storing current r1 in the saved pt_regs.

GDB backtrace with vmcore of patched kernel:

--------
(gdb) bt
 #0  0xc0000000002a53e8 in crash_setup_regs (oldregs=0x0, newregs=0xc00000000670b8d8)
    at ./arch/powerpc/include/asm/kexec.h:69
 #1  __crash_kexec (regs=regs@entry=0x0) at kernel/kexec_core.c:974
 #2  0xc000000000168918 in panic (fmt=fmt@entry=0xc000000001654a60 "sysrq triggered crash\n")
    at kernel/panic.c:358
 #3  0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155
 #4  0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false)
    at drivers/tty/sysrq.c:602
 #5  0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>,
    count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163
 #6  0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>,
    buf=<optimized out>, file=<optimized out>, pde=0xc00000000362cb40) at fs/proc/inode.c:340
 #7  proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>,
    ppos=<optimized out>) at fs/proc/inode.c:352
 #8  0xc0000000005b3bbc in vfs_write (file=file@entry=0xc000000006aa6b00,
    buf=buf@entry=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>,
    count=count@entry=2, pos=pos@entry=0xc00000000670bda0) at fs/read_write.c:582
 #9  0xc0000000005b4264 in ksys_write (fd=<optimized out>,
    buf=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>, count=2)
    at fs/read_write.c:637
 #10 0xc00000000002ea2c in system_call_exception (regs=0xc00000000670be80, r0=<optimized out>)
    at arch/powerpc/kernel/syscall.c:171
 #11 0xc00000000000c270 in system_call_vectored_common ()
    at arch/powerpc/kernel/interrupt_64.S:192
--------

Nick adds:
  So this now saves regs as though it was an interrupt taken in the
  caller, at the instruction after the call to ppc_save_regs, whereas
  previously the NIP was there, but R1 came from the caller's caller and
  that mismatch is what causes gdb's dwarf unwinder to go haywire.

Signed-off-by: Aditya Gupta <adityag@linux.ibm.com>
Fixes: d16a58f ("powerpc: Improve ppc_save_regs()")
Reivewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://msgid.link/20230615091047.90433-1-adityag@linux.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Kamal Mostafa <kamal@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Nov 1, 2023
Normally, x->replay_esn and x->preplay_esn should be allocated at
xfrm_alloc_replay_state_esn(...) in xfrm_state_construct(...), hence the
xfrm_update_ae_params(...) is okay to update them. However, the current
implementation of xfrm_new_ae(...) allows a malicious user to directly
dereference a NULL pointer and crash the kernel like below.

BUG: kernel NULL pointer dereference, address: 0000000000000000
PGD 8253067 P4D 8253067 PUD 8e0e067 PMD 0
Oops: 0002 [#1] PREEMPT SMP KASAN NOPTI
CPU: 0 PID: 98 Comm: poc.npd Not tainted 6.4.0-rc7-00072-gdad9774deaf1 #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.o4
RIP: 0010:memcpy_orig+0xad/0x140
Code: e8 4c 89 5f e0 48 8d 7f e0 73 d2 83 c2 20 48 29 d6 48 29 d7 83 fa 10 72 34 4c 8b 06 4c 8b 4e 08 c
RSP: 0018:ffff888008f57658 EFLAGS: 00000202
RAX: 0000000000000000 RBX: ffff888008bd0000 RCX: ffffffff8238e571
RDX: 0000000000000018 RSI: ffff888007f64844 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff888008f57818
R13: ffff888007f64aa4 R14: 0000000000000000 R15: 0000000000000000
FS:  00000000014013c0(0000) GS:ffff88806d600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000000054d8000 CR4: 00000000000006f0
Call Trace:
 <TASK>
 ? __die+0x1f/0x70
 ? page_fault_oops+0x1e8/0x500
 ? __pfx_is_prefetch.constprop.0+0x10/0x10
 ? __pfx_page_fault_oops+0x10/0x10
 ? _raw_spin_unlock_irqrestore+0x11/0x40
 ? fixup_exception+0x36/0x460
 ? _raw_spin_unlock_irqrestore+0x11/0x40
 ? exc_page_fault+0x5e/0xc0
 ? asm_exc_page_fault+0x26/0x30
 ? xfrm_update_ae_params+0xd1/0x260
 ? memcpy_orig+0xad/0x140
 ? __pfx__raw_spin_lock_bh+0x10/0x10
 xfrm_update_ae_params+0xe7/0x260
 xfrm_new_ae+0x298/0x4e0
 ? __pfx_xfrm_new_ae+0x10/0x10
 ? __pfx_xfrm_new_ae+0x10/0x10
 xfrm_user_rcv_msg+0x25a/0x410
 ? __pfx_xfrm_user_rcv_msg+0x10/0x10
 ? __alloc_skb+0xcf/0x210
 ? stack_trace_save+0x90/0xd0
 ? filter_irq_stacks+0x1c/0x70
 ? __stack_depot_save+0x39/0x4e0
 ? __kasan_slab_free+0x10a/0x190
 ? kmem_cache_free+0x9c/0x340
 ? netlink_recvmsg+0x23c/0x660
 ? sock_recvmsg+0xeb/0xf0
 ? __sys_recvfrom+0x13c/0x1f0
 ? __x64_sys_recvfrom+0x71/0x90
 ? do_syscall_64+0x3f/0x90
 ? entry_SYSCALL_64_after_hwframe+0x72/0xdc
 ? copyout+0x3e/0x50
 netlink_rcv_skb+0xd6/0x210
 ? __pfx_xfrm_user_rcv_msg+0x10/0x10
 ? __pfx_netlink_rcv_skb+0x10/0x10
 ? __pfx_sock_has_perm+0x10/0x10
 ? mutex_lock+0x8d/0xe0
 ? __pfx_mutex_lock+0x10/0x10
 xfrm_netlink_rcv+0x44/0x50
 netlink_unicast+0x36f/0x4c0
 ? __pfx_netlink_unicast+0x10/0x10
 ? netlink_recvmsg+0x500/0x660
 netlink_sendmsg+0x3b7/0x700

This Null-ptr-deref bug is assigned CVE-2023-3772. And this commit
adds additional NULL check in xfrm_update_ae_params to fix the NPD.

Fixes: d8647b7 ("xfrm: Add user interface for esn and big anti-replay windows")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

CVE-2023-3772
(cherry picked from commit 00374d9)
Signed-off-by: Yuxuan Luo <yuxuan.luo@canonical.com>
Acked-by: Stefan Bader <stefan.bader@canonical.com>
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request May 9, 2024
BugLink: https://bugs.launchpad.net/bugs/2059014

[ Upstream commit fc3a553 ]

An issue occurred while reading an ELF file in libbpf.c during fuzzing:

	Program received signal SIGSEGV, Segmentation fault.
	0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206
	4206 in libbpf.c
	(gdb) bt
	#0 0x0000000000958e97 in bpf_object.collect_prog_relos () at libbpf.c:4206
	#1 0x000000000094f9d6 in bpf_object.collect_relos () at libbpf.c:6706
	#2 0x000000000092bef3 in bpf_object_open () at libbpf.c:7437
	#3 0x000000000092c046 in bpf_object.open_mem () at libbpf.c:7497
	#4 0x0000000000924afa in LLVMFuzzerTestOneInput () at fuzz/bpf-object-fuzzer.c:16
	#5 0x000000000060be11 in testblitz_engine::fuzzer::Fuzzer::run_one ()
	#6 0x000000000087ad92 in tracing::span::Span::in_scope ()
	#7 0x00000000006078aa in testblitz_engine::fuzzer::util::walkdir ()
	#8 0x00000000005f3217 in testblitz_engine::entrypoint::main::{{closure}} ()
	#9 0x00000000005f2601 in main ()
	(gdb)

scn_data was null at this code(tools/lib/bpf/src/libbpf.c):

	if (rel->r_offset % BPF_INSN_SZ || rel->r_offset >= scn_data->d_size) {

The scn_data is derived from the code above:

	scn = elf_sec_by_idx(obj, sec_idx);
	scn_data = elf_sec_data(obj, scn);

	relo_sec_name = elf_sec_str(obj, shdr->sh_name);
	sec_name = elf_sec_name(obj, scn);
	if (!relo_sec_name || !sec_name)// don't check whether scn_data is NULL
		return -EINVAL;

In certain special scenarios, such as reading a malformed ELF file,
it is possible that scn_data may be a null pointer

Signed-off-by: Mingyi Zhang <zhangmingyi5@huawei.com>
Signed-off-by: Xin Liu <liuxin350@huawei.com>
Signed-off-by: Changye Wu <wuchangye@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231221033947.154564-1-liuxin350@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Manuel Diewald <manuel.diewald@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Feb 7, 2025
BugLink: https://bugs.launchpad.net/bugs/2076435

commit be346c1 upstream.

The code in ocfs2_dio_end_io_write() estimates number of necessary
transaction credits using ocfs2_calc_extend_credits().  This however does
not take into account that the IO could be arbitrarily large and can
contain arbitrary number of extents.

Extent tree manipulations do often extend the current transaction but not
in all of the cases.  For example if we have only single block extents in
the tree, ocfs2_mark_extent_written() will end up calling
ocfs2_replace_extent_rec() all the time and we will never extend the
current transaction and eventually exhaust all the transaction credits if
the IO contains many single block extents.  Once that happens a
WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in
jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to
this error.  This was actually triggered by one of our customers on a
heavily fragmented OCFS2 filesystem.

To fix the issue make sure the transaction always has enough credits for
one extent insert before each call of ocfs2_mark_extent_written().

Heming Zhao said:

------
PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error"

PID: xxx  TASK: xxxx  CPU: 5  COMMAND: "SubmitThread-CA"
  #0 machine_kexec at ffffffff8c069932
  #1 __crash_kexec at ffffffff8c1338fa
  #2 panic at ffffffff8c1d69b9
  #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2]
  #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2]
  #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2]
  #6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2]
  #7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2]
  #8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2]
  #9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2]
#10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2]
#11 dio_complete at ffffffff8c2b9fa7
#12 do_blockdev_direct_IO at ffffffff8c2bc09f
#13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2]
#14 generic_file_direct_write at ffffffff8c1dcf14
#15 __generic_file_write_iter at ffffffff8c1dd07b
#16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2]
#17 aio_write at ffffffff8c2cc72e
#18 kmem_cache_alloc at ffffffff8c248dde
#19 do_io_submit at ffffffff8c2ccada
#20 do_syscall_64 at ffffffff8c004984
#21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba

Link: https://lkml.kernel.org/r/20240617095543.6971-1-jack@suse.cz
Link: https://lkml.kernel.org/r/20240614145243.8837-1-jack@suse.cz
Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reviewed-by: Heming Zhao <heming.zhao@suse.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Portia Stephens <portia.stephens@canonical.com>
Signed-off-by: Roxana Nicolescu <roxana.nicolescu@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request Feb 7, 2025
BugLink: https://bugs.launchpad.net/bugs/2078289

commit 1723f04 upstream.

Currently if we request a feature that is not set in the Kernel config we
fail silently and return all the available features.  However, the man
page indicates we should return an EINVAL.

We need to fix this issue since we can end up with a Kernel warning should
a program request the feature UFFD_FEATURE_WP_UNPOPULATED on a kernel with
the config not set with this feature.

 [  200.812896] WARNING: CPU: 91 PID: 13634 at mm/memory.c:1660 zap_pte_range+0x43d/0x660
 [  200.820738] Modules linked in:
 [  200.869387] CPU: 91 PID: 13634 Comm: userfaultfd Kdump: loaded Not tainted 6.9.0-rc5+ #8
 [  200.877477] Hardware name: Dell Inc. PowerEdge R6525/0N7YGH, BIOS 2.7.3 03/30/2022
 [  200.885052] RIP: 0010:zap_pte_range+0x43d/0x660

Link: https://lkml.kernel.org/r/20240626130513.120193-1-audra@redhat.com
Fixes: e06f1e1 ("userfaultfd: wp: enabled write protection in userfaultfd API")
Signed-off-by: Audra Mitchell <audra@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Rafael Aquini <raquini@redhat.com>
Cc: Shaohua Li <shli@fb.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Portia Stephens <portia.stephens@canonical.com>
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request May 21, 2025
BugLink: https://bugs.launchpad.net/bugs/2103829

[ Upstream commit c7b87ce0dd10b64b68a0b22cb83bbd556e28fe81 ]

libtraceevent parses and returns an array of argument fields, sometimes
larger than RAW_SYSCALL_ARGS_NUM (6) because it includes "__syscall_nr",
idx will traverse to index 6 (7th element) whereas sc->fmt->arg holds 6
elements max, creating an out-of-bounds access. This runtime error is
found by UBsan. The error message:

  $ sudo UBSAN_OPTIONS=print_stacktrace=1 ./perf trace -a --max-events=1
  builtin-trace.c:1966:35: runtime error: index 6 out of bounds for type 'syscall_arg_fmt [6]'
    #0 0x5c04956be5fe in syscall__alloc_arg_fmts /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:1966
    #1 0x5c04956c0510 in trace__read_syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2110
    #2 0x5c04956c372b in trace__syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2436
    #3 0x5c04956d2f39 in trace__init_syscalls_bpf_prog_array_maps /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:3897
    #4 0x5c04956d6d25 in trace__run /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:4335
    #5 0x5c04956e112e in cmd_trace /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:5502
    #6 0x5c04956eda7d in run_builtin /home/howard/hw/linux-perf/tools/perf/perf.c:351
    #7 0x5c04956ee0a8 in handle_internal_command /home/howard/hw/linux-perf/tools/perf/perf.c:404
    #8 0x5c04956ee37f in run_argv /home/howard/hw/linux-perf/tools/perf/perf.c:448
    #9 0x5c04956ee8e9 in main /home/howard/hw/linux-perf/tools/perf/perf.c:556
    #10 0x79eb3622a3b7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #11 0x79eb3622a47a in __libc_start_main_impl ../csu/libc-start.c:360
    #12 0x5c04955422d4 in _start (/home/howard/hw/linux-perf/tools/perf/perf+0x4e02d4) (BuildId: 5b6cab2d59e96a4341741765ad6914a4d784dbc6)

     0.000 ( 0.014 ms): Chrome_ChildIO/117244 write(fd: 238, buf: !, count: 1)                                      = 1

Fixes: 5e58fcf ("perf trace: Allow allocating sc->arg_fmt even without the syscall tracepoint")
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Link: https://lore.kernel.org/r/20250122025519.361873-1-howardchu95@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Noah Wager <noah.wager@canonical.com>
Signed-off-by: Mehmet Basaran <mehmet.basaran@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request May 21, 2025
BugLink: https://bugs.launchpad.net/bugs/2104873

commit 7faf14a7b0366f153284db0ad3347c457ea70136 upstream.

If getting acl_default fails, acl_access and acl_default will be released
simultaneously. However, acl_access will still retain a pointer pointing
to the released posix_acl, which will trigger a WARNING in
nfs3svc_release_getacl like this:

------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 26 PID: 3199 at lib/refcount.c:28
refcount_warn_saturate+0xb5/0x170
Modules linked in:
CPU: 26 UID: 0 PID: 3199 Comm: nfsd Not tainted
6.12.0-rc6-00079-g04ae226af01f-dirty #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.16.1-2.fc37 04/01/2014
RIP: 0010:refcount_warn_saturate+0xb5/0x170
Code: cc cc 0f b6 1d b3 20 a5 03 80 fb 01 0f 87 65 48 d8 00 83 e3 01 75
e4 48 c7 c7 c0 3b 9b 85 c6 05 97 20 a5 03 01 e8 fb 3e 30 ff <0f> 0b eb
cd 0f b6 1d 8a3
RSP: 0018:ffffc90008637cd8 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff83904fde
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff88871ed36380
RBP: ffff888158beeb40 R08: 0000000000000001 R09: fffff520010c6f56
R10: ffffc90008637ab7 R11: 0000000000000001 R12: 0000000000000001
R13: ffff888140e77400 R14: ffff888140e77408 R15: ffffffff858b42c0
FS:  0000000000000000(0000) GS:ffff88871ed00000(0000)
knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000562384d32158 CR3: 000000055cc6a000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <TASK>
 ? refcount_warn_saturate+0xb5/0x170
 ? __warn+0xa5/0x140
 ? refcount_warn_saturate+0xb5/0x170
 ? report_bug+0x1b1/0x1e0
 ? handle_bug+0x53/0xa0
 ? exc_invalid_op+0x17/0x40
 ? asm_exc_invalid_op+0x1a/0x20
 ? tick_nohz_tick_stopped+0x1e/0x40
 ? refcount_warn_saturate+0xb5/0x170
 ? refcount_warn_saturate+0xb5/0x170
 nfs3svc_release_getacl+0xc9/0xe0
 svc_process_common+0x5db/0xb60
 ? __pfx_svc_process_common+0x10/0x10
 ? __rcu_read_unlock+0x69/0xa0
 ? __pfx_nfsd_dispatch+0x10/0x10
 ? svc_xprt_received+0xa1/0x120
 ? xdr_init_decode+0x11d/0x190
 svc_process+0x2a7/0x330
 svc_handle_xprt+0x69d/0x940
 svc_recv+0x180/0x2d0
 nfsd+0x168/0x200
 ? __pfx_nfsd+0x10/0x10
 kthread+0x1a2/0x1e0
 ? kthread+0xf4/0x1e0
 ? __pfx_kthread+0x10/0x10
 ret_from_fork+0x34/0x60
 ? __pfx_kthread+0x10/0x10
 ret_from_fork_asm+0x1a/0x30
 </TASK>
Kernel panic - not syncing: kernel: panic_on_warn set ...

Clear acl_access/acl_default after posix_acl_release is called to prevent
UAF from being triggered.

Fixes: a257cdd ("[PATCH] NFSD: Add server support for NFSv3 ACLs.")
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20241107014705.2509463-1-lilingfeng@huaweicloud.com/
Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com>
Reviewed-by: Rick Macklem <rmacklem@uoguelph.ca>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Noah Wager <noah.wager@canonical.com>
Signed-off-by: Mehmet Basaran <mehmet.basaran@canonical.com>
delphix-devops-bot pushed a commit that referenced this pull request May 21, 2025
BugLink: https://bugs.launchpad.net/bugs/2106494

[ Upstream commit 6b3d638ca897e099fa99bd6d02189d3176f80a47 ]

KMSAN reported a use-after-free issue in eth_skb_pkt_type()[1]. The
cause of the issue was that eth_skb_pkt_type() accessed skb's data
that didn't contain an Ethernet header. This occurs when
bpf_prog_test_run_xdp() passes an invalid value as the user_data
argument to bpf_test_init().

Fix this by returning an error when user_data is less than ETH_HLEN in
bpf_test_init(). Additionally, remove the check for "if (user_size >
size)" as it is unnecessary.

[1]
BUG: KMSAN: use-after-free in eth_skb_pkt_type include/linux/etherdevice.h:627 [inline]
BUG: KMSAN: use-after-free in eth_type_trans+0x4ee/0x980 net/ethernet/eth.c:165
 eth_skb_pkt_type include/linux/etherdevice.h:627 [inline]
 eth_type_trans+0x4ee/0x980 net/ethernet/eth.c:165
 __xdp_build_skb_from_frame+0x5a8/0xa50 net/core/xdp.c:635
 xdp_recv_frames net/bpf/test_run.c:272 [inline]
 xdp_test_run_batch net/bpf/test_run.c:361 [inline]
 bpf_test_run_xdp_live+0x2954/0x3330 net/bpf/test_run.c:390
 bpf_prog_test_run_xdp+0x148e/0x1b10 net/bpf/test_run.c:1318
 bpf_prog_test_run+0x5b7/0xa30 kernel/bpf/syscall.c:4371
 __sys_bpf+0x6a6/0xe20 kernel/bpf/syscall.c:5777
 __do_sys_bpf kernel/bpf/syscall.c:5866 [inline]
 __se_sys_bpf kernel/bpf/syscall.c:5864 [inline]
 __x64_sys_bpf+0xa4/0xf0 kernel/bpf/syscall.c:5864
 x64_sys_call+0x2ea0/0x3d90 arch/x86/include/generated/asm/syscalls_64.h:322
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0xd9/0x1d0 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

Uninit was created at:
 free_pages_prepare mm/page_alloc.c:1056 [inline]
 free_unref_page+0x156/0x1320 mm/page_alloc.c:2657
 __free_pages+0xa3/0x1b0 mm/page_alloc.c:4838
 bpf_ringbuf_free kernel/bpf/ringbuf.c:226 [inline]
 ringbuf_map_free+0xff/0x1e0 kernel/bpf/ringbuf.c:235
 bpf_map_free kernel/bpf/syscall.c:838 [inline]
 bpf_map_free_deferred+0x17c/0x310 kernel/bpf/syscall.c:862
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xa2b/0x1b60 kernel/workqueue.c:3310
 worker_thread+0xedf/0x1550 kernel/workqueue.c:3391
 kthread+0x535/0x6b0 kernel/kthread.c:389
 ret_from_fork+0x6e/0x90 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

CPU: 1 UID: 0 PID: 17276 Comm: syz.1.16450 Not tainted 6.12.0-05490-g9bb88c659673 #8
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014

Fixes: be3d72a ("bpf: move user_size out of bpf_test_init")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Suggested-by: Martin KaFai Lau <martin.lau@linux.dev>
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://patch.msgid.link/20250121150643.671650-1-syoshida@redhat.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Noah Wager <noah.wager@canonical.com>
Signed-off-by: Mehmet Basaran <mehmet.basaran@canonical.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants