Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V3.14 mx6 spdif #9

Conversation

warped-rudi
Copy link
Contributor

Hi Jon,
this is the SPDIF part of the audio patches. Please have a special eye on the the dts(i) changes. I don't know, if it's necessary to mention both clocks and clock-names, but I found it better for reading.

warped-rudi and others added 7 commits August 2, 2014 20:19
Signed-off-by: Rudi <r.ihle@s-t.de>
Signed-off-by: Rudi <r.ihle@s-t.de>
Add support for the sample rates between 48kHz and 192kHz that are also
supported by HDMI: 88.2kHz, 96kHz, 176.4kHz, 192kHz.

TODO: TESTING

Signed-off-by: Anssi Hannula <anssi.hannula@iki.fi>
Transmitters and receivers may support a 192kHz sample rate.

Tested with Cubox-i imx6 system and Onkyo TX-SR607 receiver.

Signed-off-by: Anssi Hannula <anssi.hannula@iki.fi>
The calculation code does
u64 = (u32 - u32) * 100000;

The 64 bits are of no help here as the type is casted only after the
multiplication, and therefore the result may overflow, possibly causing
inoptimal or wrong clock setup.

Fix the code to cast before multiplication.

Signed-off-by: Anssi Hannula <anssi.hannula@iki.fi>
The spdif driver will enumerate all possible clock sources (internal
and external) for a matching frequency. Since CuBox-i does not have
any external oscillator, we must tell the driver so. Otherwise it may
choose a non-present clock source resulting in no sound at 88.2 and
176.4kHz.

Signed-off-by: Rudi <r.ihle@s-t.de>
This patch allows all 6 bytes ot the SPDIF channel status information
to be transmitted.
@warped-rudi
Copy link
Contributor Author

Since the patches are already included, I close this PR.

@warped-rudi warped-rudi deleted the v3.14-mx6-spdif branch August 22, 2014 21:54
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
In several places, this snippet is used when removing neigh entries:

	list_del(&neigh->list);
	ipoib_neigh_free(neigh);

The list_del() removes neigh from the associated struct ipoib_path, while
ipoib_neigh_free() removes neigh from the device's neigh entry lookup
table.  Both of these operations are protected by the priv->lock
spinlock.  The table however is also protected via RCU, and so naturally
the lock is not held when doing reads.

This leads to a race condition, in which a thread may successfully look
up a neigh entry that has already been deleted from neigh->list.  Since
the previous deletion will have marked the entry with poison, a second
list_del() on the object will cause a panic:

  rabeeh#5 [ffff8802338c3c70] general_protection at ffffffff815108c5
     [exception RIP: list_del+16]
     RIP: ffffffff81289020  RSP: ffff8802338c3d20  RFLAGS: 00010082
     RAX: dead000000200200  RBX: ffff880433e60c88  RCX: 0000000000009e6c
     RDX: 0000000000000246  RSI: ffff8806012ca298  RDI: ffff880433e60c88
     RBP: ffff8802338c3d30   R8: ffff8806012ca2e8   R9: 00000000ffffffff
     R10: 0000000000000001  R11: 0000000000000000  R12: ffff8804346b2020
     R13: ffff88032a3e7540  R14: ffff8804346b26e0  R15: 0000000000000246
     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
  rabeeh#6 [ffff8802338c3d38] ipoib_cm_tx_handler at ffffffffa066fe0a [ib_ipoib]
  linux4kix#7 [ffff8802338c3d98] cm_process_work at ffffffffa05149a7 [ib_cm]
  linux4kix#8 [ffff8802338c3de8] cm_work_handler at ffffffffa05161aa [ib_cm]
  linux4kix#9 [ffff8802338c3e38] worker_thread at ffffffff81090e10
 linux4kix#10 [ffff8802338c3ee8] kthread at ffffffff81096c66
 linux4kix#11 [ffff8802338c3f48] kernel_thread at ffffffff8100c0ca

We move the list_del() into ipoib_neigh_free(), so that deletion happens
only once, after the entry has been successfully removed from the lookup
table.  This same behavior is already used in ipoib_del_neighs_by_gid()
and __ipoib_reap_neigh().

Signed-off-by: Jim Foraker <foraker1@llnl.gov>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com>
Reviewed-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Dave has reported the following lockdep splat:

  =================================
  [ INFO: inconsistent lock state ]
  3.11.0-rc1+ linux4kix#9 Not tainted
  ---------------------------------
  inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
  kswapd0/49 [HC0[0]:SC0[0]:HE1:SE1] takes:
   (&mapping->i_mmap_mutex){+.+.?.}, at: [<c114971b>] page_referenced+0x87/0x5e3
  {RECLAIM_FS-ON-W} state was registered at:
     mark_held_locks+0x81/0xe7
     lockdep_trace_alloc+0x5e/0xbc
     __alloc_pages_nodemask+0x8b/0x9b6
     __get_free_pages+0x20/0x31
     get_zeroed_page+0x12/0x14
     __pmd_alloc+0x1c/0x6b
     huge_pmd_share+0x265/0x283
     huge_pte_alloc+0x5d/0x71
     hugetlb_fault+0x7c/0x64a
     handle_mm_fault+0x255/0x299
     __do_page_fault+0x142/0x55c
     do_page_fault+0xd/0x16
     error_code+0x6c/0x74
  irq event stamp: 3136917
  hardirqs last  enabled at (3136917):  _raw_spin_unlock_irq+0x27/0x50
  hardirqs last disabled at (3136916):  _raw_spin_lock_irq+0x15/0x78
  softirqs last  enabled at (3136180):  __do_softirq+0x137/0x30f
  softirqs last disabled at (3136175):  irq_exit+0xa8/0xaa
  other info that might help us debug this:
   Possible unsafe locking scenario:
         CPU0
         ----
    lock(&mapping->i_mmap_mutex);
    <Interrupt>
      lock(&mapping->i_mmap_mutex);

  *** DEADLOCK ***
  no locks held by kswapd0/49.

  stack backtrace:
  CPU: 1 PID: 49 Comm: kswapd0 Not tainted 3.11.0-rc1+ linux4kix#9
  Hardware name: Dell Inc.                 Precision WorkStation 490    /0DT031, BIOS A08 04/25/2008
  Call Trace:
    dump_stack+0x4b/0x79
    print_usage_bug+0x1d9/0x1e3
    mark_lock+0x1e0/0x261
    __lock_acquire+0x623/0x17f2
    lock_acquire+0x7d/0x195
    mutex_lock_nested+0x6c/0x3a7
    page_referenced+0x87/0x5e3
    shrink_page_list+0x3d9/0x947
    shrink_inactive_list+0x155/0x4cb
    shrink_lruvec+0x300/0x5ce
    shrink_zone+0x53/0x14e
    kswapd+0x517/0xa75
    kthread+0xa8/0xaa
    ret_from_kernel_thread+0x1b/0x28

which is a false positive caused by hugetlb pmd sharing code which
allocates a new pmd from withing mapping->i_mmap_mutex.  If this
allocation causes reclaim then the lockdep detector complains that we
might self-deadlock.

This is not correct though, because hugetlb pages are not reclaimable so
their mapping will be never touched from the reclaim path.

The patch tells lockup detector that hugetlb i_mmap_mutex is special by
assigning it a separate lockdep class so it won't report possible
deadlocks on unrelated mappings.

[peterz@infradead.org: comment for annotation]
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Michal Hocko <mhocko@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Commit 1c1d86a ("[media] v4l2: always require v4l2_dev,
rename parent to dev_parent") expects v4l2_dev to be always set.
It converted most of the drivers using the parent field of video_device
to v4l2_dev field. G2D driver did not set the parent field. Hence it got
left out. Without this patch we get the following boot warning and G2D
driver fails to register the video device.
WARNING: CPU: 0 PID: 1 at drivers/media/v4l2-core/v4l2-dev.c:775 __video_register_device+0xfc0/0x1028()
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.11.0-rc1-00001-g1c3e372-dirty linux4kix#9
[<c0014b7c>] (unwind_backtrace+0x0/0xf4) from [<c0011524>] (show_stack+0x10/0x14)
[<c0011524>] (show_stack+0x10/0x14) from [<c041d7a8>] (dump_stack+0x7c/0xb0)
[<c041d7a8>] (dump_stack+0x7c/0xb0) from [<c001dc94>] (warn_slowpath_common+0x6c/0x88)
[<c001dc94>] (warn_slowpath_common+0x6c/0x88) from [<c001dd4c>] (warn_slowpath_null+0x1c/0x24)
[<c001dd4c>] (warn_slowpath_null+0x1c/0x24) from [<c02cf8d4>] (__video_register_device+0xfc0/0x1028)
[<c02cf8d4>] (__video_register_device+0xfc0/0x1028) from [<c0311a94>] (g2d_probe+0x1f8/0x398)
[<c0311a94>] (g2d_probe+0x1f8/0x398) from [<c0247d54>] (platform_drv_probe+0x14/0x18)
[<c0247d54>] (platform_drv_probe+0x14/0x18) from [<c0246b10>] (driver_probe_device+0x108/0x220)
[<c0246b10>] (driver_probe_device+0x108/0x220) from [<c0246cf8>] (__driver_attach+0x8c/0x90)
[<c0246cf8>] (__driver_attach+0x8c/0x90) from [<c0245050>] (bus_for_each_dev+0x60/0x94)
[<c0245050>] (bus_for_each_dev+0x60/0x94) from [<c02462c8>] (bus_add_driver+0x1c0/0x24c)
[<c02462c8>] (bus_add_driver+0x1c0/0x24c) from [<c02472d0>] (driver_register+0x78/0x140)
[<c02472d0>] (driver_register+0x78/0x140) from [<c00087c8>] (do_one_initcall+0xf8/0x144)
[<c00087c8>] (do_one_initcall+0xf8/0x144) from [<c05b29e8>] (kernel_init_freeable+0x13c/0x1d8)
[<c05b29e8>] (kernel_init_freeable+0x13c/0x1d8) from [<c041a108>] (kernel_init+0xc/0x160)
[<c041a108>] (kernel_init+0xc/0x160) from [<c000e2f8>] (ret_from_fork+0x14/0x3c)
---[ end trace 4e0ec028b0028e02 ]---
s5p-g2d 12800000.g2d: Failed to register video device
s5p-g2d: probe of 12800000.g2d failed with error -22

Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Cc: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Kamil Debski <k.debski@samsung.com>
Signed-off-by: Mauro Carvalho Chehab <m.chehab@samsung.com>
Cc: stable@vger.kernel.org
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
When booting secondary CPUs, announce_cpu() is called to show which cpu has
been brought up. For example:

[    0.402751] smpboot: Booting Node   0, Processors  rabeeh#1 rabeeh#2 rabeeh#3 rabeeh#4 rabeeh#5 OK
[    0.525667] smpboot: Booting Node   1, Processors  rabeeh#6 linux4kix#7 linux4kix#8 linux4kix#9 linux4kix#10 linux4kix#11 OK
[    0.755592] smpboot: Booting Node   0, Processors  linux4kix#12 linux4kix#13 linux4kix#14 linux4kix#15 linux4kix#16 linux4kix#17 OK
[    0.890495] smpboot: Booting Node   1, Processors  linux4kix#18 linux4kix#19 linux4kix#20 linux4kix#21 linux4kix#22 linux4kix#23

But the last "OK" is lost, because 'nr_cpu_ids-1' represents the maximum
possible cpu id. It should use the maximum present cpu id in case not all
CPUs booted up.

Signed-off-by: Libin <huawei.libin@huawei.com>
Cc: <guohanjun@huawei.com>
Cc: <wangyijing@huawei.com>
Cc: <fenghua.yu@intel.com>
Cc: <paul.gortmaker@windriver.com>
Link: http://lkml.kernel.org/r/1378378676-18276-1-git-send-email-huawei.libin@huawei.com
[ tweaked the changelog, removed unnecessary line break, tweaked the format to align the fields vertically. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
When parsing lines from objdump a line containing source code starting
with a numeric label is mistaken for a line of disassembly starting with
a memory address.

Current validation fails to recognise that the "memory address" is out
of range and calculates an invalid offset which later causes this
segfault:

Program received signal SIGSEGV, Segmentation fault.
0x0000000000457315 in disasm__calc_percent (notes=0xc98970, evidx=0, offset=143705, end=2127526177, path=0x7fffffffbf50)
    at util/annotate.c:631
631				hits += h->addr[offset++];
(gdb) bt
 #0  0x0000000000457315 in disasm__calc_percent (notes=0xc98970, evidx=0, offset=143705, end=2127526177, path=0x7fffffffbf50)
    at util/annotate.c:631
 rabeeh#1  0x00000000004d65e3 in annotate_browser__calc_percent (browser=0x7fffffffd130, evsel=0xa01da0) at ui/browsers/annotate.c:364
 rabeeh#2  0x00000000004d7433 in annotate_browser__run (browser=0x7fffffffd130, evsel=0xa01da0, hbt=0x0) at ui/browsers/annotate.c:672
 rabeeh#3  0x00000000004d80c9 in symbol__tui_annotate (sym=0xc989a0, map=0xa02660, evsel=0xa01da0, hbt=0x0) at ui/browsers/annotate.c:962
 rabeeh#4  0x00000000004d7aa0 in hist_entry__tui_annotate (he=0xdf73f0, evsel=0xa01da0, hbt=0x0) at ui/browsers/annotate.c:823
 rabeeh#5  0x00000000004dd648 in perf_evsel__hists_browse (evsel=0xa01da0, nr_events=1, helpline=
    0x58b768 "For a higher level overview, try: perf report --sort comm,dso", ev_name=0xa02cd0 "cycles", left_exits=false, hbt=
    0x0, min_pcnt=0, env=0xa011e0) at ui/browsers/hists.c:1659
 rabeeh#6  0x00000000004de372 in perf_evlist__tui_browse_hists (evlist=0xa01520, help=
    0x58b768 "For a higher level overview, try: perf report --sort comm,dso", hbt=0x0, min_pcnt=0, env=0xa011e0)
    at ui/browsers/hists.c:1950
 linux4kix#7  0x000000000042cf6b in __cmd_report (rep=0x7fffffffd6c0) at builtin-report.c:581
 linux4kix#8  0x000000000042e25d in cmd_report (argc=0, argv=0x7fffffffe4b0, prefix=0x0) at builtin-report.c:965
 linux4kix#9  0x000000000041a0e1 in run_builtin (p=0x801548, argc=1, argv=0x7fffffffe4b0) at perf.c:319
 linux4kix#10 0x000000000041a319 in handle_internal_command (argc=1, argv=0x7fffffffe4b0) at perf.c:376
 linux4kix#11 0x000000000041a465 in run_argv (argcp=0x7fffffffe38c, argv=0x7fffffffe380) at perf.c:420
 linux4kix#12 0x000000000041a707 in main (argc=1, argv=0x7fffffffe4b0) at perf.c:521

After the fix is applied the symbol can be annotated showing the
problematic line "1:      rep"

copy_user_generic_string  /usr/lib/debug/lib/modules/3.9.10-100.fc17.x86_64/vmlinux
             */
            ENTRY(copy_user_generic_string)
                    CFI_STARTPROC
                    ASM_STAC
                    andl %edx,%edx
              and    %edx,%edx
                    jz 4f
              je     37
                    cmpl $8,%edx
              cmp    $0x8,%edx
                    jb 2f           /* less than 8 bytes, go to byte copy loop */
              jb     33
                    ALIGN_DESTINATION
              mov    %edi,%ecx
              and    $0x7,%ecx
              je     28
              sub    $0x8,%ecx
              neg    %ecx
              sub    %ecx,%edx
        1a:   mov    (%rsi),%al
              mov    %al,(%rdi)
              inc    %rsi
              inc    %rdi
              dec    %ecx
              jne    1a
                    movl %edx,%ecx
        28:   mov    %edx,%ecx
                    shrl $3,%ecx
              shr    $0x3,%ecx
                    andl $7,%edx
              and    $0x7,%edx
            1:      rep
100.00        rep    movsq %ds:(%rsi),%es:(%rdi)
                    movsq
            2:      movl %edx,%ecx
        33:   mov    %edx,%ecx
            3:      rep
              rep    movsb %ds:(%rsi),%es:(%rdi)
                    movsb
            4:      xorl %eax,%eax
        37:   xor    %eax,%eax
              data32 xchg %ax,%ax
                    ASM_CLAC
                    ret
              retq

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1379009721-27667-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Some ARC SMP systems lack native atomic R-M-W (LLOCK/SCOND) insns and
can only use atomic EX insn (reg with mem) to build higher level R-M-W
primitives. This includes a SystemC based SMP simulation model.

So rwlocks need to use a protecting spinlock for atomic cmp-n-exchange
operation to update reader(s)/writer count.

The spinlock operation itself looks as follows:

	mov reg, 1		; 1=locked, 0=unlocked
retry:
	EX reg, [lock]		; load existing, store 1, atomically
	BREQ reg, 1, rety	; if already locked, retry

In single-threaded simulation, SystemC alternates between the 2 cores
with "N" insn each based scheduling. Additionally for insn with global
side effect, such as EX writing to shared mem, a core switch is
enforced too.

Given that, 2 cores doing a repeated EX on same location, Linux often
got into a livelock e.g. when both cores were fiddling with tasklist
lock (gdbserver / hackbench) for read/write respectively as the
sequence diagram below shows:

           core1                                   core2
         --------                                --------
1. spin lock [EX r=0, w=1] - LOCKED
2. rwlock(Read)            - LOCKED
3. spin unlock  [ST 0]     - UNLOCKED
                                         spin lock [EX r=0,w=1] - LOCKED
                      -- resched core 1----

5. spin lock [EX r=1] - ALREADY-LOCKED

                      -- resched core 2----
6.                                       rwlock(Write) - READER-LOCKED
7.                                       spin unlock [ST 0]
8.                                       rwlock failed, retry again

9.                                       spin lock  [EX r=0, w=1]
                      -- resched core 1----

10  spinlock locked in linux4kix#9, retry rabeeh#5
11. spin lock [EX gets 1]
                      -- resched core 2----
...
...

The fix was to unlock using the EX insn too (step 7), to trigger another
SystemC scheduling pass which would let core1 proceed, eliding the
livelock.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
As the new x86 CPU bootup printout format code maintainer, I am
taking immediate action to improve and clean (and thus indulge
my OCD) the reporting of the cores when coming up online.

Fix padding to a right-hand alignment, cleanup code and bind
reporting width to the max number of supported CPUs on the
system, like this:

 [    0.074509] smpboot: Booting Node   0, Processors:      rabeeh#1  rabeeh#2  rabeeh#3  rabeeh#4  rabeeh#5  rabeeh#6  linux4kix#7 OK
 [    0.644008] smpboot: Booting Node   1, Processors:  linux4kix#8  linux4kix#9 linux4kix#10 linux4kix#11 linux4kix#12 linux4kix#13 linux4kix#14 linux4kix#15 OK
 [    1.245006] smpboot: Booting Node   2, Processors: linux4kix#16 linux4kix#17 linux4kix#18 linux4kix#19 linux4kix#20 linux4kix#21 linux4kix#22 linux4kix#23 OK
 [    1.864005] smpboot: Booting Node   3, Processors: linux4kix#24 linux4kix#25 linux4kix#26 linux4kix#27 linux4kix#28 #29 #30 #31 OK
 [    2.489005] smpboot: Booting Node   4, Processors: #32 #33 #34 #35 #36 #37 #38 #39 OK
 [    3.093005] smpboot: Booting Node   5, Processors: #40 #41 #42 #43 #44 #45 #46 #47 OK
 [    3.698005] smpboot: Booting Node   6, Processors: #48 #49 #50 #51 #52 #53 #54 #55 OK
 [    4.304005] smpboot: Booting Node   7, Processors: #56 #57 #58 #59 #60 #61 #62 #63 OK
 [    4.961413] Brought up 64 CPUs

and this:

 [    0.072367] smpboot: Booting Node   0, Processors:    rabeeh#1 rabeeh#2 rabeeh#3 rabeeh#4 rabeeh#5 rabeeh#6 linux4kix#7 OK
 [    0.686329] Brought up 8 CPUs

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Libin <huawei.libin@huawei.com>
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          rabeeh#1   rabeeh#2   rabeeh#3   rabeeh#4   rabeeh#5   rabeeh#6   linux4kix#7
[    0.603005] .... node   rabeeh#1, CPUs:     linux4kix#8   linux4kix#9  linux4kix#10  linux4kix#11  linux4kix#12  linux4kix#13  linux4kix#14  linux4kix#15
[    1.200005] .... node   rabeeh#2, CPUs:    linux4kix#16  linux4kix#17  linux4kix#18  linux4kix#19  linux4kix#20  linux4kix#21  linux4kix#22  linux4kix#23
[    1.796005] .... node   rabeeh#3, CPUs:    linux4kix#24  linux4kix#25  linux4kix#26  linux4kix#27  linux4kix#28  #29  #30  #31
[    2.393005] .... node   rabeeh#4, CPUs:    #32  #33  #34  #35  #36  #37  #38  #39
[    2.996005] .... node   rabeeh#5, CPUs:    #40  #41  #42  #43  #44  #45  #46  #47
[    3.600005] .... node   rabeeh#6, CPUs:    #48  #49  #50  #51  #52  #53  #54  #55
[    4.202005] .... node   linux4kix#7, CPUs:    #56  #57  #58  #59  #60  #61  #62  #63
[    4.811005] .... node   linux4kix#8, CPUs:    #64  #65  #66  #67  #68  #69  #70  #71
[    5.421006] .... node   linux4kix#9, CPUs:    #72  #73  #74  #75  #76  #77  #78  #79
[    6.032005] .... node  linux4kix#10, CPUs:    #80  #81  #82  #83  #84  #85  #86  #87
[    6.648006] .... node  linux4kix#11, CPUs:    #88  #89  #90  #91  #92  #93  #94  #95
[    7.262005] .... node  linux4kix#12, CPUs:    #96  #97  #98  #99 #100 #101 #102 #103
[    7.865005] .... node  linux4kix#13, CPUs:   #104 #105 #106 #107 #108 #109 #110 #111
[    8.466005] .... node  linux4kix#14, CPUs:   #112 #113 #114 #115 #116 #117 #118 #119
[    9.073006] .... node  linux4kix#15, CPUs:   #120 #121 #122 #123 #124 #125 #126 #127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: huawei.libin@huawei.com
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Andrey reported the following report:

ERROR: AddressSanitizer: heap-buffer-overflow on address ffff8800359c99f3
ffff8800359c99f3 is located 0 bytes to the right of 243-byte region [ffff8800359c9900, ffff8800359c99f3)
Accessed by thread T13003:
  #0 ffffffff810dd2da (asan_report_error+0x32a/0x440)
  rabeeh#1 ffffffff810dc6b0 (asan_check_region+0x30/0x40)
  rabeeh#2 ffffffff810dd4d3 (__tsan_write1+0x13/0x20)
  rabeeh#3 ffffffff811cd19e (ftrace_regex_release+0x1be/0x260)
  rabeeh#4 ffffffff812a1065 (__fput+0x155/0x360)
  rabeeh#5 ffffffff812a12de (____fput+0x1e/0x30)
  rabeeh#6 ffffffff8111708d (task_work_run+0x10d/0x140)
  linux4kix#7 ffffffff810ea043 (do_exit+0x433/0x11f0)
  linux4kix#8 ffffffff810eaee4 (do_group_exit+0x84/0x130)
  linux4kix#9 ffffffff810eafb1 (SyS_exit_group+0x21/0x30)
  linux4kix#10 ffffffff81928782 (system_call_fastpath+0x16/0x1b)

Allocated by thread T5167:
  #0 ffffffff810dc778 (asan_slab_alloc+0x48/0xc0)
  rabeeh#1 ffffffff8128337c (__kmalloc+0xbc/0x500)
  rabeeh#2 ffffffff811d9d54 (trace_parser_get_init+0x34/0x90)
  rabeeh#3 ffffffff811cd7b3 (ftrace_regex_open+0x83/0x2e0)
  rabeeh#4 ffffffff811cda7d (ftrace_filter_open+0x2d/0x40)
  rabeeh#5 ffffffff8129b4ff (do_dentry_open+0x32f/0x430)
  rabeeh#6 ffffffff8129b668 (finish_open+0x68/0xa0)
  linux4kix#7 ffffffff812b66ac (do_last+0xb8c/0x1710)
  linux4kix#8 ffffffff812b7350 (path_openat+0x120/0xb50)
  linux4kix#9 ffffffff812b8884 (do_filp_open+0x54/0xb0)
  linux4kix#10 ffffffff8129d36c (do_sys_open+0x1ac/0x2c0)
  linux4kix#11 ffffffff8129d4b7 (SyS_open+0x37/0x50)
  linux4kix#12 ffffffff81928782 (system_call_fastpath+0x16/0x1b)

Shadow bytes around the buggy address:
  ffff8800359c9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  ffff8800359c9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
  ffff8800359c9800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9880: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>ffff8800359c9980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[03]fb
  ffff8800359c9a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  ffff8800359c9b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffff8800359c9c00: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap redzone:          fa
  Heap kmalloc redzone:  fb
  Freed heap region:     fd
  Shadow gap:            fe

The out-of-bounds access happens on 'parser->buffer[parser->idx] = 0;'

Although the crash happened in ftrace_regex_open() the real bug
occurred in trace_get_user() where there's an incrementation to
parser->idx without a check against the size. The way it is triggered
is if userspace sends in 128 characters (EVENT_BUF_SIZE + 1), the loop
that reads the last character stores it and then breaks out because
there is no more characters. Then the last character is read to determine
what to do next, and the index is incremented without checking size.

Then the caller of trace_get_user() usually nulls out the last character
with a zero, but since the index is equal to the size, it writes a nul
character after the allocated space, which can corrupt memory.

Luckily, only root user has write access to this file.

Link: http://lkml.kernel.org/r/20131009222323.04fd1a0d@gandalf.local.home

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
…ux/kernel/git/tip/tip

Pull x86 boot changes from Ingo Molnar:
 "Two changes that prettify and compactify the SMP bootup output from:

     smpboot: Booting Node   0, Processors  rabeeh#1 rabeeh#2 rabeeh#3 OK
     smpboot: Booting Node   1, Processors  rabeeh#4 rabeeh#5 rabeeh#6 linux4kix#7 OK
     smpboot: Booting Node   2, Processors  linux4kix#8 linux4kix#9 linux4kix#10 linux4kix#11 OK
     smpboot: Booting Node   3, Processors  linux4kix#12 linux4kix#13 linux4kix#14 linux4kix#15 OK
     Brought up 16 CPUs

  to something like:

     x86: Booting SMP configuration:
     .... node  #0, CPUs:        rabeeh#1  rabeeh#2  rabeeh#3
     .... node  rabeeh#1, CPUs:    rabeeh#4  rabeeh#5  rabeeh#6  linux4kix#7
     .... node  rabeeh#2, CPUs:    linux4kix#8  linux4kix#9 linux4kix#10 linux4kix#11
     .... node  rabeeh#3, CPUs:   linux4kix#12 linux4kix#13 linux4kix#14 linux4kix#15
     x86: Booted up 4 nodes, 16 CPUs"

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Further compress CPUs bootup message
  x86: Improve the printout of the SMP bootup CPU table
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Running a kernel with CONFIG_PROVE_RCU=y yields the following diagnostic:

===============================
[ INFO: suspicious RCU usage. ]
3.12.0-rc5-kvm+ linux4kix#9 Not tainted
-------------------------------

include/linux/kvm_host.h:473 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 1, debug_locks = 0
1 lock held by qemu-system-ppc/4831:

stack backtrace:
CPU: 28 PID: 4831 Comm: qemu-system-ppc Not tainted 3.12.0-rc5-kvm+ linux4kix#9
Call Trace:
[c000000be462b2a0] [c00000000001644c] .show_stack+0x7c/0x1f0 (unreliable)
[c000000be462b370] [c000000000ad57c0] .dump_stack+0x88/0xb4
[c000000be462b3f0] [c0000000001315e8] .lockdep_rcu_suspicious+0x138/0x180
[c000000be462b480] [c00000000007862c] .gfn_to_memslot+0x13c/0x170
[c000000be462b510] [c00000000007d384] .gfn_to_hva_prot+0x24/0x90
[c000000be462b5a0] [c00000000007d420] .kvm_read_guest_page+0x30/0xd0
[c000000be462b630] [c00000000007d528] .kvm_read_guest+0x68/0x110
[c000000be462b6e0] [c000000000084594] .kvmppc_rtas_hcall+0x34/0x180
[c000000be462b7d0] [c000000000097934] .kvmppc_pseries_do_hcall+0x74/0x830
[c000000be462b880] [c0000000000990e8] .kvmppc_vcpu_run_hv+0xff8/0x15a0
[c000000be462b9e0] [c0000000000839cc] .kvmppc_vcpu_run+0x2c/0x40
[c000000be462ba50] [c0000000000810b4] .kvm_arch_vcpu_ioctl_run+0x54/0x1b0
[c000000be462bae0] [c00000000007b508] .kvm_vcpu_ioctl+0x478/0x730
[c000000be462bca0] [c00000000025532c] .do_vfs_ioctl+0x4dc/0x7a0
[c000000be462bd80] [c0000000002556b4] .SyS_ioctl+0xc4/0xe0
[c000000be462be30] [c000000000009ee4] syscall_exit+0x0/0x98

To fix this, we take the SRCU read lock around the kvmppc_rtas_hcall()
call.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
 bridge dev

When the following commands are executed:

brctl addbr br0
ifconfig br0 hw ether <addr>
rmmod bridge

The calltrace will occur:

[  563.312114] device eth1 left promiscuous mode
[  563.312188] br0: port 1(eth1) entered disabled state
[  563.468190] kmem_cache_destroy bridge_fdb_cache: Slab cache still has objects
[  563.468197] CPU: 6 PID: 6982 Comm: rmmod Tainted: G           O 3.12.0-0.7-default+ linux4kix#9
[  563.468199] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[  563.468200]  0000000000000880 ffff88010f111e98 ffffffff814d1c92 ffff88010f111eb8
[  563.468204]  ffffffff81148efd ffff88010f111eb8 0000000000000000 ffff88010f111ec8
[  563.468206]  ffffffffa062a270 ffff88010f111ed8 ffffffffa063ac76 ffff88010f111f78
[  563.468209] Call Trace:
[  563.468218]  [<ffffffff814d1c92>] dump_stack+0x6a/0x78
[  563.468234]  [<ffffffff81148efd>] kmem_cache_destroy+0xfd/0x100
[  563.468242]  [<ffffffffa062a270>] br_fdb_fini+0x10/0x20 [bridge]
[  563.468247]  [<ffffffffa063ac76>] br_deinit+0x4e/0x50 [bridge]
[  563.468254]  [<ffffffff810c7dc9>] SyS_delete_module+0x199/0x2b0
[  563.468259]  [<ffffffff814e0922>] system_call_fastpath+0x16/0x1b
[  570.377958] Bridge firewalling registered

--------------------------- cut here -------------------------------

The reason is that when the bridge dev's address is changed, the
br_fdb_change_mac_address() will add new address in fdb, but when
the bridge was removed, the address entry in the fdb did not free,
the bridge_fdb_cache still has objects when destroy the cache, Fix
this by flushing the bridge address entry when removing the bridge.

v2: according to the Toshiaki Makita and Vlad's suggestion, I only
    delete the vlan0 entry, it still have a leak here if the vlan id
    is other number, so I need to call fdb_delete_by_port(br, NULL, 1)
    to flush all entries whose dst is NULL for the bridge.

Suggested-by: Toshiaki Makita <toshiaki.makita1@gmail.com>
Suggested-by: Vlad Yasevich <vyasevich@gmail.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Dave Jones reported a use after free in UDP stack :

[ 5059.434216] =========================
[ 5059.434314] [ BUG: held lock freed! ]
[ 5059.434420] 3.13.0-rc3+ linux4kix#9 Not tainted
[ 5059.434520] -------------------------
[ 5059.434620] named/863 is freeing memory ffff88005e960000-ffff88005e96061f, with a lock still held there!
[ 5059.434815]  (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0
[ 5059.435012] 3 locks held by named/863:
[ 5059.435086]  #0:  (rcu_read_lock){.+.+..}, at: [<ffffffff8143054d>] __netif_receive_skb_core+0x11d/0x940
[ 5059.435295]  rabeeh#1:  (rcu_read_lock){.+.+..}, at: [<ffffffff81467a5e>] ip_local_deliver_finish+0x3e/0x410
[ 5059.435500]  rabeeh#2:  (slock-AF_INET){+.-...}, at: [<ffffffff8149bd21>] udp_queue_rcv_skb+0xd1/0x4b0
[ 5059.435734]
stack backtrace:
[ 5059.435858] CPU: 0 PID: 863 Comm: named Not tainted 3.13.0-rc3+ linux4kix#9 [loadavg: 0.21 0.06 0.06 1/115 1365]
[ 5059.436052] Hardware name:                  /D510MO, BIOS MOPNV10J.86A.0175.2010.0308.0620 03/08/2010
[ 5059.436223]  0000000000000002 ffff88007e203ad8 ffffffff8153a372 ffff8800677130e0
[ 5059.436390]  ffff88007e203b10 ffffffff8108cafa ffff88005e960000 ffff88007b00cfc0
[ 5059.436554]  ffffea00017a5800 ffffffff8141c490 0000000000000246 ffff88007e203b48
[ 5059.436718] Call Trace:
[ 5059.436769]  <IRQ>  [<ffffffff8153a372>] dump_stack+0x4d/0x66
[ 5059.436904]  [<ffffffff8108cafa>] debug_check_no_locks_freed+0x15a/0x160
[ 5059.437037]  [<ffffffff8141c490>] ? __sk_free+0x110/0x230
[ 5059.437147]  [<ffffffff8112da2a>] kmem_cache_free+0x6a/0x150
[ 5059.437260]  [<ffffffff8141c490>] __sk_free+0x110/0x230
[ 5059.437364]  [<ffffffff8141c5c9>] sk_free+0x19/0x20
[ 5059.437463]  [<ffffffff8141cb25>] sock_edemux+0x25/0x40
[ 5059.437567]  [<ffffffff8141c181>] sock_queue_rcv_skb+0x81/0x280
[ 5059.437685]  [<ffffffff8149bd21>] ? udp_queue_rcv_skb+0xd1/0x4b0
[ 5059.437805]  [<ffffffff81499c82>] __udp_queue_rcv_skb+0x42/0x240
[ 5059.437925]  [<ffffffff81541d25>] ? _raw_spin_lock+0x65/0x70
[ 5059.438038]  [<ffffffff8149bebb>] udp_queue_rcv_skb+0x26b/0x4b0
[ 5059.438155]  [<ffffffff8149c712>] __udp4_lib_rcv+0x152/0xb00
[ 5059.438269]  [<ffffffff8149d7f5>] udp_rcv+0x15/0x20
[ 5059.438367]  [<ffffffff81467b2f>] ip_local_deliver_finish+0x10f/0x410
[ 5059.438492]  [<ffffffff81467a5e>] ? ip_local_deliver_finish+0x3e/0x410
[ 5059.438621]  [<ffffffff81468653>] ip_local_deliver+0x43/0x80
[ 5059.438733]  [<ffffffff81467f70>] ip_rcv_finish+0x140/0x5a0
[ 5059.438843]  [<ffffffff81468926>] ip_rcv+0x296/0x3f0
[ 5059.438945]  [<ffffffff81430b72>] __netif_receive_skb_core+0x742/0x940
[ 5059.439074]  [<ffffffff8143054d>] ? __netif_receive_skb_core+0x11d/0x940
[ 5059.442231]  [<ffffffff8108c81d>] ? trace_hardirqs_on+0xd/0x10
[ 5059.442231]  [<ffffffff81430d83>] __netif_receive_skb+0x13/0x60
[ 5059.442231]  [<ffffffff81431c1e>] netif_receive_skb+0x1e/0x1f0
[ 5059.442231]  [<ffffffff814334e0>] napi_gro_receive+0x70/0xa0
[ 5059.442231]  [<ffffffffa01de426>] rtl8169_poll+0x166/0x700 [r8169]
[ 5059.442231]  [<ffffffff81432bc9>] net_rx_action+0x129/0x1e0
[ 5059.442231]  [<ffffffff810478cd>] __do_softirq+0xed/0x240
[ 5059.442231]  [<ffffffff81047e25>] irq_exit+0x125/0x140
[ 5059.442231]  [<ffffffff81004241>] do_IRQ+0x51/0xc0
[ 5059.442231]  [<ffffffff81542bef>] common_interrupt+0x6f/0x6f

We need to keep a reference on the socket, by using skb_steal_sock()
at the right place.

Note that another patch is needed to fix a race in
udp_sk_rx_dst_set(), as we hold no lock protecting the dst.

Fixes: 421b388 ("udp: ipv4: Add udp early demux")
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Shawn Bohrer <sbohrer@rgmadvisors.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
In function free_dmar_iommu(), it sets IRQ handler data to NULL
before calling free_irq(), which will cause invalid memory access
because free_irq() will access IRQ handler data when calling
function dmar_msi_mask(). So only set IRQ handler data to NULL
after calling free_irq().

Sample stack dump:
[   13.094010] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[   13.103215] IP: [<ffffffff810a97cd>] __lock_acquire+0x4d/0x12a0
[   13.110104] PGD 0
[   13.112614] Oops: 0000 [rabeeh#1] SMP
[   13.116585] Modules linked in:
[   13.120260] CPU: 60 PID: 1 Comm: swapper/0 Tainted: G        W    3.13.0-rc1-gerry+ linux4kix#9
[   13.129367] Hardware name: Intel Corporation LH Pass ........../SVRBD-ROW_T, BIOS SE5C600.86B.99.99.x059.091020121352 09/10/2012
[   13.142555] task: ffff88042dd38010 ti: ffff88042dd32000 task.ti: ffff88042dd32000
[   13.151179] RIP: 0010:[<ffffffff810a97cd>]  [<ffffffff810a97cd>] __lock_acquire+0x4d/0x12a0
[   13.160867] RSP: 0000:ffff88042dd33b78  EFLAGS: 00010046
[   13.166969] RAX: 0000000000000046 RBX: 0000000000000002 RCX: 0000000000000000
[   13.175122] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000048
[   13.183274] RBP: ffff88042dd33bd8 R08: 0000000000000002 R09: 0000000000000001
[   13.191417] R10: 0000000000000000 R11: 0000000000000001 R12: ffff88042dd38010
[   13.199571] R13: 0000000000000000 R14: 0000000000000048 R15: 0000000000000000
[   13.207725] FS:  0000000000000000(0000) GS:ffff88103f200000(0000) knlGS:0000000000000000
[   13.217014] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   13.223596] CR2: 0000000000000048 CR3: 0000000001a0b000 CR4: 00000000000407e0
[   13.231747] Stack:
[   13.234160]  0000000000000004 0000000000000046 ffff88042dd33b98 ffffffff810a567d
[   13.243059]  ffff88042dd33c08 ffffffff810bb14c ffffffff828995a0 0000000000000046
[   13.251969]  0000000000000000 0000000000000000 0000000000000002 0000000000000000
[   13.260862] Call Trace:
[   13.263775]  [<ffffffff810a567d>] ? trace_hardirqs_off+0xd/0x10
[   13.270571]  [<ffffffff810bb14c>] ? vprintk_emit+0x23c/0x570
[   13.277058]  [<ffffffff810ab1e3>] lock_acquire+0x93/0x120
[   13.283269]  [<ffffffff814623f7>] ? dmar_msi_mask+0x47/0x70
[   13.289677]  [<ffffffff8156b449>] _raw_spin_lock_irqsave+0x49/0x90
[   13.296748]  [<ffffffff814623f7>] ? dmar_msi_mask+0x47/0x70
[   13.303153]  [<ffffffff814623f7>] dmar_msi_mask+0x47/0x70
[   13.309354]  [<ffffffff810c0d93>] irq_shutdown+0x53/0x60
[   13.315467]  [<ffffffff810bdd9d>] __free_irq+0x26d/0x280
[   13.321580]  [<ffffffff810be920>] free_irq+0xf0/0x180
[   13.327395]  [<ffffffff81466591>] free_dmar_iommu+0x271/0x2b0
[   13.333996]  [<ffffffff810a947d>] ? trace_hardirqs_on+0xd/0x10
[   13.340696]  [<ffffffff81461a17>] free_iommu+0x17/0x50
[   13.346597]  [<ffffffff81dc75a5>] init_dmars+0x691/0x77a
[   13.352711]  [<ffffffff81dc7afd>] intel_iommu_init+0x351/0x438
[   13.359400]  [<ffffffff81d8a711>] ? iommu_setup+0x27d/0x27d
[   13.365806]  [<ffffffff81d8a739>] pci_iommu_init+0x28/0x52
[   13.372114]  [<ffffffff81000342>] do_one_initcall+0x122/0x180
[   13.378707]  [<ffffffff81077738>] ? parse_args+0x1e8/0x320
[   13.385016]  [<ffffffff81d850e8>] kernel_init_freeable+0x1e1/0x26c
[   13.392100]  [<ffffffff81d84833>] ? do_early_param+0x88/0x88
[   13.398596]  [<ffffffff8154f8b0>] ? rest_init+0xd0/0xd0
[   13.404614]  [<ffffffff8154f8be>] kernel_init+0xe/0x130
[   13.410626]  [<ffffffff81574d6c>] ret_from_fork+0x7c/0xb0
[   13.416829]  [<ffffffff8154f8b0>] ? rest_init+0xd0/0xd0
[   13.422842] Code: ec 99 00 85 c0 8b 05 53 05 a5 00 41 0f 45 d8 85 c0 0f 84 ff 00 00 00 8b 05 99 f9 7e 01 49 89 fe 41 89 f7 85 c0 0f 84 03 01 00 00 <49> 8b 06 be 01 00 00 00 48 3d c0 0e 01 82 0f 44 de 41 83 ff 01
[   13.450191] RIP  [<ffffffff810a97cd>] __lock_acquire+0x4d/0x12a0
[   13.458598]  RSP <ffff88042dd33b78>
[   13.462671] CR2: 0000000000000048
[   13.466551] ---[ end trace c5bd26a37c81d760 ]---

Reviewed-by: Yijing Wang <wangyijing@huawei.com>
Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
vmxnet3's netpoll driver is incorrectly coded.  It directly calls
vmxnet3_do_poll, which is the driver internal napi poll routine.  As the netpoll
controller method doesn't block real napi polls in any way, there is a potential
for race conditions in which the netpoll controller method and the napi poll
method run concurrently.  The result is data corruption causing panics such as this
one recently observed:
PID: 1371   TASK: ffff88023762caa0  CPU: 1   COMMAND: "rs:main Q:Reg"
 #0 [ffff88023abd5780] machine_kexec at ffffffff81038f3b
 rabeeh#1 [ffff88023abd57e0] crash_kexec at ffffffff810c5d92
 rabeeh#2 [ffff88023abd58b0] oops_end at ffffffff8152b570
 rabeeh#3 [ffff88023abd58e0] die at ffffffff81010e0b
 rabeeh#4 [ffff88023abd5910] do_trap at ffffffff8152add4
 rabeeh#5 [ffff88023abd5970] do_invalid_op at ffffffff8100cf95
 rabeeh#6 [ffff88023abd5a10] invalid_op at ffffffff8100bf9b
    [exception RIP: vmxnet3_rq_rx_complete+1968]
    RIP: ffffffffa00f1e80  RSP: ffff88023abd5ac8  RFLAGS: 00010086
    RAX: 0000000000000000  RBX: ffff88023b5dcee0  RCX: 00000000000000c0
    RDX: 0000000000000000  RSI: 00000000000005f2  RDI: ffff88023b5dcee0
    RBP: ffff88023abd5b48   R8: 0000000000000000   R9: ffff88023a3b6048
    R10: 0000000000000000  R11: 0000000000000002  R12: ffff8802398d4cd8
    R13: ffff88023af35140  R14: ffff88023b60c890  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 linux4kix#7 [ffff88023abd5b50] vmxnet3_do_poll at ffffffffa00f204a [vmxnet3]
 linux4kix#8 [ffff88023abd5b80] vmxnet3_netpoll at ffffffffa00f209c [vmxnet3]
 linux4kix#9 [ffff88023abd5ba0] netpoll_poll_dev at ffffffff81472bb7

The fix is to do as other drivers do, and have the poll controller call the top
half interrupt handler, which schedules a napi poll properly to recieve frames

Tested by myself, successfully.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Shreyas Bhatewara <sbhatewara@vmware.com>
CC: "VMware, Inc." <pv-drivers@vmware.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: stable@vger.kernel.org
Reviewed-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
commit b34aa86f12e8848ba453215602c8c50fa63c4cb3 upstream.

Mmapping a comedi data buffer with lockdep checking enabled produced the
following kernel debug messages:

======================================================
[ INFO: possible circular locking dependency detected ]
3.5.0-rc3-ija1+ linux4kix#9 Tainted: G         C
-------------------------------------------------------
comedi_test/4160 is trying to acquire lock:
 (&dev->mutex#2){+.+.+.}, at: [<ffffffffa00313f4>] comedi_mmap+0x57/0x1d9 [comedi]

but task is already holding lock:
 (&mm->mmap_sem){++++++}, at: [<ffffffff810c96fe>] vm_mmap_pgoff+0x41/0x76

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> rabeeh#1 (&mm->mmap_sem){++++++}:
       [<ffffffff8106d0e8>] lock_acquire+0x97/0x105
       [<ffffffff810ce3bc>] might_fault+0x6d/0x90
       [<ffffffffa0031ffb>] do_devinfo_ioctl.isra.7+0x11e/0x14c [comedi]
       [<ffffffffa003227f>] comedi_unlocked_ioctl+0x256/0xe48 [comedi]
       [<ffffffff810f7fcd>] vfs_ioctl+0x18/0x34
       [<ffffffff810f87fd>] do_vfs_ioctl+0x382/0x43c
       [<ffffffff810f88f9>] sys_ioctl+0x42/0x65
       [<ffffffff81415c62>] system_call_fastpath+0x16/0x1b

-> #0 (&dev->mutex#2){+.+.+.}:
       [<ffffffff8106c528>] __lock_acquire+0x101d/0x1591
       [<ffffffff8106d0e8>] lock_acquire+0x97/0x105
       [<ffffffff8140c894>] mutex_lock_nested+0x46/0x2a4
       [<ffffffffa00313f4>] comedi_mmap+0x57/0x1d9 [comedi]
       [<ffffffff810d5816>] mmap_region+0x281/0x492
       [<ffffffff810d5c92>] do_mmap_pgoff+0x26b/0x2a7
       [<ffffffff810c971a>] vm_mmap_pgoff+0x5d/0x76
       [<ffffffff810d493f>] sys_mmap_pgoff+0xc7/0x10d
       [<ffffffff81004d36>] sys_mmap+0x16/0x20
       [<ffffffff81415c62>] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&mm->mmap_sem);
                               lock(&dev->mutex#2);
                               lock(&mm->mmap_sem);
  lock(&dev->mutex#2);

 *** DEADLOCK ***

To avoid the circular dependency, just try to get the lock in
`comedi_mmap()` instead of blocking.  Since the comedi device's main mutex
is heavily used, do a down-read of its `attach_lock` rwsemaphore
instead.  Trying to down-read `attach_lock` should only fail if
some task has down-write locked it, and that is only done while the
comedi device is being attached to or detached from a low-level hardware
device.

Unfortunately, acquiring the `attach_lock` doesn't prevent another
task replacing the comedi data buffer we are trying to mmap.  The
details of the buffer are held in a `struct comedi_buf_map` and pointed
to by `s->async->buf_map` where `s` is the comedi subdevice whose buffer
we are trying to map.  The `struct comedi_buf_map` is already reference
counted with a `struct kref`, so we can stop it being freed prematurely.

Modify `comedi_mmap()` to call new function
`comedi_buf_map_from_subdev_get()` to read the subdevice's current
buffer map pointer and increment its reference instead of accessing
`async->buf_map` directly.  Call `comedi_buf_map_put()` to decrement the
reference once the buffer map structure has been dealt with.  (Note that
`comedi_buf_map_put()` does nothing if passed a NULL pointer.)

`comedi_buf_map_from_subdev_get()` checks the subdevice's buffer map
pointer has been set and the buffer map has been initialized enough for
`comedi_mmap()` to deal with it (specifically, check the `n_pages`
member has been set to a non-zero value).  If all is well, the buffer
map's reference is incremented and a pointer to it is returned.  The
comedi subdevice's spin-lock is used to protect the checks.  Also use
the spin-lock in `__comedi_buf_alloc()` and `__comedi_buf_free()` to
protect changes to the subdevice's buffer map structure pointer and the
buffer map structure's `n_pages` member.  (This checking of `n_pages` is
a bit clunky and I [Ian Abbott] plan to deal with it in the future.)

Signed-off-by: Ian Abbott <abbotti@mev.co.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
n-aizu pushed a commit to n-aizu/linux-linaro-stable-mx6 that referenced this pull request Feb 4, 2015
commit ce7514526742c0898b837d4395f515b79dfb5a12 upstream.

It is possible for ata_sff_flush_pio_task() to set ap->hsm_task_state to
HSM_ST_IDLE in between the time __ata_sff_port_intr() checks for HSM_ST_IDLE
and before it calls ata_sff_hsm_move() causing ata_sff_hsm_move() to BUG().

This problem is hard to reproduce making this patch hard to verify, but this
fix will prevent the race.

I have not been able to reproduce the problem, but here is a crash dump from
a 2.6.32 kernel.

On examining the ata port's state, its hsm_task_state field has a value of HSM_ST_IDLE:

crash> struct ata_port.hsm_task_state ffff881c1121c000
  hsm_task_state = 0

Normally, this should not be possible as ata_sff_hsm_move() was called from ata_sff_host_intr(),
which checks hsm_task_state and won't call ata_sff_hsm_move() if it has a HSM_ST_IDLE value.

PID: 11053  TASK: ffff8816e846cae0  CPU: 0   COMMAND: "sshd"
 #0 [ffff88008ba03960] machine_kexec at ffffffff81038f3b
 #1 [ffff88008ba039c0] crash_kexec at ffffffff810c5d92
 #2 [ffff88008ba03a90] oops_end at ffffffff8152b510
 linux4kix#3 [ffff88008ba03ac0] die at ffffffff81010e0b
 linux4kix#4 [ffff88008ba03af0] do_trap at ffffffff8152ad74
 linux4kix#5 [ffff88008ba03b50] do_invalid_op at ffffffff8100cf95
 linux4kix#6 [ffff88008ba03bf0] invalid_op at ffffffff8100bf9b
    [exception RIP: ata_sff_hsm_move+317]
    RIP: ffffffff813a77ad  RSP: ffff88008ba03ca0  RFLAGS: 00010097
    RAX: 0000000000000000  RBX: ffff881c1121dc60  RCX: 0000000000000000
    RDX: ffff881c1121dd10  RSI: ffff881c1121dc60  RDI: ffff881c1121c000
    RBP: ffff88008ba03d00   R8: 0000000000000000   R9: 000000000000002e
    R10: 000000000001003f  R11: 000000000000009b  R12: ffff881c1121c000
    R13: 0000000000000000  R14: 0000000000000050  R15: ffff881c1121dd78
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 linux4kix#7 [ffff88008ba03d08] ata_sff_host_intr at ffffffff813a7fbd
 linux4kix#8 [ffff88008ba03d38] ata_sff_interrupt at ffffffff813a821e
 linux4kix#9 [ffff88008ba03d78] handle_IRQ_event at ffffffff810e6ec0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants