Skip to content

Initialize jump variable #5

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: linux-linaro-lsk-mx6
Choose a base branch
from
Open

Initialize jump variable #5

wants to merge 1 commit into from

Conversation

CurlyMoo
Copy link

@CurlyMoo CurlyMoo commented Jul 3, 2014

This removes the error given when running make config:

In file included from scripts/kconfig/zconf.tab.c:2503:0:
scripts/kconfig/menu.c: In function 'get_symbol_str':
scripts/kconfig/menu.c:567:18: warning: 'jump' may be used uninitialized in this function [-Wmaybe-uninitialized]
     jump->offset = r->len - 1;
                  ^
scripts/kconfig/menu.c:528:19: note: 'jump' was declared here
  struct jump_key *jump;

This remove the error given when running 'make config':

In file included from scripts/kconfig/zconf.tab.c:2503:0:
scripts/kconfig/menu.c: In function 'get_symbol_str':
scripts/kconfig/menu.c:567:18: warning: 'jump' may be used uninitialized in this function [-Wmaybe-uninitialized]
     jump->offset = r->len - 1;
                  ^
scripts/kconfig/menu.c:528:19: note: 'jump' was declared here
  struct jump_key *jump;
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
…/kernel/git/vgupta/arc

Pull first batch of ARC changes from Vineet Gupta:
 "There's a second bunch to follow next week - which depends on commits
  on other trees (irq/net).  I'd have preferred the accompanying ARC
  change via respective trees, but it didn't workout somehow.

  Highlights of changes:

   - Continuation of ARC MM changes from 3.10 including

       zero page optimization
       Setting pagecache pages dirty by default
       Non executable stack by default
       Reducing dcache flushes for aliasing VIPT config

   - Long overdue rework of pt_regs machinery - removing the unused word
     gutters and adding ECR register to baseline (helps cleanup lot of
     low level code)

   - Support for ARC gcc 4.8

   - Few other preventive fixes, cosmetics, usage of Kconfig helper..

  The diffstat is larger than normal primarily because of arcregs.h
  header split as well as beautification of macros in entry.h"

* tag 'arc-v3.11-rc1-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (32 commits)
  ARC: warn on improper stack unwind FDE entries
  arc: delete __cpuinit usage from all arc files
  ARC: [tlb-miss] Fix bug with CONFIG_ARC_DBG_TLB_MISS_COUNT
  ARC: [tlb-miss] Extraneous PTE bit testing/setting
  ARC: Adjustments for gcc 4.8
  ARC: Setup Vector Table Base in early boot
  ARC: Remove explicit passing around of ECR
  ARC: pt_regs update rabeeh#5: Use real ECR for pt_regs->event vs. synth values
  ARC: stop using pt_regs->orig_r8
  ARC: pt_regs update rabeeh#4: r25 saved/restored unconditionally
  ARC: K/U SP saved from one location in stack switching macro
  ARC: Entry Handler tweaks: Simplify branch for in-kernel preemption
  ARC: Entry Handler tweaks: Avoid hardcoded LIMMS for ECR values
  ARC: Increase readability of entry handlers
  ARC: pt_regs update rabeeh#3: Remove unused gutter at start of callee_regs
  ARC: pt_regs update rabeeh#2: Remove unused gutter at start of pt_regs
  ARC: pt_regs update rabeeh#1: Align pt_regs end with end of kernel stack page
  ARC: pt_regs update #0: remove kernel stack canary
  ARC: [mm] Remove @Write argument to do_page_fault()
  ARC: [mm] Make stack/heap Non-executable by default
  ...
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
commit 2f7021a "cpufreq: protect 'policy->cpus' from offlining
during __gov_queue_work()" caused a regression in CPU hotplug,
because it lead to a deadlock between cpufreq governor worker thread
and the CPU hotplug writer task.

Lockdep splat corresponding to this deadlock is shown below:

[   60.277396] ======================================================
[   60.277400] [ INFO: possible circular locking dependency detected ]
[   60.277407] 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744 Not tainted
[   60.277411] -------------------------------------------------------
[   60.277417] bash/2225 is trying to acquire lock:
[   60.277422]  ((&(&j_cdbs->work)->work)){+.+...}, at: [<ffffffff810621b5>] flush_work+0x5/0x280
[   60.277444] but task is already holding lock:
[   60.277449]  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
[   60.277465] which lock already depends on the new lock.

[   60.277472] the existing dependency chain (in reverse order) is:
[   60.277477] -> rabeeh#2 (cpu_hotplug.lock){+.+.+.}:
[   60.277490]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.277503]        [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
[   60.277514]        [<ffffffff81042cbc>] get_online_cpus+0x3c/0x60
[   60.277522]        [<ffffffff814b842a>] gov_queue_work+0x2a/0xb0
[   60.277532]        [<ffffffff814b7891>] cs_dbs_timer+0xc1/0xe0
[   60.277543]        [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
[   60.277552]        [<ffffffff81063d31>] worker_thread+0x121/0x3a0
[   60.277560]        [<ffffffff8106ae2b>] kthread+0xdb/0xe0
[   60.277569]        [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
[   60.277580] -> rabeeh#1 (&j_cdbs->timer_mutex){+.+...}:
[   60.277592]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.277600]        [<ffffffff815b6157>] mutex_lock_nested+0x67/0x410
[   60.277608]        [<ffffffff814b785d>] cs_dbs_timer+0x8d/0xe0
[   60.277616]        [<ffffffff8106302d>] process_one_work+0x1cd/0x6a0
[   60.277624]        [<ffffffff81063d31>] worker_thread+0x121/0x3a0
[   60.277633]        [<ffffffff8106ae2b>] kthread+0xdb/0xe0
[   60.277640]        [<ffffffff815bb96c>] ret_from_fork+0x7c/0xb0
[   60.277649] -> #0 ((&(&j_cdbs->work)->work)){+.+...}:
[   60.277661]        [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
[   60.277669]        [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.277677]        [<ffffffff810621ed>] flush_work+0x3d/0x280
[   60.277685]        [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
[   60.277693]        [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
[   60.277701]        [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
[   60.277709]        [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
[   60.277719]        [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
[   60.277728]        [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
[   60.277737]        [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
[   60.277747]        [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
[   60.277759]        [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
[   60.277768]        [<ffffffff815a0a68>] _cpu_down+0x88/0x330
[   60.277779]        [<ffffffff815a0d46>] cpu_down+0x36/0x50
[   60.277788]        [<ffffffff815a2748>] store_online+0x98/0xd0
[   60.277796]        [<ffffffff81452a28>] dev_attr_store+0x18/0x30
[   60.277806]        [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
[   60.277818]        [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
[   60.277826]        [<ffffffff811686fc>] SyS_write+0x4c/0xa0
[   60.277834]        [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
[   60.277842] other info that might help us debug this:

[   60.277848] Chain exists of:
  (&(&j_cdbs->work)->work) --> &j_cdbs->timer_mutex --> cpu_hotplug.lock

[   60.277864]  Possible unsafe locking scenario:

[   60.277869]        CPU0                    CPU1
[   60.277873]        ----                    ----
[   60.277877]   lock(cpu_hotplug.lock);
[   60.277885]                                lock(&j_cdbs->timer_mutex);
[   60.277892]                                lock(cpu_hotplug.lock);
[   60.277900]   lock((&(&j_cdbs->work)->work));
[   60.277907]  *** DEADLOCK ***

[   60.277915] 6 locks held by bash/2225:
[   60.277919]  #0:  (sb_writers#6){.+.+.+}, at: [<ffffffff81168173>] vfs_write+0x1c3/0x1f0
[   60.277937]  rabeeh#1:  (&buffer->mutex){+.+.+.}, at: [<ffffffff811d9e3c>] sysfs_write_file+0x3c/0x150
[   60.277954]  rabeeh#2:  (s_active#61){.+.+.+}, at: [<ffffffff811d9ec3>] sysfs_write_file+0xc3/0x150
[   60.277972]  rabeeh#3:  (x86_cpu_hotplug_driver_mutex){+.+...}, at: [<ffffffff81024cf7>] cpu_hotplug_driver_lock+0x17/0x20
[   60.277990]  rabeeh#4:  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff815a0d32>] cpu_down+0x22/0x50
[   60.278007]  rabeeh#5:  (cpu_hotplug.lock){+.+.+.}, at: [<ffffffff81042d8b>] cpu_hotplug_begin+0x2b/0x60
[   60.278023] stack backtrace:
[   60.278031] CPU: 3 PID: 2225 Comm: bash Not tainted 3.10.0-rc7-dbg-01385-g241fd04-dirty #1744
[   60.278037] Hardware name: Acer             Aspire 5741G    /Aspire 5741G    , BIOS V1.20 02/08/2011
[   60.278042]  ffffffff8204e110 ffff88014df6b9f8 ffffffff815b3d90 ffff88014df6ba38
[   60.278055]  ffffffff815b0a8d ffff880150ed3f60 ffff880150ed4770 3871c4002c8980b2
[   60.278068]  ffff880150ed4748 ffff880150ed4770 ffff880150ed3f60 ffff88014df6bb00
[   60.278081] Call Trace:
[   60.278091]  [<ffffffff815b3d90>] dump_stack+0x19/0x1b
[   60.278101]  [<ffffffff815b0a8d>] print_circular_bug+0x2b6/0x2c5
[   60.278111]  [<ffffffff810ab826>] __lock_acquire+0x1766/0x1d30
[   60.278123]  [<ffffffff81067e08>] ? __kernel_text_address+0x58/0x80
[   60.278134]  [<ffffffff810ac6d4>] lock_acquire+0xa4/0x200
[   60.278142]  [<ffffffff810621b5>] ? flush_work+0x5/0x280
[   60.278151]  [<ffffffff810621ed>] flush_work+0x3d/0x280
[   60.278159]  [<ffffffff810621b5>] ? flush_work+0x5/0x280
[   60.278169]  [<ffffffff810a9b14>] ? mark_held_locks+0x94/0x140
[   60.278178]  [<ffffffff81062d77>] ? __cancel_work_timer+0x77/0x120
[   60.278188]  [<ffffffff810a9cbd>] ? trace_hardirqs_on_caller+0xfd/0x1c0
[   60.278196]  [<ffffffff81062d8a>] __cancel_work_timer+0x8a/0x120
[   60.278206]  [<ffffffff81062e53>] cancel_delayed_work_sync+0x13/0x20
[   60.278214]  [<ffffffff814b89d9>] cpufreq_governor_dbs+0x529/0x6f0
[   60.278225]  [<ffffffff814b76a7>] cs_cpufreq_governor_dbs+0x17/0x20
[   60.278234]  [<ffffffff814b5df8>] __cpufreq_governor+0x48/0x100
[   60.278244]  [<ffffffff814b6b80>] __cpufreq_remove_dev.isra.14+0x80/0x3c0
[   60.278255]  [<ffffffff815adc0d>] cpufreq_cpu_callback+0x38/0x4c
[   60.278265]  [<ffffffff81071a4d>] notifier_call_chain+0x5d/0x110
[   60.278275]  [<ffffffff81071b0e>] __raw_notifier_call_chain+0xe/0x10
[   60.278284]  [<ffffffff815a0a68>] _cpu_down+0x88/0x330
[   60.278292]  [<ffffffff81024cf7>] ? cpu_hotplug_driver_lock+0x17/0x20
[   60.278302]  [<ffffffff815a0d46>] cpu_down+0x36/0x50
[   60.278311]  [<ffffffff815a2748>] store_online+0x98/0xd0
[   60.278320]  [<ffffffff81452a28>] dev_attr_store+0x18/0x30
[   60.278329]  [<ffffffff811d9edb>] sysfs_write_file+0xdb/0x150
[   60.278337]  [<ffffffff8116806d>] vfs_write+0xbd/0x1f0
[   60.278347]  [<ffffffff81185950>] ? fget_light+0x320/0x4b0
[   60.278355]  [<ffffffff811686fc>] SyS_write+0x4c/0xa0
[   60.278364]  [<ffffffff815bbbbe>] tracesys+0xd0/0xd5
[   60.280582] smpboot: CPU 1 is now offline

The intention of that commit was to avoid warnings during CPU
hotplug, which indicated that offline CPUs were getting IPIs from the
cpufreq governor's work items.  But the real root-cause of that
problem was commit a66b2e5 (cpufreq: Preserve sysfs files across
suspend/resume) because it totally skipped all the cpufreq callbacks
during CPU hotplug in the suspend/resume path, and hence it never
actually shut down the cpufreq governor's worker threads during CPU
offline in the suspend/resume path.

Reflecting back, the reason why we never suspected that commit as the
root-cause earlier, was that the original issue was reported with
just the halt command and nobody had brought in suspend/resume to the
equation.

The reason for _that_ in turn, as it turns out, is that earlier
halt/shutdown was being done by disabling non-boot CPUs while tasks
were frozen, just like suspend/resume....  but commit cf7df37
(reboot: migrate shutdown/reboot to boot cpu) which came somewhere
along that very same time changed that logic: shutdown/halt no longer
takes CPUs offline.  Thus, the test-cases for reproducing the bug
were vastly different and thus we went totally off the trail.

Overall, it was one hell of a confusion with so many commits
affecting each other and also affecting the symptoms of the problems
in subtle ways.  Finally, now since the original problematic commit
(a66b2e5) has been completely reverted, revert this intermediate fix
too (2f7021a), to fix the CPU hotplug deadlock.  Phew!

Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reported-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Tested-by: Peter Wu <lekensteyn@gmail.com>
Cc: 3.10+ <stable@vger.kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Commits 6a1c068 and
9356b53, respectively
  'tty: Convert termios_mutex to termios_rwsem' and
  'n_tty: Access termios values safely'
introduced a circular lock dependency with console_lock and
termios_rwsem.

The lockdep report [1] shows that n_tty_write() will attempt
to claim console_lock while holding the termios_rwsem, whereas
tty_do_resize() may already hold the console_lock while
claiming the termios_rwsem.

Since n_tty_write() and tty_do_resize() do not contend
over the same data -- the tty->winsize structure -- correct
the lock dependency by introducing a new lock which
specifically serializes access to tty->winsize only.

[1] Lockdep report

======================================================
[ INFO: possible circular locking dependency detected ]
3.10.0-0+tip-xeon+lockdep #0+tip Not tainted
-------------------------------------------------------
modprobe/277 is trying to acquire lock:
 (&tty->termios_rwsem){++++..}, at: [<ffffffff81452656>] tty_do_resize+0x36/0xe0

but task is already holding lock:
 ((fb_notifier_list).rwsem){.+.+.+}, at: [<ffffffff8107aac6>] __blocking_notifier_call_chain+0x56/0xc0

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> rabeeh#2 ((fb_notifier_list).rwsem){.+.+.+}:
       [<ffffffff810b6d62>] lock_acquire+0x92/0x1f0
       [<ffffffff8175b797>] down_read+0x47/0x5c
       [<ffffffff8107aac6>] __blocking_notifier_call_chain+0x56/0xc0
       [<ffffffff8107ab46>] blocking_notifier_call_chain+0x16/0x20
       [<ffffffff813d7c0b>] fb_notifier_call_chain+0x1b/0x20
       [<ffffffff813d95b2>] register_framebuffer+0x1e2/0x320
       [<ffffffffa01043e1>] drm_fb_helper_initial_config+0x371/0x540 [drm_kms_helper]
       [<ffffffffa01bcb05>] nouveau_fbcon_init+0x105/0x140 [nouveau]
       [<ffffffffa01ad0af>] nouveau_drm_load+0x43f/0x610 [nouveau]
       [<ffffffffa008a79e>] drm_get_pci_dev+0x17e/0x2a0 [drm]
       [<ffffffffa01ad4da>] nouveau_drm_probe+0x25a/0x2a0 [nouveau]
       [<ffffffff813b13db>] local_pci_probe+0x4b/0x80
       [<ffffffff813b1701>] pci_device_probe+0x111/0x120
       [<ffffffff814977eb>] driver_probe_device+0x8b/0x3a0
       [<ffffffff81497bab>] __driver_attach+0xab/0xb0
       [<ffffffff814956ad>] bus_for_each_dev+0x5d/0xa0
       [<ffffffff814971fe>] driver_attach+0x1e/0x20
       [<ffffffff81496cc1>] bus_add_driver+0x111/0x290
       [<ffffffff814982b7>] driver_register+0x77/0x170
       [<ffffffff813b0454>] __pci_register_driver+0x64/0x70
       [<ffffffffa008a9da>] drm_pci_init+0x11a/0x130 [drm]
       [<ffffffffa022a04d>] nouveau_drm_init+0x4d/0x1000 [nouveau]
       [<ffffffff810002ea>] do_one_initcall+0xea/0x1a0
       [<ffffffff810c54cb>] load_module+0x123b/0x1bf0
       [<ffffffff810c5f57>] SyS_init_module+0xd7/0x120
       [<ffffffff817677c2>] system_call_fastpath+0x16/0x1b

-> rabeeh#1 (console_lock){+.+.+.}:
       [<ffffffff810b6d62>] lock_acquire+0x92/0x1f0
       [<ffffffff810430a7>] console_lock+0x77/0x80
       [<ffffffff8146b2a1>] con_flush_chars+0x31/0x50
       [<ffffffff8145780c>] n_tty_write+0x1ec/0x4d0
       [<ffffffff814541b9>] tty_write+0x159/0x2e0
       [<ffffffff814543f5>] redirected_tty_write+0xb5/0xc0
       [<ffffffff811ab9d5>] vfs_write+0xc5/0x1f0
       [<ffffffff811abec5>] SyS_write+0x55/0xa0
       [<ffffffff817677c2>] system_call_fastpath+0x16/0x1b

-> #0 (&tty->termios_rwsem){++++..}:
       [<ffffffff810b65c3>] __lock_acquire+0x1c43/0x1d30
       [<ffffffff810b6d62>] lock_acquire+0x92/0x1f0
       [<ffffffff8175b724>] down_write+0x44/0x70
       [<ffffffff81452656>] tty_do_resize+0x36/0xe0
       [<ffffffff8146c841>] vc_do_resize+0x3e1/0x4c0
       [<ffffffff8146c99f>] vc_resize+0x1f/0x30
       [<ffffffff813e4535>] fbcon_init+0x385/0x5a0
       [<ffffffff8146a4bc>] visual_init+0xbc/0x120
       [<ffffffff8146cd13>] do_bind_con_driver+0x163/0x320
       [<ffffffff8146cfa1>] do_take_over_console+0x61/0x70
       [<ffffffff813e2b93>] do_fbcon_takeover+0x63/0xc0
       [<ffffffff813e67a5>] fbcon_event_notify+0x715/0x820
       [<ffffffff81762f9d>] notifier_call_chain+0x5d/0x110
       [<ffffffff8107aadc>] __blocking_notifier_call_chain+0x6c/0xc0
       [<ffffffff8107ab46>] blocking_notifier_call_chain+0x16/0x20
       [<ffffffff813d7c0b>] fb_notifier_call_chain+0x1b/0x20
       [<ffffffff813d95b2>] register_framebuffer+0x1e2/0x320
       [<ffffffffa01043e1>] drm_fb_helper_initial_config+0x371/0x540 [drm_kms_helper]
       [<ffffffffa01bcb05>] nouveau_fbcon_init+0x105/0x140 [nouveau]
       [<ffffffffa01ad0af>] nouveau_drm_load+0x43f/0x610 [nouveau]
       [<ffffffffa008a79e>] drm_get_pci_dev+0x17e/0x2a0 [drm]
       [<ffffffffa01ad4da>] nouveau_drm_probe+0x25a/0x2a0 [nouveau]
       [<ffffffff813b13db>] local_pci_probe+0x4b/0x80
       [<ffffffff813b1701>] pci_device_probe+0x111/0x120
       [<ffffffff814977eb>] driver_probe_device+0x8b/0x3a0
       [<ffffffff81497bab>] __driver_attach+0xab/0xb0
       [<ffffffff814956ad>] bus_for_each_dev+0x5d/0xa0
       [<ffffffff814971fe>] driver_attach+0x1e/0x20
       [<ffffffff81496cc1>] bus_add_driver+0x111/0x290
       [<ffffffff814982b7>] driver_register+0x77/0x170
       [<ffffffff813b0454>] __pci_register_driver+0x64/0x70
       [<ffffffffa008a9da>] drm_pci_init+0x11a/0x130 [drm]
       [<ffffffffa022a04d>] nouveau_drm_init+0x4d/0x1000 [nouveau]
       [<ffffffff810002ea>] do_one_initcall+0xea/0x1a0
       [<ffffffff810c54cb>] load_module+0x123b/0x1bf0
       [<ffffffff810c5f57>] SyS_init_module+0xd7/0x120
       [<ffffffff817677c2>] system_call_fastpath+0x16/0x1b

other info that might help us debug this:

Chain exists of:
  &tty->termios_rwsem --> console_lock --> (fb_notifier_list).rwsem

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock((fb_notifier_list).rwsem);
                               lock(console_lock);
                               lock((fb_notifier_list).rwsem);
  lock(&tty->termios_rwsem);

 *** DEADLOCK ***

7 locks held by modprobe/277:
 #0:  (&__lockdep_no_validate__){......}, at: [<ffffffff81497b5b>] __driver_attach+0x5b/0xb0
 rabeeh#1:  (&__lockdep_no_validate__){......}, at: [<ffffffff81497b69>] __driver_attach+0x69/0xb0
 rabeeh#2:  (drm_global_mutex){+.+.+.}, at: [<ffffffffa008a6dd>] drm_get_pci_dev+0xbd/0x2a0 [drm]
 rabeeh#3:  (registration_lock){+.+.+.}, at: [<ffffffff813d93f5>] register_framebuffer+0x25/0x320
 rabeeh#4:  (&fb_info->lock){+.+.+.}, at: [<ffffffff813d8116>] lock_fb_info+0x26/0x60
 rabeeh#5:  (console_lock){+.+.+.}, at: [<ffffffff813d95a4>] register_framebuffer+0x1d4/0x320
 rabeeh#6:  ((fb_notifier_list).rwsem){.+.+.+}, at: [<ffffffff8107aac6>] __blocking_notifier_call_chain+0x56/0xc0

stack backtrace:
CPU: 0 PID: 277 Comm: modprobe Not tainted 3.10.0-0+tip-xeon+lockdep #0+tip
Hardware name: Dell Inc. Precision WorkStation T5400  /0RW203, BIOS A11 04/30/2012
 ffffffff8213e5e0 ffff8802aa2fb298 ffffffff81755f19 ffff8802aa2fb2e8
 ffffffff8174f506 ffff8802aa2fa000 ffff8802aa2fb378 ffff8802aa2ea8e8
 ffff8802aa2ea910 ffff8802aa2ea8e8 0000000000000006 0000000000000007
Call Trace:
 [<ffffffff81755f19>] dump_stack+0x19/0x1b
 [<ffffffff8174f506>] print_circular_bug+0x1fb/0x20c
 [<ffffffff810b65c3>] __lock_acquire+0x1c43/0x1d30
 [<ffffffff810b775e>] ? mark_held_locks+0xae/0x120
 [<ffffffff810b78d5>] ? trace_hardirqs_on_caller+0x105/0x1d0
 [<ffffffff810b6d62>] lock_acquire+0x92/0x1f0
 [<ffffffff81452656>] ? tty_do_resize+0x36/0xe0
 [<ffffffff8175b724>] down_write+0x44/0x70
 [<ffffffff81452656>] ? tty_do_resize+0x36/0xe0
 [<ffffffff81452656>] tty_do_resize+0x36/0xe0
 [<ffffffff8146c841>] vc_do_resize+0x3e1/0x4c0
 [<ffffffff8146c99f>] vc_resize+0x1f/0x30
 [<ffffffff813e4535>] fbcon_init+0x385/0x5a0
 [<ffffffff8146a4bc>] visual_init+0xbc/0x120
 [<ffffffff8146cd13>] do_bind_con_driver+0x163/0x320
 [<ffffffff8146cfa1>] do_take_over_console+0x61/0x70
 [<ffffffff813e2b93>] do_fbcon_takeover+0x63/0xc0
 [<ffffffff813e67a5>] fbcon_event_notify+0x715/0x820
 [<ffffffff81762f9d>] notifier_call_chain+0x5d/0x110
 [<ffffffff8107aadc>] __blocking_notifier_call_chain+0x6c/0xc0
 [<ffffffff8107ab46>] blocking_notifier_call_chain+0x16/0x20
 [<ffffffff813d7c0b>] fb_notifier_call_chain+0x1b/0x20
 [<ffffffff813d95b2>] register_framebuffer+0x1e2/0x320
 [<ffffffffa01043e1>] drm_fb_helper_initial_config+0x371/0x540 [drm_kms_helper]
 [<ffffffff8173cbcb>] ? kmemleak_alloc+0x5b/0xc0
 [<ffffffff81198874>] ? kmem_cache_alloc_trace+0x104/0x290
 [<ffffffffa01035e1>] ? drm_fb_helper_single_add_all_connectors+0x81/0xf0 [drm_kms_helper]
 [<ffffffffa01bcb05>] nouveau_fbcon_init+0x105/0x140 [nouveau]
 [<ffffffffa01ad0af>] nouveau_drm_load+0x43f/0x610 [nouveau]
 [<ffffffffa008a79e>] drm_get_pci_dev+0x17e/0x2a0 [drm]
 [<ffffffffa01ad4da>] nouveau_drm_probe+0x25a/0x2a0 [nouveau]
 [<ffffffff8175f162>] ? _raw_spin_unlock_irqrestore+0x42/0x80
 [<ffffffff813b13db>] local_pci_probe+0x4b/0x80
 [<ffffffff813b1701>] pci_device_probe+0x111/0x120
 [<ffffffff814977eb>] driver_probe_device+0x8b/0x3a0
 [<ffffffff81497bab>] __driver_attach+0xab/0xb0
 [<ffffffff81497b00>] ? driver_probe_device+0x3a0/0x3a0
 [<ffffffff814956ad>] bus_for_each_dev+0x5d/0xa0
 [<ffffffff814971fe>] driver_attach+0x1e/0x20
 [<ffffffff81496cc1>] bus_add_driver+0x111/0x290
 [<ffffffffa022a000>] ? 0xffffffffa0229fff
 [<ffffffff814982b7>] driver_register+0x77/0x170
 [<ffffffffa022a000>] ? 0xffffffffa0229fff
 [<ffffffff813b0454>] __pci_register_driver+0x64/0x70
 [<ffffffffa008a9da>] drm_pci_init+0x11a/0x130 [drm]
 [<ffffffffa022a000>] ? 0xffffffffa0229fff
 [<ffffffffa022a000>] ? 0xffffffffa0229fff
 [<ffffffffa022a04d>] nouveau_drm_init+0x4d/0x1000 [nouveau]
 [<ffffffff810002ea>] do_one_initcall+0xea/0x1a0
 [<ffffffff810c54cb>] load_module+0x123b/0x1bf0
 [<ffffffff81399a50>] ? ddebug_proc_open+0xb0/0xb0
 [<ffffffff813855ae>] ? trace_hardirqs_on_thunk+0x3a/0x3f
 [<ffffffff810c5f57>] SyS_init_module+0xd7/0x120
 [<ffffffff817677c2>] system_call_fastpath+0x16/0x1b

Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
We used to keep the port's char device structs and the /sys entries
around till the last reference to the port was dropped.  This is
actually unnecessary, and resulted in buggy behaviour:

1. Open port in guest
2. Hot-unplug port
3. Hot-plug a port with the same 'name' property as the unplugged one

This resulted in hot-plug being unsuccessful, as a port with the same
name already exists (even though it was unplugged).

This behaviour resulted in a warning message like this one:

-------------------8<---------------------------------------
WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xc9/0x130() (Not tainted)
Hardware name: KVM
sysfs: cannot create duplicate filename
'/devices/pci0000:00/0000:00:04.0/virtio0/virtio-ports/vport0p1'

Call Trace:
 [<ffffffff8106b607>] ? warn_slowpath_common+0x87/0xc0
 [<ffffffff8106b6f6>] ? warn_slowpath_fmt+0x46/0x50
 [<ffffffff811f2319>] ? sysfs_add_one+0xc9/0x130
 [<ffffffff811f23e8>] ? create_dir+0x68/0xb0
 [<ffffffff811f2469>] ? sysfs_create_dir+0x39/0x50
 [<ffffffff81273129>] ? kobject_add_internal+0xb9/0x260
 [<ffffffff812733d8>] ? kobject_add_varg+0x38/0x60
 [<ffffffff812734b4>] ? kobject_add+0x44/0x70
 [<ffffffff81349de4>] ? get_device_parent+0xf4/0x1d0
 [<ffffffff8134b389>] ? device_add+0xc9/0x650

-------------------8<---------------------------------------

Instead of relying on guest applications to release all references to
the ports, we should go ahead and unregister the port from all the core
layers.  Any open/read calls on the port will then just return errors,
and an unplug/plug operation on the host will succeed as expected.

This also caused buggy behaviour in case of the device removal (not just
a port): when the device was removed (which means all ports on that
device are removed automatically as well), the ports with active
users would clean up only when the last references were dropped -- and
it would be too late then to be referencing char device pointers,
resulting in oopses:

-------------------8<---------------------------------------
PID: 6162   TASK: ffff8801147ad500  CPU: 0   COMMAND: "cat"
 #0 [ffff88011b9d5a90] machine_kexec at ffffffff8103232b
 rabeeh#1 [ffff88011b9d5af0] crash_kexec at ffffffff810b9322
 rabeeh#2 [ffff88011b9d5bc0] oops_end at ffffffff814f4a50
 rabeeh#3 [ffff88011b9d5bf0] die at ffffffff8100f26b
 rabeeh#4 [ffff88011b9d5c20] do_general_protection at ffffffff814f45e2
 rabeeh#5 [ffff88011b9d5c50] general_protection at ffffffff814f3db5
    [exception RIP: strlen+2]
    RIP: ffffffff81272ae2  RSP: ffff88011b9d5d00  RFLAGS: 00010246
    RAX: 0000000000000000  RBX: ffff880118901c18  RCX: 0000000000000000
    RDX: ffff88011799982c  RSI: 00000000000000d0  RDI: 3a303030302f3030
    RBP: ffff88011b9d5d38   R8: 0000000000000006   R9: ffffffffa0134500
    R10: 0000000000001000  R11: 0000000000001000  R12: ffff880117a1cc10
    R13: 00000000000000d0  R14: 0000000000000017  R15: ffffffff81aff700
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 rabeeh#6 [ffff88011b9d5d00] kobject_get_path at ffffffff8126dc5d
 linux4kix#7 [ffff88011b9d5d40] kobject_uevent_env at ffffffff8126e551
 linux4kix#8 [ffff88011b9d5dd0] kobject_uevent at ffffffff8126e9eb
 linux4kix#9 [ffff88011b9d5de0] device_del at ffffffff813440c7

-------------------8<---------------------------------------

So clean up when we have all the context, and all that's left to do when
the references to the port have dropped is to free up the port struct
itself.

CC: <stable@vger.kernel.org>
Reported-by: chayang <chayang@redhat.com>
Reported-by: YOGANANTH SUBRAMANIAN <anantyog@in.ibm.com>
Reported-by: FuXiangChun <xfu@redhat.com>
Reported-by: Qunfang Zhang <qzhang@redhat.com>
Reported-by: Sibiao Luo <sluo@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
In several places, this snippet is used when removing neigh entries:

	list_del(&neigh->list);
	ipoib_neigh_free(neigh);

The list_del() removes neigh from the associated struct ipoib_path, while
ipoib_neigh_free() removes neigh from the device's neigh entry lookup
table.  Both of these operations are protected by the priv->lock
spinlock.  The table however is also protected via RCU, and so naturally
the lock is not held when doing reads.

This leads to a race condition, in which a thread may successfully look
up a neigh entry that has already been deleted from neigh->list.  Since
the previous deletion will have marked the entry with poison, a second
list_del() on the object will cause a panic:

  rabeeh#5 [ffff8802338c3c70] general_protection at ffffffff815108c5
     [exception RIP: list_del+16]
     RIP: ffffffff81289020  RSP: ffff8802338c3d20  RFLAGS: 00010082
     RAX: dead000000200200  RBX: ffff880433e60c88  RCX: 0000000000009e6c
     RDX: 0000000000000246  RSI: ffff8806012ca298  RDI: ffff880433e60c88
     RBP: ffff8802338c3d30   R8: ffff8806012ca2e8   R9: 00000000ffffffff
     R10: 0000000000000001  R11: 0000000000000000  R12: ffff8804346b2020
     R13: ffff88032a3e7540  R14: ffff8804346b26e0  R15: 0000000000000246
     ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
  rabeeh#6 [ffff8802338c3d38] ipoib_cm_tx_handler at ffffffffa066fe0a [ib_ipoib]
  linux4kix#7 [ffff8802338c3d98] cm_process_work at ffffffffa05149a7 [ib_cm]
  linux4kix#8 [ffff8802338c3de8] cm_work_handler at ffffffffa05161aa [ib_cm]
  linux4kix#9 [ffff8802338c3e38] worker_thread at ffffffff81090e10
 linux4kix#10 [ffff8802338c3ee8] kthread at ffffffff81096c66
 linux4kix#11 [ffff8802338c3f48] kernel_thread at ffffffff8100c0ca

We move the list_del() into ipoib_neigh_free(), so that deletion happens
only once, after the entry has been successfully removed from the lookup
table.  This same behavior is already used in ipoib_del_neighs_by_gid()
and __ipoib_reap_neigh().

Signed-off-by: Jim Foraker <foraker1@llnl.gov>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Reviewed-by: Jack Wang <jinpu.wang@profitbricks.com>
Reviewed-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Pull SCSI target fixes from Nicholas Bellinger:
 "The first patch is to address a long standing issue where INQUIRY
  vendor + model response data was not correctly padded with ASCII
  spaces, causing MSFT and Falconstor multipath stacks to not function
  with our LUNs.

  The second -> forth patches are additional iscsi-target regression
  fixes for the post >= v3.10 iser-target changes.  The second and third
  are failure cases that have appeared during further testing, and the
  forth is only reproducible with malformed NOP packets.

  The fifth patch is a v3.11 specific regression caused by a recent
  optimization that showed up during WRITE I/O failure testing.

  I'll be sending Patch rabeeh#1 and Patch rabeeh#5 to Greg-KH separately for
  stable"

* git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  target: Fix se_cmd->state_list leak regression during WRITE failure
  iscsi-target: Fix potential NULL pointer in solicited NOPOUT reject
  iscsi-target: Fix iscsit_transport reference leak during NP thread reset
  iscsi-target: Fix ImmediateData=Yes failure regression in >= v3.10
  target: Fix trailing ASCII space usage in INQUIRY vendor+model
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
When booting secondary CPUs, announce_cpu() is called to show which cpu has
been brought up. For example:

[    0.402751] smpboot: Booting Node   0, Processors  rabeeh#1 rabeeh#2 rabeeh#3 rabeeh#4 rabeeh#5 OK
[    0.525667] smpboot: Booting Node   1, Processors  rabeeh#6 linux4kix#7 linux4kix#8 linux4kix#9 linux4kix#10 linux4kix#11 OK
[    0.755592] smpboot: Booting Node   0, Processors  linux4kix#12 linux4kix#13 linux4kix#14 linux4kix#15 linux4kix#16 linux4kix#17 OK
[    0.890495] smpboot: Booting Node   1, Processors  linux4kix#18 linux4kix#19 linux4kix#20 linux4kix#21 linux4kix#22 linux4kix#23

But the last "OK" is lost, because 'nr_cpu_ids-1' represents the maximum
possible cpu id. It should use the maximum present cpu id in case not all
CPUs booted up.

Signed-off-by: Libin <huawei.libin@huawei.com>
Cc: <guohanjun@huawei.com>
Cc: <wangyijing@huawei.com>
Cc: <fenghua.yu@intel.com>
Cc: <paul.gortmaker@windriver.com>
Link: http://lkml.kernel.org/r/1378378676-18276-1-git-send-email-huawei.libin@huawei.com
[ tweaked the changelog, removed unnecessary line break, tweaked the format to align the fields vertically. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
When parsing lines from objdump a line containing source code starting
with a numeric label is mistaken for a line of disassembly starting with
a memory address.

Current validation fails to recognise that the "memory address" is out
of range and calculates an invalid offset which later causes this
segfault:

Program received signal SIGSEGV, Segmentation fault.
0x0000000000457315 in disasm__calc_percent (notes=0xc98970, evidx=0, offset=143705, end=2127526177, path=0x7fffffffbf50)
    at util/annotate.c:631
631				hits += h->addr[offset++];
(gdb) bt
 #0  0x0000000000457315 in disasm__calc_percent (notes=0xc98970, evidx=0, offset=143705, end=2127526177, path=0x7fffffffbf50)
    at util/annotate.c:631
 rabeeh#1  0x00000000004d65e3 in annotate_browser__calc_percent (browser=0x7fffffffd130, evsel=0xa01da0) at ui/browsers/annotate.c:364
 rabeeh#2  0x00000000004d7433 in annotate_browser__run (browser=0x7fffffffd130, evsel=0xa01da0, hbt=0x0) at ui/browsers/annotate.c:672
 rabeeh#3  0x00000000004d80c9 in symbol__tui_annotate (sym=0xc989a0, map=0xa02660, evsel=0xa01da0, hbt=0x0) at ui/browsers/annotate.c:962
 rabeeh#4  0x00000000004d7aa0 in hist_entry__tui_annotate (he=0xdf73f0, evsel=0xa01da0, hbt=0x0) at ui/browsers/annotate.c:823
 rabeeh#5  0x00000000004dd648 in perf_evsel__hists_browse (evsel=0xa01da0, nr_events=1, helpline=
    0x58b768 "For a higher level overview, try: perf report --sort comm,dso", ev_name=0xa02cd0 "cycles", left_exits=false, hbt=
    0x0, min_pcnt=0, env=0xa011e0) at ui/browsers/hists.c:1659
 rabeeh#6  0x00000000004de372 in perf_evlist__tui_browse_hists (evlist=0xa01520, help=
    0x58b768 "For a higher level overview, try: perf report --sort comm,dso", hbt=0x0, min_pcnt=0, env=0xa011e0)
    at ui/browsers/hists.c:1950
 linux4kix#7  0x000000000042cf6b in __cmd_report (rep=0x7fffffffd6c0) at builtin-report.c:581
 linux4kix#8  0x000000000042e25d in cmd_report (argc=0, argv=0x7fffffffe4b0, prefix=0x0) at builtin-report.c:965
 linux4kix#9  0x000000000041a0e1 in run_builtin (p=0x801548, argc=1, argv=0x7fffffffe4b0) at perf.c:319
 linux4kix#10 0x000000000041a319 in handle_internal_command (argc=1, argv=0x7fffffffe4b0) at perf.c:376
 linux4kix#11 0x000000000041a465 in run_argv (argcp=0x7fffffffe38c, argv=0x7fffffffe380) at perf.c:420
 linux4kix#12 0x000000000041a707 in main (argc=1, argv=0x7fffffffe4b0) at perf.c:521

After the fix is applied the symbol can be annotated showing the
problematic line "1:      rep"

copy_user_generic_string  /usr/lib/debug/lib/modules/3.9.10-100.fc17.x86_64/vmlinux
             */
            ENTRY(copy_user_generic_string)
                    CFI_STARTPROC
                    ASM_STAC
                    andl %edx,%edx
              and    %edx,%edx
                    jz 4f
              je     37
                    cmpl $8,%edx
              cmp    $0x8,%edx
                    jb 2f           /* less than 8 bytes, go to byte copy loop */
              jb     33
                    ALIGN_DESTINATION
              mov    %edi,%ecx
              and    $0x7,%ecx
              je     28
              sub    $0x8,%ecx
              neg    %ecx
              sub    %ecx,%edx
        1a:   mov    (%rsi),%al
              mov    %al,(%rdi)
              inc    %rsi
              inc    %rdi
              dec    %ecx
              jne    1a
                    movl %edx,%ecx
        28:   mov    %edx,%ecx
                    shrl $3,%ecx
              shr    $0x3,%ecx
                    andl $7,%edx
              and    $0x7,%edx
            1:      rep
100.00        rep    movsq %ds:(%rsi),%es:(%rdi)
                    movsq
            2:      movl %edx,%ecx
        33:   mov    %edx,%ecx
            3:      rep
              rep    movsb %ds:(%rsi),%es:(%rdi)
                    movsb
            4:      xorl %eax,%eax
        37:   xor    %eax,%eax
              data32 xchg %ax,%ax
                    ASM_CLAC
                    ret
              retq

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1379009721-27667-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Some ARC SMP systems lack native atomic R-M-W (LLOCK/SCOND) insns and
can only use atomic EX insn (reg with mem) to build higher level R-M-W
primitives. This includes a SystemC based SMP simulation model.

So rwlocks need to use a protecting spinlock for atomic cmp-n-exchange
operation to update reader(s)/writer count.

The spinlock operation itself looks as follows:

	mov reg, 1		; 1=locked, 0=unlocked
retry:
	EX reg, [lock]		; load existing, store 1, atomically
	BREQ reg, 1, rety	; if already locked, retry

In single-threaded simulation, SystemC alternates between the 2 cores
with "N" insn each based scheduling. Additionally for insn with global
side effect, such as EX writing to shared mem, a core switch is
enforced too.

Given that, 2 cores doing a repeated EX on same location, Linux often
got into a livelock e.g. when both cores were fiddling with tasklist
lock (gdbserver / hackbench) for read/write respectively as the
sequence diagram below shows:

           core1                                   core2
         --------                                --------
1. spin lock [EX r=0, w=1] - LOCKED
2. rwlock(Read)            - LOCKED
3. spin unlock  [ST 0]     - UNLOCKED
                                         spin lock [EX r=0,w=1] - LOCKED
                      -- resched core 1----

5. spin lock [EX r=1] - ALREADY-LOCKED

                      -- resched core 2----
6.                                       rwlock(Write) - READER-LOCKED
7.                                       spin unlock [ST 0]
8.                                       rwlock failed, retry again

9.                                       spin lock  [EX r=0, w=1]
                      -- resched core 1----

10  spinlock locked in linux4kix#9, retry rabeeh#5
11. spin lock [EX gets 1]
                      -- resched core 2----
...
...

The fix was to unlock using the EX insn too (step 7), to trigger another
SystemC scheduling pass which would let core1 proceed, eliding the
livelock.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
As the new x86 CPU bootup printout format code maintainer, I am
taking immediate action to improve and clean (and thus indulge
my OCD) the reporting of the cores when coming up online.

Fix padding to a right-hand alignment, cleanup code and bind
reporting width to the max number of supported CPUs on the
system, like this:

 [    0.074509] smpboot: Booting Node   0, Processors:      rabeeh#1  rabeeh#2  rabeeh#3  rabeeh#4  rabeeh#5  rabeeh#6  linux4kix#7 OK
 [    0.644008] smpboot: Booting Node   1, Processors:  linux4kix#8  linux4kix#9 linux4kix#10 linux4kix#11 linux4kix#12 linux4kix#13 linux4kix#14 linux4kix#15 OK
 [    1.245006] smpboot: Booting Node   2, Processors: linux4kix#16 linux4kix#17 linux4kix#18 linux4kix#19 linux4kix#20 linux4kix#21 linux4kix#22 linux4kix#23 OK
 [    1.864005] smpboot: Booting Node   3, Processors: linux4kix#24 linux4kix#25 linux4kix#26 linux4kix#27 linux4kix#28 #29 #30 #31 OK
 [    2.489005] smpboot: Booting Node   4, Processors: #32 #33 #34 #35 #36 #37 #38 #39 OK
 [    3.093005] smpboot: Booting Node   5, Processors: #40 #41 #42 #43 #44 #45 #46 #47 OK
 [    3.698005] smpboot: Booting Node   6, Processors: #48 #49 #50 #51 #52 #53 #54 #55 OK
 [    4.304005] smpboot: Booting Node   7, Processors: #56 #57 #58 #59 #60 #61 #62 #63 OK
 [    4.961413] Brought up 64 CPUs

and this:

 [    0.072367] smpboot: Booting Node   0, Processors:    rabeeh#1 rabeeh#2 rabeeh#3 rabeeh#4 rabeeh#5 rabeeh#6 linux4kix#7 OK
 [    0.686329] Brought up 8 CPUs

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Libin <huawei.libin@huawei.com>
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Link: http://lkml.kernel.org/r/20130927143554.GF4422@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
The recent 3.12 pull request for apparmor was missing a couple rcu _protected
access modifiers. Resulting in the follow suspicious RCU usage

 [   29.804534] [ INFO: suspicious RCU usage. ]
 [   29.804539] 3.11.0+ rabeeh#5 Not tainted
 [   29.804541] -------------------------------
 [   29.804545] security/apparmor/include/policy.h:363 suspicious rcu_dereference_check() usage!
 [   29.804548]
 [   29.804548] other info that might help us debug this:
 [   29.804548]
 [   29.804553]
 [   29.804553] rcu_scheduler_active = 1, debug_locks = 1
 [   29.804558] 2 locks held by apparmor_parser/1268:
 [   29.804560]  #0:  (sb_writers#9){.+.+.+}, at: [<ffffffff81120a4c>] file_start_write+0x27/0x29
 [   29.804576]  rabeeh#1:  (&ns->lock){+.+.+.}, at: [<ffffffff811f5d88>] aa_replace_profiles+0x166/0x57c
 [   29.804589]
 [   29.804589] stack backtrace:
 [   29.804595] CPU: 0 PID: 1268 Comm: apparmor_parser Not tainted 3.11.0+ rabeeh#5
 [   29.804599] Hardware name: ASUSTeK Computer Inc.         UL50VT          /UL50VT    , BIOS 217     03/01/2010
 [   29.804602]  0000000000000000 ffff8800b95a1d90 ffffffff8144eb9b ffff8800b94db540
 [   29.804611]  ffff8800b95a1dc0 ffffffff81087439 ffff880138cc3a18 ffff880138cc3a18
 [   29.804619]  ffff8800b9464a90 ffff880138cc3a38 ffff8800b95a1df0 ffffffff811f5084
 [   29.804628] Call Trace:
 [   29.804636]  [<ffffffff8144eb9b>] dump_stack+0x4e/0x82
 [   29.804642]  [<ffffffff81087439>] lockdep_rcu_suspicious+0xfc/0x105
 [   29.804649]  [<ffffffff811f5084>] __aa_update_replacedby+0x53/0x7f
 [   29.804655]  [<ffffffff811f5408>] __replace_profile+0x11f/0x1ed
 [   29.804661]  [<ffffffff811f6032>] aa_replace_profiles+0x410/0x57c
 [   29.804668]  [<ffffffff811f16d4>] profile_replace+0x35/0x4c
 [   29.804674]  [<ffffffff81120fa3>] vfs_write+0xad/0x113
 [   29.804680]  [<ffffffff81121609>] SyS_write+0x44/0x7a
 [   29.804687]  [<ffffffff8145bfd2>] system_call_fastpath+0x16/0x1b
 [   29.804691]
 [   29.804694] ===============================
 [   29.804697] [ INFO: suspicious RCU usage. ]
 [   29.804700] 3.11.0+ rabeeh#5 Not tainted
 [   29.804703] -------------------------------
 [   29.804706] security/apparmor/policy.c:566 suspicious rcu_dereference_check() usage!
 [   29.804709]
 [   29.804709] other info that might help us debug this:
 [   29.804709]
 [   29.804714]
 [   29.804714] rcu_scheduler_active = 1, debug_locks = 1
 [   29.804718] 2 locks held by apparmor_parser/1268:
 [   29.804721]  #0:  (sb_writers#9){.+.+.+}, at: [<ffffffff81120a4c>] file_start_write+0x27/0x29
 [   29.804733]  rabeeh#1:  (&ns->lock){+.+.+.}, at: [<ffffffff811f5d88>] aa_replace_profiles+0x166/0x57c
 [   29.804744]
 [   29.804744] stack backtrace:
 [   29.804750] CPU: 0 PID: 1268 Comm: apparmor_parser Not tainted 3.11.0+ rabeeh#5
 [   29.804753] Hardware name: ASUSTeK Computer Inc.         UL50VT          /UL50VT    , BIOS 217     03/01/2010
 [   29.804756]  0000000000000000 ffff8800b95a1d80 ffffffff8144eb9b ffff8800b94db540
 [   29.804764]  ffff8800b95a1db0 ffffffff81087439 ffff8800b95b02b0 0000000000000000
 [   29.804772]  ffff8800b9efba08 ffff880138cc3a38 ffff8800b95a1dd0 ffffffff811f4f94
 [   29.804779] Call Trace:
 [   29.804786]  [<ffffffff8144eb9b>] dump_stack+0x4e/0x82
 [   29.804791]  [<ffffffff81087439>] lockdep_rcu_suspicious+0xfc/0x105
 [   29.804798]  [<ffffffff811f4f94>] aa_free_replacedby_kref+0x4d/0x62
 [   29.804804]  [<ffffffff811f4f47>] ? aa_put_namespace+0x17/0x17
 [   29.804810]  [<ffffffff811f4f0b>] kref_put+0x36/0x40
 [   29.804816]  [<ffffffff811f5423>] __replace_profile+0x13a/0x1ed
 [   29.804822]  [<ffffffff811f6032>] aa_replace_profiles+0x410/0x57c
 [   29.804829]  [<ffffffff811f16d4>] profile_replace+0x35/0x4c
 [   29.804835]  [<ffffffff81120fa3>] vfs_write+0xad/0x113
 [   29.804840]  [<ffffffff81121609>] SyS_write+0x44/0x7a
 [   29.804847]  [<ffffffff8145bfd2>] system_call_fastpath+0x16/0x1b

Reported-by: miles.lane@gmail.com
CC: paulmck@linux.vnet.ibm.com
Signed-off-by: John Johansen <john.johansen@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Michael Semon reported that xfs/299 generated this lockdep warning:

=============================================
[ INFO: possible recursive locking detected ]
3.12.0-rc2+ rabeeh#2 Not tainted
---------------------------------------------
touch/21072 is trying to acquire lock:
 (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

but task is already holding lock:
 (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&xfs_dquot_other_class);
  lock(&xfs_dquot_other_class);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

7 locks held by touch/21072:
 #0:  (sb_writers#10){++++.+}, at: [<c11185b6>] mnt_want_write+0x1e/0x3e
 rabeeh#1:  (&type->i_mutex_dir_key#4){+.+.+.}, at: [<c11078ee>] do_last+0x245/0xe40
 rabeeh#2:  (sb_internal#2){++++.+}, at: [<c122c9e0>] xfs_trans_alloc+0x1f/0x35
 rabeeh#3:  (&(&ip->i_lock)->mr_lock/1){+.+...}, at: [<c126cd1b>] xfs_ilock+0x100/0x1f1
 rabeeh#4:  (&(&ip->i_lock)->mr_lock){++++-.}, at: [<c126cf52>] xfs_ilock_nowait+0x105/0x22f
 rabeeh#5:  (&dqp->q_qlock){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
 rabeeh#6:  (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

The lockdep annotation for dquot lock nesting only understands
locking for user and "other" dquots, not user, group and quota
dquots. Fix the annotations to match the locking heirarchy we now
have.

Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          rabeeh#1   rabeeh#2   rabeeh#3   rabeeh#4   rabeeh#5   rabeeh#6   linux4kix#7
[    0.603005] .... node   rabeeh#1, CPUs:     linux4kix#8   linux4kix#9  linux4kix#10  linux4kix#11  linux4kix#12  linux4kix#13  linux4kix#14  linux4kix#15
[    1.200005] .... node   rabeeh#2, CPUs:    linux4kix#16  linux4kix#17  linux4kix#18  linux4kix#19  linux4kix#20  linux4kix#21  linux4kix#22  linux4kix#23
[    1.796005] .... node   rabeeh#3, CPUs:    linux4kix#24  linux4kix#25  linux4kix#26  linux4kix#27  linux4kix#28  #29  #30  #31
[    2.393005] .... node   rabeeh#4, CPUs:    #32  #33  #34  #35  #36  #37  #38  #39
[    2.996005] .... node   rabeeh#5, CPUs:    #40  #41  #42  #43  #44  #45  #46  #47
[    3.600005] .... node   rabeeh#6, CPUs:    #48  #49  #50  #51  #52  #53  #54  #55
[    4.202005] .... node   linux4kix#7, CPUs:    #56  #57  #58  #59  #60  #61  #62  #63
[    4.811005] .... node   linux4kix#8, CPUs:    #64  #65  #66  #67  #68  #69  #70  #71
[    5.421006] .... node   linux4kix#9, CPUs:    #72  #73  #74  #75  #76  #77  #78  #79
[    6.032005] .... node  linux4kix#10, CPUs:    #80  #81  #82  #83  #84  #85  #86  #87
[    6.648006] .... node  linux4kix#11, CPUs:    #88  #89  #90  #91  #92  #93  #94  #95
[    7.262005] .... node  linux4kix#12, CPUs:    #96  #97  #98  #99 #100 #101 #102 #103
[    7.865005] .... node  linux4kix#13, CPUs:   #104 #105 #106 #107 #108 #109 #110 #111
[    8.466005] .... node  linux4kix#14, CPUs:   #112 #113 #114 #115 #116 #117 #118 #119
[    9.073006] .... node  linux4kix#15, CPUs:   #120 #121 #122 #123 #124 #125 #126 #127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: huawei.libin@huawei.com
Cc: wangyijing@huawei.com
Cc: fenghua.yu@intel.com
Cc: guohanjun@huawei.com
Cc: paul.gortmaker@windriver.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20130930095624.GB16383@pd.tnic
Signed-off-by: Ingo Molnar <mingo@kernel.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Michael Semon reported that xfs/299 generated this lockdep warning:

=============================================
[ INFO: possible recursive locking detected ]
3.12.0-rc2+ rabeeh#2 Not tainted
---------------------------------------------
touch/21072 is trying to acquire lock:
 (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

but task is already holding lock:
 (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&xfs_dquot_other_class);
  lock(&xfs_dquot_other_class);

 *** DEADLOCK ***

 May be due to missing lock nesting notation

7 locks held by touch/21072:
 #0:  (sb_writers#10){++++.+}, at: [<c11185b6>] mnt_want_write+0x1e/0x3e
 rabeeh#1:  (&type->i_mutex_dir_key#4){+.+.+.}, at: [<c11078ee>] do_last+0x245/0xe40
 rabeeh#2:  (sb_internal#2){++++.+}, at: [<c122c9e0>] xfs_trans_alloc+0x1f/0x35
 rabeeh#3:  (&(&ip->i_lock)->mr_lock/1){+.+...}, at: [<c126cd1b>] xfs_ilock+0x100/0x1f1
 rabeeh#4:  (&(&ip->i_lock)->mr_lock){++++-.}, at: [<c126cf52>] xfs_ilock_nowait+0x105/0x22f
 rabeeh#5:  (&dqp->q_qlock){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64
 rabeeh#6:  (&xfs_dquot_other_class){+.+...}, at: [<c12902fb>] xfs_trans_dqlockedjoin+0x57/0x64

The lockdep annotation for dquot lock nesting only understands
locking for user and "other" dquots, not user, group and quota
dquots. Fix the annotations to match the locking heirarchy we now
have.

Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>

(cherry picked from commit f112a04)
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Booting a mx6 with CONFIG_PROVE_LOCKING we get:

======================================================
[ INFO: possible circular locking dependency detected ]
3.12.0-rc4-next-20131009+ #34 Not tainted
-------------------------------------------------------
swapper/0/1 is trying to acquire lock:
 (&imx_drm_device->mutex){+.+.+.}, at: [<804575a8>] imx_drm_encoder_get_mux_id+0x28/0x98

but task is already holding lock:
 (&crtc->mutex){+.+...}, at: [<802fe778>] drm_modeset_lock_all+0x40/0x54

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> rabeeh#2 (&crtc->mutex){+.+...}:
       [<800777d0>] __lock_acquire+0x18d4/0x1c24
       [<80077fec>] lock_acquire+0x68/0x7c
       [<805ead5c>] _mutex_lock_nest_lock+0x58/0x3a8
       [<802fec50>] drm_crtc_init+0x48/0xa8
       [<80457c88>] imx_drm_add_crtc+0xd4/0x144
       [<8045e2e8>] ipu_drm_probe+0x114/0x1fc
       [<80312278>] platform_drv_probe+0x20/0x50
       [<80310c68>] driver_probe_device+0x110/0x22c
       [<80310e20>] __driver_attach+0x9c/0xa0
       [<8030f218>] bus_for_each_dev+0x5c/0x90
       [<80310750>] driver_attach+0x20/0x28
       [<8031034c>] bus_add_driver+0xdc/0x1dc
       [<803114d8>] driver_register+0x80/0xfc
       [<80312198>] __platform_driver_register+0x50/0x64
       [<808172fc>] ipu_drm_driver_init+0x18/0x20
       [<800088c0>] do_one_initcall+0xfc/0x160
       [<807e7c5c>] kernel_init_freeable+0x104/0x1d4
       [<805e2930>] kernel_init+0x10/0xec
       [<8000ea68>] ret_from_fork+0x14/0x2c

-> rabeeh#1 (&dev->mode_config.mutex){+.+.+.}:
       [<800777d0>] __lock_acquire+0x18d4/0x1c24
       [<80077fec>] lock_acquire+0x68/0x7c
       [<805eb100>] mutex_lock_nested+0x54/0x3a4
       [<802fe758>] drm_modeset_lock_all+0x20/0x54
       [<802fead4>] drm_encoder_init+0x20/0x7c
       [<80457ae4>] imx_drm_add_encoder+0x88/0xec
       [<80459838>] imx_ldb_probe+0x344/0x4fc
       [<80312278>] platform_drv_probe+0x20/0x50
       [<80310c68>] driver_probe_device+0x110/0x22c
       [<80310e20>] __driver_attach+0x9c/0xa0
       [<8030f218>] bus_for_each_dev+0x5c/0x90
       [<80310750>] driver_attach+0x20/0x28
       [<8031034c>] bus_add_driver+0xdc/0x1dc
       [<803114d8>] driver_register+0x80/0xfc
       [<80312198>] __platform_driver_register+0x50/0x64
       [<8081722c>] imx_ldb_driver_init+0x18/0x20
       [<800088c0>] do_one_initcall+0xfc/0x160
       [<807e7c5c>] kernel_init_freeable+0x104/0x1d4
       [<805e2930>] kernel_init+0x10/0xec
       [<8000ea68>] ret_from_fork+0x14/0x2c

-> #0 (&imx_drm_device->mutex){+.+.+.}:
       [<805e510c>] print_circular_bug+0x74/0x2e0
       [<80077ad0>] __lock_acquire+0x1bd4/0x1c24
       [<80077fec>] lock_acquire+0x68/0x7c
       [<805eb100>] mutex_lock_nested+0x54/0x3a4
       [<804575a8>] imx_drm_encoder_get_mux_id+0x28/0x98
       [<80459a98>] imx_ldb_encoder_prepare+0x34/0x114
       [<802ef724>] drm_crtc_helper_set_mode+0x1f0/0x4c0
       [<802f0344>] drm_crtc_helper_set_config+0x828/0x99c
       [<802ff270>] drm_mode_set_config_internal+0x5c/0xdc
       [<802eebe0>] drm_fb_helper_set_par+0x50/0xb4
       [<802af580>] fbcon_init+0x490/0x500
       [<802dd104>] visual_init+0xa8/0xf8
       [<802df414>] do_bind_con_driver+0x140/0x37c
       [<802df764>] do_take_over_console+0x114/0x1c4
       [<802af65c>] do_fbcon_takeover+0x6c/0xd4
       [<802b2b30>] fbcon_event_notify+0x7c8/0x818
       [<80049954>] notifier_call_chain+0x4c/0x8c
       [<80049cd8>] __blocking_notifier_call_chain+0x50/0x68
       [<80049d10>] blocking_notifier_call_chain+0x20/0x28
       [<802a75f0>] fb_notifier_call_chain+0x1c/0x24
       [<802a9224>] register_framebuffer+0x188/0x268
       [<802ee994>] drm_fb_helper_initial_config+0x2bc/0x4b8
       [<802f118c>] drm_fbdev_cma_init+0x7c/0xec
       [<80817288>] imx_fb_helper_init+0x54/0x90
       [<800088c0>] do_one_initcall+0xfc/0x160
       [<807e7c5c>] kernel_init_freeable+0x104/0x1d4
       [<805e2930>] kernel_init+0x10/0xec
       [<8000ea68>] ret_from_fork+0x14/0x2c

other info that might help us debug this:

Chain exists of:
  &imx_drm_device->mutex --> &dev->mode_config.mutex --> &crtc->mutex

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&crtc->mutex);
                               lock(&dev->mode_config.mutex);
                               lock(&crtc->mutex);
  lock(&imx_drm_device->mutex);

 *** DEADLOCK ***

6 locks held by swapper/0/1:
 #0:  (registration_lock){+.+.+.}, at: [<802a90bc>] register_framebuffer+0x20/0x268
 rabeeh#1:  (&fb_info->lock){+.+.+.}, at: [<802a7a90>] lock_fb_info+0x20/0x44
 rabeeh#2:  (console_lock){+.+.+.}, at: [<802a9218>] register_framebuffer+0x17c/0x268
 rabeeh#3:  ((fb_notifier_list).rwsem){.+.+.+}, at: [<80049cbc>] __blocking_notifier_call_chain+0x34/0x68
 rabeeh#4:  (&dev->mode_config.mutex){+.+.+.}, at: [<802fe758>] drm_modeset_lock_all+0x20/0x54
 rabeeh#5:  (&crtc->mutex){+.+...}, at: [<802fe778>] drm_modeset_lock_all+0x40/0x54

In order to avoid this lockdep warning, remove the locking from
imx_drm_encoder_get_mux_id() and imx_drm_crtc_panel_format_pins().

Tested on a mx6sabrelite and mx53qsb.

Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
If EM Transmit bit is busy during init ata_msleep() is called.  It is
wrong - msleep() should be used instead of ata_msleep(), because if EM
Transmit bit is busy for one port, it will be busy for all other ports
too, so using ata_msleep() causes wasting tries for another ports.

The most common scenario looks like that now
(six ports try to transmit a LED meaasege):
- port #0 tries for the 1st time and succeeds
- ports rabeeh#1-5 try for the 1st time and sleeps
- port rabeeh#1 tries for the 2nd time and succeeds
- ports rabeeh#2-5 try for the 2nd time and sleeps
- port rabeeh#2 tries for the 3rd time and succeeds
- ports rabeeh#3-5 try for the 3rd time and sleeps
- port rabeeh#3 tries for the 4th time and succeeds
- ports rabeeh#4-5 try for the 4th time and sleeps
- port rabeeh#4 tries for the 5th time and succeeds
- port rabeeh#5 tries for the 5th time and sleeps

At this moment port rabeeh#5 wasted all its five tries and failed to
initialize.  Because there are only 5 (EM_MAX_RETRY) tries available
usually only five ports succeed to initialize. The sixth port and next
ones usually will fail.

If msleep() is used instead of ata_msleep() the first port succeeds to
initialize in the first try and next ones usually succeed to
initialize in the second try.

tj: updated comment

Signed-off-by: Lukasz Dorau <lukasz.dorau@intel.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Andrey reported the following report:

ERROR: AddressSanitizer: heap-buffer-overflow on address ffff8800359c99f3
ffff8800359c99f3 is located 0 bytes to the right of 243-byte region [ffff8800359c9900, ffff8800359c99f3)
Accessed by thread T13003:
  #0 ffffffff810dd2da (asan_report_error+0x32a/0x440)
  rabeeh#1 ffffffff810dc6b0 (asan_check_region+0x30/0x40)
  rabeeh#2 ffffffff810dd4d3 (__tsan_write1+0x13/0x20)
  rabeeh#3 ffffffff811cd19e (ftrace_regex_release+0x1be/0x260)
  rabeeh#4 ffffffff812a1065 (__fput+0x155/0x360)
  rabeeh#5 ffffffff812a12de (____fput+0x1e/0x30)
  rabeeh#6 ffffffff8111708d (task_work_run+0x10d/0x140)
  linux4kix#7 ffffffff810ea043 (do_exit+0x433/0x11f0)
  linux4kix#8 ffffffff810eaee4 (do_group_exit+0x84/0x130)
  linux4kix#9 ffffffff810eafb1 (SyS_exit_group+0x21/0x30)
  linux4kix#10 ffffffff81928782 (system_call_fastpath+0x16/0x1b)

Allocated by thread T5167:
  #0 ffffffff810dc778 (asan_slab_alloc+0x48/0xc0)
  rabeeh#1 ffffffff8128337c (__kmalloc+0xbc/0x500)
  rabeeh#2 ffffffff811d9d54 (trace_parser_get_init+0x34/0x90)
  rabeeh#3 ffffffff811cd7b3 (ftrace_regex_open+0x83/0x2e0)
  rabeeh#4 ffffffff811cda7d (ftrace_filter_open+0x2d/0x40)
  rabeeh#5 ffffffff8129b4ff (do_dentry_open+0x32f/0x430)
  rabeeh#6 ffffffff8129b668 (finish_open+0x68/0xa0)
  linux4kix#7 ffffffff812b66ac (do_last+0xb8c/0x1710)
  linux4kix#8 ffffffff812b7350 (path_openat+0x120/0xb50)
  linux4kix#9 ffffffff812b8884 (do_filp_open+0x54/0xb0)
  linux4kix#10 ffffffff8129d36c (do_sys_open+0x1ac/0x2c0)
  linux4kix#11 ffffffff8129d4b7 (SyS_open+0x37/0x50)
  linux4kix#12 ffffffff81928782 (system_call_fastpath+0x16/0x1b)

Shadow bytes around the buggy address:
  ffff8800359c9700: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  ffff8800359c9780: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
  ffff8800359c9800: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9880: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9900: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>ffff8800359c9980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[03]fb
  ffff8800359c9a00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9a80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  ffff8800359c9b00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  ffff8800359c9b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffff8800359c9c00: 00 00 00 00 00 00 00 00 fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap redzone:          fa
  Heap kmalloc redzone:  fb
  Freed heap region:     fd
  Shadow gap:            fe

The out-of-bounds access happens on 'parser->buffer[parser->idx] = 0;'

Although the crash happened in ftrace_regex_open() the real bug
occurred in trace_get_user() where there's an incrementation to
parser->idx without a check against the size. The way it is triggered
is if userspace sends in 128 characters (EVENT_BUF_SIZE + 1), the loop
that reads the last character stores it and then breaks out because
there is no more characters. Then the last character is read to determine
what to do next, and the index is incremented without checking size.

Then the caller of trace_get_user() usually nulls out the last character
with a zero, but since the index is equal to the size, it writes a nul
character after the allocated space, which can corrupt memory.

Luckily, only root user has write access to this file.

Link: http://lkml.kernel.org/r/20131009222323.04fd1a0d@gandalf.local.home

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
…ux/kernel/git/tip/tip

Pull x86 boot changes from Ingo Molnar:
 "Two changes that prettify and compactify the SMP bootup output from:

     smpboot: Booting Node   0, Processors  rabeeh#1 rabeeh#2 rabeeh#3 OK
     smpboot: Booting Node   1, Processors  rabeeh#4 rabeeh#5 rabeeh#6 linux4kix#7 OK
     smpboot: Booting Node   2, Processors  linux4kix#8 linux4kix#9 linux4kix#10 linux4kix#11 OK
     smpboot: Booting Node   3, Processors  linux4kix#12 linux4kix#13 linux4kix#14 linux4kix#15 OK
     Brought up 16 CPUs

  to something like:

     x86: Booting SMP configuration:
     .... node  #0, CPUs:        rabeeh#1  rabeeh#2  rabeeh#3
     .... node  rabeeh#1, CPUs:    rabeeh#4  rabeeh#5  rabeeh#6  linux4kix#7
     .... node  rabeeh#2, CPUs:    linux4kix#8  linux4kix#9 linux4kix#10 linux4kix#11
     .... node  rabeeh#3, CPUs:   linux4kix#12 linux4kix#13 linux4kix#14 linux4kix#15
     x86: Booted up 4 nodes, 16 CPUs"

* 'x86-boot-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Further compress CPUs bootup message
  x86: Improve the printout of the SMP bootup CPU table
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
As part of normal operaions, the hrtimer subsystem frequently calls
into the timekeeping code, creating a locking order of
  hrtimer locks -> timekeeping locks

clock_was_set_delayed() was suppoed to allow us to avoid deadlocks
between the timekeeping the hrtimer subsystem, so that we could
notify the hrtimer subsytem the time had changed while holding
the timekeeping locks. This was done by scheduling delayed work
that would run later once we were out of the timekeeing code.

But unfortunately the lock chains are complex enoguh that in
scheduling delayed work, we end up eventually trying to grab
an hrtimer lock.

Sasha Levin noticed this in testing when the new seqlock lockdep
enablement triggered the following (somewhat abrieviated) message:

[  251.100221] ======================================================
[  251.100221] [ INFO: possible circular locking dependency detected ]
[  251.100221] 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053 Not tainted
[  251.101967] -------------------------------------------------------
[  251.101967] kworker/10:1/4506 is trying to acquire lock:
[  251.101967]  (timekeeper_seq){----..}, at: [<ffffffff81160e96>] retrigger_next_event+0x56/0x70
[  251.101967]
[  251.101967] but task is already holding lock:
[  251.101967]  (hrtimer_bases.lock#11){-.-...}, at: [<ffffffff81160e7c>] retrigger_next_event+0x3c/0x70
[  251.101967]
[  251.101967] which lock already depends on the new lock.
[  251.101967]
[  251.101967]
[  251.101967] the existing dependency chain (in reverse order) is:
[  251.101967]
-> rabeeh#5 (hrtimer_bases.lock#11){-.-...}:
[snipped]
-> rabeeh#4 (&rt_b->rt_runtime_lock){-.-...}:
[snipped]
-> rabeeh#3 (&rq->lock){-.-.-.}:
[snipped]
-> rabeeh#2 (&p->pi_lock){-.-.-.}:
[snipped]
-> rabeeh#1 (&(&pool->lock)->rlock){-.-...}:
[  251.101967]        [<ffffffff81194803>] validate_chain+0x6c3/0x7b0
[  251.101967]        [<ffffffff81194d9d>] __lock_acquire+0x4ad/0x580
[  251.101967]        [<ffffffff81194ff2>] lock_acquire+0x182/0x1d0
[  251.101967]        [<ffffffff84398500>] _raw_spin_lock+0x40/0x80
[  251.101967]        [<ffffffff81153e69>] __queue_work+0x1a9/0x3f0
[  251.101967]        [<ffffffff81154168>] queue_work_on+0x98/0x120
[  251.101967]        [<ffffffff81161351>] clock_was_set_delayed+0x21/0x30
[  251.101967]        [<ffffffff811c4bd1>] do_adjtimex+0x111/0x160
[  251.101967]        [<ffffffff811e2711>] compat_sys_adjtimex+0x41/0x70
[  251.101967]        [<ffffffff843a4b49>] ia32_sysret+0x0/0x5
[  251.101967]
-> #0 (timekeeper_seq){----..}:
[snipped]
[  251.101967] other info that might help us debug this:
[  251.101967]
[  251.101967] Chain exists of:
  timekeeper_seq --> &rt_b->rt_runtime_lock --> hrtimer_bases.lock#11

[  251.101967]  Possible unsafe locking scenario:
[  251.101967]
[  251.101967]        CPU0                    CPU1
[  251.101967]        ----                    ----
[  251.101967]   lock(hrtimer_bases.lock#11);
[  251.101967]                                lock(&rt_b->rt_runtime_lock);
[  251.101967]                                lock(hrtimer_bases.lock#11);
[  251.101967]   lock(timekeeper_seq);
[  251.101967]
[  251.101967]  *** DEADLOCK ***
[  251.101967]
[  251.101967] 3 locks held by kworker/10:1/4506:
[  251.101967]  #0:  (events){.+.+.+}, at: [<ffffffff81154960>] process_one_work+0x200/0x530
[  251.101967]  rabeeh#1:  (hrtimer_work){+.+...}, at: [<ffffffff81154960>] process_one_work+0x200/0x530
[  251.101967]  rabeeh#2:  (hrtimer_bases.lock#11){-.-...}, at: [<ffffffff81160e7c>] retrigger_next_event+0x3c/0x70
[  251.101967]
[  251.101967] stack backtrace:
[  251.101967] CPU: 10 PID: 4506 Comm: kworker/10:1 Not tainted 3.13.0-rc2-next-20131206-sasha-00005-g8be2375-dirty #4053
[  251.101967] Workqueue: events clock_was_set_work

So the best solution is to avoid calling clock_was_set_delayed() while
holding the timekeeping lock, and instead using a flag variable to
decide if we should call clock_was_set() once we've released the locks.

This works for the case here, where the do_adjtimex() was the deadlock
trigger point. Unfortuantely, in update_wall_time() we still hold
the jiffies lock, which would deadlock with the ipi triggered by
clock_was_set(), preventing us from calling it even after we drop the
timekeeping lock. So instead call clock_was_set_delayed() at that point.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: stable <stable@vger.kernel.org> rabeeh#3.10+
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
…to next/dt

From Jason Cooper:
mvebu DT changes for v3.14 (incremental rabeeh#5)

 - mvebu
    - add rtc chip isl12057 node to ReadyNAS boards
    - fix register length in Armada XP pmsu

 - kirkwood
    - sort ocp nodes by address in 6282 dtsi file
    - add 6192 dtsi file
    - add LaPlug board
    - add sata phy node

 - dove
    - add sata phy node

* tag 'mvebu-dt-3.14-5' of git://git.infradead.org/linux-mvebu:
  ARM: Kirkwood: DT board setup for LaPlug
  ARM: Kirkwood: Add 6192 DTSI file
  ARM: mvebu: fix register length for Armada XP PMSU
  ARM: kirkwood: 6282: sort DT nodes by address
  Phy: Add DT nodes on kirkwood and Dove for the SATA PHY
  ARM: mvebu: Enable ISL12057 RTC chip in ReadyNAS 2120 .dts file
  ARM: mvebu: Enable ISL12057 RTC chip in ReadyNAS 104 .dts file
  ARM: mvebu: Enable ISL12057 RTC chip in ReadyNAS 102 .dts file

Signed-off-by: Olof Johansson <olof@lixom.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Commit 8456a64 ("slab: use struct page for slab management") causes
a crash in the LVM2 testsuite on PA-RISC (the crashing test is
fsadm.sh).  The testsuite doesn't crash on 3.12, crashes on 3.13-rc1 and
later.

 Bad Address (null pointer deref?): Code=15 regs=000000413edd89a0 (Addr=000006202224647d)
 CPU: 3 PID: 24008 Comm: loop0 Not tainted 3.13.0-rc6 rabeeh#5
 task: 00000001bf3c0048 ti: 000000413edd8000 task.ti: 000000413edd8000

      YZrvWESTHLNXBCVMcbcbcbcbOGFRQPDI
 PSW: 00001000000001101111100100001110 Not tainted
 r00-03  000000ff0806f90e 00000000405c8de0 000000004013e6c0 000000413edd83f0
 r04-07  00000000405a95e0 0000000000000200 00000001414735f0 00000001bf349e40
 r08-11  0000000010fe3d10 0000000000000001 00000040829c7778 000000413efd9000
 r12-15  0000000000000000 000000004060d800 0000000010fe3000 0000000010fe3000
 r16-19  000000413edd82a0 00000041078ddbc0 0000000000000010 0000000000000001
 r20-23  0008f3d0d83a8000 0000000000000000 00000040829c7778 0000000000000080
 r24-27  00000001bf349e40 00000001bf349e40 202d66202224640d 00000000405a95e0
 r28-31  202d662022246465 000000413edd88f0 000000413edd89a0 0000000000000001
 sr00-03  000000000532c000 0000000000000000 0000000000000000 000000000532c000
 sr04-07  0000000000000000 0000000000000000 0000000000000000 0000000000000000

 IASQ: 0000000000000000 0000000000000000 IAOQ: 00000000401fe42c 00000000401fe430
  IIR: 539c0030    ISR: 00000000202d6000  IOR: 000006202224647d
  CPU:        3   CR30: 000000413edd8000 CR31: 0000000000000000
  ORIG_R28: 00000000405a95e0
  IAOQ[0]: vma_interval_tree_iter_first+0x14/0x48
  IAOQ[1]: vma_interval_tree_iter_first+0x18/0x48
  RP(r2): flush_dcache_page+0x128/0x388
 Backtrace:
   flush_dcache_page+0x128/0x388
   lo_splice_actor+0x90/0x148 [loop]
   splice_from_pipe_feed+0xc0/0x1d0
   __splice_from_pipe+0xac/0xc0
   lo_direct_splice_actor+0x1c/0x70 [loop]
   splice_direct_to_actor+0xec/0x228
   lo_receive+0xe4/0x298 [loop]
   loop_thread+0x478/0x640 [loop]
   kthread+0x134/0x168
   end_fault_vector+0x20/0x28
   xfs_setsize_buftarg+0x0/0x90 [xfs]

 Kernel panic - not syncing: Bad Address (null pointer deref?)

Commit 8456a64 changes the page structure so that the slab
subsystem reuses the page->mapping field.

The crash happens in the following way:
 * XFS allocates some memory from slab and issues a bio to read data
   into it.
 * the bio is sent to the loopback device.
 * lo_receive creates an actor and calls splice_direct_to_actor.
 * lo_splice_actor copies data to the target page.
 * lo_splice_actor calls flush_dcache_page because the page may be
   mapped by userspace.  In that case we need to flush the kernel cache.
 * flush_dcache_page asks for the list of userspace mappings, however
   that page->mapping field is reused by the slab subsystem for a
   different purpose.  This causes the crash.

Note that other architectures without coherent caches (sparc, arm, mips)
also call page_mapping from flush_dcache_page, so they may crash in the
same way.

This patch fixes this bug by testing if the page is a slab page in
page_mapping and returning NULL if it is.

The patch also fixes VM_BUG_ON(PageSlab(page)) that could happen in
earlier kernels in the same scenario on architectures without cache
coherence when CONFIG_DEBUG_VM is enabled - so it should be backported
to stable kernels.

In the old kernels, the function page_mapping is placed in
include/linux/mm.h, so you should modify the patch accordingly when
backporting it.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: John David Anglin <dave.anglin@bell.net>]
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Christoph Lameter <cl@linux.com>
Acked-by: Pekka Enberg <penberg@kernel.org>
Reviewed-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Helge Deller <deller@gmx.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
…NULL

In the gen_pool_dma_alloc() the dma pointer can be NULL and while
assigning gen_pool_virt_to_phys(pool, vaddr) to dma caused the following
crash on da850 evm:

   Unable to handle kernel NULL pointer dereference at virtual address 00000000
   Internal error: Oops: 805 [rabeeh#1] PREEMPT ARM
   Modules linked in:
   CPU: 0 PID: 1 Comm: swapper Tainted: G        W    3.13.0-rc1-00001-g0609e45-dirty rabeeh#5
   task: c4830000 ti: c4832000 task.ti: c4832000
   PC is at gen_pool_dma_alloc+0x30/0x3c
   LR is at gen_pool_virt_to_phys+0x74/0x80
   Process swapper, call trace:
     gen_pool_dma_alloc+0x30/0x3c
     davinci_pm_probe+0x40/0xa8
     platform_drv_probe+0x1c/0x4c
     driver_probe_device+0x98/0x22c
     __driver_attach+0x8c/0x90
     bus_for_each_dev+0x6c/0x8c
     bus_add_driver+0x124/0x1d4
     driver_register+0x78/0xf8
     platform_driver_probe+0x20/0xa4
     davinci_init_late+0xc/0x14
     init_machine_late+0x1c/0x28
     do_one_initcall+0x34/0x15c
     kernel_init_freeable+0xe4/0x1ac
     kernel_init+0x8/0xec

This patch fixes the above.

[akpm@linux-foundation.org: update kerneldoc]
Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com>
Cc: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Nicolin Chen <b42378@freescale.com>
Cc: Joe Perches <joe@perches.com>
Cc: Sachin Kamat <sachin.kamat@linaro.org>
Cc: <stable@vger.kernel.org>	[3.13.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
Interface rabeeh#5 of 19d2:1270 is a net interface which has been submitted to the
qmi_wwan driver so consequently remove it from the option driver.

Signed-off-by: Raymond Wanyoike <raymond.wanyoike@gmail.com>
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
vmxnet3's netpoll driver is incorrectly coded.  It directly calls
vmxnet3_do_poll, which is the driver internal napi poll routine.  As the netpoll
controller method doesn't block real napi polls in any way, there is a potential
for race conditions in which the netpoll controller method and the napi poll
method run concurrently.  The result is data corruption causing panics such as this
one recently observed:
PID: 1371   TASK: ffff88023762caa0  CPU: 1   COMMAND: "rs:main Q:Reg"
 #0 [ffff88023abd5780] machine_kexec at ffffffff81038f3b
 rabeeh#1 [ffff88023abd57e0] crash_kexec at ffffffff810c5d92
 rabeeh#2 [ffff88023abd58b0] oops_end at ffffffff8152b570
 rabeeh#3 [ffff88023abd58e0] die at ffffffff81010e0b
 rabeeh#4 [ffff88023abd5910] do_trap at ffffffff8152add4
 rabeeh#5 [ffff88023abd5970] do_invalid_op at ffffffff8100cf95
 rabeeh#6 [ffff88023abd5a10] invalid_op at ffffffff8100bf9b
    [exception RIP: vmxnet3_rq_rx_complete+1968]
    RIP: ffffffffa00f1e80  RSP: ffff88023abd5ac8  RFLAGS: 00010086
    RAX: 0000000000000000  RBX: ffff88023b5dcee0  RCX: 00000000000000c0
    RDX: 0000000000000000  RSI: 00000000000005f2  RDI: ffff88023b5dcee0
    RBP: ffff88023abd5b48   R8: 0000000000000000   R9: ffff88023a3b6048
    R10: 0000000000000000  R11: 0000000000000002  R12: ffff8802398d4cd8
    R13: ffff88023af35140  R14: ffff88023b60c890  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 linux4kix#7 [ffff88023abd5b50] vmxnet3_do_poll at ffffffffa00f204a [vmxnet3]
 linux4kix#8 [ffff88023abd5b80] vmxnet3_netpoll at ffffffffa00f209c [vmxnet3]
 linux4kix#9 [ffff88023abd5ba0] netpoll_poll_dev at ffffffff81472bb7

The fix is to do as other drivers do, and have the poll controller call the top
half interrupt handler, which schedules a napi poll properly to recieve frames

Tested by myself, successfully.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Shreyas Bhatewara <sbhatewara@vmware.com>
CC: "VMware, Inc." <pv-drivers@vmware.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: stable@vger.kernel.org
Reviewed-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
If a topology event subscription fails for any reason, such as out
of memory, max number reached or because we received an invalid
request the correct behavior is to terminate the subscribers
connection to the topology server. This is currently broken and
produces the following oops:

[27.953662] tipc: Subscription rejected, illegal request
[27.955329] BUG: spinlock recursion on CPU#1, kworker/u4:0/6
[27.957066]  lock: 0xffff88003c67f408, .magic: dead4ead, .owner: kworker/u4:0/6, .owner_cpu: 1
[27.958054] CPU: 1 PID: 6 Comm: kworker/u4:0 Not tainted 3.14.0-rc6+ rabeeh#5
[27.960230] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[27.960874] Workqueue: tipc_rcv tipc_recv_work [tipc]
[27.961430]  ffff88003c67f408 ffff88003de27c18 ffffffff815c0207 ffff88003de1c050
[27.962292]  ffff88003de27c38 ffffffff815beec5 ffff88003c67f408 ffffffff817f0a8a
[27.963152]  ffff88003de27c58 ffffffff815beeeb ffff88003c67f408 ffffffffa0013520
[27.964023] Call Trace:
[27.964292]  [<ffffffff815c0207>] dump_stack+0x45/0x56
[27.964874]  [<ffffffff815beec5>] spin_dump+0x8c/0x91
[27.965420]  [<ffffffff815beeeb>] spin_bug+0x21/0x26
[27.965995]  [<ffffffff81083df6>] do_raw_spin_lock+0x116/0x140
[27.966631]  [<ffffffff815c6215>] _raw_spin_lock_bh+0x15/0x20
[27.967256]  [<ffffffffa0008540>] subscr_conn_shutdown_event+0x20/0xa0 [tipc]
[27.968051]  [<ffffffffa000fde4>] tipc_close_conn+0xa4/0xb0 [tipc]
[27.968722]  [<ffffffffa00101ba>] tipc_conn_terminate+0x1a/0x30 [tipc]
[27.969436]  [<ffffffffa00089a2>] subscr_conn_msg_event+0x1f2/0x2f0 [tipc]
[27.970209]  [<ffffffffa0010000>] tipc_receive_from_sock+0x90/0xf0 [tipc]
[27.970972]  [<ffffffffa000fa79>] tipc_recv_work+0x29/0x50 [tipc]
[27.971633]  [<ffffffff8105dbf5>] process_one_work+0x165/0x3e0
[27.972267]  [<ffffffff8105e869>] worker_thread+0x119/0x3a0
[27.972896]  [<ffffffff8105e750>] ? manage_workers.isra.25+0x2a0/0x2a0
[27.973622]  [<ffffffff810648af>] kthread+0xdf/0x100
[27.974168]  [<ffffffff810647d0>] ? kthread_create_on_node+0x1a0/0x1a0
[27.974893]  [<ffffffff815ce13c>] ret_from_fork+0x7c/0xb0
[27.975466]  [<ffffffff810647d0>] ? kthread_create_on_node+0x1a0/0x1a0

The recursion occurs when subscr_terminate tries to grab the
subscriber lock, which is already taken by subscr_conn_msg_event.
We fix this by checking if the request to establish a new
subscription was successful, and if not we initiate termination of
the subscriber after we have released the subscriber lock.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
commit 0c8482ac92db5ac15792caf23b7f7df9e4f48ae1 upstream.

virtscsi_init calls virtscsi_remove_vqs on err, even before initializing
the vqs. The latter calls virtscsi_set_affinity, so let's check the
pointer there before setting affinity on it.

This fixes a panic when setting device's num_queues=2 on RHEL 6.5:

qemu-system-x86_64 ... \
-device virtio-scsi-pci,id=scsi0,addr=0x13,...,num_queues=2 \
-drive file=/stor/vm/dummy.raw,id=drive-scsi-disk,... \
-device scsi-hd,drive=drive-scsi-disk,...

[    0.354734] scsi0 : Virtio SCSI HBA
[    0.379504] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[    0.380141] IP: [<ffffffff814741ef>] __virtscsi_set_affinity+0x4f/0x120
[    0.380141] PGD 0
[    0.380141] Oops: 0000 [rabeeh#1] SMP
[    0.380141] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.14.0+ rabeeh#5
[    0.380141] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2007
[    0.380141] task: ffff88003c9f0000 ti: ffff88003c9f8000 task.ti: ffff88003c9f8000
[    0.380141] RIP: 0010:[<ffffffff814741ef>]  [<ffffffff814741ef>] __virtscsi_set_affinity+0x4f/0x120
[    0.380141] RSP: 0000:ffff88003c9f9c08  EFLAGS: 00010256
[    0.380141] RAX: 0000000000000000 RBX: ffff88003c3a9d40 RCX: 0000000000001070
[    0.380141] RDX: 0000000000000002 RSI: 0000000000000000 RDI: 0000000000000000
[    0.380141] RBP: ffff88003c9f9c28 R08: 00000000000136c0 R09: ffff88003c801c00
[    0.380141] R10: ffffffff81475229 R11: 0000000000000008 R12: 0000000000000000
[    0.380141] R13: ffffffff81cc7ca8 R14: ffff88003cac3d40 R15: ffff88003cac37a0
[    0.380141] FS:  0000000000000000(0000) GS:ffff88003e400000(0000) knlGS:0000000000000000
[    0.380141] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    0.380141] CR2: 0000000000000020 CR3: 0000000001c0e000 CR4: 00000000000006f0
[    0.380141] Stack:
[    0.380141]  ffff88003c3a9d40 0000000000000000 ffff88003cac3d80 ffff88003cac3d40
[    0.380141]  ffff88003c9f9c48 ffffffff814742e8 ffff88003c26d000 ffff88003c26d000
[    0.380141]  ffff88003c9f9c68 ffffffff81474321 ffff88003c26d000 ffff88003c3a9d40
[    0.380141] Call Trace:
[    0.380141]  [<ffffffff814742e8>] virtscsi_set_affinity+0x28/0x40
[    0.380141]  [<ffffffff81474321>] virtscsi_remove_vqs+0x21/0x50
[    0.380141]  [<ffffffff81475231>] virtscsi_init+0x91/0x240
[    0.380141]  [<ffffffff81365290>] ? vp_get+0x50/0x70
[    0.380141]  [<ffffffff81475544>] virtscsi_probe+0xf4/0x280
[    0.380141]  [<ffffffff81363ea5>] virtio_dev_probe+0xe5/0x140
[    0.380141]  [<ffffffff8144c669>] driver_probe_device+0x89/0x230
[    0.380141]  [<ffffffff8144c8ab>] __driver_attach+0x9b/0xa0
[    0.380141]  [<ffffffff8144c810>] ? driver_probe_device+0x230/0x230
[    0.380141]  [<ffffffff8144c810>] ? driver_probe_device+0x230/0x230
[    0.380141]  [<ffffffff8144ac1c>] bus_for_each_dev+0x8c/0xb0
[    0.380141]  [<ffffffff8144c499>] driver_attach+0x19/0x20
[    0.380141]  [<ffffffff8144bf28>] bus_add_driver+0x198/0x220
[    0.380141]  [<ffffffff8144ce9f>] driver_register+0x5f/0xf0
[    0.380141]  [<ffffffff81d27c91>] ? spi_transport_init+0x79/0x79
[    0.380141]  [<ffffffff8136403b>] register_virtio_driver+0x1b/0x30
[    0.380141]  [<ffffffff81d27d19>] init+0x88/0xd6
[    0.380141]  [<ffffffff81d27c18>] ? scsi_init_procfs+0x5b/0x5b
[    0.380141]  [<ffffffff81ce88a7>] do_one_initcall+0x7f/0x10a
[    0.380141]  [<ffffffff81ce8aa7>] kernel_init_freeable+0x14a/0x1de
[    0.380141]  [<ffffffff81ce8b3b>] ? kernel_init_freeable+0x1de/0x1de
[    0.380141]  [<ffffffff817dec20>] ? rest_init+0x80/0x80
[    0.380141]  [<ffffffff817dec29>] kernel_init+0x9/0xf0
[    0.380141]  [<ffffffff817e68fc>] ret_from_fork+0x7c/0xb0
[    0.380141]  [<ffffffff817dec20>] ? rest_init+0x80/0x80
[    0.380141] RIP  [<ffffffff814741ef>] __virtscsi_set_affinity+0x4f/0x120
[    0.380141]  RSP <ffff88003c9f9c08>
[    0.380141] CR2: 0000000000000020
[    0.380141] ---[ end trace 8074b70c3d5e1d73 ]---
[    0.475018] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[    0.475018]
[    0.475068] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
[    0.475068] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009

[jejb: checkpatch fixes]
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
mk01 pushed a commit to mk01/linux-linaro-stable-mx6 that referenced this pull request Sep 3, 2014
commit 3f1f9b851311a76226140b55b1ea22111234a7c2 upstream.

This fixes the following lockdep complaint:

[ INFO: possible circular locking dependency detected ]
3.16.0-rc2-mm1+ linux4kix#7 Tainted: G           O
-------------------------------------------------------
kworker/u24:0/4356 is trying to acquire lock:
 (&(&sbi->s_es_lru_lock)->rlock){+.+.-.}, at: [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0

but task is already holding lock:
 (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180

which lock already depends on the new lock.

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&ei->i_es_lock);
                               lock(&(&sbi->s_es_lru_lock)->rlock);
                               lock(&ei->i_es_lock);
  lock(&(&sbi->s_es_lru_lock)->rlock);

 *** DEADLOCK ***

6 locks held by kworker/u24:0/4356:
 #0:  ("writeback"){.+.+.+}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
 rabeeh#1:  ((&(&wb->dwork)->work)){+.+.+.}, at: [<ffffffff81071d00>] process_one_work+0x180/0x560
 rabeeh#2:  (&type->s_umount_key#22){++++++}, at: [<ffffffff811a9c74>] grab_super_passive+0x44/0x90
 rabeeh#3:  (jbd2_handle){+.+...}, at: [<ffffffff812979f9>] start_this_handle+0x189/0x5f0
 rabeeh#4:  (&ei->i_data_sem){++++..}, at: [<ffffffff81247062>] ext4_map_blocks+0x132/0x550
 rabeeh#5:  (&ei->i_es_lock){++++-.}, at: [<ffffffff81286961>] ext4_es_insert_extent+0x71/0x180

stack backtrace:
CPU: 0 PID: 4356 Comm: kworker/u24:0 Tainted: G           O   3.16.0-rc2-mm1+ linux4kix#7
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Workqueue: writeback bdi_writeback_workfn (flush-253:0)
 ffffffff8213dce0 ffff880014b07538 ffffffff815df0bb 0000000000000007
 ffffffff8213e040 ffff880014b07588 ffffffff815db3dd ffff880014b07568
 ffff880014b07610 ffff88003b868930 ffff88003b868908 ffff88003b868930
Call Trace:
 [<ffffffff815df0bb>] dump_stack+0x4e/0x68
 [<ffffffff815db3dd>] print_circular_bug+0x1fb/0x20c
 [<ffffffff810a7a3e>] __lock_acquire+0x163e/0x1d00
 [<ffffffff815e89dc>] ? retint_restore_args+0xe/0xe
 [<ffffffff815ddc7b>] ? __slab_alloc+0x4a8/0x4ce
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff810a8707>] lock_acquire+0x87/0x120
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff8128592d>] ? ext4_es_free_extent+0x5d/0x70
 [<ffffffff815e6f09>] _raw_spin_lock+0x39/0x50
 [<ffffffff81285fff>] ? __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff8119760b>] ? kmem_cache_alloc+0x18b/0x1a0
 [<ffffffff81285fff>] __ext4_es_shrink+0x4f/0x2e0
 [<ffffffff812869b8>] ext4_es_insert_extent+0xc8/0x180
 [<ffffffff812470f4>] ext4_map_blocks+0x1c4/0x550
 [<ffffffff8124c4c4>] ext4_writepages+0x6d4/0xd00
	...

Reported-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: Minchan Kim <minchan@kernel.org>
Cc: Zheng Liu <gnehzuil.liu@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant