-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pinning apps not working #2
Comments
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
Two paths to update quota and f2fs_lock_op: 1. - lock_op | - quota_update `- unlock_op 2. - quota_update - lock_op `- unlock_op But, we need to make a transaction on quota_update + lock_op in ivanmeler#2 case. So, this patch introduces: 1. lock_op 2. down_write 3. check __need_flush 4. up_write 5. if there is dirty quota entries, flush them 6. otherwise, good to go Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
commit cf3591ef832915892f2499b7e54b51d4c578b28c upstream. Revert the commit bd293d071ffe65e645b4d8104f9d8fe15ea13862. The proper fix has been made available with commit d0a255e795ab ("loop: set PF_MEMALLOC_NOIO for the worker thread"). Note that the fix offered by commit bd293d071ffe doesn't really prevent the deadlock from occuring - if we look at the stacktrace reported by Junxiao Bi, we see that it hangs in bit_wait_io and not on the mutex - i.e. it has already successfully taken the mutex. Changing the mutex from mutex_lock to mutex_trylock won't help with deadlocks that happen afterwards. PID: 474 TASK: ffff8813e11f4600 CPU: 10 COMMAND: "kswapd0" #0 [ffff8813dedfb938] __schedule at ffffffff8173f405 ivanmeler#1 [ffff8813dedfb990] schedule at ffffffff8173fa27 ivanmeler#2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec ivanmeler#3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186 ivanmeler#4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f ivanmeler#5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8 ivanmeler#6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81 ivanmeler#7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio] ivanmeler#8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio] #9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio] ivanmeler#10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce ivanmeler#11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778 ivanmeler#12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f ivanmeler#13 [ffff8813dedfbec0] kthread at ffffffff810a8428 ivanmeler#14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242 Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Fixes: bd293d071ffe ("dm bufio: fix deadlock with loop device") Depends-on: d0a255e795ab ("loop: set PF_MEMALLOC_NOIO for the worker thread") Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I6cd6b254cca8cb8e72c197d44b91790c1a799554
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
[ Upstream commit fe163e534e5eecdfd7b5920b0dfd24c458ee85d6 ] syzbot reported: BUG: KMSAN: uninit-value in capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700 CPU: 0 PID: 10025 Comm: syz-executor379 Not tainted 4.20.0-rc7+ ivanmeler#2 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x173/0x1d0 lib/dump_stack.c:113 kmsan_report+0x12e/0x2a0 mm/kmsan/kmsan.c:613 __msan_warning+0x82/0xf0 mm/kmsan/kmsan_instr.c:313 capi_write+0x791/0xa90 drivers/isdn/capi/capi.c:700 do_loop_readv_writev fs/read_write.c:703 [inline] do_iter_write+0x83e/0xd80 fs/read_write.c:961 vfs_writev fs/read_write.c:1004 [inline] do_writev+0x397/0x840 fs/read_write.c:1039 __do_sys_writev fs/read_write.c:1112 [inline] __se_sys_writev+0x9b/0xb0 fs/read_write.c:1109 __x64_sys_writev+0x4a/0x70 fs/read_write.c:1109 do_syscall_64+0xbc/0xf0 arch/x86/entry/common.c:291 entry_SYSCALL_64_after_hwframe+0x63/0xe7 [...] The problem is that capi_write() is reading past the end of the message. Fix it by checking the message's length in the needed places. Reported-and-tested-by: syzbot+0849c524d9c634f5ae66@syzkaller.appspotmail.com Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I44273001127d82f553a5df3ca55da16e85604be0
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
commit e4f8e513c3d353c134ad4eef9fd0bba12406c7c8 upstream. A long time ago we fixed a similar deadlock in show_slab_objects() [1]. However, it is apparently due to the commits like 01fb58bcba63 ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path") and 03afc0e ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}"), this kind of deadlock is back by just reading files in /sys/kernel/slab which will generate a lockdep splat below. Since the "mem_hotplug_lock" here is only to obtain a stable online node mask while racing with NUMA node hotplug, in the worst case, the results may me miscalculated while doing NUMA node hotplug, but they shall be corrected by later reads of the same files. WARNING: possible circular locking dependency detected ------------------------------------------------------ cat/5224 is trying to acquire lock: ffff900012ac3120 (mem_hotplug_lock.rw_sem){++++}, at: show_slab_objects+0x94/0x3a8 but task is already holding lock: b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> ivanmeler#2 (kn->count#45){++++}: lock_acquire+0x31c/0x360 __kernfs_remove+0x290/0x490 kernfs_remove+0x30/0x44 sysfs_remove_dir+0x70/0x88 kobject_del+0x50/0xb0 sysfs_slab_unlink+0x2c/0x38 shutdown_cache+0xa0/0xf0 kmemcg_cache_shutdown_fn+0x1c/0x34 kmemcg_workfn+0x44/0x64 process_one_work+0x4f4/0x950 worker_thread+0x390/0x4bc kthread+0x1cc/0x1e8 ret_from_fork+0x10/0x18 -> ivanmeler#1 (slab_mutex){+.+.}: lock_acquire+0x31c/0x360 __mutex_lock_common+0x16c/0xf78 mutex_lock_nested+0x40/0x50 memcg_create_kmem_cache+0x38/0x16c memcg_kmem_cache_create_func+0x3c/0x70 process_one_work+0x4f4/0x950 worker_thread+0x390/0x4bc kthread+0x1cc/0x1e8 ret_from_fork+0x10/0x18 -> #0 (mem_hotplug_lock.rw_sem){++++}: validate_chain+0xd10/0x2bcc __lock_acquire+0x7f4/0xb8c lock_acquire+0x31c/0x360 get_online_mems+0x54/0x150 show_slab_objects+0x94/0x3a8 total_objects_show+0x28/0x34 slab_attr_show+0x38/0x54 sysfs_kf_seq_show+0x198/0x2d4 kernfs_seq_show+0xa4/0xcc seq_read+0x30c/0x8a8 kernfs_fop_read+0xa8/0x314 __vfs_read+0x88/0x20c vfs_read+0xd8/0x10c ksys_read+0xb0/0x120 __arm64_sys_read+0x54/0x88 el0_svc_handler+0x170/0x240 el0_svc+0x8/0xc other info that might help us debug this: Chain exists of: mem_hotplug_lock.rw_sem --> slab_mutex --> kn->count#45 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(kn->count#45); lock(slab_mutex); lock(kn->count#45); lock(mem_hotplug_lock.rw_sem); *** DEADLOCK *** 3 locks held by cat/5224: #0: 9eff00095b14b2a0 (&p->lock){+.+.}, at: seq_read+0x4c/0x8a8 ivanmeler#1: 0eff008997041480 (&of->mutex){+.+.}, at: kernfs_seq_start+0x34/0xf0 ivanmeler#2: b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0 stack backtrace: Call trace: dump_backtrace+0x0/0x248 show_stack+0x20/0x2c dump_stack+0xd0/0x140 print_circular_bug+0x368/0x380 check_noncircular+0x248/0x250 validate_chain+0xd10/0x2bcc __lock_acquire+0x7f4/0xb8c lock_acquire+0x31c/0x360 get_online_mems+0x54/0x150 show_slab_objects+0x94/0x3a8 total_objects_show+0x28/0x34 slab_attr_show+0x38/0x54 sysfs_kf_seq_show+0x198/0x2d4 kernfs_seq_show+0xa4/0xcc seq_read+0x30c/0x8a8 kernfs_fop_read+0xa8/0x314 __vfs_read+0x88/0x20c vfs_read+0xd8/0x10c ksys_read+0xb0/0x120 __arm64_sys_read+0x54/0x88 el0_svc_handler+0x170/0x240 el0_svc+0x8/0xc I think it is important to mention that this doesn't expose the show_slab_objects to use-after-free. There is only a single path that might really race here and that is the slab hotplug notifier callback __kmem_cache_shrink (via slab_mem_going_offline_callback) but that path doesn't really destroy kmem_cache_node data structures. [1] http://lkml.iu.edu/hypermail/linux/kernel/1101.0/02850.html [akpm@linux-foundation.org: add comment explaining why we don't need mem_hotplug_lock] Link: http://lkml.kernel.org/r/1570192309-10132-1-git-send-email-cai@lca.pw Fixes: 01fb58bcba63 ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path") Fixes: 03afc0e ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}") Signed-off-by: Qian Cai <cai@lca.pw> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Tejun Heo <tj@kernel.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: Roman Gushchin <guro@fb.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I939b74a8a12839476f43f2230484560ce3892b8c
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
[ Upstream commit 347ab9480313737c0f1aaa08e8f2e1a791235535 ] This patch fixes deadlock warning if removing PWM device when CONFIG_PROVE_LOCKING is enabled. This issue can be reproceduced by the following steps on the R-Car H3 Salvator-X board if the backlight is disabled: # cd /sys/class/pwm/pwmchip0 # echo 0 > export # ls device export npwm power pwm0 subsystem uevent unexport # cd device/driver # ls bind e6e31000.pwm uevent unbind # echo e6e31000.pwm > unbind [ 87.659974] ====================================================== [ 87.666149] WARNING: possible circular locking dependency detected [ 87.672327] 5.0.0 ivanmeler#7 Not tainted [ 87.675549] ------------------------------------------------------ [ 87.681723] bash/2986 is trying to acquire lock: [ 87.686337] 000000005ea0e178 (kn->count#58){++++}, at: kernfs_remove_by_name_ns+0x50/0xa0 [ 87.694528] [ 87.694528] but task is already holding lock: [ 87.700353] 000000006313b17c (pwm_lock){+.+.}, at: pwmchip_remove+0x28/0x13c [ 87.707405] [ 87.707405] which lock already depends on the new lock. [ 87.707405] [ 87.715574] [ 87.715574] the existing dependency chain (in reverse order) is: [ 87.723048] [ 87.723048] -> ivanmeler#1 (pwm_lock){+.+.}: [ 87.728017] __mutex_lock+0x70/0x7e4 [ 87.732108] mutex_lock_nested+0x1c/0x24 [ 87.736547] pwm_request_from_chip.part.6+0x34/0x74 [ 87.741940] pwm_request_from_chip+0x20/0x40 [ 87.746725] export_store+0x6c/0x1f4 [ 87.750820] dev_attr_store+0x18/0x28 [ 87.754998] sysfs_kf_write+0x54/0x64 [ 87.759175] kernfs_fop_write+0xe4/0x1e8 [ 87.763615] __vfs_write+0x40/0x184 [ 87.767619] vfs_write+0xa8/0x19c [ 87.771448] ksys_write+0x58/0xbc [ 87.775278] __arm64_sys_write+0x18/0x20 [ 87.779721] el0_svc_common+0xd0/0x124 [ 87.783986] el0_svc_compat_handler+0x1c/0x24 [ 87.788858] el0_svc_compat+0x8/0x18 [ 87.792947] [ 87.792947] -> #0 (kn->count#58){++++}: [ 87.798260] lock_acquire+0xc4/0x22c [ 87.802353] __kernfs_remove+0x258/0x2c4 [ 87.806790] kernfs_remove_by_name_ns+0x50/0xa0 [ 87.811836] remove_files.isra.1+0x38/0x78 [ 87.816447] sysfs_remove_group+0x48/0x98 [ 87.820971] sysfs_remove_groups+0x34/0x4c [ 87.825583] device_remove_attrs+0x6c/0x7c [ 87.830197] device_del+0x11c/0x33c [ 87.834201] device_unregister+0x14/0x2c [ 87.838638] pwmchip_sysfs_unexport+0x40/0x4c [ 87.843509] pwmchip_remove+0xf4/0x13c [ 87.847773] rcar_pwm_remove+0x28/0x34 [ 87.852039] platform_drv_remove+0x24/0x64 [ 87.856651] device_release_driver_internal+0x18c/0x21c [ 87.862391] device_release_driver+0x14/0x1c [ 87.867175] unbind_store+0xe0/0x124 [ 87.871265] drv_attr_store+0x20/0x30 [ 87.875442] sysfs_kf_write+0x54/0x64 [ 87.879618] kernfs_fop_write+0xe4/0x1e8 [ 87.884055] __vfs_write+0x40/0x184 [ 87.888057] vfs_write+0xa8/0x19c [ 87.891887] ksys_write+0x58/0xbc [ 87.895716] __arm64_sys_write+0x18/0x20 [ 87.900154] el0_svc_common+0xd0/0x124 [ 87.904417] el0_svc_compat_handler+0x1c/0x24 [ 87.909289] el0_svc_compat+0x8/0x18 [ 87.913378] [ 87.913378] other info that might help us debug this: [ 87.913378] [ 87.921374] Possible unsafe locking scenario: [ 87.921374] [ 87.927286] CPU0 CPU1 [ 87.931808] ---- ---- [ 87.936331] lock(pwm_lock); [ 87.939293] lock(kn->count#58); [ 87.945120] lock(pwm_lock); [ 87.950599] lock(kn->count#58); [ 87.953908] [ 87.953908] *** DEADLOCK *** [ 87.953908] [ 87.959821] 4 locks held by bash/2986: [ 87.963563] #0: 00000000ace7bc30 (sb_writers#6){.+.+}, at: vfs_write+0x188/0x19c [ 87.971044] ivanmeler#1: 00000000287991b2 (&of->mutex){+.+.}, at: kernfs_fop_write+0xb4/0x1e8 [ 87.978872] ivanmeler#2: 00000000f739d016 (&dev->mutex){....}, at: device_release_driver_internal+0x40/0x21c [ 87.988001] ivanmeler#3: 000000006313b17c (pwm_lock){+.+.}, at: pwmchip_remove+0x28/0x13c [ 87.995481] [ 87.995481] stack backtrace: [ 87.999836] CPU: 0 PID: 2986 Comm: bash Not tainted 5.0.0 ivanmeler#7 [ 88.005489] Hardware name: Renesas Salvator-X board based on r8a7795 ES1.x (DT) [ 88.012791] Call trace: [ 88.015235] dump_backtrace+0x0/0x190 [ 88.018891] show_stack+0x14/0x1c [ 88.022204] dump_stack+0xb0/0xec [ 88.025514] print_circular_bug.isra.32+0x1d0/0x2e0 [ 88.030385] __lock_acquire+0x1318/0x1864 [ 88.034388] lock_acquire+0xc4/0x22c [ 88.037958] __kernfs_remove+0x258/0x2c4 [ 88.041874] kernfs_remove_by_name_ns+0x50/0xa0 [ 88.046398] remove_files.isra.1+0x38/0x78 [ 88.050487] sysfs_remove_group+0x48/0x98 [ 88.054490] sysfs_remove_groups+0x34/0x4c [ 88.058580] device_remove_attrs+0x6c/0x7c [ 88.062671] device_del+0x11c/0x33c [ 88.066154] device_unregister+0x14/0x2c [ 88.070070] pwmchip_sysfs_unexport+0x40/0x4c [ 88.074421] pwmchip_remove+0xf4/0x13c [ 88.078163] rcar_pwm_remove+0x28/0x34 [ 88.081906] platform_drv_remove+0x24/0x64 [ 88.085996] device_release_driver_internal+0x18c/0x21c [ 88.091215] device_release_driver+0x14/0x1c [ 88.095478] unbind_store+0xe0/0x124 [ 88.099048] drv_attr_store+0x20/0x30 [ 88.102704] sysfs_kf_write+0x54/0x64 [ 88.106359] kernfs_fop_write+0xe4/0x1e8 [ 88.110275] __vfs_write+0x40/0x184 [ 88.113757] vfs_write+0xa8/0x19c [ 88.117065] ksys_write+0x58/0xbc [ 88.120374] __arm64_sys_write+0x18/0x20 [ 88.124291] el0_svc_common+0xd0/0x124 [ 88.128034] el0_svc_compat_handler+0x1c/0x24 [ 88.132384] el0_svc_compat+0x8/0x18 The sysfs unexport in pwmchip_remove() is completely asymmetric to what we do in pwmchip_add_with_polarity() and commit 0733424c9ba9 ("pwm: Unexport children before chip removal") is a strong indication that this was wrong to begin with. We should just move pwmchip_sysfs_unexport() where it belongs, which is right after pwmchip_sysfs_unexport_children(). In that case, we do not need separate functions anymore either. We also really want to remove sysfs irrespective of whether or not the chip will be removed as a result of pwmchip_remove(). We can only assume that the driver will be gone after that, so we shouldn't leave any dangling sysfs files around. This warning disappears if we move pwmchip_sysfs_unexport() to the top of pwmchip_remove(), pwmchip_sysfs_unexport_children(). That way it is also outside of the pwm_lock section, which indeed doesn't seem to be needed. Moving the pwmchip_sysfs_export() call outside of that section also seems fine and it'd be perfectly symmetric with pwmchip_remove() again. So, this patch fixes them. Signed-off-by: Phong Hoang <phong.hoang.wz@renesas.com> [shimoda: revise the commit log and code] Fixes: 76abbdd ("pwm: Add sysfs interface") Fixes: 0733424c9ba9 ("pwm: Unexport children before chip removal") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Tested-by: Hoan Nguyen An <na-hoan@jinso.co.jp> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: If7fc36ed28d5bd95a4f32666dc2010939cbf66e5
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
[ Upstream commit 551842446ed695641a00782cd118cbb064a416a1 ] ifmsh->csa is an RCU-protected pointer. The writer context in ieee80211_mesh_finish_csa() is already mutually exclusive with wdev->sdata.mtx, but the RCU checker did not know this. Use rcu_dereference_protected() to avoid a warning. fixes the following warning: [ 12.519089] ============================= [ 12.520042] WARNING: suspicious RCU usage [ 12.520652] 5.1.0-rc7-wt+ ivanmeler#16 Tainted: G W [ 12.521409] ----------------------------- [ 12.521972] net/mac80211/mesh.c:1223 suspicious rcu_dereference_check() usage! [ 12.522928] other info that might help us debug this: [ 12.523984] rcu_scheduler_active = 2, debug_locks = 1 [ 12.524855] 5 locks held by kworker/u8:2/152: [ 12.525438] #0: 00000000057be08c ((wq_completion)phy0){+.+.}, at: process_one_work+0x1a2/0x620 [ 12.526607] ivanmeler#1: 0000000059c6b07a ((work_completion)(&sdata->csa_finalize_work)){+.+.}, at: process_one_work+0x1a2/0x620 [ 12.528001] ivanmeler#2: 00000000f184ba7d (&wdev->mtx){+.+.}, at: ieee80211_csa_finalize_work+0x2f/0x90 [ 12.529116] ivanmeler#3: 00000000831a1f54 (&local->mtx){+.+.}, at: ieee80211_csa_finalize_work+0x47/0x90 [ 12.530233] ivanmeler#4: 00000000fd06f988 (&local->chanctx_mtx){+.+.}, at: ieee80211_csa_finalize_work+0x51/0x90 Signed-off-by: Thomas Pedersen <thomas@eero.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I19313f756382b0078683036d50c6645dd8ab2bee
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
[ Upstream commit 6da9f775175e516fc7229ceaa9b54f8f56aa7924 ] When debugging options are turned on, the rcu_read_lock() function might not be inlined. This results in lockdep's print_lock() function printing "rcu_read_lock+0x0/0x70" instead of rcu_read_lock()'s caller. For example: [ 10.579995] ============================= [ 10.584033] WARNING: suspicious RCU usage [ 10.588074] 4.18.0.memcg_v2+ ivanmeler#1 Not tainted [ 10.593162] ----------------------------- [ 10.597203] include/linux/rcupdate.h:281 Illegal context switch in RCU read-side critical section! [ 10.606220] [ 10.606220] other info that might help us debug this: [ 10.606220] [ 10.614280] [ 10.614280] rcu_scheduler_active = 2, debug_locks = 1 [ 10.620853] 3 locks held by systemd/1: [ 10.624632] #0: (____ptrval____) (&type->i_mutex_dir_key#5){.+.+}, at: lookup_slow+0x42/0x70 [ 10.633232] ivanmeler#1: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 [ 10.640954] ivanmeler#2: (____ptrval____) (rcu_read_lock){....}, at: rcu_read_lock+0x0/0x70 These "rcu_read_lock+0x0/0x70" strings are not providing any useful information. This commit therefore forces inlining of the rcu_read_lock() function so that rcu_read_lock()'s caller is instead shown. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I557be1ed5041de3d559c49ebe6607c97d0680392
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
[ Upstream commit 86968ef21596515958d5f0a40233d02be78ecec0 ] Calling ceph_buffer_put() in __ceph_setxattr() may end up freeing the i_xattrs.prealloc_blob buffer while holding the i_ceph_lock. This can be fixed by postponing the call until later, when the lock is released. The following backtrace was triggered by fstests generic/117. BUG: sleeping function called from invalid context at mm/vmalloc.c:2283 in_atomic(): 1, irqs_disabled(): 0, pid: 650, name: fsstress 3 locks held by fsstress/650: #0: 00000000870a0fe8 (sb_writers#8){.+.+}, at: mnt_want_write+0x20/0x50 ivanmeler#1: 00000000ba0c4c74 (&type->i_mutex_dir_key#6){++++}, at: vfs_setxattr+0x55/0xa0 ivanmeler#2: 000000008dfbb3f2 (&(&ci->i_ceph_lock)->rlock){+.+.}, at: __ceph_setxattr+0x297/0x810 CPU: 1 PID: 650 Comm: fsstress Not tainted 5.2.0+ #437 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014 Call Trace: dump_stack+0x67/0x90 ___might_sleep.cold+0x9f/0xb1 vfree+0x4b/0x60 ceph_buffer_release+0x1b/0x60 __ceph_setxattr+0x2b4/0x810 __vfs_setxattr+0x66/0x80 __vfs_setxattr_noperm+0x59/0xf0 vfs_setxattr+0x81/0xa0 setxattr+0x115/0x230 ? filename_lookup+0xc9/0x140 ? rcu_read_lock_sched_held+0x74/0x80 ? rcu_sync_lockdep_assert+0x2e/0x60 ? __sb_start_write+0x142/0x1a0 ? mnt_want_write+0x20/0x50 path_setxattr+0xba/0xd0 __x64_sys_lsetxattr+0x24/0x30 do_syscall_64+0x50/0x1c0 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x7ff23514359a Signed-off-by: Luis Henriques <lhenriques@suse.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I70028d2cd09626c0d42def3a6ac62d59bebe1b6e
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
commit e091eab028f9253eac5c04f9141bbc9d170acab3 upstream. In some cases, ocfs2_iget() reads the data of inode, which has been deleted for some reason. That will make the system panic. So We should judge whether this inode has been deleted, and tell the caller that the inode is a bad inode. For example, the ocfs2 is used as the backed of nfs, and the client is nfsv3. This issue can be reproduced by the following steps. on the nfs server side, ..../patha/pathb Step 1: The process A was scheduled before calling the function fh_verify. Step 2: The process B is removing the 'pathb', and just completed the call to function dput. Then the dentry of 'pathb' has been deleted from the dcache, and all ancestors have been deleted also. The relationship of dentry and inode was deleted through the function hlist_del_init. The following is the call stack. dentry_iput->hlist_del_init(&dentry->d_u.d_alias) At this time, the inode is still in the dcache. Step 3: The process A call the function ocfs2_get_dentry, which get the inode from dcache. Then the refcount of inode is 1. The following is the call stack. nfsd3_proc_getacl->fh_verify->exportfs_decode_fh->fh_to_dentry(ocfs2_get_dentry) Step 4: Dirty pages are flushed by bdi threads. So the inode of 'patha' is evicted, and this directory was deleted. But the inode of 'pathb' can't be evicted, because the refcount of the inode was 1. Step 5: The process A keep running, and call the function reconnect_path(in exportfs_decode_fh), which call function ocfs2_get_parent of ocfs2. Get the block number of parent directory(patha) by the name of ... Then read the data from disk by the block number. But this inode has been deleted, so the system panic. Process A Process B 1. in nfsd3_proc_getacl | 2. | dput 3. fh_to_dentry(ocfs2_get_dentry) | 4. bdi flush dirty cache | 5. ocfs2_iget | [283465.542049] OCFS2: ERROR (device sdp): ocfs2_validate_inode_block: Invalid dinode #580640: OCFS2_VALID_FL not set [283465.545490] Kernel panic - not syncing: OCFS2: (device sdp): panic forced after error [283465.546889] CPU: 5 PID: 12416 Comm: nfsd Tainted: G W 4.1.12-124.18.6.el6uek.bug28762940v3.x86_64 ivanmeler#2 [283465.548382] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/21/2015 [283465.549657] 0000000000000000 ffff8800a56fb7b8 ffffffff816e839c ffffffffa0514758 [283465.550392] 000000000008dc20 ffff8800a56fb838 ffffffff816e62d3 0000000000000008 [283465.551056] ffff880000000010 ffff8800a56fb848 ffff8800a56fb7e8 ffff88005df9f000 [283465.551710] Call Trace: [283465.552516] [<ffffffff816e839c>] dump_stack+0x63/0x81 [283465.553291] [<ffffffff816e62d3>] panic+0xcb/0x21b [283465.554037] [<ffffffffa04e66b0>] ocfs2_handle_error+0xf0/0xf0 [ocfs2] [283465.554882] [<ffffffffa04e7737>] __ocfs2_error+0x67/0x70 [ocfs2] [283465.555768] [<ffffffffa049c0f9>] ocfs2_validate_inode_block+0x229/0x230 [ocfs2] [283465.556683] [<ffffffffa047bcbc>] ocfs2_read_blocks+0x46c/0x7b0 [ocfs2] [283465.557408] [<ffffffffa049bed0>] ? ocfs2_inode_cache_io_unlock+0x20/0x20 [ocfs2] [283465.557973] [<ffffffffa049f0eb>] ocfs2_read_inode_block_full+0x3b/0x60 [ocfs2] [283465.558525] [<ffffffffa049f5ba>] ocfs2_iget+0x4aa/0x880 [ocfs2] [283465.559082] [<ffffffffa049146e>] ocfs2_get_parent+0x9e/0x220 [ocfs2] [283465.559622] [<ffffffff81297c05>] reconnect_path+0xb5/0x300 [283465.560156] [<ffffffff81297f46>] exportfs_decode_fh+0xf6/0x2b0 [283465.560708] [<ffffffffa062faf0>] ? nfsd_proc_getattr+0xa0/0xa0 [nfsd] [283465.561262] [<ffffffff810a8196>] ? prepare_creds+0x26/0x110 [283465.561932] [<ffffffffa0630860>] fh_verify+0x350/0x660 [nfsd] [283465.562862] [<ffffffffa0637804>] ? nfsd_cache_lookup+0x44/0x630 [nfsd] [283465.563697] [<ffffffffa063a8b9>] nfsd3_proc_getattr+0x69/0xf0 [nfsd] [283465.564510] [<ffffffffa062cf60>] nfsd_dispatch+0xe0/0x290 [nfsd] [283465.565358] [<ffffffffa05eb892>] ? svc_tcp_adjust_wspace+0x12/0x30 [sunrpc] [283465.566272] [<ffffffffa05ea652>] svc_process_common+0x412/0x6a0 [sunrpc] [283465.567155] [<ffffffffa05eaa03>] svc_process+0x123/0x210 [sunrpc] [283465.568020] [<ffffffffa062c90f>] nfsd+0xff/0x170 [nfsd] [283465.568962] [<ffffffffa062c810>] ? nfsd_destroy+0x80/0x80 [nfsd] [283465.570112] [<ffffffff810a622b>] kthread+0xcb/0xf0 [283465.571099] [<ffffffff810a6160>] ? kthread_create_on_node+0x180/0x180 [283465.572114] [<ffffffff816f11b8>] ret_from_fork+0x58/0x90 [283465.573156] [<ffffffff810a6160>] ? kthread_create_on_node+0x180/0x180 Link: http://lkml.kernel.org/r/1554185919-3010-1-git-send-email-sunny.s.zhang@oracle.com Signed-off-by: Shuning Zhang <sunny.s.zhang@oracle.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: piaojun <piaojun@huawei.com> Cc: "Gang He" <ghe@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Iead149536359912d6afd34e28b656222ced32a12
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
[ Upstream commit 443f2d5ba13d65ccfd879460f77941875159d154 ] Observe a segmentation fault when 'perf stat' is asked to repeat forever with the interval option. Without fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.000211692 3,13,89,82,34,157 cycles 10.000380119 1,53,98,52,22,294 cycles 10.040467280 17,16,79,265 cycles Segmentation fault This problem was only observed when we use forever option aka -r 0 and works with limited repeats. Calling print_counter with ts being set to NULL, is not a correct option when interval is set. Hence avoid print_counter(NULL,..) if interval is set. With fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.019866622 3,15,14,43,08,697 cycles 10.039865756 3,15,16,31,95,261 cycles 10.059950628 1,26,05,47,158 cycles 5.009902655 3,14,52,62,33,932 cycles 10.019880228 3,14,52,22,89,154 cycles 10.030543876 66,90,18,333 cycles 5.009848281 3,14,51,98,25,437 cycles 10.029854402 3,15,14,93,04,918 cycles 5.009834177 3,14,51,95,92,316 cycles Committer notes: Did the 'git bisect' to find the cset introducing the problem to add the Fixes tag below, and at that time the problem reproduced as: (gdb) run stat -r0 -I500 sleep 1 <SNIP> Program received signal SIGSEGV, Segmentation fault. print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 866 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, csv_sep); (gdb) bt #0 print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 ivanmeler#1 0x000000000041860a in print_counters (ts=ts@entry=0x0, argc=argc@entry=2, argv=argv@entry=0x7fffffffd640) at builtin-stat.c:938 ivanmeler#2 0x0000000000419a7f in cmd_stat (argc=2, argv=0x7fffffffd640, prefix=<optimized out>) at builtin-stat.c:1411 ivanmeler#3 0x000000000045c65a in run_builtin (p=p@entry=0x6291b8 <commands+216>, argc=argc@entry=5, argv=argv@entry=0x7fffffffd640) at perf.c:370 ivanmeler#4 0x000000000045c893 in handle_internal_command (argc=5, argv=0x7fffffffd640) at perf.c:429 ivanmeler#5 0x000000000045c8f1 in run_argv (argcp=argcp@entry=0x7fffffffd4ac, argv=argv@entry=0x7fffffffd4a0) at perf.c:473 ivanmeler#6 0x000000000045cac9 in main (argc=<optimized out>, argv=<optimized out>) at perf.c:588 (gdb) Mostly the same as just before this patch: Program received signal SIGSEGV, Segmentation fault. 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 964 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep); (gdb) bt #0 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 ivanmeler#1 0x0000000000588047 in perf_evlist__print_counters (evlist=0xbc9b90, config=0xa1f2a0 <stat_config>, _target=0xa1f0c0 <target>, ts=0x0, argc=2, argv=0x7fffffffd670) at util/stat-display.c:1172 ivanmeler#2 0x000000000045390f in print_counters (ts=0x0, argc=2, argv=0x7fffffffd670) at builtin-stat.c:656 ivanmeler#3 0x0000000000456bb5 in cmd_stat (argc=2, argv=0x7fffffffd670) at builtin-stat.c:1960 ivanmeler#4 0x00000000004dd2e0 in run_builtin (p=0xa30e00 <commands+288>, argc=5, argv=0x7fffffffd670) at perf.c:310 ivanmeler#5 0x00000000004dd54d in handle_internal_command (argc=5, argv=0x7fffffffd670) at perf.c:362 ivanmeler#6 0x00000000004dd694 in run_argv (argcp=0x7fffffffd4cc, argv=0x7fffffffd4c0) at perf.c:406 ivanmeler#7 0x00000000004dda11 in main (argc=5, argv=0x7fffffffd670) at perf.c:531 (gdb) Fixes: d4f63a4741a8 ("perf stat: Introduce print_counters function") Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v4.2+ Link: http://lore.kernel.org/lkml/20190904094738.9558-3-srikar@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I03e14a64d8f774e0cd166298744216ebe9fb26bc
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
[ Upstream commit b66f31efbdad95ec274345721d99d1d835e6de01 ] This patch fixes the lock inversion complaint: ============================================ WARNING: possible recursive locking detected 5.3.0-rc7-dbg+ ivanmeler#1 Not tainted -------------------------------------------- kworker/u16:6/171 is trying to acquire lock: 00000000035c6e6c (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x78/0x4a0 [rdma_cm] but task is already holding lock: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&id_priv->handler_mutex); lock(&id_priv->handler_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/u16:6/171: #0: 00000000e2eaa773 ((wq_completion)iw_cm_wq){+.+.}, at: process_one_work+0x472/0xac0 ivanmeler#1: 000000001efd357b ((work_completion)(&work->work)ivanmeler#3){+.+.}, at: process_one_work+0x476/0xac0 ivanmeler#2: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] stack backtrace: CPU: 3 PID: 171 Comm: kworker/u16:6 Not tainted 5.3.0-rc7-dbg+ ivanmeler#1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: iw_cm_wq cm_work_handler [iw_cm] Call Trace: dump_stack+0x8a/0xd6 __lock_acquire.cold+0xe1/0x24d lock_acquire+0x106/0x240 __mutex_lock+0x12e/0xcb0 mutex_lock_nested+0x1f/0x30 rdma_destroy_id+0x78/0x4a0 [rdma_cm] iw_conn_req_handler+0x5c9/0x680 [rdma_cm] cm_work_handler+0xe62/0x1100 [iw_cm] process_one_work+0x56d/0xac0 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 This is not a bug as there are actually two lock classes here. Link: https://lore.kernel.org/r/20190930231707.48259-3-bvanassche@acm.org Fixes: de910bd ("RDMA/cma: Simplify locking needed for serialization of callbacks") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Iec345e445c537b8b7003488b79a620d4cc8c7ea0
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
commit 159d2c7d8106177bd9a986fd005a311fe0d11285 upstream. qdisc_root() use from netem_enqueue() triggers a lockdep warning. __dev_queue_xmit() uses rcu_read_lock_bh() which is not equivalent to rcu_read_lock() + local_bh_disable_bh as far as lockdep is concerned. WARNING: suspicious RCU usage 5.3.0-rc7+ #0 Not tainted ----------------------------- include/net/sch_generic.h:492 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by syz-executor427/8855: #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: lwtunnel_xmit_redirect include/net/lwtunnel.h:92 [inline] #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x2dc/0x2570 net/ipv4/ip_output.c:214 ivanmeler#1: 00000000b5525c01 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x20a/0x3650 net/core/dev.c:3804 ivanmeler#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline] ivanmeler#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_xmit_skb net/core/dev.c:3502 [inline] ivanmeler#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_queue_xmit+0x14b8/0x3650 net/core/dev.c:3838 stack backtrace: CPU: 0 PID: 8855 Comm: syz-executor427 Not tainted 5.3.0-rc7+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5357 qdisc_root include/net/sch_generic.h:492 [inline] netem_enqueue+0x1cfb/0x2d80 net/sched/sch_netem.c:479 __dev_xmit_skb net/core/dev.c:3527 [inline] __dev_queue_xmit+0x15d2/0x3650 net/core/dev.c:3838 dev_queue_xmit+0x18/0x20 net/core/dev.c:3902 neigh_hh_output include/net/neighbour.h:500 [inline] neigh_output include/net/neighbour.h:509 [inline] ip_finish_output2+0x1726/0x2570 net/ipv4/ip_output.c:228 __ip_finish_output net/ipv4/ip_output.c:308 [inline] __ip_finish_output+0x5fc/0xb90 net/ipv4/ip_output.c:290 ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:318 NF_HOOK_COND include/linux/netfilter.h:294 [inline] ip_mc_output+0x292/0xf40 net/ipv4/ip_output.c:417 dst_output include/net/dst.h:436 [inline] ip_local_out+0xbb/0x190 net/ipv4/ip_output.c:125 ip_send_skb+0x42/0xf0 net/ipv4/ip_output.c:1555 udp_send_skb.isra.0+0x6b2/0x1160 net/ipv4/udp.c:887 udp_sendmsg+0x1e96/0x2820 net/ipv4/udp.c:1174 inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:657 ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311 __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413 __do_sys_sendmmsg net/socket.c:2442 [inline] __se_sys_sendmmsg net/socket.c:2439 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Ic47d9dd7d7ff9ac46cca29518b83ecaa9076180c
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
commit 5c9934b6767b16ba60be22ec3cbd4379ad64170d upstream. We got another syzbot report [1] that tells us we must use write_lock_irq()/write_unlock_irq() to avoid possible deadlock. [1] WARNING: inconsistent lock state 5.5.0-rc1-syzkaller #0 Not tainted -------------------------------- inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-R} usage. syz-executor826/9605 [HC1[1]:SC0[0]:HE0:SE1] takes: ffffffff8a128718 (disc_data_lock){+-..}, at: sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138 {HARDIRQ-ON-W} state was registered at: lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485 __raw_write_lock_bh include/linux/rwlock_api_smp.h:203 [inline] _raw_write_lock_bh+0x33/0x50 kernel/locking/spinlock.c:319 sixpack_close+0x1d/0x250 drivers/net/hamradio/6pack.c:657 tty_ldisc_close.isra.0+0x119/0x1a0 drivers/tty/tty_ldisc.c:489 tty_set_ldisc+0x230/0x6b0 drivers/tty/tty_ldisc.c:585 tiocsetd drivers/tty/tty_io.c:2337 [inline] tty_ioctl+0xe8d/0x14f0 drivers/tty/tty_io.c:2597 vfs_ioctl fs/ioctl.c:47 [inline] file_ioctl fs/ioctl.c:545 [inline] do_vfs_ioctl+0x977/0x14e0 fs/ioctl.c:732 ksys_ioctl+0xab/0xd0 fs/ioctl.c:749 __do_sys_ioctl fs/ioctl.c:756 [inline] __se_sys_ioctl fs/ioctl.c:754 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:754 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe irq event stamp: 3946 hardirqs last enabled at (3945): [<ffffffff87c86e43>] __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline] hardirqs last enabled at (3945): [<ffffffff87c86e43>] _raw_spin_unlock_irq+0x23/0x80 kernel/locking/spinlock.c:199 hardirqs last disabled at (3946): [<ffffffff8100675f>] trace_hardirqs_off_thunk+0x1a/0x1c arch/x86/entry/thunk_64.S:42 softirqs last enabled at (2658): [<ffffffff86a8b4df>] spin_unlock_bh include/linux/spinlock.h:383 [inline] softirqs last enabled at (2658): [<ffffffff86a8b4df>] clusterip_netdev_event+0x46f/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:222 softirqs last disabled at (2656): [<ffffffff86a8b22b>] spin_lock_bh include/linux/spinlock.h:343 [inline] softirqs last disabled at (2656): [<ffffffff86a8b22b>] clusterip_netdev_event+0x1bb/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:196 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(disc_data_lock); <Interrupt> lock(disc_data_lock); *** DEADLOCK *** 5 locks held by syz-executor826/9605: #0: ffff8880a905e198 (&tty->legacy_mutex){+.+.}, at: tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19 ivanmeler#1: ffffffff899a56c0 (rcu_read_lock){....}, at: mutex_spin_on_owner+0x0/0x330 kernel/locking/mutex.c:413 ivanmeler#2: ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: spin_lock include/linux/spinlock.h:338 [inline] ivanmeler#2: ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: serial8250_interrupt+0x2d/0x1a0 drivers/tty/serial/8250/8250_core.c:116 ivanmeler#3: ffffffff8c104048 (&port_lock_key){-.-.}, at: serial8250_handle_irq.part.0+0x24/0x330 drivers/tty/serial/8250/8250_port.c:1823 ivanmeler#4: ffff8880a905e090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref+0x22/0x90 drivers/tty/tty_ldisc.c:288 stack backtrace: CPU: 1 PID: 9605 Comm: syz-executor826 Not tainted 5.5.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_usage_bug.cold+0x327/0x378 kernel/locking/lockdep.c:3101 valid_state kernel/locking/lockdep.c:3112 [inline] mark_lock_irq kernel/locking/lockdep.c:3309 [inline] mark_lock+0xbb4/0x1220 kernel/locking/lockdep.c:3666 mark_usage kernel/locking/lockdep.c:3554 [inline] __lock_acquire+0x1e55/0x4a00 kernel/locking/lockdep.c:3909 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485 __raw_read_lock include/linux/rwlock_api_smp.h:149 [inline] _raw_read_lock+0x32/0x50 kernel/locking/spinlock.c:223 sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138 sixpack_write_wakeup+0x25/0x340 drivers/net/hamradio/6pack.c:402 tty_wakeup+0xe9/0x120 drivers/tty/tty_io.c:536 tty_port_default_wakeup+0x2b/0x40 drivers/tty/tty_port.c:50 tty_port_tty_wakeup+0x57/0x70 drivers/tty/tty_port.c:387 uart_write_wakeup+0x46/0x70 drivers/tty/serial/serial_core.c:104 serial8250_tx_chars+0x495/0xaf0 drivers/tty/serial/8250/8250_port.c:1761 serial8250_handle_irq.part.0+0x2a2/0x330 drivers/tty/serial/8250/8250_port.c:1834 serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1820 [inline] serial8250_default_handle_irq+0xc0/0x150 drivers/tty/serial/8250/8250_port.c:1850 serial8250_interrupt+0xf1/0x1a0 drivers/tty/serial/8250/8250_core.c:126 __handle_irq_event_percpu+0x15d/0x970 kernel/irq/handle.c:149 handle_irq_event_percpu+0x74/0x160 kernel/irq/handle.c:189 handle_irq_event+0xa7/0x134 kernel/irq/handle.c:206 handle_edge_irq+0x25e/0x8d0 kernel/irq/chip.c:830 generic_handle_irq_desc include/linux/irqdesc.h:156 [inline] do_IRQ+0xde/0x280 arch/x86/kernel/irq.c:250 common_interrupt+0xf/0xf arch/x86/entry/entry_64.S:607 </IRQ> RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:685 [inline] RIP: 0010:mutex_spin_on_owner+0x247/0x330 kernel/locking/mutex.c:579 Code: c3 be 08 00 00 00 4c 89 e7 e8 e5 06 59 00 4c 89 e0 48 c1 e8 03 42 80 3c 38 00 0f 85 e1 00 00 00 49 8b 04 24 a8 01 75 96 f3 90 <e9> 2f fe ff ff 0f 0b e8 0d 19 09 00 84 c0 0f 85 ff fd ff ff 48 c7 RSP: 0018:ffffc90001eafa20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd7 RAX: 0000000000000000 RBX: ffff88809fd9e0c0 RCX: 1ffffffff13266dd RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000000 RBP: ffffc90001eafa60 R08: 1ffff11013d22898 R09: ffffed1013d22899 R10: ffffed1013d22898 R11: ffff88809e9144c7 R12: ffff8880a905e138 R13: ffff88809e9144c0 R14: 0000000000000000 R15: dffffc0000000000 mutex_optimistic_spin kernel/locking/mutex.c:673 [inline] __mutex_lock_common kernel/locking/mutex.c:962 [inline] __mutex_lock+0x32b/0x13c0 kernel/locking/mutex.c:1106 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1121 tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19 tty_release+0xb5/0xe90 drivers/tty/tty_io.c:1665 __fput+0x2ff/0x890 fs/file_table.c:280 ____fput+0x16/0x20 fs/file_table.c:313 task_work_run+0x145/0x1c0 kernel/task_work.c:113 exit_task_work include/linux/task_work.h:22 [inline] do_exit+0x8e7/0x2ef0 kernel/exit.c:797 do_group_exit+0x135/0x360 kernel/exit.c:895 __do_sys_exit_group kernel/exit.c:906 [inline] __se_sys_exit_group kernel/exit.c:904 [inline] __x64_sys_exit_group+0x44/0x50 kernel/exit.c:904 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x43fef8 Code: Bad RIP value. RSP: 002b:00007ffdb07d2338 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043fef8 RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000 RBP: 00000000004bf730 R08: 00000000000000e7 R09: ffffffffffffffd0 R10: 00000000004002c8 R11: 0000000000000246 R12: 0000000000000001 R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000 Fixes: 6e4e2f8 ("6pack,mkiss: fix lock inconsistency") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Id26e23f7b2ce74845dffd39c1fc7dbe8c836dbdd
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
[ Upstream commit 64081362e8ff4587b4554087f3cfc73d3e0a4cd7 ] We've recently seen a workload on XFS filesystems with a repeatable deadlock between background writeback and a multi-process application doing concurrent writes and fsyncs to a small range of a file. range_cyclic writeback Process 1 Process 2 xfs_vm_writepages write_cache_pages writeback_index = 2 cycled = 0 .... find page 2 dirty lock Page 2 ->writepage page 2 writeback page 2 clean page 2 added to bio no more pages write() locks page 1 dirties page 1 locks page 2 dirties page 1 fsync() .... xfs_vm_writepages write_cache_pages start index 0 find page 1 towrite lock Page 1 ->writepage page 1 writeback page 1 clean page 1 added to bio find page 2 towrite lock Page 2 page 2 is writeback <blocks> write() locks page 1 dirties page 1 fsync() .... xfs_vm_writepages write_cache_pages start index 0 !done && !cycled sets index to 0, restarts lookup find page 1 dirty find page 1 towrite lock Page 1 page 1 is writeback <blocks> lock Page 1 <blocks> DEADLOCK because: - process 1 needs page 2 writeback to complete to make enough progress to issue IO pending for page 1 - writeback needs page 1 writeback to complete so process 2 can progress and unlock the page it is blocked on, then it can issue the IO pending for page 2 - process 2 can't make progress until process 1 issues IO for page 1 The underlying cause of the problem here is that range_cyclic writeback is processing pages in descending index order as we hold higher index pages in a structure controlled from above write_cache_pages(). The write_cache_pages() caller needs to be able to submit these pages for IO before write_cache_pages restarts writeback at mapping index 0 to avoid wcp inverting the page lock/writeback wait order. generic_writepages() is not susceptible to this bug as it has no private context held across write_cache_pages() - filesystems using this infrastructure always submit pages in ->writepage immediately and so there is no problem with range_cyclic going back to mapping index 0. However: mpage_writepages() has a private bio context, exofs_writepages() has page_collect fuse_writepages() has fuse_fill_wb_data nfs_writepages() has nfs_pageio_descriptor xfs_vm_writepages() has xfs_writepage_ctx All of these ->writepages implementations can hold pages under writeback in their private structures until write_cache_pages() returns, and hence they are all susceptible to this deadlock. Also worth noting is that ext4 has it's own bastardised version of write_cache_pages() and so it /may/ have an equivalent deadlock. I looked at the code long enough to understand that it has a similar retry loop for range_cyclic writeback reaching the end of the file and then promptly ran away before my eyes bled too much. I'll leave it for the ext4 developers to determine if their code is actually has this deadlock and how to fix it if it has. There's a few ways I can see avoid this deadlock. There's probably more, but these are the first I've though of: 1. get rid of range_cyclic altogether 2. range_cyclic always stops at EOF, and we start again from writeback index 0 on the next call into write_cache_pages() 2a. wcp also returns EAGAIN to ->writepages implementations to indicate range cyclic has hit EOF. writepages implementations can then flush the current context and call wpc again to continue. i.e. lift the retry into the ->writepages implementation 3. range_cyclic uses trylock_page() rather than lock_page(), and it skips pages it can't lock without blocking. It will already do this for pages under writeback, so this seems like a no-brainer 3a. all non-WB_SYNC_ALL writeback uses trylock_page() to avoid blocking as per pages under writeback. I don't think ivanmeler#1 is an option - range_cyclic prevents frequently dirtied lower file offset from starving background writeback of rarely touched higher file offsets. performance as going back to the start of the file implies an immediate seek. We'll have exactly the same number of seeks if we switch writeback to another inode, and then come back to this one later and restart from index 0. retry loop up into the wcp caller means we can issue IO on the pending pages before calling wcp again, and so avoid locking or waiting on pages in the wrong order. I'm not convinced we need to do this given that we get the same thing from ivanmeler#2 on the next writeback call from the writeback infrastructure. inversion problem, just prevents it from becoming a deadlock situation. I'd prefer we fix the inversion, not sweep it under the carpet like this. band-aid fix of ivanmeler#3. So it seems that the simplest way to fix this issue is to implement solution ivanmeler#2 Link: http://lkml.kernel.org/r/20181005054526.21507-1-david@fromorbit.com Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Jan Kara <jack@suse.de> Cc: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I63adda506962733b2e35427bacbae729e094f104
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
commit 8bb2ee192e482c5d500df9f2b1b26a560bd3026f upstream. do_task_stat() accesses IP and SP of a task without bumping reference count of a stack (which became an entity with independent lifetime at some point). Steps to reproduce: #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/time.h> #include <sys/resource.h> #include <unistd.h> #include <sys/wait.h> int main(void) { setrlimit(RLIMIT_CORE, &(struct rlimit){}); while (1) { char buf[64]; char buf2[4096]; pid_t pid; int fd; pid = fork(); if (pid == 0) { *(volatile int *)0 = 0; } snprintf(buf, sizeof(buf), "/proc/%u/stat", pid); fd = open(buf, O_RDONLY); read(fd, buf2, sizeof(buf2)); close(fd); waitpid(pid, NULL, 0); } return 0; } BUG: unable to handle kernel paging request at 0000000000003fd8 IP: do_task_stat+0x8b4/0xaf0 PGD 800000003d73e067 P4D 800000003d73e067 PUD 3d558067 PMD 0 Oops: 0000 [ivanmeler#1] PREEMPT SMP PTI CPU: 0 PID: 1417 Comm: a.out Not tainted 4.15.0-rc8-dirty ivanmeler#2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc27 04/01/2014 RIP: 0010:do_task_stat+0x8b4/0xaf0 Call Trace: proc_single_show+0x43/0x70 seq_read+0xe6/0x3b0 __vfs_read+0x1e/0x120 vfs_read+0x84/0x110 SyS_read+0x3d/0xa0 entry_SYSCALL_64_fastpath+0x13/0x6c RIP: 0033:0x7f4d7928cba0 RSP: 002b:00007ffddb245158 EFLAGS: 00000246 Code: 03 b7 a0 01 00 00 4c 8b 4c 24 70 4c 8b 44 24 78 4c 89 74 24 18 e9 91 f9 ff ff f6 45 4d 02 0f 84 fd f7 ff ff 48 8b 45 40 48 89 ef <48> 8b 80 d8 3f 00 00 48 89 44 24 20 e8 9b 97 eb ff 48 89 44 24 RIP: do_task_stat+0x8b4/0xaf0 RSP: ffffc90000607cc8 CR2: 0000000000003fd8 John Ogness said: for my tests I added an else case to verify that the race is hit and correctly mitigated. Link: http://lkml.kernel.org/r/20180116175054.GA11513@avx2 Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Reported-by: "Kohli, Gaurav" <gkohli@codeaurora.org> Tested-by: John Ogness <john.ogness@linutronix.de> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@elte.hu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: zhangyi (F) <yi.zhang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I776b02e7fa3a5db870bf2a08c5c1ea423d628fcd
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
[ Upstream commit cfcbae3895b86c390ede57b2a8f601dd5972b47b ] In function __ufshcd_query_descriptor(), in the event of an error happening, we directly goto out_unlock and forget to invaliate hba->dev_cmd.query.descriptor pointer. This results in this pointer still valid in ufshcd_copy_query_response() for other query requests which go through ufshcd_exec_raw_upiu_cmd(). This will cause __memcpy() crash and system hangs. Log as shown below: Unable to handle kernel paging request at virtual address ffff000012233c40 Mem abort info: ESR = 0x96000047 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x00000047 CM = 0, WnR = 1 swapper pgtable: 4k pages, 48-bit VAs, pgdp = 0000000028cc735c [ffff000012233c40] pgd=00000000bffff003, pud=00000000bfffe003, pmd=00000000ba8b8003, pte=0000000000000000 Internal error: Oops: 96000047 [ivanmeler#2] PREEMPT SMP ... Call trace: __memcpy+0x74/0x180 ufshcd_issue_devman_upiu_cmd+0x250/0x3c0 ufshcd_exec_raw_upiu_cmd+0xfc/0x1a8 ufs_bsg_request+0x178/0x3b0 bsg_queue_rq+0xc0/0x118 blk_mq_dispatch_rq_list+0xb0/0x538 blk_mq_sched_dispatch_requests+0x18c/0x1d8 __blk_mq_run_hw_queue+0xb4/0x118 blk_mq_run_work_fn+0x28/0x38 process_one_work+0x1ec/0x470 worker_thread+0x48/0x458 kthread+0x130/0x138 ret_from_fork+0x10/0x1c Code: 540000ab a8c12027 a88120c7 a8c12027 (a88120c7) ---[ end trace 793e1eb5dff69f2d ]--- note: kworker/0:2H[2054] exited with preempt_count 1 This patch is to move "descriptor = NULL" down to below the label "out_unlock". Fixes: d44a5f9(ufs: query descriptor API) Link: https://lore.kernel.org/r/20191112223436.27449-3-huobean@gmail.com Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I75343d11edaf78f0c9cba379cbb2edd5ad0cb228
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Mar 10, 2020
[ Upstream commit 42ffb0bf584ae5b6b38f72259af1e0ee417ac77f ] There exists a deadlock with range_cyclic that has existed forever. If we loop around with a bio already built we could deadlock with a writer who has the page locked that we're attempting to write but is waiting on a page in our bio to be written out. The task traces are as follows PID: 1329874 TASK: ffff889ebcdf3800 CPU: 33 COMMAND: "kworker/u113:5" #0 [ffffc900297bb658] __schedule at ffffffff81a4c33f ivanmeler#1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3 ivanmeler#2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42 ivanmeler#3 [ffffc900297bb708] __lock_page at ffffffff811f145b ivanmeler#4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502 ivanmeler#5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684 ivanmeler#6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff ivanmeler#7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0 ivanmeler#8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2 #9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd PID: 2167901 TASK: ffff889dc6a59c00 CPU: 14 COMMAND: "aio-dio-invalid" #0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f ivanmeler#1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3 ivanmeler#2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42 ivanmeler#3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6 ivanmeler#4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7 ivanmeler#5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359 ivanmeler#6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933 ivanmeler#7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8 ivanmeler#8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d #9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032 I used drgn to find the respective pages we were stuck on page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901 page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874 As you can see the kworker is waiting for bit 0 (PG_locked) on index 7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index 8148. aio-dio-invalid has 7680, and the kworker epd looks like the following crash> struct extent_page_data ffffc900297bbbb0 struct extent_page_data { bio = 0xffff889f747ed830, tree = 0xffff889eed6ba448, extent_locked = 0, sync_io = 0 } Probably worth mentioning as well that it waits for writeback of the page to complete while holding a lock on it (at prepare_pages()). Using drgn I walked the bio pages looking for page 0xffffea00fbfc7500 which is the one we're waiting for writeback on bio = Object(prog, 'struct bio', address=0xffff889f747ed830) for i in range(0, bio.bi_vcnt.value_()): bv = bio.bi_io_vec[i] if bv.bv_page.value_() == 0xffffea00fbfc7500: print("FOUND IT") which validated what I suspected. The fix for this is simple, flush the epd before we loop back around to the beginning of the file during writeout. Fixes: b293f02 ("Btrfs: Add writepages support") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I5daee5ea01df5eb76df156cb489140e58826cab4
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 26, 2020
commit e091eab028f9253eac5c04f9141bbc9d170acab3 upstream. In some cases, ocfs2_iget() reads the data of inode, which has been deleted for some reason. That will make the system panic. So We should judge whether this inode has been deleted, and tell the caller that the inode is a bad inode. For example, the ocfs2 is used as the backed of nfs, and the client is nfsv3. This issue can be reproduced by the following steps. on the nfs server side, ..../patha/pathb Step 1: The process A was scheduled before calling the function fh_verify. Step 2: The process B is removing the 'pathb', and just completed the call to function dput. Then the dentry of 'pathb' has been deleted from the dcache, and all ancestors have been deleted also. The relationship of dentry and inode was deleted through the function hlist_del_init. The following is the call stack. dentry_iput->hlist_del_init(&dentry->d_u.d_alias) At this time, the inode is still in the dcache. Step 3: The process A call the function ocfs2_get_dentry, which get the inode from dcache. Then the refcount of inode is 1. The following is the call stack. nfsd3_proc_getacl->fh_verify->exportfs_decode_fh->fh_to_dentry(ocfs2_get_dentry) Step 4: Dirty pages are flushed by bdi threads. So the inode of 'patha' is evicted, and this directory was deleted. But the inode of 'pathb' can't be evicted, because the refcount of the inode was 1. Step 5: The process A keep running, and call the function reconnect_path(in exportfs_decode_fh), which call function ocfs2_get_parent of ocfs2. Get the block number of parent directory(patha) by the name of ... Then read the data from disk by the block number. But this inode has been deleted, so the system panic. Process A Process B 1. in nfsd3_proc_getacl | 2. | dput 3. fh_to_dentry(ocfs2_get_dentry) | 4. bdi flush dirty cache | 5. ocfs2_iget | [283465.542049] OCFS2: ERROR (device sdp): ocfs2_validate_inode_block: Invalid dinode #580640: OCFS2_VALID_FL not set [283465.545490] Kernel panic - not syncing: OCFS2: (device sdp): panic forced after error [283465.546889] CPU: 5 PID: 12416 Comm: nfsd Tainted: G W 4.1.12-124.18.6.el6uek.bug28762940v3.x86_64 ivanmeler#2 [283465.548382] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/21/2015 [283465.549657] 0000000000000000 ffff8800a56fb7b8 ffffffff816e839c ffffffffa0514758 [283465.550392] 000000000008dc20 ffff8800a56fb838 ffffffff816e62d3 0000000000000008 [283465.551056] ffff880000000010 ffff8800a56fb848 ffff8800a56fb7e8 ffff88005df9f000 [283465.551710] Call Trace: [283465.552516] [<ffffffff816e839c>] dump_stack+0x63/0x81 [283465.553291] [<ffffffff816e62d3>] panic+0xcb/0x21b [283465.554037] [<ffffffffa04e66b0>] ocfs2_handle_error+0xf0/0xf0 [ocfs2] [283465.554882] [<ffffffffa04e7737>] __ocfs2_error+0x67/0x70 [ocfs2] [283465.555768] [<ffffffffa049c0f9>] ocfs2_validate_inode_block+0x229/0x230 [ocfs2] [283465.556683] [<ffffffffa047bcbc>] ocfs2_read_blocks+0x46c/0x7b0 [ocfs2] [283465.557408] [<ffffffffa049bed0>] ? ocfs2_inode_cache_io_unlock+0x20/0x20 [ocfs2] [283465.557973] [<ffffffffa049f0eb>] ocfs2_read_inode_block_full+0x3b/0x60 [ocfs2] [283465.558525] [<ffffffffa049f5ba>] ocfs2_iget+0x4aa/0x880 [ocfs2] [283465.559082] [<ffffffffa049146e>] ocfs2_get_parent+0x9e/0x220 [ocfs2] [283465.559622] [<ffffffff81297c05>] reconnect_path+0xb5/0x300 [283465.560156] [<ffffffff81297f46>] exportfs_decode_fh+0xf6/0x2b0 [283465.560708] [<ffffffffa062faf0>] ? nfsd_proc_getattr+0xa0/0xa0 [nfsd] [283465.561262] [<ffffffff810a8196>] ? prepare_creds+0x26/0x110 [283465.561932] [<ffffffffa0630860>] fh_verify+0x350/0x660 [nfsd] [283465.562862] [<ffffffffa0637804>] ? nfsd_cache_lookup+0x44/0x630 [nfsd] [283465.563697] [<ffffffffa063a8b9>] nfsd3_proc_getattr+0x69/0xf0 [nfsd] [283465.564510] [<ffffffffa062cf60>] nfsd_dispatch+0xe0/0x290 [nfsd] [283465.565358] [<ffffffffa05eb892>] ? svc_tcp_adjust_wspace+0x12/0x30 [sunrpc] [283465.566272] [<ffffffffa05ea652>] svc_process_common+0x412/0x6a0 [sunrpc] [283465.567155] [<ffffffffa05eaa03>] svc_process+0x123/0x210 [sunrpc] [283465.568020] [<ffffffffa062c90f>] nfsd+0xff/0x170 [nfsd] [283465.568962] [<ffffffffa062c810>] ? nfsd_destroy+0x80/0x80 [nfsd] [283465.570112] [<ffffffff810a622b>] kthread+0xcb/0xf0 [283465.571099] [<ffffffff810a6160>] ? kthread_create_on_node+0x180/0x180 [283465.572114] [<ffffffff816f11b8>] ret_from_fork+0x58/0x90 [283465.573156] [<ffffffff810a6160>] ? kthread_create_on_node+0x180/0x180 Link: http://lkml.kernel.org/r/1554185919-3010-1-git-send-email-sunny.s.zhang@oracle.com Signed-off-by: Shuning Zhang <sunny.s.zhang@oracle.com> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: piaojun <piaojun@huawei.com> Cc: "Gang He" <ghe@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Iead149536359912d6afd34e28b656222ced32a12
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 26, 2020
[ Upstream commit 443f2d5ba13d65ccfd879460f77941875159d154 ] Observe a segmentation fault when 'perf stat' is asked to repeat forever with the interval option. Without fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.000211692 3,13,89,82,34,157 cycles 10.000380119 1,53,98,52,22,294 cycles 10.040467280 17,16,79,265 cycles Segmentation fault This problem was only observed when we use forever option aka -r 0 and works with limited repeats. Calling print_counter with ts being set to NULL, is not a correct option when interval is set. Hence avoid print_counter(NULL,..) if interval is set. With fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.019866622 3,15,14,43,08,697 cycles 10.039865756 3,15,16,31,95,261 cycles 10.059950628 1,26,05,47,158 cycles 5.009902655 3,14,52,62,33,932 cycles 10.019880228 3,14,52,22,89,154 cycles 10.030543876 66,90,18,333 cycles 5.009848281 3,14,51,98,25,437 cycles 10.029854402 3,15,14,93,04,918 cycles 5.009834177 3,14,51,95,92,316 cycles Committer notes: Did the 'git bisect' to find the cset introducing the problem to add the Fixes tag below, and at that time the problem reproduced as: (gdb) run stat -r0 -I500 sleep 1 <SNIP> Program received signal SIGSEGV, Segmentation fault. print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 866 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, csv_sep); (gdb) bt #0 print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 ivanmeler#1 0x000000000041860a in print_counters (ts=ts@entry=0x0, argc=argc@entry=2, argv=argv@entry=0x7fffffffd640) at builtin-stat.c:938 ivanmeler#2 0x0000000000419a7f in cmd_stat (argc=2, argv=0x7fffffffd640, prefix=<optimized out>) at builtin-stat.c:1411 ivanmeler#3 0x000000000045c65a in run_builtin (p=p@entry=0x6291b8 <commands+216>, argc=argc@entry=5, argv=argv@entry=0x7fffffffd640) at perf.c:370 ivanmeler#4 0x000000000045c893 in handle_internal_command (argc=5, argv=0x7fffffffd640) at perf.c:429 ivanmeler#5 0x000000000045c8f1 in run_argv (argcp=argcp@entry=0x7fffffffd4ac, argv=argv@entry=0x7fffffffd4a0) at perf.c:473 ivanmeler#6 0x000000000045cac9 in main (argc=<optimized out>, argv=<optimized out>) at perf.c:588 (gdb) Mostly the same as just before this patch: Program received signal SIGSEGV, Segmentation fault. 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 964 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep); (gdb) bt #0 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 ivanmeler#1 0x0000000000588047 in perf_evlist__print_counters (evlist=0xbc9b90, config=0xa1f2a0 <stat_config>, _target=0xa1f0c0 <target>, ts=0x0, argc=2, argv=0x7fffffffd670) at util/stat-display.c:1172 ivanmeler#2 0x000000000045390f in print_counters (ts=0x0, argc=2, argv=0x7fffffffd670) at builtin-stat.c:656 ivanmeler#3 0x0000000000456bb5 in cmd_stat (argc=2, argv=0x7fffffffd670) at builtin-stat.c:1960 ivanmeler#4 0x00000000004dd2e0 in run_builtin (p=0xa30e00 <commands+288>, argc=5, argv=0x7fffffffd670) at perf.c:310 ivanmeler#5 0x00000000004dd54d in handle_internal_command (argc=5, argv=0x7fffffffd670) at perf.c:362 ivanmeler#6 0x00000000004dd694 in run_argv (argcp=0x7fffffffd4cc, argv=0x7fffffffd4c0) at perf.c:406 ivanmeler#7 0x00000000004dda11 in main (argc=5, argv=0x7fffffffd670) at perf.c:531 (gdb) Fixes: d4f63a4741a8 ("perf stat: Introduce print_counters function") Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v4.2+ Link: http://lore.kernel.org/lkml/20190904094738.9558-3-srikar@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I03e14a64d8f774e0cd166298744216ebe9fb26bc
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 26, 2020
[ Upstream commit b66f31efbdad95ec274345721d99d1d835e6de01 ] This patch fixes the lock inversion complaint: ============================================ WARNING: possible recursive locking detected 5.3.0-rc7-dbg+ ivanmeler#1 Not tainted -------------------------------------------- kworker/u16:6/171 is trying to acquire lock: 00000000035c6e6c (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x78/0x4a0 [rdma_cm] but task is already holding lock: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&id_priv->handler_mutex); lock(&id_priv->handler_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/u16:6/171: #0: 00000000e2eaa773 ((wq_completion)iw_cm_wq){+.+.}, at: process_one_work+0x472/0xac0 ivanmeler#1: 000000001efd357b ((work_completion)(&work->work)ivanmeler#3){+.+.}, at: process_one_work+0x476/0xac0 ivanmeler#2: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] stack backtrace: CPU: 3 PID: 171 Comm: kworker/u16:6 Not tainted 5.3.0-rc7-dbg+ ivanmeler#1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: iw_cm_wq cm_work_handler [iw_cm] Call Trace: dump_stack+0x8a/0xd6 __lock_acquire.cold+0xe1/0x24d lock_acquire+0x106/0x240 __mutex_lock+0x12e/0xcb0 mutex_lock_nested+0x1f/0x30 rdma_destroy_id+0x78/0x4a0 [rdma_cm] iw_conn_req_handler+0x5c9/0x680 [rdma_cm] cm_work_handler+0xe62/0x1100 [iw_cm] process_one_work+0x56d/0xac0 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 This is not a bug as there are actually two lock classes here. Link: https://lore.kernel.org/r/20190930231707.48259-3-bvanassche@acm.org Fixes: de910bd ("RDMA/cma: Simplify locking needed for serialization of callbacks") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Iec345e445c537b8b7003488b79a620d4cc8c7ea0
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 26, 2020
commit 159d2c7d8106177bd9a986fd005a311fe0d11285 upstream. qdisc_root() use from netem_enqueue() triggers a lockdep warning. __dev_queue_xmit() uses rcu_read_lock_bh() which is not equivalent to rcu_read_lock() + local_bh_disable_bh as far as lockdep is concerned. WARNING: suspicious RCU usage 5.3.0-rc7+ #0 Not tainted ----------------------------- include/net/sch_generic.h:492 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by syz-executor427/8855: #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: lwtunnel_xmit_redirect include/net/lwtunnel.h:92 [inline] #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x2dc/0x2570 net/ipv4/ip_output.c:214 ivanmeler#1: 00000000b5525c01 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x20a/0x3650 net/core/dev.c:3804 ivanmeler#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline] ivanmeler#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_xmit_skb net/core/dev.c:3502 [inline] ivanmeler#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_queue_xmit+0x14b8/0x3650 net/core/dev.c:3838 stack backtrace: CPU: 0 PID: 8855 Comm: syz-executor427 Not tainted 5.3.0-rc7+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5357 qdisc_root include/net/sch_generic.h:492 [inline] netem_enqueue+0x1cfb/0x2d80 net/sched/sch_netem.c:479 __dev_xmit_skb net/core/dev.c:3527 [inline] __dev_queue_xmit+0x15d2/0x3650 net/core/dev.c:3838 dev_queue_xmit+0x18/0x20 net/core/dev.c:3902 neigh_hh_output include/net/neighbour.h:500 [inline] neigh_output include/net/neighbour.h:509 [inline] ip_finish_output2+0x1726/0x2570 net/ipv4/ip_output.c:228 __ip_finish_output net/ipv4/ip_output.c:308 [inline] __ip_finish_output+0x5fc/0xb90 net/ipv4/ip_output.c:290 ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:318 NF_HOOK_COND include/linux/netfilter.h:294 [inline] ip_mc_output+0x292/0xf40 net/ipv4/ip_output.c:417 dst_output include/net/dst.h:436 [inline] ip_local_out+0xbb/0x190 net/ipv4/ip_output.c:125 ip_send_skb+0x42/0xf0 net/ipv4/ip_output.c:1555 udp_send_skb.isra.0+0x6b2/0x1160 net/ipv4/udp.c:887 udp_sendmsg+0x1e96/0x2820 net/ipv4/udp.c:1174 inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:657 ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311 __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413 __do_sys_sendmmsg net/socket.c:2442 [inline] __se_sys_sendmmsg net/socket.c:2439 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Ic47d9dd7d7ff9ac46cca29518b83ecaa9076180c
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 27, 2020
[ Upstream commit 443f2d5ba13d65ccfd879460f77941875159d154 ] Observe a segmentation fault when 'perf stat' is asked to repeat forever with the interval option. Without fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.000211692 3,13,89,82,34,157 cycles 10.000380119 1,53,98,52,22,294 cycles 10.040467280 17,16,79,265 cycles Segmentation fault This problem was only observed when we use forever option aka -r 0 and works with limited repeats. Calling print_counter with ts being set to NULL, is not a correct option when interval is set. Hence avoid print_counter(NULL,..) if interval is set. With fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.019866622 3,15,14,43,08,697 cycles 10.039865756 3,15,16,31,95,261 cycles 10.059950628 1,26,05,47,158 cycles 5.009902655 3,14,52,62,33,932 cycles 10.019880228 3,14,52,22,89,154 cycles 10.030543876 66,90,18,333 cycles 5.009848281 3,14,51,98,25,437 cycles 10.029854402 3,15,14,93,04,918 cycles 5.009834177 3,14,51,95,92,316 cycles Committer notes: Did the 'git bisect' to find the cset introducing the problem to add the Fixes tag below, and at that time the problem reproduced as: (gdb) run stat -r0 -I500 sleep 1 <SNIP> Program received signal SIGSEGV, Segmentation fault. print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 866 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, csv_sep); (gdb) bt #0 print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 ivanmeler#1 0x000000000041860a in print_counters (ts=ts@entry=0x0, argc=argc@entry=2, argv=argv@entry=0x7fffffffd640) at builtin-stat.c:938 ivanmeler#2 0x0000000000419a7f in cmd_stat (argc=2, argv=0x7fffffffd640, prefix=<optimized out>) at builtin-stat.c:1411 ivanmeler#3 0x000000000045c65a in run_builtin (p=p@entry=0x6291b8 <commands+216>, argc=argc@entry=5, argv=argv@entry=0x7fffffffd640) at perf.c:370 ivanmeler#4 0x000000000045c893 in handle_internal_command (argc=5, argv=0x7fffffffd640) at perf.c:429 ivanmeler#5 0x000000000045c8f1 in run_argv (argcp=argcp@entry=0x7fffffffd4ac, argv=argv@entry=0x7fffffffd4a0) at perf.c:473 ivanmeler#6 0x000000000045cac9 in main (argc=<optimized out>, argv=<optimized out>) at perf.c:588 (gdb) Mostly the same as just before this patch: Program received signal SIGSEGV, Segmentation fault. 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 964 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep); (gdb) bt #0 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 ivanmeler#1 0x0000000000588047 in perf_evlist__print_counters (evlist=0xbc9b90, config=0xa1f2a0 <stat_config>, _target=0xa1f0c0 <target>, ts=0x0, argc=2, argv=0x7fffffffd670) at util/stat-display.c:1172 ivanmeler#2 0x000000000045390f in print_counters (ts=0x0, argc=2, argv=0x7fffffffd670) at builtin-stat.c:656 ivanmeler#3 0x0000000000456bb5 in cmd_stat (argc=2, argv=0x7fffffffd670) at builtin-stat.c:1960 ivanmeler#4 0x00000000004dd2e0 in run_builtin (p=0xa30e00 <commands+288>, argc=5, argv=0x7fffffffd670) at perf.c:310 ivanmeler#5 0x00000000004dd54d in handle_internal_command (argc=5, argv=0x7fffffffd670) at perf.c:362 ivanmeler#6 0x00000000004dd694 in run_argv (argcp=0x7fffffffd4cc, argv=0x7fffffffd4c0) at perf.c:406 ivanmeler#7 0x00000000004dda11 in main (argc=5, argv=0x7fffffffd670) at perf.c:531 (gdb) Fixes: d4f63a4741a8 ("perf stat: Introduce print_counters function") Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v4.2+ Link: http://lore.kernel.org/lkml/20190904094738.9558-3-srikar@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I03e14a64d8f774e0cd166298744216ebe9fb26bc
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 27, 2020
[ Upstream commit 443f2d5ba13d65ccfd879460f77941875159d154 ] Observe a segmentation fault when 'perf stat' is asked to repeat forever with the interval option. Without fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.000211692 3,13,89,82,34,157 cycles 10.000380119 1,53,98,52,22,294 cycles 10.040467280 17,16,79,265 cycles Segmentation fault This problem was only observed when we use forever option aka -r 0 and works with limited repeats. Calling print_counter with ts being set to NULL, is not a correct option when interval is set. Hence avoid print_counter(NULL,..) if interval is set. With fix: # perf stat -r 0 -I 5000 -e cycles -a sleep 10 # time counts unit events 5.019866622 3,15,14,43,08,697 cycles 10.039865756 3,15,16,31,95,261 cycles 10.059950628 1,26,05,47,158 cycles 5.009902655 3,14,52,62,33,932 cycles 10.019880228 3,14,52,22,89,154 cycles 10.030543876 66,90,18,333 cycles 5.009848281 3,14,51,98,25,437 cycles 10.029854402 3,15,14,93,04,918 cycles 5.009834177 3,14,51,95,92,316 cycles Committer notes: Did the 'git bisect' to find the cset introducing the problem to add the Fixes tag below, and at that time the problem reproduced as: (gdb) run stat -r0 -I500 sleep 1 <SNIP> Program received signal SIGSEGV, Segmentation fault. print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 866 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, csv_sep); (gdb) bt #0 print_interval (prefix=prefix@entry=0x7fffffffc8d0 "", ts=ts@entry=0x0) at builtin-stat.c:866 ivanmeler#1 0x000000000041860a in print_counters (ts=ts@entry=0x0, argc=argc@entry=2, argv=argv@entry=0x7fffffffd640) at builtin-stat.c:938 ivanmeler#2 0x0000000000419a7f in cmd_stat (argc=2, argv=0x7fffffffd640, prefix=<optimized out>) at builtin-stat.c:1411 ivanmeler#3 0x000000000045c65a in run_builtin (p=p@entry=0x6291b8 <commands+216>, argc=argc@entry=5, argv=argv@entry=0x7fffffffd640) at perf.c:370 ivanmeler#4 0x000000000045c893 in handle_internal_command (argc=5, argv=0x7fffffffd640) at perf.c:429 ivanmeler#5 0x000000000045c8f1 in run_argv (argcp=argcp@entry=0x7fffffffd4ac, argv=argv@entry=0x7fffffffd4a0) at perf.c:473 ivanmeler#6 0x000000000045cac9 in main (argc=<optimized out>, argv=<optimized out>) at perf.c:588 (gdb) Mostly the same as just before this patch: Program received signal SIGSEGV, Segmentation fault. 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 964 sprintf(prefix, "%6lu.%09lu%s", ts->tv_sec, ts->tv_nsec, config->csv_sep); (gdb) bt #0 0x00000000005874a7 in print_interval (config=0xa1f2a0 <stat_config>, evlist=0xbc9b90, prefix=0x7fffffffd1c0 "`", ts=0x0) at util/stat-display.c:964 ivanmeler#1 0x0000000000588047 in perf_evlist__print_counters (evlist=0xbc9b90, config=0xa1f2a0 <stat_config>, _target=0xa1f0c0 <target>, ts=0x0, argc=2, argv=0x7fffffffd670) at util/stat-display.c:1172 ivanmeler#2 0x000000000045390f in print_counters (ts=0x0, argc=2, argv=0x7fffffffd670) at builtin-stat.c:656 ivanmeler#3 0x0000000000456bb5 in cmd_stat (argc=2, argv=0x7fffffffd670) at builtin-stat.c:1960 ivanmeler#4 0x00000000004dd2e0 in run_builtin (p=0xa30e00 <commands+288>, argc=5, argv=0x7fffffffd670) at perf.c:310 ivanmeler#5 0x00000000004dd54d in handle_internal_command (argc=5, argv=0x7fffffffd670) at perf.c:362 ivanmeler#6 0x00000000004dd694 in run_argv (argcp=0x7fffffffd4cc, argv=0x7fffffffd4c0) at perf.c:406 ivanmeler#7 0x00000000004dda11 in main (argc=5, argv=0x7fffffffd670) at perf.c:531 (gdb) Fixes: d4f63a4741a8 ("perf stat: Introduce print_counters function") Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: stable@vger.kernel.org # v4.2+ Link: http://lore.kernel.org/lkml/20190904094738.9558-3-srikar@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I03e14a64d8f774e0cd166298744216ebe9fb26bc
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 27, 2020
[ Upstream commit b66f31efbdad95ec274345721d99d1d835e6de01 ] This patch fixes the lock inversion complaint: ============================================ WARNING: possible recursive locking detected 5.3.0-rc7-dbg+ ivanmeler#1 Not tainted -------------------------------------------- kworker/u16:6/171 is trying to acquire lock: 00000000035c6e6c (&id_priv->handler_mutex){+.+.}, at: rdma_destroy_id+0x78/0x4a0 [rdma_cm] but task is already holding lock: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&id_priv->handler_mutex); lock(&id_priv->handler_mutex); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by kworker/u16:6/171: #0: 00000000e2eaa773 ((wq_completion)iw_cm_wq){+.+.}, at: process_one_work+0x472/0xac0 ivanmeler#1: 000000001efd357b ((work_completion)(&work->work)ivanmeler#3){+.+.}, at: process_one_work+0x476/0xac0 ivanmeler#2: 00000000bc7c307d (&id_priv->handler_mutex){+.+.}, at: iw_conn_req_handler+0x151/0x680 [rdma_cm] stack backtrace: CPU: 3 PID: 171 Comm: kworker/u16:6 Not tainted 5.3.0-rc7-dbg+ ivanmeler#1 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Workqueue: iw_cm_wq cm_work_handler [iw_cm] Call Trace: dump_stack+0x8a/0xd6 __lock_acquire.cold+0xe1/0x24d lock_acquire+0x106/0x240 __mutex_lock+0x12e/0xcb0 mutex_lock_nested+0x1f/0x30 rdma_destroy_id+0x78/0x4a0 [rdma_cm] iw_conn_req_handler+0x5c9/0x680 [rdma_cm] cm_work_handler+0xe62/0x1100 [iw_cm] process_one_work+0x56d/0xac0 worker_thread+0x7a/0x5d0 kthread+0x1bc/0x210 ret_from_fork+0x24/0x30 This is not a bug as there are actually two lock classes here. Link: https://lore.kernel.org/r/20190930231707.48259-3-bvanassche@acm.org Fixes: de910bd ("RDMA/cma: Simplify locking needed for serialization of callbacks") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Iec345e445c537b8b7003488b79a620d4cc8c7ea0
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 27, 2020
commit 159d2c7d8106177bd9a986fd005a311fe0d11285 upstream. qdisc_root() use from netem_enqueue() triggers a lockdep warning. __dev_queue_xmit() uses rcu_read_lock_bh() which is not equivalent to rcu_read_lock() + local_bh_disable_bh as far as lockdep is concerned. WARNING: suspicious RCU usage 5.3.0-rc7+ #0 Not tainted ----------------------------- include/net/sch_generic.h:492 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by syz-executor427/8855: #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: lwtunnel_xmit_redirect include/net/lwtunnel.h:92 [inline] #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x2dc/0x2570 net/ipv4/ip_output.c:214 ivanmeler#1: 00000000b5525c01 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x20a/0x3650 net/core/dev.c:3804 ivanmeler#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline] ivanmeler#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_xmit_skb net/core/dev.c:3502 [inline] ivanmeler#2: 00000000364bae92 (&(&sch->q.lock)->rlock){+.-.}, at: __dev_queue_xmit+0x14b8/0x3650 net/core/dev.c:3838 stack backtrace: CPU: 0 PID: 8855 Comm: syz-executor427 Not tainted 5.3.0-rc7+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x172/0x1f0 lib/dump_stack.c:113 lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5357 qdisc_root include/net/sch_generic.h:492 [inline] netem_enqueue+0x1cfb/0x2d80 net/sched/sch_netem.c:479 __dev_xmit_skb net/core/dev.c:3527 [inline] __dev_queue_xmit+0x15d2/0x3650 net/core/dev.c:3838 dev_queue_xmit+0x18/0x20 net/core/dev.c:3902 neigh_hh_output include/net/neighbour.h:500 [inline] neigh_output include/net/neighbour.h:509 [inline] ip_finish_output2+0x1726/0x2570 net/ipv4/ip_output.c:228 __ip_finish_output net/ipv4/ip_output.c:308 [inline] __ip_finish_output+0x5fc/0xb90 net/ipv4/ip_output.c:290 ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:318 NF_HOOK_COND include/linux/netfilter.h:294 [inline] ip_mc_output+0x292/0xf40 net/ipv4/ip_output.c:417 dst_output include/net/dst.h:436 [inline] ip_local_out+0xbb/0x190 net/ipv4/ip_output.c:125 ip_send_skb+0x42/0xf0 net/ipv4/ip_output.c:1555 udp_send_skb.isra.0+0x6b2/0x1160 net/ipv4/udp.c:887 udp_sendmsg+0x1e96/0x2820 net/ipv4/udp.c:1174 inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807 sock_sendmsg_nosec net/socket.c:637 [inline] sock_sendmsg+0xd7/0x130 net/socket.c:657 ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311 __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413 __do_sys_sendmmsg net/socket.c:2442 [inline] __se_sys_sendmmsg net/socket.c:2439 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296 entry_SYSCALL_64_after_hwframe+0x49/0xbe Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Ic47d9dd7d7ff9ac46cca29518b83ecaa9076180c
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 27, 2020
commit 5c9934b6767b16ba60be22ec3cbd4379ad64170d upstream. We got another syzbot report [1] that tells us we must use write_lock_irq()/write_unlock_irq() to avoid possible deadlock. [1] WARNING: inconsistent lock state 5.5.0-rc1-syzkaller #0 Not tainted -------------------------------- inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-R} usage. syz-executor826/9605 [HC1[1]:SC0[0]:HE0:SE1] takes: ffffffff8a128718 (disc_data_lock){+-..}, at: sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138 {HARDIRQ-ON-W} state was registered at: lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485 __raw_write_lock_bh include/linux/rwlock_api_smp.h:203 [inline] _raw_write_lock_bh+0x33/0x50 kernel/locking/spinlock.c:319 sixpack_close+0x1d/0x250 drivers/net/hamradio/6pack.c:657 tty_ldisc_close.isra.0+0x119/0x1a0 drivers/tty/tty_ldisc.c:489 tty_set_ldisc+0x230/0x6b0 drivers/tty/tty_ldisc.c:585 tiocsetd drivers/tty/tty_io.c:2337 [inline] tty_ioctl+0xe8d/0x14f0 drivers/tty/tty_io.c:2597 vfs_ioctl fs/ioctl.c:47 [inline] file_ioctl fs/ioctl.c:545 [inline] do_vfs_ioctl+0x977/0x14e0 fs/ioctl.c:732 ksys_ioctl+0xab/0xd0 fs/ioctl.c:749 __do_sys_ioctl fs/ioctl.c:756 [inline] __se_sys_ioctl fs/ioctl.c:754 [inline] __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:754 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe irq event stamp: 3946 hardirqs last enabled at (3945): [<ffffffff87c86e43>] __raw_spin_unlock_irq include/linux/spinlock_api_smp.h:168 [inline] hardirqs last enabled at (3945): [<ffffffff87c86e43>] _raw_spin_unlock_irq+0x23/0x80 kernel/locking/spinlock.c:199 hardirqs last disabled at (3946): [<ffffffff8100675f>] trace_hardirqs_off_thunk+0x1a/0x1c arch/x86/entry/thunk_64.S:42 softirqs last enabled at (2658): [<ffffffff86a8b4df>] spin_unlock_bh include/linux/spinlock.h:383 [inline] softirqs last enabled at (2658): [<ffffffff86a8b4df>] clusterip_netdev_event+0x46f/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:222 softirqs last disabled at (2656): [<ffffffff86a8b22b>] spin_lock_bh include/linux/spinlock.h:343 [inline] softirqs last disabled at (2656): [<ffffffff86a8b22b>] clusterip_netdev_event+0x1bb/0x670 net/ipv4/netfilter/ipt_CLUSTERIP.c:196 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(disc_data_lock); <Interrupt> lock(disc_data_lock); *** DEADLOCK *** 5 locks held by syz-executor826/9605: #0: ffff8880a905e198 (&tty->legacy_mutex){+.+.}, at: tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19 ivanmeler#1: ffffffff899a56c0 (rcu_read_lock){....}, at: mutex_spin_on_owner+0x0/0x330 kernel/locking/mutex.c:413 ivanmeler#2: ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: spin_lock include/linux/spinlock.h:338 [inline] ivanmeler#2: ffff8880a496a2b0 (&(&i->lock)->rlock){-.-.}, at: serial8250_interrupt+0x2d/0x1a0 drivers/tty/serial/8250/8250_core.c:116 ivanmeler#3: ffffffff8c104048 (&port_lock_key){-.-.}, at: serial8250_handle_irq.part.0+0x24/0x330 drivers/tty/serial/8250/8250_port.c:1823 ivanmeler#4: ffff8880a905e090 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref+0x22/0x90 drivers/tty/tty_ldisc.c:288 stack backtrace: CPU: 1 PID: 9605 Comm: syz-executor826 Not tainted 5.5.0-rc1-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x197/0x210 lib/dump_stack.c:118 print_usage_bug.cold+0x327/0x378 kernel/locking/lockdep.c:3101 valid_state kernel/locking/lockdep.c:3112 [inline] mark_lock_irq kernel/locking/lockdep.c:3309 [inline] mark_lock+0xbb4/0x1220 kernel/locking/lockdep.c:3666 mark_usage kernel/locking/lockdep.c:3554 [inline] __lock_acquire+0x1e55/0x4a00 kernel/locking/lockdep.c:3909 lock_acquire+0x190/0x410 kernel/locking/lockdep.c:4485 __raw_read_lock include/linux/rwlock_api_smp.h:149 [inline] _raw_read_lock+0x32/0x50 kernel/locking/spinlock.c:223 sp_get.isra.0+0x1d/0xf0 drivers/net/ppp/ppp_synctty.c:138 sixpack_write_wakeup+0x25/0x340 drivers/net/hamradio/6pack.c:402 tty_wakeup+0xe9/0x120 drivers/tty/tty_io.c:536 tty_port_default_wakeup+0x2b/0x40 drivers/tty/tty_port.c:50 tty_port_tty_wakeup+0x57/0x70 drivers/tty/tty_port.c:387 uart_write_wakeup+0x46/0x70 drivers/tty/serial/serial_core.c:104 serial8250_tx_chars+0x495/0xaf0 drivers/tty/serial/8250/8250_port.c:1761 serial8250_handle_irq.part.0+0x2a2/0x330 drivers/tty/serial/8250/8250_port.c:1834 serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1820 [inline] serial8250_default_handle_irq+0xc0/0x150 drivers/tty/serial/8250/8250_port.c:1850 serial8250_interrupt+0xf1/0x1a0 drivers/tty/serial/8250/8250_core.c:126 __handle_irq_event_percpu+0x15d/0x970 kernel/irq/handle.c:149 handle_irq_event_percpu+0x74/0x160 kernel/irq/handle.c:189 handle_irq_event+0xa7/0x134 kernel/irq/handle.c:206 handle_edge_irq+0x25e/0x8d0 kernel/irq/chip.c:830 generic_handle_irq_desc include/linux/irqdesc.h:156 [inline] do_IRQ+0xde/0x280 arch/x86/kernel/irq.c:250 common_interrupt+0xf/0xf arch/x86/entry/entry_64.S:607 </IRQ> RIP: 0010:cpu_relax arch/x86/include/asm/processor.h:685 [inline] RIP: 0010:mutex_spin_on_owner+0x247/0x330 kernel/locking/mutex.c:579 Code: c3 be 08 00 00 00 4c 89 e7 e8 e5 06 59 00 4c 89 e0 48 c1 e8 03 42 80 3c 38 00 0f 85 e1 00 00 00 49 8b 04 24 a8 01 75 96 f3 90 <e9> 2f fe ff ff 0f 0b e8 0d 19 09 00 84 c0 0f 85 ff fd ff ff 48 c7 RSP: 0018:ffffc90001eafa20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffd7 RAX: 0000000000000000 RBX: ffff88809fd9e0c0 RCX: 1ffffffff13266dd RDX: 0000000000000000 RSI: 0000000000000008 RDI: 0000000000000000 RBP: ffffc90001eafa60 R08: 1ffff11013d22898 R09: ffffed1013d22899 R10: ffffed1013d22898 R11: ffff88809e9144c7 R12: ffff8880a905e138 R13: ffff88809e9144c0 R14: 0000000000000000 R15: dffffc0000000000 mutex_optimistic_spin kernel/locking/mutex.c:673 [inline] __mutex_lock_common kernel/locking/mutex.c:962 [inline] __mutex_lock+0x32b/0x13c0 kernel/locking/mutex.c:1106 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1121 tty_lock+0xc7/0x130 drivers/tty/tty_mutex.c:19 tty_release+0xb5/0xe90 drivers/tty/tty_io.c:1665 __fput+0x2ff/0x890 fs/file_table.c:280 ____fput+0x16/0x20 fs/file_table.c:313 task_work_run+0x145/0x1c0 kernel/task_work.c:113 exit_task_work include/linux/task_work.h:22 [inline] do_exit+0x8e7/0x2ef0 kernel/exit.c:797 do_group_exit+0x135/0x360 kernel/exit.c:895 __do_sys_exit_group kernel/exit.c:906 [inline] __se_sys_exit_group kernel/exit.c:904 [inline] __x64_sys_exit_group+0x44/0x50 kernel/exit.c:904 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x43fef8 Code: Bad RIP value. RSP: 002b:00007ffdb07d2338 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043fef8 RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000 RBP: 00000000004bf730 R08: 00000000000000e7 R09: ffffffffffffffd0 R10: 00000000004002c8 R11: 0000000000000246 R12: 0000000000000001 R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000 Fixes: 6e4e2f8 ("6pack,mkiss: fix lock inconsistency") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Lee Jones <lee.jones@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: Id26e23f7b2ce74845dffd39c1fc7dbe8c836dbdd
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 28, 2020
[ Upstream commit 42ffb0bf584ae5b6b38f72259af1e0ee417ac77f ] There exists a deadlock with range_cyclic that has existed forever. If we loop around with a bio already built we could deadlock with a writer who has the page locked that we're attempting to write but is waiting on a page in our bio to be written out. The task traces are as follows PID: 1329874 TASK: ffff889ebcdf3800 CPU: 33 COMMAND: "kworker/u113:5" #0 [ffffc900297bb658] __schedule at ffffffff81a4c33f ivanmeler#1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3 ivanmeler#2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42 ivanmeler#3 [ffffc900297bb708] __lock_page at ffffffff811f145b ivanmeler#4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502 ivanmeler#5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684 ivanmeler#6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff ivanmeler#7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0 ivanmeler#8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2 #9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd PID: 2167901 TASK: ffff889dc6a59c00 CPU: 14 COMMAND: "aio-dio-invalid" #0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f ivanmeler#1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3 ivanmeler#2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42 ivanmeler#3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6 ivanmeler#4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7 ivanmeler#5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359 ivanmeler#6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933 ivanmeler#7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8 ivanmeler#8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d #9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032 I used drgn to find the respective pages we were stuck on page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901 page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874 As you can see the kworker is waiting for bit 0 (PG_locked) on index 7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index 8148. aio-dio-invalid has 7680, and the kworker epd looks like the following crash> struct extent_page_data ffffc900297bbbb0 struct extent_page_data { bio = 0xffff889f747ed830, tree = 0xffff889eed6ba448, extent_locked = 0, sync_io = 0 } Probably worth mentioning as well that it waits for writeback of the page to complete while holding a lock on it (at prepare_pages()). Using drgn I walked the bio pages looking for page 0xffffea00fbfc7500 which is the one we're waiting for writeback on bio = Object(prog, 'struct bio', address=0xffff889f747ed830) for i in range(0, bio.bi_vcnt.value_()): bv = bio.bi_io_vec[i] if bv.bv_page.value_() == 0xffffea00fbfc7500: print("FOUND IT") which validated what I suspected. The fix for this is simple, flush the epd before we loop back around to the beginning of the file during writeout. Fixes: b293f02 ("Btrfs: Add writepages support") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I5daee5ea01df5eb76df156cb489140e58826cab4
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 28, 2020
[ Upstream commit 42ffb0bf584ae5b6b38f72259af1e0ee417ac77f ] There exists a deadlock with range_cyclic that has existed forever. If we loop around with a bio already built we could deadlock with a writer who has the page locked that we're attempting to write but is waiting on a page in our bio to be written out. The task traces are as follows PID: 1329874 TASK: ffff889ebcdf3800 CPU: 33 COMMAND: "kworker/u113:5" #0 [ffffc900297bb658] __schedule at ffffffff81a4c33f ivanmeler#1 [ffffc900297bb6e0] schedule at ffffffff81a4c6e3 ivanmeler#2 [ffffc900297bb6f8] io_schedule at ffffffff81a4ca42 ivanmeler#3 [ffffc900297bb708] __lock_page at ffffffff811f145b ivanmeler#4 [ffffc900297bb798] __process_pages_contig at ffffffff814bc502 ivanmeler#5 [ffffc900297bb8c8] lock_delalloc_pages at ffffffff814bc684 ivanmeler#6 [ffffc900297bb900] find_lock_delalloc_range at ffffffff814be9ff ivanmeler#7 [ffffc900297bb9a0] writepage_delalloc at ffffffff814bebd0 ivanmeler#8 [ffffc900297bba18] __extent_writepage at ffffffff814bfbf2 #9 [ffffc900297bba98] extent_write_cache_pages at ffffffff814bffbd PID: 2167901 TASK: ffff889dc6a59c00 CPU: 14 COMMAND: "aio-dio-invalid" #0 [ffffc9003b50bb18] __schedule at ffffffff81a4c33f ivanmeler#1 [ffffc9003b50bba0] schedule at ffffffff81a4c6e3 ivanmeler#2 [ffffc9003b50bbb8] io_schedule at ffffffff81a4ca42 ivanmeler#3 [ffffc9003b50bbc8] wait_on_page_bit at ffffffff811f24d6 ivanmeler#4 [ffffc9003b50bc60] prepare_pages at ffffffff814b05a7 ivanmeler#5 [ffffc9003b50bcd8] btrfs_buffered_write at ffffffff814b1359 ivanmeler#6 [ffffc9003b50bdb0] btrfs_file_write_iter at ffffffff814b5933 ivanmeler#7 [ffffc9003b50be38] new_sync_write at ffffffff8128f6a8 ivanmeler#8 [ffffc9003b50bec8] vfs_write at ffffffff81292b9d #9 [ffffc9003b50bf00] ksys_pwrite64 at ffffffff81293032 I used drgn to find the respective pages we were stuck on page_entry.page 0xffffea00fbfc7500 index 8148 bit 15 pid 2167901 page_entry.page 0xffffea00f9bb7400 index 7680 bit 0 pid 1329874 As you can see the kworker is waiting for bit 0 (PG_locked) on index 7680, and aio-dio-invalid is waiting for bit 15 (PG_writeback) on index 8148. aio-dio-invalid has 7680, and the kworker epd looks like the following crash> struct extent_page_data ffffc900297bbbb0 struct extent_page_data { bio = 0xffff889f747ed830, tree = 0xffff889eed6ba448, extent_locked = 0, sync_io = 0 } Probably worth mentioning as well that it waits for writeback of the page to complete while holding a lock on it (at prepare_pages()). Using drgn I walked the bio pages looking for page 0xffffea00fbfc7500 which is the one we're waiting for writeback on bio = Object(prog, 'struct bio', address=0xffff889f747ed830) for i in range(0, bio.bi_vcnt.value_()): bv = bio.bi_io_vec[i] if bv.bv_page.value_() == 0xffffea00fbfc7500: print("FOUND IT") which validated what I suspected. The fix for this is simple, flush the epd before we loop back around to the beginning of the file during writeout. Fixes: b293f02 ("Btrfs: Add writepages support") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I5daee5ea01df5eb76df156cb489140e58826cab4
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 28, 2020
commit cf0a25436f05753aca5151891aea4fd130556e2a upstream. BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917 in_atomic(): 1, irqs_disabled(): 128, pid: 383, name: sh Preemption disabled at:[<ffff800000124c18>] kgdb_cpu_enter+0x158/0x6b8 CPU: 3 PID: 383 Comm: sh Tainted: G W 4.1.13-rt13 ivanmeler#2 Hardware name: Freescale Layerscape 2085a RDB Board (DT) Call trace: [<ffff8000000885e8>] dump_backtrace+0x0/0x128 [<ffff800000088734>] show_stack+0x24/0x30 [<ffff80000079a7c4>] dump_stack+0x80/0xa0 [<ffff8000000bd324>] ___might_sleep+0x18c/0x1a0 [<ffff8000007a20ac>] __rt_spin_lock+0x2c/0x40 [<ffff8000007a2268>] rt_read_lock+0x40/0x58 [<ffff800000085328>] single_step_handler+0x38/0xd8 [<ffff800000082368>] do_debug_exception+0x58/0xb8 Exception stack(0xffff80834a1e7c80 to 0xffff80834a1e7da0) 7c80: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7e40 ffff8083 001bfcc4 ffff8000 7ca0: f2000400 00000000 00000000 00000000 4a1e7d80 ffff8083 0049501c ffff8000 7cc0: 00005402 00000000 00aaa210 ffff8000 4a1e7ea0 ffff8083 000833f4 ffff8000 7ce0: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7ea0 ffff8083 001bfcc0 ffff8000 7d00: 4a0fc400 ffff8083 00005402 00000000 4a1e7d40 ffff8083 00490324 ffff8000 7d20: ffffff9c 00000000 92c23ba0 0000ffff 000a0000 00000000 00000000 00000000 7d40: 00000008 00000000 00080000 00000000 92c23b8b 0000ffff 92c23b8e 0000ffff 7d60: 00000038 00000000 00001cb2 00000000 00000005 00000000 92d7b498 0000ffff 7d80: 01010101 01010101 92be9000 0000ffff 00000000 00000000 00000030 00000000 [<ffff8000000833f4>] el1_dbg+0x18/0x6c This issue is similar with 62c6c61("arm64: replace read_lock to rcu lock in call_break_hook"), but comes to single_step_handler. This also solves kgdbts boot test silent hang issue on 4.4 -rt kernel. Signed-off-by: Yang Shi <yang.shi@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Ie8256b40dac91a0174938394f530f5db08ebef45
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 28, 2020
commit cf0a25436f05753aca5151891aea4fd130556e2a upstream. BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:917 in_atomic(): 1, irqs_disabled(): 128, pid: 383, name: sh Preemption disabled at:[<ffff800000124c18>] kgdb_cpu_enter+0x158/0x6b8 CPU: 3 PID: 383 Comm: sh Tainted: G W 4.1.13-rt13 ivanmeler#2 Hardware name: Freescale Layerscape 2085a RDB Board (DT) Call trace: [<ffff8000000885e8>] dump_backtrace+0x0/0x128 [<ffff800000088734>] show_stack+0x24/0x30 [<ffff80000079a7c4>] dump_stack+0x80/0xa0 [<ffff8000000bd324>] ___might_sleep+0x18c/0x1a0 [<ffff8000007a20ac>] __rt_spin_lock+0x2c/0x40 [<ffff8000007a2268>] rt_read_lock+0x40/0x58 [<ffff800000085328>] single_step_handler+0x38/0xd8 [<ffff800000082368>] do_debug_exception+0x58/0xb8 Exception stack(0xffff80834a1e7c80 to 0xffff80834a1e7da0) 7c80: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7e40 ffff8083 001bfcc4 ffff8000 7ca0: f2000400 00000000 00000000 00000000 4a1e7d80 ffff8083 0049501c ffff8000 7cc0: 00005402 00000000 00aaa210 ffff8000 4a1e7ea0 ffff8083 000833f4 ffff8000 7ce0: ffffff9c ffffffff 92c23ba0 0000ffff 4a1e7ea0 ffff8083 001bfcc0 ffff8000 7d00: 4a0fc400 ffff8083 00005402 00000000 4a1e7d40 ffff8083 00490324 ffff8000 7d20: ffffff9c 00000000 92c23ba0 0000ffff 000a0000 00000000 00000000 00000000 7d40: 00000008 00000000 00080000 00000000 92c23b8b 0000ffff 92c23b8e 0000ffff 7d60: 00000038 00000000 00001cb2 00000000 00000005 00000000 92d7b498 0000ffff 7d80: 01010101 01010101 92be9000 0000ffff 00000000 00000000 00000030 00000000 [<ffff8000000833f4>] el1_dbg+0x18/0x6c This issue is similar with 62c6c61("arm64: replace read_lock to rcu lock in call_break_hook"), but comes to single_step_handler. This also solves kgdbts boot test silent hang issue on 4.4 -rt kernel. Signed-off-by: Yang Shi <yang.shi@linaro.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Ie8256b40dac91a0174938394f530f5db08ebef45
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 28, 2020
[ Upstream commit f0cc2cd70164efe8f75c5d99560f0f69969c72e4 ] During unmount we can have a job from the delayed inode items work queue still running, that can lead to at least two bad things: 1) A crash, because the worker can try to create a transaction just after the fs roots were freed; 2) A transaction leak, because the worker can create a transaction before the fs roots are freed and just after we committed the last transaction and after we stopped the transaction kthread. A stack trace example of the crash: [79011.691214] kernel BUG at lib/radix-tree.c:982! [79011.692056] invalid opcode: 0000 [ivanmeler#1] PREEMPT SMP DEBUG_PAGEALLOC PTI [79011.693180] CPU: 3 PID: 1394 Comm: kworker/u8:2 Tainted: G W 5.6.0-rc2-btrfs-next-54 ivanmeler#2 (...) [79011.696789] Workqueue: btrfs-delayed-meta btrfs_work_helper [btrfs] [79011.697904] RIP: 0010:radix_tree_tag_set+0xe7/0x170 (...) [79011.702014] RSP: 0018:ffffb3c84a317ca0 EFLAGS: 00010293 [79011.702949] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [79011.704202] RDX: ffffb3c84a317cb0 RSI: ffffb3c84a317ca8 RDI: ffff8db3931340a0 [79011.705463] RBP: 0000000000000005 R08: 0000000000000005 R09: ffffffff974629d0 [79011.706756] R10: ffffb3c84a317bc0 R11: 0000000000000001 R12: ffff8db393134000 [79011.708010] R13: ffff8db3931340a0 R14: ffff8db393134068 R15: 0000000000000001 [79011.709270] FS: 0000000000000000(0000) GS:ffff8db3b6a00000(0000) knlGS:0000000000000000 [79011.710699] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [79011.711710] CR2: 00007f22c2a0a000 CR3: 0000000232ad4005 CR4: 00000000003606e0 [79011.712958] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [79011.714205] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [79011.715448] Call Trace: [79011.715925] record_root_in_trans+0x72/0xf0 [btrfs] [79011.716819] btrfs_record_root_in_trans+0x4b/0x70 [btrfs] [79011.717925] start_transaction+0xdd/0x5c0 [btrfs] [79011.718829] btrfs_async_run_delayed_root+0x17e/0x2b0 [btrfs] [79011.719915] btrfs_work_helper+0xaa/0x720 [btrfs] [79011.720773] process_one_work+0x26d/0x6a0 [79011.721497] worker_thread+0x4f/0x3e0 [79011.722153] ? process_one_work+0x6a0/0x6a0 [79011.722901] kthread+0x103/0x140 [79011.723481] ? kthread_create_worker_on_cpu+0x70/0x70 [79011.724379] ret_from_fork+0x3a/0x50 (...) The following diagram shows a sequence of steps that lead to the crash during ummount of the filesystem: CPU 1 CPU 2 CPU 3 btrfs_punch_hole() btrfs_btree_balance_dirty() btrfs_balance_delayed_items() --> sees fs_info->delayed_root->items with value 200, which is greater than BTRFS_DELAYED_BACKGROUND (128) and smaller than BTRFS_DELAYED_WRITEBACK (512) btrfs_wq_run_delayed_node() --> queues a job for fs_info->delayed_workers to run btrfs_async_run_delayed_root() btrfs_async_run_delayed_root() --> job queued by CPU 1 --> starts picking and running delayed nodes from the prepare_list list close_ctree() btrfs_delete_unused_bgs() btrfs_commit_super() btrfs_join_transaction() --> gets transaction N btrfs_commit_transaction(N) --> set transaction state to TRANTS_STATE_COMMIT_START btrfs_first_prepared_delayed_node() --> picks delayed node X through the prepared_list list btrfs_run_delayed_items() btrfs_first_delayed_node() --> also picks delayed node X but through the node_list list __btrfs_commit_inode_delayed_items() --> runs all delayed items from this node and drops the node's item count to 0 through call to btrfs_release_delayed_inode() --> finishes running any remaining delayed nodes --> finishes transaction commit --> stops cleaner and transaction threads btrfs_free_fs_roots() --> frees all roots and removes them from the radix tree fs_info->fs_roots_radix btrfs_join_transaction() start_transaction() btrfs_record_root_in_trans() record_root_in_trans() radix_tree_tag_set() --> crashes because the root is not in the radix tree anymore If the worker is able to call btrfs_join_transaction() before the unmount task frees the fs roots, we end up leaking a transaction and all its resources, since after the call to btrfs_commit_super() and stopping the transaction kthread, we don't expect to have any transaction open anymore. When this situation happens the worker has a delayed node that has no more items to run, since the task calling btrfs_run_delayed_items(), which is doing a transaction commit, picks the same node and runs all its items first. We can not wait for the worker to complete when running delayed items through btrfs_run_delayed_items(), because we call that function in several phases of a transaction commit, and that could cause a deadlock because the worker calls btrfs_join_transaction() and the task doing the transaction commit may have already set the transaction state to TRANS_STATE_COMMIT_DOING. Also it's not possible to get into a situation where only some of the items of a delayed node are added to the fs/subvolume tree in the current transaction and the remaining ones in the next transaction, because when running the items of a delayed inode we lock its mutex, effectively waiting for the worker if the worker is running the items of the delayed node already. Since this can only cause issues when unmounting a filesystem, fix it in a simple way by waiting for any jobs on the delayed workers queue before calling btrfs_commit_supper() at close_ctree(). This works because at this point no one can call btrfs_btree_balance_dirty() or btrfs_balance_delayed_items(), and if we end up waiting for any worker to complete, btrfs_commit_super() will commit the transaction created by the worker. CC: stable@vger.kernel.org # 4.4+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Ia179a92918b1d0ddf237562cce9a732919f6e619
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 28, 2020
commit 056ad39ee9253873522f6469c3364964a322912b upstream. FuzzUSB (a variant of syzkaller) found a free-while-still-in-use bug in the USB scatter-gather library: BUG: KASAN: use-after-free in atomic_read include/asm-generic/atomic-instrumented.h:26 [inline] BUG: KASAN: use-after-free in usb_hcd_unlink_urb+0x5f/0x170 drivers/usb/core/hcd.c:1607 Read of size 4 at addr ffff888065379610 by task kworker/u4:1/27 CPU: 1 PID: 27 Comm: kworker/u4:1 Not tainted 5.5.11 ivanmeler#2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 Workqueue: scsi_tmf_2 scmd_eh_abort_handler Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xce/0x128 lib/dump_stack.c:118 print_address_description.constprop.4+0x21/0x3c0 mm/kasan/report.c:374 __kasan_report+0x153/0x1cb mm/kasan/report.c:506 kasan_report+0x12/0x20 mm/kasan/common.c:639 check_memory_region_inline mm/kasan/generic.c:185 [inline] check_memory_region+0x152/0x1b0 mm/kasan/generic.c:192 __kasan_check_read+0x11/0x20 mm/kasan/common.c:95 atomic_read include/asm-generic/atomic-instrumented.h:26 [inline] usb_hcd_unlink_urb+0x5f/0x170 drivers/usb/core/hcd.c:1607 usb_unlink_urb+0x72/0xb0 drivers/usb/core/urb.c:657 usb_sg_cancel+0x14e/0x290 drivers/usb/core/message.c:602 usb_stor_stop_transport+0x5e/0xa0 drivers/usb/storage/transport.c:937 This bug occurs when cancellation of the S-G transfer races with transfer completion. When that happens, usb_sg_cancel() may continue to access the transfer's URBs after usb_sg_wait() has freed them. The bug is caused by the fact that usb_sg_cancel() does not take any sort of reference to the transfer, and so there is nothing to prevent the URBs from being deallocated while the routine is trying to use them. The fix is to take such a reference by incrementing the transfer's io->count field while the cancellation is in progres and decrementing it afterward. The transfer's URBs are not deallocated until io->complete is triggered, which happens when io->count reaches zero. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Reported-and-tested-by: Kyungtae Kim <kt0755@gmail.com> CC: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/Pine.LNX.4.44L0.2003281615140.14837-100000@netrider.rowland.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I21aecc9aa1c604b05eb74bd3e46bcdb224af81fd
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Jun 4, 2020
commit 55e610cdd28c0ad3dce0652030c0296d549673f3 upstream. This lock is already taken in ata_scsi_queuecmd() a few levels up the call stack so attempting to take it here is an error. Moreover, it is pointless in the first place since it only protects a single, atomic assignment. Enabling lock debugging gives the following output: ============================================= [ INFO: possible recursive locking detected ] 4.4.0-rc5+ #189 Not tainted --------------------------------------------- kworker/u2:3/37 is trying to acquire lock: (&(&host->lock)->rlock){-.-...}, at: [<90283294>] sata_dwc_exec_command_by_tag.constprop.14+0x44/0x8c but task is already holding lock: (&(&host->lock)->rlock){-.-...}, at: [<902761ac>] ata_scsi_queuecmd+0x2c/0x330 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&(&host->lock)->rlock); lock(&(&host->lock)->rlock); *** DEADLOCK *** May be due to missing lock nesting notation 4 locks held by kworker/u2:3/37: #0: ("events_unbound"){.+.+.+}, at: [<9003a0a4>] process_one_work+0x12c/0x430 ivanmeler#1: ((&entry->work)){+.+.+.}, at: [<9003a0a4>] process_one_work+0x12c/0x430 ivanmeler#2: (&bdev->bd_mutex){+.+.+.}, at: [<9011fd54>] __blkdev_get+0x50/0x380 ivanmeler#3: (&(&host->lock)->rlock){-.-...}, at: [<902761ac>] ata_scsi_queuecmd+0x2c/0x330 stack backtrace: CPU: 0 PID: 37 Comm: kworker/u2:3 Not tainted 4.4.0-rc5+ #189 Workqueue: events_unbound async_run_entry_fn Stack : 90b38e30 00000021 00000003 9b2a6040 00000000 9005f3f0 904fc8dc 00000025 906b96e4 00000000 90528648 9b3336c4 904fc8dc 9009bf18 00000002 00000004 00000000 00000000 9b3336c4 9b3336e4 904fc8dc 9003d074 00000000 90500000 9005e738 00000000 00000000 00000000 00000000 00000000 00000000 00000000 6e657665 755f7374 756f626e 0000646e 00000000 00000000 9b00ca00 9b025000 ... Call Trace: [<90009d6c>] show_stack+0x88/0xa4 [<90057744>] __lock_acquire+0x1ce8/0x2154 [<900583e4>] lock_acquire+0x64/0x8c [<9045ff10>] _raw_spin_lock_irqsave+0x54/0x78 [<90283294>] sata_dwc_exec_command_by_tag.constprop.14+0x44/0x8c [<90283484>] sata_dwc_qc_issue+0x1a8/0x24c [<9026b39c>] ata_qc_issue+0x1f0/0x410 [<90273c6c>] ata_scsi_translate+0xb4/0x200 [<90276234>] ata_scsi_queuecmd+0xb4/0x330 [<9025800c>] scsi_dispatch_cmd+0xd0/0x128 [<90259934>] scsi_request_fn+0x58c/0x638 [<901a3e50>] __blk_run_queue+0x40/0x5c [<901a83d4>] blk_queue_bio+0x27c/0x28c [<901a5914>] generic_make_request+0xf0/0x188 [<901a5a54>] submit_bio+0xa8/0x194 [<9011adcc>] submit_bh_wbc.isra.23+0x15c/0x17c [<9011c908>] block_read_full_page+0x3e4/0x428 [<9009e2e0>] do_read_cache_page+0xac/0x210 [<9009fd90>] read_cache_page+0x18/0x24 [<901bbd18>] read_dev_sector+0x38/0xb0 [<901bd174>] msdos_partition+0xb4/0x5c0 [<901bcb8c>] check_partition+0x140/0x274 [<901bba60>] rescan_partitions+0xa0/0x2b0 [<9011ff68>] __blkdev_get+0x264/0x380 [<901201ac>] blkdev_get+0x128/0x36c [<901b9378>] add_disk+0x3c0/0x4bc [<90268268>] sd_probe_async+0x100/0x224 [<90043a44>] async_run_entry_fn+0x50/0x124 [<9003a11c>] process_one_work+0x1a4/0x430 [<9003a4f4>] worker_thread+0x14c/0x4fc [<900408f4>] kthread+0xd0/0xe8 [<90004338>] ret_from_kernel_thread+0x14/0x1c Fixes: 6293600 ("[libata] Add 460EX on-chip SATA driver, sata_dwc_460ex") Tested-by: Christian Lamparter <chunkeey@googlemail.com> Signed-off-by: Mans Rullgard <mans@mansr.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I81359dded4519e5366e5fe4339756a22c1827b75
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Jun 4, 2020
[ Upstream commit 76e752701a8af4404bbd9c45723f7cbd6e4a251e ] Some servers seem to accept connections while booting but never send the SMBNegotiate response neither close the connection, causing all processes accessing the share hang on uninterruptible sleep state. This happens when the cifs_demultiplex_thread detects the server is unresponsive so releases the socket and start trying to reconnect. At some point, the faulty server will accept the socket and the TCP status will be set to NeedNegotiate. The first issued command accessing the share will start the negotiation (pid 5828 below), but the response will never arrive so other commands will be blocked waiting on the mutex (pid 55352). This patch checks for unresponsive servers also on the negotiate stage releasing the socket and reconnecting if the response is not received and checking again the tcp state when the mutex is acquired. PID: 55352 TASK: ffff880fd6cc02c0 CPU: 0 COMMAND: "ls" #0 [ffff880fd9add9f0] schedule at ffffffff81467eb9 ivanmeler#1 [ffff880fd9addb38] __mutex_lock_slowpath at ffffffff81468fe0 ivanmeler#2 [ffff880fd9addba8] mutex_lock at ffffffff81468b1a ivanmeler#3 [ffff880fd9addbc0] cifs_reconnect_tcon at ffffffffa042f905 [cifs] ivanmeler#4 [ffff880fd9addc60] smb_init at ffffffffa042faeb [cifs] ivanmeler#5 [ffff880fd9addca0] CIFSSMBQPathInfo at ffffffffa04360b5 [cifs] .... Which is waiting a mutex owned by: PID: 5828 TASK: ffff880fcc55e400 CPU: 0 COMMAND: "xxxx" #0 [ffff880fbfdc19b8] schedule at ffffffff81467eb9 ivanmeler#1 [ffff880fbfdc1b00] wait_for_response at ffffffffa044f96d [cifs] ivanmeler#2 [ffff880fbfdc1b60] SendReceive at ffffffffa04505ce [cifs] ivanmeler#3 [ffff880fbfdc1bb0] CIFSSMBNegotiate at ffffffffa0438d79 [cifs] ivanmeler#4 [ffff880fbfdc1c50] cifs_negotiate_protocol at ffffffffa043b383 [cifs] ivanmeler#5 [ffff880fbfdc1c80] cifs_reconnect_tcon at ffffffffa042f911 [cifs] ivanmeler#6 [ffff880fbfdc1d20] smb_init at ffffffffa042faeb [cifs] ivanmeler#7 [ffff880fbfdc1d60] CIFSSMBQFSInfo at ffffffffa0434eb0 [cifs] .... Signed-off-by: Samuel Cabrero <scabrero@suse.de> Reviewed-by: Aurélien Aptel <aaptel@suse.de> Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I588516ef6df5f4e7cb738f9cb602e59d94254a42
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 22, 2021
Our static-static calculation returns a failure if the public key is of low order. We check for this when peers are added, and don't allow them to be added if they're low order, except in the case where we haven't yet been given a private key. In that case, we would defer the removal of the peer until we're given a private key, since at that point we're doing new static-static calculations which incur failures we can act on. This meant, however, that we wound up removing peers rather late in the configuration flow. Syzkaller points out that peer_remove calls flush_workqueue, which in turn might then wait for sending a handshake initiation to complete. Since handshake initiation needs the static identity lock, holding the static identity lock while calling peer_remove can result in a rare deadlock. We have precisely this case in this situation of late-stage peer removal based on an invalid public key. We can't drop the lock when removing, because then incoming handshakes might interact with a bogus static-static calculation. While the band-aid patch for this would involve breaking up the peer removal into two steps like wg_peer_remove_all does, in order to solve the locking issue, there's actually a much more elegant way of fixing this: If the static-static calculation succeeds with one private key, it *must* succeed with all others, because all 32-byte strings map to valid private keys, thanks to clamping. That means we can get rid of this silly dance and locking headaches of removing peers late in the configuration flow, and instead just reject them early on, regardless of whether the device has yet been assigned a private key. For the case where the device doesn't yet have a private key, we safely use zeros just for the purposes of checking for low order points by way of checking the output of the calculation. The following PoC will trigger the deadlock: ip link add wg0 type wireguard ip addr add 10.0.0.1/24 dev wg0 ip link set wg0 up ping -f 10.0.0.2 & while true; do wg set wg0 private-key /dev/null peer AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA= allowed-ips 10.0.0.0/24 endpoint 10.0.0.3:1234 wg set wg0 private-key <(echo AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=) done [ 0.949105] ====================================================== [ 0.949550] WARNING: possible circular locking dependency detected [ 0.950143] 5.5.0-debug+ #18 Not tainted [ 0.950431] ------------------------------------------------------ [ 0.950959] wg/89 is trying to acquire lock: [ 0.951252] ffff8880333e2128 ((wq_completion)wg-kex-wg0){+.+.}, at: flush_workqueue+0xe3/0x12f0 [ 0.951865] [ 0.951865] but task is already holding lock: [ 0.952280] ffff888032819bc0 (&wg->static_identity.lock){++++}, at: wg_set_device+0x95d/0xcc0 [ 0.953011] [ 0.953011] which lock already depends on the new lock. [ 0.953011] [ 0.953651] [ 0.953651] the existing dependency chain (in reverse order) is: [ 0.954292] [ 0.954292] -> ivanmeler#2 (&wg->static_identity.lock){++++}: [ 0.954804] lock_acquire+0x127/0x350 [ 0.955133] down_read+0x83/0x410 [ 0.955428] wg_noise_handshake_create_initiation+0x97/0x700 [ 0.955885] wg_packet_send_handshake_initiation+0x13a/0x280 [ 0.956401] wg_packet_handshake_send_worker+0x10/0x20 [ 0.956841] process_one_work+0x806/0x1500 [ 0.957167] worker_thread+0x8c/0xcb0 [ 0.957549] kthread+0x2ee/0x3b0 [ 0.957792] ret_from_fork+0x24/0x30 [ 0.958234] [ 0.958234] -> ivanmeler#1 ((work_completion)(&peer->transmit_handshake_work)){+.+.}: [ 0.958808] lock_acquire+0x127/0x350 [ 0.959075] process_one_work+0x7ab/0x1500 [ 0.959369] worker_thread+0x8c/0xcb0 [ 0.959639] kthread+0x2ee/0x3b0 [ 0.959896] ret_from_fork+0x24/0x30 [ 0.960346] [ 0.960346] -> #0 ((wq_completion)wg-kex-wg0){+.+.}: [ 0.960945] check_prev_add+0x167/0x1e20 [ 0.961351] __lock_acquire+0x2012/0x3170 [ 0.961725] lock_acquire+0x127/0x350 [ 0.961990] flush_workqueue+0x106/0x12f0 [ 0.962280] peer_remove_after_dead+0x160/0x220 [ 0.962600] wg_set_device+0xa24/0xcc0 [ 0.962994] genl_rcv_msg+0x52f/0xe90 [ 0.963298] netlink_rcv_skb+0x111/0x320 [ 0.963618] genl_rcv+0x1f/0x30 [ 0.963853] netlink_unicast+0x3f6/0x610 [ 0.964245] netlink_sendmsg+0x700/0xb80 [ 0.964586] __sys_sendto+0x1dd/0x2c0 [ 0.964854] __x64_sys_sendto+0xd8/0x1b0 [ 0.965141] do_syscall_64+0x90/0xd9a [ 0.965408] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 0.965769] [ 0.965769] other info that might help us debug this: [ 0.965769] [ 0.966337] Chain exists of: [ 0.966337] (wq_completion)wg-kex-wg0 --> (work_completion)(&peer->transmit_handshake_work) --> &wg->static_identity.lock [ 0.966337] [ 0.967417] Possible unsafe locking scenario: [ 0.967417] [ 0.967836] CPU0 CPU1 [ 0.968155] ---- ---- [ 0.968497] lock(&wg->static_identity.lock); [ 0.968779] lock((work_completion)(&peer->transmit_handshake_work)); [ 0.969345] lock(&wg->static_identity.lock); [ 0.969809] lock((wq_completion)wg-kex-wg0); [ 0.970146] [ 0.970146] *** DEADLOCK *** [ 0.970146] [ 0.970531] 5 locks held by wg/89: [ 0.970908] #0: ffffffff827433c8 (cb_lock){++++}, at: genl_rcv+0x10/0x30 [ 0.971400] ivanmeler#1: ffffffff82743480 (genl_mutex){+.+.}, at: genl_rcv_msg+0x642/0xe90 [ 0.971924] ivanmeler#2: ffffffff827160c0 (rtnl_mutex){+.+.}, at: wg_set_device+0x9f/0xcc0 [ 0.972488] ivanmeler#3: ffff888032819de0 (&wg->device_update_lock){+.+.}, at: wg_set_device+0xb0/0xcc0 [ 0.973095] ivanmeler#4: ffff888032819bc0 (&wg->static_identity.lock){++++}, at: wg_set_device+0x95d/0xcc0 [ 0.973653] [ 0.973653] stack backtrace: [ 0.973932] CPU: 1 PID: 89 Comm: wg Not tainted 5.5.0-debug+ #18 [ 0.974476] Call Trace: [ 0.974638] dump_stack+0x97/0xe0 [ 0.974869] check_noncircular+0x312/0x3e0 [ 0.975132] ? print_circular_bug+0x1f0/0x1f0 [ 0.975410] ? __kernel_text_address+0x9/0x30 [ 0.975727] ? unwind_get_return_address+0x51/0x90 [ 0.976024] check_prev_add+0x167/0x1e20 [ 0.976367] ? graph_lock+0x70/0x160 [ 0.976682] __lock_acquire+0x2012/0x3170 [ 0.976998] ? register_lock_class+0x1140/0x1140 [ 0.977323] lock_acquire+0x127/0x350 [ 0.977627] ? flush_workqueue+0xe3/0x12f0 [ 0.977890] flush_workqueue+0x106/0x12f0 [ 0.978147] ? flush_workqueue+0xe3/0x12f0 [ 0.978410] ? find_held_lock+0x2c/0x110 [ 0.978662] ? lock_downgrade+0x6e0/0x6e0 [ 0.978919] ? queue_rcu_work+0x60/0x60 [ 0.979166] ? netif_napi_del+0x151/0x3b0 [ 0.979501] ? peer_remove_after_dead+0x160/0x220 [ 0.979871] peer_remove_after_dead+0x160/0x220 [ 0.980232] wg_set_device+0xa24/0xcc0 [ 0.980516] ? deref_stack_reg+0x8e/0xc0 [ 0.980801] ? set_peer+0xe10/0xe10 [ 0.981040] ? __ww_mutex_check_waiters+0x150/0x150 [ 0.981430] ? __nla_validate_parse+0x163/0x270 [ 0.981719] ? genl_family_rcv_msg_attrs_parse+0x13f/0x310 [ 0.982078] genl_rcv_msg+0x52f/0xe90 [ 0.982348] ? genl_family_rcv_msg_attrs_parse+0x310/0x310 [ 0.982690] ? register_lock_class+0x1140/0x1140 [ 0.983049] netlink_rcv_skb+0x111/0x320 [ 0.983298] ? genl_family_rcv_msg_attrs_parse+0x310/0x310 [ 0.983645] ? netlink_ack+0x880/0x880 [ 0.983888] genl_rcv+0x1f/0x30 [ 0.984168] netlink_unicast+0x3f6/0x610 [ 0.984443] ? netlink_detachskb+0x60/0x60 [ 0.984729] ? find_held_lock+0x2c/0x110 [ 0.984976] netlink_sendmsg+0x700/0xb80 [ 0.985220] ? netlink_broadcast_filtered+0xa60/0xa60 [ 0.985533] __sys_sendto+0x1dd/0x2c0 [ 0.985763] ? __x64_sys_getpeername+0xb0/0xb0 [ 0.986039] ? sockfd_lookup_light+0x17/0x160 [ 0.986397] ? __sys_recvmsg+0x8c/0xf0 [ 0.986711] ? __sys_recvmsg_sock+0xd0/0xd0 [ 0.987018] __x64_sys_sendto+0xd8/0x1b0 [ 0.987283] ? lockdep_hardirqs_on+0x39b/0x5a0 [ 0.987666] do_syscall_64+0x90/0xd9a [ 0.987903] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 0.988223] RIP: 0033:0x7fe77c12003e [ 0.988508] Code: c3 8b 07 85 c0 75 24 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 4 [ 0.989666] RSP: 002b:00007fffada2ed58 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 0.990137] RAX: ffffffffffffffda RBX: 00007fe77c159d48 RCX: 00007fe77c12003e [ 0.990583] RDX: 0000000000000040 RSI: 000055fd1d38e020 RDI: 0000000000000004 [ 0.991091] RBP: 000055fd1d38e020 R08: 000055fd1cb63358 R09: 000000000000000c [ 0.991568] R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002c [ 0.992014] R13: 0000000000000004 R14: 000055fd1d38e020 R15: 0000000000000001 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net> (cherry picked from commit ec31c2676a10e064878927b243fada8c2fb0c03c) Bug: 152722841 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I860bfac72c98c8c9b26f4490b4f346dc67892f87
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 23, 2021
[ Upstream commit 47b16820c490149c2923e8474048f2c6e7557cab ] If xace hardware reports a bad version number, the error handling code in ace_setup() calls put_disk(), followed by queue cleanup. However, since the disk data structure has the queue pointer set, put_disk() also cleans and releases the queue. This results in blk_cleanup_queue() accessing an already released data structure, which in turn may result in a crash such as the following. [ 10.681671] BUG: Kernel NULL pointer dereference at 0x00000040 [ 10.681826] Faulting instruction address: 0xc0431480 [ 10.682072] Oops: Kernel access of bad area, sig: 11 [ivanmeler#1] [ 10.682251] BE PAGE_SIZE=4K PREEMPT Xilinx Virtex440 [ 10.682387] Modules linked in: [ 10.682528] CPU: 0 PID: 1 Comm: swapper Tainted: G W 5.0.0-rc6-next-20190218+ ivanmeler#2 [ 10.682733] NIP: c0431480 LR: c043147c CTR: c0422ad8 [ 10.682863] REGS: cf82fbe0 TRAP: 0300 Tainted: G W (5.0.0-rc6-next-20190218+) [ 10.683065] MSR: 00029000 <CE,EE,ME> CR: 22000222 XER: 00000000 [ 10.683236] DEAR: 00000040 ESR: 00000000 [ 10.683236] GPR00: c043147c cf82fc90 cf82ccc0 00000000 00000000 00000000 00000002 00000000 [ 10.683236] GPR08: 00000000 00000000 c04310bc 00000000 22000222 00000000 c0002c54 00000000 [ 10.683236] GPR16: 00000000 00000001 c09aa39c c09021b0 c09021dc 00000007 c0a68c08 00000000 [ 10.683236] GPR24: 00000001 ced6d400 ced6dcf0 c0815d9c 00000000 00000000 00000000 cedf0800 [ 10.684331] NIP [c0431480] blk_mq_run_hw_queue+0x28/0x114 [ 10.684473] LR [c043147c] blk_mq_run_hw_queue+0x24/0x114 [ 10.684602] Call Trace: [ 10.684671] [cf82fc90] [c043147c] blk_mq_run_hw_queue+0x24/0x114 (unreliable) [ 10.684854] [cf82fcc0] [c04315bc] blk_mq_run_hw_queues+0x50/0x7c [ 10.685002] [cf82fce0] [c0422b24] blk_set_queue_dying+0x30/0x68 [ 10.685154] [cf82fcf0] [c0423ec0] blk_cleanup_queue+0x34/0x14c [ 10.685306] [cf82fd10] [c054d73c] ace_probe+0x3dc/0x508 [ 10.685445] [cf82fd50] [c052d740] platform_drv_probe+0x4c/0xb8 [ 10.685592] [cf82fd70] [c052abb0] really_probe+0x20c/0x32c [ 10.685728] [cf82fda0] [c052ae58] driver_probe_device+0x68/0x464 [ 10.685877] [cf82fdc0] [c052b500] device_driver_attach+0xb4/0xe4 [ 10.686024] [cf82fde0] [c052b5dc] __driver_attach+0xac/0xfc [ 10.686161] [cf82fe00] [c0528428] bus_for_each_dev+0x80/0xc0 [ 10.686314] [cf82fe30] [c0529b3c] bus_add_driver+0x144/0x234 [ 10.686457] [cf82fe50] [c052c46c] driver_register+0x88/0x15c [ 10.686610] [cf82fe60] [c09de288] ace_init+0x4c/0xac [ 10.686742] [cf82fe80] [c0002730] do_one_initcall+0xac/0x330 [ 10.686888] [cf82fee0] [c09aafd0] kernel_init_freeable+0x34c/0x478 [ 10.687043] [cf82ff30] [c0002c6c] kernel_init+0x18/0x114 [ 10.687188] [cf82ff40] [c000f2f0] ret_from_kernel_thread+0x14/0x1c [ 10.687349] Instruction dump: [ 10.687435] 3863ffd4 4bfffd70 9421ffd0 7c0802a6 93c10028 7c9e2378 93e1002c 38810008 [ 10.687637] 7c7f1b78 90010034 4bfffc25 813f008c <81290040> 75290100 4182002c 80810008 [ 10.688056] ---[ end trace 13c9ff51d41b9d40 ]--- Fix the problem by setting the disk queue pointer to NULL before calling put_disk(). A more comprehensive fix might be to rearrange the code to check the hardware version before initializing data structures, but I don't know if this would have undesirable side effects, and it would increase the complexity of backporting the fix to older kernels. Fixes: 74489a9 ("Add support for Xilinx SystemACE CompactFlash interface") Acked-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org>
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 23, 2021
… sdevs) commit ef4021fe5fd77ced0323cede27979d80a56211ca upstream. When the user tries to remove a zfcp port via sysfs, we only rejected it if there are zfcp unit children under the port. With purely automatically scanned LUNs there are no zfcp units but only SCSI devices. In such cases, the port_remove erroneously continued. We close the port and this implicitly closes all LUNs under the port. The SCSI devices survive with their private zfcp_scsi_dev still holding a reference to the "removed" zfcp_port (still allocated but invisible in sysfs) [zfcp_get_port_by_wwpn in zfcp_scsi_slave_alloc]. This is not a problem as long as the fc_rport stays blocked. Once (auto) port scan brings back the removed port, we unblock its fc_rport again by design. However, there is no mechanism that would recover (open) the LUNs under the port (no "ersfs_3" without zfcp_unit [zfcp_erp_strategy_followup_success]). Any pending or new I/O to such LUN leads to repeated: Done: NEEDS_RETRY Result: hostbyte=DID_IMM_RETRY driverbyte=DRIVER_OK See also v4.10 commit 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN recovery"). Even a manual LUN recovery (echo 0 > /sys/bus/scsi/devices/H:C:T:L/zfcp_failed) does not help, as the LUN links to the old "removed" port which remains to lack ZFCP_STATUS_COMMON_RUNNING [zfcp_erp_required_act]. The only workaround is to first ensure that the fc_rport is blocked (e.g. port_remove again in case it was re-discovered by (auto) port scan), then delete the SCSI devices, and finally re-discover by (auto) port scan. The port scan includes an fc_rport unblock, which in turn triggers a new scan on the scsi target to freshly get new pure auto scan LUNs. Fix this by rejecting port_remove also if there are SCSI devices (even without any zfcp_unit) under this port. Re-use mechanics from v3.7 commit d99b601 ("[SCSI] zfcp: restore refcount check on port_remove"). However, we have to give up zfcp_sysfs_port_units_mutex earlier in unit_add to prevent a deadlock with scsi_host scan taking shost->scan_mutex first and then zfcp_sysfs_port_units_mutex now in our zfcp_scsi_slave_alloc(). Signed-off-by: Steffen Maier <maier@linux.ibm.com> Fixes: b62a8d9 ("[SCSI] zfcp: Use SCSI device data zfcp scsi dev instead of zfcp unit") Fixes: f8210e3 ("[SCSI] zfcp: Allow midlayer to scan for LUNs when running in NPIV mode") Cc: <stable@vger.kernel.org> ivanmeler#2.6.37+ Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@google.com> Change-Id: I63e54349997c8fc09e61e7be797c167e9c60f6f3
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 29, 2021
commit 61dc0a446e5d08f2de8a24b45f69a1e302bb1b1b upstream. pm_runtime_get_sync does return a error value that must be checked for error conditions, else, due to various reasons, the device maynot be enabled and the system will crash due to lack of clock to the hardware module. Before: 12.562784] [00000000] *pgd=fe193835 12.562792] Internal error: : 1406 [ivanmeler#1] SMP ARM [...] 12.562864] CPU: 1 PID: 241 Comm: modprobe Not tainted 4.7.0-rc4-next-20160624 ivanmeler#2 12.562867] Hardware name: Generic DRA74X (Flattened Device Tree) 12.562872] task: ed51f140 ti: ed44c000 task.ti: ed44c000 12.562886] PC is at omap4_rng_init+0x20/0x84 [omap_rng] 12.562899] LR is at set_current_rng+0xc0/0x154 [rng_core] [...] After the proper checks: [ 94.366705] omap_rng 48090000.rng: _od_fail_runtime_resume: FIXME: missing hwmod/omap_dev info [ 94.375767] omap_rng 48090000.rng: Failed to runtime_get device -19 [ 94.382351] omap_rng 48090000.rng: initialization failed. Fixes: 665d92f ("hwrng: OMAP: convert to use runtime PM") Cc: Paul Walmsley <paul@pwsan.com> Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Ie7ee2a890630bb6b0f26c2f1ec2dbcd80bbbf286
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 29, 2021
[ Upstream commit f6766ff6afff70e2aaf39e1511e16d471de7c3ae ] We need to check mddev->del_work before flush workqueu since the purpose of flush is to ensure the previous md is disappeared. Otherwise the similar deadlock appeared if LOCKDEP is enabled, it is due to md_open holds the bdev->bd_mutex before flush workqueue. kernel: [ 154.522645] ====================================================== kernel: [ 154.522647] WARNING: possible circular locking dependency detected kernel: [ 154.522650] 5.6.0-rc7-lp151.27-default #25 Tainted: G O kernel: [ 154.522651] ------------------------------------------------------ kernel: [ 154.522653] mdadm/2482 is trying to acquire lock: kernel: [ 154.522655] ffff888078529128 ((wq_completion)md_misc){+.+.}, at: flush_workqueue+0x84/0x4b0 kernel: [ 154.522673] kernel: [ 154.522673] but task is already holding lock: kernel: [ 154.522675] ffff88804efa9338 (&bdev->bd_mutex){+.+.}, at: __blkdev_get+0x79/0x590 kernel: [ 154.522691] kernel: [ 154.522691] which lock already depends on the new lock. kernel: [ 154.522691] kernel: [ 154.522694] kernel: [ 154.522694] the existing dependency chain (in reverse order) is: kernel: [ 154.522696] kernel: [ 154.522696] -> ivanmeler#4 (&bdev->bd_mutex){+.+.}: kernel: [ 154.522704] __mutex_lock+0x87/0x950 kernel: [ 154.522706] __blkdev_get+0x79/0x590 kernel: [ 154.522708] blkdev_get+0x65/0x140 kernel: [ 154.522709] blkdev_get_by_dev+0x2f/0x40 kernel: [ 154.522716] lock_rdev+0x3d/0x90 [md_mod] kernel: [ 154.522719] md_import_device+0xd6/0x1b0 [md_mod] kernel: [ 154.522723] new_dev_store+0x15e/0x210 [md_mod] kernel: [ 154.522728] md_attr_store+0x7a/0xc0 [md_mod] kernel: [ 154.522732] kernfs_fop_write+0x117/0x1b0 kernel: [ 154.522735] vfs_write+0xad/0x1a0 kernel: [ 154.522737] ksys_write+0xa4/0xe0 kernel: [ 154.522745] do_syscall_64+0x64/0x2b0 kernel: [ 154.522748] entry_SYSCALL_64_after_hwframe+0x49/0xbe kernel: [ 154.522749] kernel: [ 154.522749] -> ivanmeler#3 (&mddev->reconfig_mutex){+.+.}: kernel: [ 154.522752] __mutex_lock+0x87/0x950 kernel: [ 154.522756] new_dev_store+0xc9/0x210 [md_mod] kernel: [ 154.522759] md_attr_store+0x7a/0xc0 [md_mod] kernel: [ 154.522761] kernfs_fop_write+0x117/0x1b0 kernel: [ 154.522763] vfs_write+0xad/0x1a0 kernel: [ 154.522765] ksys_write+0xa4/0xe0 kernel: [ 154.522767] do_syscall_64+0x64/0x2b0 kernel: [ 154.522769] entry_SYSCALL_64_after_hwframe+0x49/0xbe kernel: [ 154.522770] kernel: [ 154.522770] -> ivanmeler#2 (kn->count#253){++++}: kernel: [ 154.522775] __kernfs_remove+0x253/0x2c0 kernel: [ 154.522778] kernfs_remove+0x1f/0x30 kernel: [ 154.522780] kobject_del+0x28/0x60 kernel: [ 154.522783] mddev_delayed_delete+0x24/0x30 [md_mod] kernel: [ 154.522786] process_one_work+0x2a7/0x5f0 kernel: [ 154.522788] worker_thread+0x2d/0x3d0 kernel: [ 154.522793] kthread+0x117/0x130 kernel: [ 154.522795] ret_from_fork+0x3a/0x50 kernel: [ 154.522796] kernel: [ 154.522796] -> ivanmeler#1 ((work_completion)(&mddev->del_work)){+.+.}: kernel: [ 154.522800] process_one_work+0x27e/0x5f0 kernel: [ 154.522802] worker_thread+0x2d/0x3d0 kernel: [ 154.522804] kthread+0x117/0x130 kernel: [ 154.522806] ret_from_fork+0x3a/0x50 kernel: [ 154.522807] kernel: [ 154.522807] -> #0 ((wq_completion)md_misc){+.+.}: kernel: [ 154.522813] __lock_acquire+0x1392/0x1690 kernel: [ 154.522816] lock_acquire+0xb4/0x1a0 kernel: [ 154.522818] flush_workqueue+0xab/0x4b0 kernel: [ 154.522821] md_open+0xb6/0xc0 [md_mod] kernel: [ 154.522823] __blkdev_get+0xea/0x590 kernel: [ 154.522825] blkdev_get+0x65/0x140 kernel: [ 154.522828] do_dentry_open+0x1d1/0x380 kernel: [ 154.522831] path_openat+0x567/0xcc0 kernel: [ 154.522834] do_filp_open+0x9b/0x110 kernel: [ 154.522836] do_sys_openat2+0x201/0x2a0 kernel: [ 154.522838] do_sys_open+0x57/0x80 kernel: [ 154.522840] do_syscall_64+0x64/0x2b0 kernel: [ 154.522842] entry_SYSCALL_64_after_hwframe+0x49/0xbe kernel: [ 154.522844] kernel: [ 154.522844] other info that might help us debug this: kernel: [ 154.522844] kernel: [ 154.522846] Chain exists of: kernel: [ 154.522846] (wq_completion)md_misc --> &mddev->reconfig_mutex --> &bdev->bd_mutex kernel: [ 154.522846] kernel: [ 154.522850] Possible unsafe locking scenario: kernel: [ 154.522850] kernel: [ 154.522852] CPU0 CPU1 kernel: [ 154.522853] ---- ---- kernel: [ 154.522854] lock(&bdev->bd_mutex); kernel: [ 154.522856] lock(&mddev->reconfig_mutex); kernel: [ 154.522858] lock(&bdev->bd_mutex); kernel: [ 154.522860] lock((wq_completion)md_misc); kernel: [ 154.522861] kernel: [ 154.522861] *** DEADLOCK *** kernel: [ 154.522861] kernel: [ 154.522864] 1 lock held by mdadm/2482: kernel: [ 154.522865] #0: ffff88804efa9338 (&bdev->bd_mutex){+.+.}, at: __blkdev_get+0x79/0x590 kernel: [ 154.522868] kernel: [ 154.522868] stack backtrace: kernel: [ 154.522873] CPU: 1 PID: 2482 Comm: mdadm Tainted: G O 5.6.0-rc7-lp151.27-default #25 kernel: [ 154.522875] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 kernel: [ 154.522878] Call Trace: kernel: [ 154.522881] dump_stack+0x8f/0xcb kernel: [ 154.522884] check_noncircular+0x194/0x1b0 kernel: [ 154.522888] ? __lock_acquire+0x1392/0x1690 kernel: [ 154.522890] __lock_acquire+0x1392/0x1690 kernel: [ 154.522893] lock_acquire+0xb4/0x1a0 kernel: [ 154.522895] ? flush_workqueue+0x84/0x4b0 kernel: [ 154.522898] flush_workqueue+0xab/0x4b0 kernel: [ 154.522900] ? flush_workqueue+0x84/0x4b0 kernel: [ 154.522905] ? md_open+0xb6/0xc0 [md_mod] kernel: [ 154.522908] md_open+0xb6/0xc0 [md_mod] kernel: [ 154.522910] __blkdev_get+0xea/0x590 kernel: [ 154.522912] ? bd_acquire+0xc0/0xc0 kernel: [ 154.522914] blkdev_get+0x65/0x140 kernel: [ 154.522916] ? bd_acquire+0xc0/0xc0 kernel: [ 154.522918] do_dentry_open+0x1d1/0x380 kernel: [ 154.522921] path_openat+0x567/0xcc0 kernel: [ 154.522923] ? __lock_acquire+0x380/0x1690 kernel: [ 154.522926] do_filp_open+0x9b/0x110 kernel: [ 154.522929] ? __alloc_fd+0xe5/0x1f0 kernel: [ 154.522935] ? kmem_cache_alloc+0x28c/0x630 kernel: [ 154.522939] ? do_sys_openat2+0x201/0x2a0 kernel: [ 154.522941] do_sys_openat2+0x201/0x2a0 kernel: [ 154.522944] do_sys_open+0x57/0x80 kernel: [ 154.522946] do_syscall_64+0x64/0x2b0 kernel: [ 154.522948] entry_SYSCALL_64_after_hwframe+0x49/0xbe kernel: [ 154.522951] RIP: 0033:0x7f98d279d9ae And md_alloc also flushed the same workqueue, but the thing is different here. Because all the paths call md_alloc don't hold bdev->bd_mutex, and the flush is necessary to avoid race condition, so leave it as it is. Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Signed-off-by: Song Liu <songliubraving@fb.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Ifb73108ae7c83ced33e92e19ff7893aa4ff6658a
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 29, 2021
commit 420902c9d086848a7548c83e0a49021514bd71b7 upstream. If we hold the superblock lock while calling reiserfs_quota_on_mount(), we can deadlock our own worker - mount blocks kworker/3:2, sleeps forever more. crash> ps|grep UN 715 2 3 ffff880220734d30 UN 0.0 0 0 [kworker/3:2] 9369 9341 2 ffff88021ffb7560 UN 1.3 493404 123184 Xorg 9665 9664 3 ffff880225b92ab0 UN 0.0 47368 812 udisks-daemon 10635 10403 3 ffff880222f22c70 UN 0.0 14904 936 mount crash> bt ffff880220734d30 PID: 715 TASK: ffff880220734d30 CPU: 3 COMMAND: "kworker/3:2" #0 [ffff8802244c3c20] schedule at ffffffff8144584b ivanmeler#1 [ffff8802244c3cc8] __rt_mutex_slowlock at ffffffff814472b3 ivanmeler#2 [ffff8802244c3d28] rt_mutex_slowlock at ffffffff814473f5 ivanmeler#3 [ffff8802244c3dc8] reiserfs_write_lock at ffffffffa05f28fd [reiserfs] ivanmeler#4 [ffff8802244c3de8] flush_async_commits at ffffffffa05ec91d [reiserfs] ivanmeler#5 [ffff8802244c3e08] process_one_work at ffffffff81073726 ivanmeler#6 [ffff8802244c3e68] worker_thread at ffffffff81073eba ivanmeler#7 [ffff8802244c3ec8] kthread at ffffffff810782e0 ivanmeler#8 [ffff8802244c3f48] kernel_thread_helper at ffffffff81450064 crash> rd ffff8802244c3cc8 10 ffff8802244c3cc8: ffffffff814472b3 ffff880222f23250 .rD.....P2.".... ffff8802244c3cd8: 0000000000000000 0000000000000286 ................ ffff8802244c3ce8: ffff8802244c3d30 ffff880220734d80 0=L$.....Ms .... ffff8802244c3cf8: ffff880222e8f628 0000000000000000 (.."............ ffff8802244c3d08: 0000000000000000 0000000000000002 ................ crash> struct rt_mutex ffff880222e8f628 struct rt_mutex { wait_lock = { raw_lock = { slock = 65537 } }, wait_list = { node_list = { next = 0xffff8802244c3d48, prev = 0xffff8802244c3d48 } }, owner = 0xffff880222f22c71, save_state = 0 } crash> bt 0xffff880222f22c70 PID: 10635 TASK: ffff880222f22c70 CPU: 3 COMMAND: "mount" #0 [ffff8802216a9868] schedule at ffffffff8144584b ivanmeler#1 [ffff8802216a9910] schedule_timeout at ffffffff81446865 ivanmeler#2 [ffff8802216a99a0] wait_for_common at ffffffff81445f74 ivanmeler#3 [ffff8802216a9a30] flush_work at ffffffff810712d3 ivanmeler#4 [ffff8802216a9ab0] schedule_on_each_cpu at ffffffff81074463 ivanmeler#5 [ffff8802216a9ae0] invalidate_bdev at ffffffff81178aba ivanmeler#6 [ffff8802216a9af0] vfs_load_quota_inode at ffffffff811a3632 ivanmeler#7 [ffff8802216a9b50] dquot_quota_on_mount at ffffffff811a375c ivanmeler#8 [ffff8802216a9b80] finish_unfinished at ffffffffa05dd8b0 [reiserfs] #9 [ffff8802216a9cc0] reiserfs_fill_super at ffffffffa05de825 [reiserfs] RIP: 00007f7b9303997a RSP: 00007ffff443c7a8 RFLAGS: 00010202 RAX: 00000000000000a5 RBX: ffffffff8144ef12 RCX: 00007f7b932e9ee0 RDX: 00007f7b93d9a400 RSI: 00007f7b93d9a3e0 RDI: 00007f7b93d9a3c0 RBP: 00007f7b93d9a2c0 R8: 00007f7b93d9a550 R9: 0000000000000001 R10: ffffffffc0ed040e R11: 0000000000000202 R12: 000000000000040e R13: 0000000000000000 R14: 00000000c0ed040e R15: 00007ffff443ca20 ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Mike Galbraith <mgalbraith@suse.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Ib963e18e5994a15e985b567c8b504ef91484e66d
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 29, 2021
[ Upstream commit 2d3a8e2deddea6c89961c422ec0c5b851e648c14 ] In blkdev_get() we call __blkdev_get() to do some internal jobs and if there is some errors in __blkdev_get(), the bdput() is called which means we have released the refcount of the bdev (actually the refcount of the bdev inode). This means we cannot access bdev after that point. But acctually bdev is still accessed in blkdev_get() after calling __blkdev_get(). This results in use-after-free if the refcount is the last one we released in __blkdev_get(). Let's take a look at the following scenerio: CPU0 CPU1 CPU2 blkdev_open blkdev_open Remove disk bd_acquire blkdev_get __blkdev_get del_gendisk bdev_unhash_inode bd_acquire bdev_get_gendisk bd_forget failed because of unhashed bdput bdput (the last one) bdev_evict_inode access bdev => use after free [ 459.350216] BUG: KASAN: use-after-free in __lock_acquire+0x24c1/0x31b0 [ 459.351190] Read of size 8 at addr ffff88806c815a80 by task syz-executor.0/20132 [ 459.352347] [ 459.352594] CPU: 0 PID: 20132 Comm: syz-executor.0 Not tainted 4.19.90 ivanmeler#2 [ 459.353628] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 459.354947] Call Trace: [ 459.355337] dump_stack+0x111/0x19e [ 459.355879] ? __lock_acquire+0x24c1/0x31b0 [ 459.356523] print_address_description+0x60/0x223 [ 459.357248] ? __lock_acquire+0x24c1/0x31b0 [ 459.357887] kasan_report.cold+0xae/0x2d8 [ 459.358503] __lock_acquire+0x24c1/0x31b0 [ 459.359120] ? _raw_spin_unlock_irq+0x24/0x40 [ 459.359784] ? lockdep_hardirqs_on+0x37b/0x580 [ 459.360465] ? _raw_spin_unlock_irq+0x24/0x40 [ 459.361123] ? finish_task_switch+0x125/0x600 [ 459.361812] ? finish_task_switch+0xee/0x600 [ 459.362471] ? mark_held_locks+0xf0/0xf0 [ 459.363108] ? __schedule+0x96f/0x21d0 [ 459.363716] lock_acquire+0x111/0x320 [ 459.364285] ? blkdev_get+0xce/0xbe0 [ 459.364846] ? blkdev_get+0xce/0xbe0 [ 459.365390] __mutex_lock+0xf9/0x12a0 [ 459.365948] ? blkdev_get+0xce/0xbe0 [ 459.366493] ? bdev_evict_inode+0x1f0/0x1f0 [ 459.367130] ? blkdev_get+0xce/0xbe0 [ 459.367678] ? destroy_inode+0xbc/0x110 [ 459.368261] ? mutex_trylock+0x1a0/0x1a0 [ 459.368867] ? __blkdev_get+0x3e6/0x1280 [ 459.369463] ? bdev_disk_changed+0x1d0/0x1d0 [ 459.370114] ? blkdev_get+0xce/0xbe0 [ 459.370656] blkdev_get+0xce/0xbe0 [ 459.371178] ? find_held_lock+0x2c/0x110 [ 459.371774] ? __blkdev_get+0x1280/0x1280 [ 459.372383] ? lock_downgrade+0x680/0x680 [ 459.373002] ? lock_acquire+0x111/0x320 [ 459.373587] ? bd_acquire+0x21/0x2c0 [ 459.374134] ? do_raw_spin_unlock+0x4f/0x250 [ 459.374780] blkdev_open+0x202/0x290 [ 459.375325] do_dentry_open+0x49e/0x1050 [ 459.375924] ? blkdev_get_by_dev+0x70/0x70 [ 459.376543] ? __x64_sys_fchdir+0x1f0/0x1f0 [ 459.377192] ? inode_permission+0xbe/0x3a0 [ 459.377818] path_openat+0x148c/0x3f50 [ 459.378392] ? kmem_cache_alloc+0xd5/0x280 [ 459.379016] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 459.379802] ? path_lookupat.isra.0+0x900/0x900 [ 459.380489] ? __lock_is_held+0xad/0x140 [ 459.381093] do_filp_open+0x1a1/0x280 [ 459.381654] ? may_open_dev+0xf0/0xf0 [ 459.382214] ? find_held_lock+0x2c/0x110 [ 459.382816] ? lock_downgrade+0x680/0x680 [ 459.383425] ? __lock_is_held+0xad/0x140 [ 459.384024] ? do_raw_spin_unlock+0x4f/0x250 [ 459.384668] ? _raw_spin_unlock+0x1f/0x30 [ 459.385280] ? __alloc_fd+0x448/0x560 [ 459.385841] do_sys_open+0x3c3/0x500 [ 459.386386] ? filp_open+0x70/0x70 [ 459.386911] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 459.387610] ? trace_hardirqs_off_caller+0x55/0x1c0 [ 459.388342] ? do_syscall_64+0x1a/0x520 [ 459.388930] do_syscall_64+0xc3/0x520 [ 459.389490] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 459.390248] RIP: 0033:0x416211 [ 459.390720] Code: 75 14 b8 02 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 04 19 00 00 c3 48 83 ec 08 e8 0a fa ff ff 48 89 04 24 b8 02 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 53 fa ff ff 48 89 d0 48 83 c4 08 48 3d 01 [ 459.393483] RSP: 002b:00007fe45dfe9a60 EFLAGS: 00000293 ORIG_RAX: 0000000000000002 [ 459.394610] RAX: ffffffffffffffda RBX: 00007fe45dfea6d4 RCX: 0000000000416211 [ 459.395678] RDX: 00007fe45dfe9b0a RSI: 0000000000000002 RDI: 00007fe45dfe9b00 [ 459.396758] RBP: 000000000076bf20 R08: 0000000000000000 R09: 000000000000000a [ 459.397930] R10: 0000000000000075 R11: 0000000000000293 R12: 00000000ffffffff [ 459.399022] R13: 0000000000000bd9 R14: 00000000004cdb80 R15: 000000000076bf2c [ 459.400168] [ 459.400430] Allocated by task 20132: [ 459.401038] kasan_kmalloc+0xbf/0xe0 [ 459.401652] kmem_cache_alloc+0xd5/0x280 [ 459.402330] bdev_alloc_inode+0x18/0x40 [ 459.402970] alloc_inode+0x5f/0x180 [ 459.403510] iget5_locked+0x57/0xd0 [ 459.404095] bdget+0x94/0x4e0 [ 459.404607] bd_acquire+0xfa/0x2c0 [ 459.405113] blkdev_open+0x110/0x290 [ 459.405702] do_dentry_open+0x49e/0x1050 [ 459.406340] path_openat+0x148c/0x3f50 [ 459.406926] do_filp_open+0x1a1/0x280 [ 459.407471] do_sys_open+0x3c3/0x500 [ 459.408010] do_syscall_64+0xc3/0x520 [ 459.408572] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 459.409415] [ 459.409679] Freed by task 1262: [ 459.410212] __kasan_slab_free+0x129/0x170 [ 459.410919] kmem_cache_free+0xb2/0x2a0 [ 459.411564] rcu_process_callbacks+0xbb2/0x2320 [ 459.412318] __do_softirq+0x225/0x8ac Fix this by delaying bdput() to the end of blkdev_get() which means we have finished accessing bdev. Fixes: 77ea887 ("implement in-kernel gendisk events handling") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Ming Lei <ming.lei@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I642bd3c3224cc174ba4bdd57133b1a17fb35dccd
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 29, 2021
commit e5a15e17a78d58f933d17cafedfcf7486a29f5b4 upstream. The following kernel panic was captured when running nfs server over ocfs2, at that time ocfs2_test_inode_bit() was checking whether one inode locating at "blkno" 5 was valid, that is ocfs2 root inode, its "suballoc_slot" was OCFS2_INVALID_SLOT(65535) and it was allocted from //global_inode_alloc, but here it wrongly assumed that it was got from per slot inode alloctor which would cause array overflow and trigger kernel panic. BUG: unable to handle kernel paging request at 0000000000001088 IP: [<ffffffff816f6898>] _raw_spin_lock+0x18/0xf0 PGD 1e06ba067 PUD 1e9e7d067 PMD 0 Oops: 0002 [ivanmeler#1] SMP CPU: 6 PID: 24873 Comm: nfsd Not tainted 4.1.12-124.36.1.el6uek.x86_64 ivanmeler#2 Hardware name: Huawei CH121 V3/IT11SGCA1, BIOS 3.87 02/02/2018 RIP: _raw_spin_lock+0x18/0xf0 RSP: e02b:ffff88005ae97908 EFLAGS: 00010206 RAX: ffff88005ae98000 RBX: 0000000000001088 RCX: 0000000000000000 RDX: 0000000000020000 RSI: 0000000000000009 RDI: 0000000000001088 RBP: ffff88005ae97928 R08: 0000000000000000 R09: ffff880212878e00 R10: 0000000000007ff0 R11: 0000000000000000 R12: 0000000000001088 R13: ffff8800063c0aa8 R14: ffff8800650c27d0 R15: 000000000000ffff FS: 0000000000000000(0000) GS:ffff880218180000(0000) knlGS:ffff880218180000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000001088 CR3: 00000002033d0000 CR4: 0000000000042660 Call Trace: igrab+0x1e/0x60 ocfs2_get_system_file_inode+0x63/0x3a0 [ocfs2] ocfs2_test_inode_bit+0x328/0xa00 [ocfs2] ocfs2_get_parent+0xba/0x3e0 [ocfs2] reconnect_path+0xb5/0x300 exportfs_decode_fh+0xf6/0x2b0 fh_verify+0x350/0x660 [nfsd] nfsd4_putfh+0x4d/0x60 [nfsd] nfsd4_proc_compound+0x3d3/0x6f0 [nfsd] nfsd_dispatch+0xe0/0x290 [nfsd] svc_process_common+0x412/0x6a0 [sunrpc] svc_process+0x123/0x210 [sunrpc] nfsd+0xff/0x170 [nfsd] kthread+0xcb/0xf0 ret_from_fork+0x61/0x90 Code: 83 c2 02 0f b7 f2 e8 18 dc 91 ff 66 90 eb bf 0f 1f 40 00 55 48 89 e5 41 56 41 55 41 54 53 0f 1f 44 00 00 48 89 fb ba 00 00 02 00 <f0> 0f c1 17 89 d0 45 31 e4 45 31 ed c1 e8 10 66 39 d0 41 89 c6 RIP _raw_spin_lock+0x18/0xf0 CR2: 0000000000001088 ---[ end trace 7264463cd1aac8f9 ]--- Kernel panic - not syncing: Fatal exception Link: http://lkml.kernel.org/r/20200616183829.87211-4-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I80c73eaa064510ddb5a043e899f103be5c73deaa
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 29, 2021
[ Upstream commit 440ab9e10e2e6e5fd677473ee6f9e3af0f6904d6 ] At times when I'm using kgdb I see a splat on my console about suspicious RCU usage. I managed to come up with a case that could reproduce this that looked like this: WARNING: suspicious RCU usage 5.7.0-rc4+ #609 Not tainted ----------------------------- kernel/pid.c:395 find_task_by_pid_ns() needs rcu_read_lock() protection! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by swapper/0/1: #0: ffffff81b6b8e988 (&dev->mutex){....}-{3:3}, at: __device_attach+0x40/0x13c ivanmeler#1: ffffffd01109e9e8 (dbg_master_lock){....}-{2:2}, at: kgdb_cpu_enter+0x20c/0x7ac ivanmeler#2: ffffffd01109ea90 (dbg_slave_lock){....}-{2:2}, at: kgdb_cpu_enter+0x3ec/0x7ac stack backtrace: CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc4+ #609 Hardware name: Google Cheza (rev3+) (DT) Call trace: dump_backtrace+0x0/0x1b8 show_stack+0x1c/0x24 dump_stack+0xd4/0x134 lockdep_rcu_suspicious+0xf0/0x100 find_task_by_pid_ns+0x5c/0x80 getthread+0x8c/0xb0 gdb_serial_stub+0x9d4/0xd04 kgdb_cpu_enter+0x284/0x7ac kgdb_handle_exception+0x174/0x20c kgdb_brk_fn+0x24/0x30 call_break_hook+0x6c/0x7c brk_handler+0x20/0x5c do_debug_exception+0x1c8/0x22c el1_sync_handler+0x3c/0xe4 el1_sync+0x7c/0x100 rpmh_rsc_probe+0x38/0x420 platform_drv_probe+0x94/0xb4 really_probe+0x134/0x300 driver_probe_device+0x68/0x100 __device_attach_driver+0x90/0xa8 bus_for_each_drv+0x84/0xcc __device_attach+0xb4/0x13c device_initial_probe+0x18/0x20 bus_probe_device+0x38/0x98 device_add+0x38c/0x420 If I understand properly we should just be able to blanket kgdb under one big RCU read lock and the problem should go away. We'll add it to the beast-of-a-function known as kgdb_cpu_enter(). With this I no longer get any splats and things seem to work fine. Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20200602154729.v2.1.I70e0d4fd46d5ed2aaf0c98a355e8e1b7a5bb7e4e@changeid Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I1e2027ebb31d38df230299f27f850c70bfa59f56
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 30, 2021
[ Upstream commit 2d3a8e2deddea6c89961c422ec0c5b851e648c14 ] In blkdev_get() we call __blkdev_get() to do some internal jobs and if there is some errors in __blkdev_get(), the bdput() is called which means we have released the refcount of the bdev (actually the refcount of the bdev inode). This means we cannot access bdev after that point. But acctually bdev is still accessed in blkdev_get() after calling __blkdev_get(). This results in use-after-free if the refcount is the last one we released in __blkdev_get(). Let's take a look at the following scenerio: CPU0 CPU1 CPU2 blkdev_open blkdev_open Remove disk bd_acquire blkdev_get __blkdev_get del_gendisk bdev_unhash_inode bd_acquire bdev_get_gendisk bd_forget failed because of unhashed bdput bdput (the last one) bdev_evict_inode access bdev => use after free [ 459.350216] BUG: KASAN: use-after-free in __lock_acquire+0x24c1/0x31b0 [ 459.351190] Read of size 8 at addr ffff88806c815a80 by task syz-executor.0/20132 [ 459.352347] [ 459.352594] CPU: 0 PID: 20132 Comm: syz-executor.0 Not tainted 4.19.90 ivanmeler#2 [ 459.353628] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 459.354947] Call Trace: [ 459.355337] dump_stack+0x111/0x19e [ 459.355879] ? __lock_acquire+0x24c1/0x31b0 [ 459.356523] print_address_description+0x60/0x223 [ 459.357248] ? __lock_acquire+0x24c1/0x31b0 [ 459.357887] kasan_report.cold+0xae/0x2d8 [ 459.358503] __lock_acquire+0x24c1/0x31b0 [ 459.359120] ? _raw_spin_unlock_irq+0x24/0x40 [ 459.359784] ? lockdep_hardirqs_on+0x37b/0x580 [ 459.360465] ? _raw_spin_unlock_irq+0x24/0x40 [ 459.361123] ? finish_task_switch+0x125/0x600 [ 459.361812] ? finish_task_switch+0xee/0x600 [ 459.362471] ? mark_held_locks+0xf0/0xf0 [ 459.363108] ? __schedule+0x96f/0x21d0 [ 459.363716] lock_acquire+0x111/0x320 [ 459.364285] ? blkdev_get+0xce/0xbe0 [ 459.364846] ? blkdev_get+0xce/0xbe0 [ 459.365390] __mutex_lock+0xf9/0x12a0 [ 459.365948] ? blkdev_get+0xce/0xbe0 [ 459.366493] ? bdev_evict_inode+0x1f0/0x1f0 [ 459.367130] ? blkdev_get+0xce/0xbe0 [ 459.367678] ? destroy_inode+0xbc/0x110 [ 459.368261] ? mutex_trylock+0x1a0/0x1a0 [ 459.368867] ? __blkdev_get+0x3e6/0x1280 [ 459.369463] ? bdev_disk_changed+0x1d0/0x1d0 [ 459.370114] ? blkdev_get+0xce/0xbe0 [ 459.370656] blkdev_get+0xce/0xbe0 [ 459.371178] ? find_held_lock+0x2c/0x110 [ 459.371774] ? __blkdev_get+0x1280/0x1280 [ 459.372383] ? lock_downgrade+0x680/0x680 [ 459.373002] ? lock_acquire+0x111/0x320 [ 459.373587] ? bd_acquire+0x21/0x2c0 [ 459.374134] ? do_raw_spin_unlock+0x4f/0x250 [ 459.374780] blkdev_open+0x202/0x290 [ 459.375325] do_dentry_open+0x49e/0x1050 [ 459.375924] ? blkdev_get_by_dev+0x70/0x70 [ 459.376543] ? __x64_sys_fchdir+0x1f0/0x1f0 [ 459.377192] ? inode_permission+0xbe/0x3a0 [ 459.377818] path_openat+0x148c/0x3f50 [ 459.378392] ? kmem_cache_alloc+0xd5/0x280 [ 459.379016] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 459.379802] ? path_lookupat.isra.0+0x900/0x900 [ 459.380489] ? __lock_is_held+0xad/0x140 [ 459.381093] do_filp_open+0x1a1/0x280 [ 459.381654] ? may_open_dev+0xf0/0xf0 [ 459.382214] ? find_held_lock+0x2c/0x110 [ 459.382816] ? lock_downgrade+0x680/0x680 [ 459.383425] ? __lock_is_held+0xad/0x140 [ 459.384024] ? do_raw_spin_unlock+0x4f/0x250 [ 459.384668] ? _raw_spin_unlock+0x1f/0x30 [ 459.385280] ? __alloc_fd+0x448/0x560 [ 459.385841] do_sys_open+0x3c3/0x500 [ 459.386386] ? filp_open+0x70/0x70 [ 459.386911] ? trace_hardirqs_on_thunk+0x1a/0x1c [ 459.387610] ? trace_hardirqs_off_caller+0x55/0x1c0 [ 459.388342] ? do_syscall_64+0x1a/0x520 [ 459.388930] do_syscall_64+0xc3/0x520 [ 459.389490] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 459.390248] RIP: 0033:0x416211 [ 459.390720] Code: 75 14 b8 02 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 04 19 00 00 c3 48 83 ec 08 e8 0a fa ff ff 48 89 04 24 b8 02 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 53 fa ff ff 48 89 d0 48 83 c4 08 48 3d 01 [ 459.393483] RSP: 002b:00007fe45dfe9a60 EFLAGS: 00000293 ORIG_RAX: 0000000000000002 [ 459.394610] RAX: ffffffffffffffda RBX: 00007fe45dfea6d4 RCX: 0000000000416211 [ 459.395678] RDX: 00007fe45dfe9b0a RSI: 0000000000000002 RDI: 00007fe45dfe9b00 [ 459.396758] RBP: 000000000076bf20 R08: 0000000000000000 R09: 000000000000000a [ 459.397930] R10: 0000000000000075 R11: 0000000000000293 R12: 00000000ffffffff [ 459.399022] R13: 0000000000000bd9 R14: 00000000004cdb80 R15: 000000000076bf2c [ 459.400168] [ 459.400430] Allocated by task 20132: [ 459.401038] kasan_kmalloc+0xbf/0xe0 [ 459.401652] kmem_cache_alloc+0xd5/0x280 [ 459.402330] bdev_alloc_inode+0x18/0x40 [ 459.402970] alloc_inode+0x5f/0x180 [ 459.403510] iget5_locked+0x57/0xd0 [ 459.404095] bdget+0x94/0x4e0 [ 459.404607] bd_acquire+0xfa/0x2c0 [ 459.405113] blkdev_open+0x110/0x290 [ 459.405702] do_dentry_open+0x49e/0x1050 [ 459.406340] path_openat+0x148c/0x3f50 [ 459.406926] do_filp_open+0x1a1/0x280 [ 459.407471] do_sys_open+0x3c3/0x500 [ 459.408010] do_syscall_64+0xc3/0x520 [ 459.408572] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 459.409415] [ 459.409679] Freed by task 1262: [ 459.410212] __kasan_slab_free+0x129/0x170 [ 459.410919] kmem_cache_free+0xb2/0x2a0 [ 459.411564] rcu_process_callbacks+0xbb2/0x2320 [ 459.412318] __do_softirq+0x225/0x8ac Fix this by delaying bdput() to the end of blkdev_get() which means we have finished accessing bdev. Fixes: 77ea887 ("implement in-kernel gendisk events handling") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Ming Lei <ming.lei@redhat.com> Cc: Jan Kara <jack@suse.cz> Cc: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I642bd3c3224cc174ba4bdd57133b1a17fb35dccd
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 30, 2021
commit e5a15e17a78d58f933d17cafedfcf7486a29f5b4 upstream. The following kernel panic was captured when running nfs server over ocfs2, at that time ocfs2_test_inode_bit() was checking whether one inode locating at "blkno" 5 was valid, that is ocfs2 root inode, its "suballoc_slot" was OCFS2_INVALID_SLOT(65535) and it was allocted from //global_inode_alloc, but here it wrongly assumed that it was got from per slot inode alloctor which would cause array overflow and trigger kernel panic. BUG: unable to handle kernel paging request at 0000000000001088 IP: [<ffffffff816f6898>] _raw_spin_lock+0x18/0xf0 PGD 1e06ba067 PUD 1e9e7d067 PMD 0 Oops: 0002 [ivanmeler#1] SMP CPU: 6 PID: 24873 Comm: nfsd Not tainted 4.1.12-124.36.1.el6uek.x86_64 ivanmeler#2 Hardware name: Huawei CH121 V3/IT11SGCA1, BIOS 3.87 02/02/2018 RIP: _raw_spin_lock+0x18/0xf0 RSP: e02b:ffff88005ae97908 EFLAGS: 00010206 RAX: ffff88005ae98000 RBX: 0000000000001088 RCX: 0000000000000000 RDX: 0000000000020000 RSI: 0000000000000009 RDI: 0000000000001088 RBP: ffff88005ae97928 R08: 0000000000000000 R09: ffff880212878e00 R10: 0000000000007ff0 R11: 0000000000000000 R12: 0000000000001088 R13: ffff8800063c0aa8 R14: ffff8800650c27d0 R15: 000000000000ffff FS: 0000000000000000(0000) GS:ffff880218180000(0000) knlGS:ffff880218180000 CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000001088 CR3: 00000002033d0000 CR4: 0000000000042660 Call Trace: igrab+0x1e/0x60 ocfs2_get_system_file_inode+0x63/0x3a0 [ocfs2] ocfs2_test_inode_bit+0x328/0xa00 [ocfs2] ocfs2_get_parent+0xba/0x3e0 [ocfs2] reconnect_path+0xb5/0x300 exportfs_decode_fh+0xf6/0x2b0 fh_verify+0x350/0x660 [nfsd] nfsd4_putfh+0x4d/0x60 [nfsd] nfsd4_proc_compound+0x3d3/0x6f0 [nfsd] nfsd_dispatch+0xe0/0x290 [nfsd] svc_process_common+0x412/0x6a0 [sunrpc] svc_process+0x123/0x210 [sunrpc] nfsd+0xff/0x170 [nfsd] kthread+0xcb/0xf0 ret_from_fork+0x61/0x90 Code: 83 c2 02 0f b7 f2 e8 18 dc 91 ff 66 90 eb bf 0f 1f 40 00 55 48 89 e5 41 56 41 55 41 54 53 0f 1f 44 00 00 48 89 fb ba 00 00 02 00 <f0> 0f c1 17 89 d0 45 31 e4 45 31 ed c1 e8 10 66 39 d0 41 89 c6 RIP _raw_spin_lock+0x18/0xf0 CR2: 0000000000001088 ---[ end trace 7264463cd1aac8f9 ]--- Kernel panic - not syncing: Fatal exception Link: http://lkml.kernel.org/r/20200616183829.87211-4-junxiao.bi@oracle.com Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Gang He <ghe@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I80c73eaa064510ddb5a043e899f103be5c73deaa
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 30, 2021
[ Upstream commit 440ab9e10e2e6e5fd677473ee6f9e3af0f6904d6 ] At times when I'm using kgdb I see a splat on my console about suspicious RCU usage. I managed to come up with a case that could reproduce this that looked like this: WARNING: suspicious RCU usage 5.7.0-rc4+ #609 Not tainted ----------------------------- kernel/pid.c:395 find_task_by_pid_ns() needs rcu_read_lock() protection! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 3 locks held by swapper/0/1: #0: ffffff81b6b8e988 (&dev->mutex){....}-{3:3}, at: __device_attach+0x40/0x13c ivanmeler#1: ffffffd01109e9e8 (dbg_master_lock){....}-{2:2}, at: kgdb_cpu_enter+0x20c/0x7ac ivanmeler#2: ffffffd01109ea90 (dbg_slave_lock){....}-{2:2}, at: kgdb_cpu_enter+0x3ec/0x7ac stack backtrace: CPU: 7 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc4+ #609 Hardware name: Google Cheza (rev3+) (DT) Call trace: dump_backtrace+0x0/0x1b8 show_stack+0x1c/0x24 dump_stack+0xd4/0x134 lockdep_rcu_suspicious+0xf0/0x100 find_task_by_pid_ns+0x5c/0x80 getthread+0x8c/0xb0 gdb_serial_stub+0x9d4/0xd04 kgdb_cpu_enter+0x284/0x7ac kgdb_handle_exception+0x174/0x20c kgdb_brk_fn+0x24/0x30 call_break_hook+0x6c/0x7c brk_handler+0x20/0x5c do_debug_exception+0x1c8/0x22c el1_sync_handler+0x3c/0xe4 el1_sync+0x7c/0x100 rpmh_rsc_probe+0x38/0x420 platform_drv_probe+0x94/0xb4 really_probe+0x134/0x300 driver_probe_device+0x68/0x100 __device_attach_driver+0x90/0xa8 bus_for_each_drv+0x84/0xcc __device_attach+0xb4/0x13c device_initial_probe+0x18/0x20 bus_probe_device+0x38/0x98 device_add+0x38c/0x420 If I understand properly we should just be able to blanket kgdb under one big RCU read lock and the problem should go away. We'll add it to the beast-of-a-function known as kgdb_cpu_enter(). With this I no longer get any splats and things seem to work fine. Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20200602154729.v2.1.I70e0d4fd46d5ed2aaf0c98a355e8e1b7a5bb7e4e@changeid Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I1e2027ebb31d38df230299f27f850c70bfa59f56
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 30, 2021
commit a545715d2dae8d071c5b06af947b07ffa846b288 upstream. When removing and adding cpu 0 on a system with GHES NMI the following stack trace is seen when re-adding the cpu: WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1349 setup_local_APIC+ Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache coretemp intel_ra CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc6+ ivanmeler#2 Call Trace: dump_stack+0x63/0x8e __warn+0xd1/0xf0 warn_slowpath_null+0x1d/0x20 setup_local_APIC+0x275/0x370 apic_ap_setup+0xe/0x20 start_secondary+0x48/0x180 set_init_arg+0x55/0x55 early_idt_handler_array+0x120/0x120 x86_64_start_reservations+0x2a/0x2c x86_64_start_kernel+0x13d/0x14c During the cpu bringup, wakeup_cpu_via_init_nmi() is called and issues an NMI on CPU 0. The GHES NMI handler, ghes_notify_nmi() runs the ghes_proc_irq_work work queue which ends up setting IRQ_WORK_VECTOR (0xf6). The "faulty" IR line set at arch/x86/kernel/apic/apic.c:1349 is also 0xf6 (specifically APIC IRR for irqs 255 to 224 is 0x400000) which confirms that something has set the IRQ_WORK_VECTOR line prior to the APIC being initialized. Commit 2383844d4850 ("GHES: Elliminate double-loop in the NMI handler") incorrectly modified the behavior such that the handler returns NMI_HANDLED only if an error was processed, and incorrectly runs the ghes work queue for every NMI. This patch modifies the ghes_proc_irq_work() to run as it did prior to 2383844d4850 ("GHES: Elliminate double-loop in the NMI handler") by properly returning NMI_HANDLED and only calling the work queue if NMI_HANDLED has been set. Fixes: 2383844d4850 (GHES: Elliminate double-loop in the NMI handler) Signed-off-by: Prarit Bhargava <prarit@redhat.com> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I87dea2f093ff9383a6848471f052629d62da8c93
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 30, 2021
[ Upstream commit 5a25de6df789cc805a9b8ba7ab5deef5067af47e ] Freeing chip on error may lead to an Oops at the next time the system goes to resume. Fix this by removing all snd_echo_free() calls on error. Fixes: 47b5d02 ("ALSA: Echoaudio - Add suspend support ivanmeler#2") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Link: https://lore.kernel.org/r/20200813074632.17022-1-dinghao.liu@zju.edu.cn Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Idcc188a7f0b4f06cd015f6b2d21e096263a19334
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 30, 2021
[ Upstream commit 71a174b39f10b4b93223d374722aa894b5d8a82e ] b6da31b2c07c "tty: Fix data race in tty_insert_flip_string_fixed_flag" puts tty_flip_buffer_push under port->lock introducing the following possible circular locking dependency: [30129.876566] ====================================================== [30129.876566] WARNING: possible circular locking dependency detected [30129.876567] 5.9.0-rc2+ ivanmeler#3 Tainted: G S W [30129.876568] ------------------------------------------------------ [30129.876568] sysrq.sh/1222 is trying to acquire lock: [30129.876569] ffffffff92c39480 (console_owner){....}-{0:0}, at: console_unlock+0x3fe/0xa90 [30129.876572] but task is already holding lock: [30129.876572] ffff888107cb9018 (&pool->lock/1){-.-.}-{2:2}, at: show_workqueue_state.cold.55+0x15b/0x6ca [30129.876576] which lock already depends on the new lock. [30129.876577] the existing dependency chain (in reverse order) is: [30129.876578] -> ivanmeler#3 (&pool->lock/1){-.-.}-{2:2}: [30129.876581] _raw_spin_lock+0x30/0x70 [30129.876581] __queue_work+0x1a3/0x10f0 [30129.876582] queue_work_on+0x78/0x80 [30129.876582] pty_write+0x165/0x1e0 [30129.876583] n_tty_write+0x47f/0xf00 [30129.876583] tty_write+0x3d6/0x8d0 [30129.876584] vfs_write+0x1a8/0x650 [30129.876588] -> ivanmeler#2 (&port->lock#2){-.-.}-{2:2}: [30129.876590] _raw_spin_lock_irqsave+0x3b/0x80 [30129.876591] tty_port_tty_get+0x1d/0xb0 [30129.876592] tty_port_default_wakeup+0xb/0x30 [30129.876592] serial8250_tx_chars+0x3d6/0x970 [30129.876593] serial8250_handle_irq.part.12+0x216/0x380 [30129.876593] serial8250_default_handle_irq+0x82/0xe0 [30129.876594] serial8250_interrupt+0xdd/0x1b0 [30129.876595] __handle_irq_event_percpu+0xfc/0x850 [30129.876602] -> ivanmeler#1 (&port->lock){-.-.}-{2:2}: [30129.876605] _raw_spin_lock_irqsave+0x3b/0x80 [30129.876605] serial8250_console_write+0x12d/0x900 [30129.876606] console_unlock+0x679/0xa90 [30129.876606] register_console+0x371/0x6e0 [30129.876607] univ8250_console_init+0x24/0x27 [30129.876607] console_init+0x2f9/0x45e [30129.876609] -> #0 (console_owner){....}-{0:0}: [30129.876611] __lock_acquire+0x2f70/0x4e90 [30129.876612] lock_acquire+0x1ac/0xad0 [30129.876612] console_unlock+0x460/0xa90 [30129.876613] vprintk_emit+0x130/0x420 [30129.876613] printk+0x9f/0xc5 [30129.876614] show_pwq+0x154/0x618 [30129.876615] show_workqueue_state.cold.55+0x193/0x6ca [30129.876615] __handle_sysrq+0x244/0x460 [30129.876616] write_sysrq_trigger+0x48/0x4a [30129.876616] proc_reg_write+0x1a6/0x240 [30129.876617] vfs_write+0x1a8/0x650 [30129.876619] other info that might help us debug this: [30129.876620] Chain exists of: [30129.876621] console_owner --> &port->lock#2 --> &pool->lock/1 [30129.876625] Possible unsafe locking scenario: [30129.876626] CPU0 CPU1 [30129.876626] ---- ---- [30129.876627] lock(&pool->lock/1); [30129.876628] lock(&port->lock#2); [30129.876630] lock(&pool->lock/1); [30129.876631] lock(console_owner); [30129.876633] *** DEADLOCK *** [30129.876634] 5 locks held by sysrq.sh/1222: [30129.876634] #0: ffff8881d3ce0470 (sb_writers#3){.+.+}-{0:0}, at: vfs_write+0x359/0x650 [30129.876637] ivanmeler#1: ffffffff92c612c0 (rcu_read_lock){....}-{1:2}, at: __handle_sysrq+0x4d/0x460 [30129.876640] ivanmeler#2: ffffffff92c612c0 (rcu_read_lock){....}-{1:2}, at: show_workqueue_state+0x5/0xf0 [30129.876642] ivanmeler#3: ffff888107cb9018 (&pool->lock/1){-.-.}-{2:2}, at: show_workqueue_state.cold.55+0x15b/0x6ca [30129.876645] ivanmeler#4: ffffffff92c39980 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x123/0x420 [30129.876648] stack backtrace: [30129.876649] CPU: 3 PID: 1222 Comm: sysrq.sh Tainted: G S W 5.9.0-rc2+ ivanmeler#3 [30129.876649] Hardware name: Intel Corporation 2012 Client Platform/Emerald Lake 2, BIOS ACRVMBY1.86C.0078.P00.1201161002 01/16/2012 [30129.876650] Call Trace: [30129.876650] dump_stack+0x9d/0xe0 [30129.876651] check_noncircular+0x34f/0x410 [30129.876653] __lock_acquire+0x2f70/0x4e90 [30129.876656] lock_acquire+0x1ac/0xad0 [30129.876658] console_unlock+0x460/0xa90 [30129.876660] vprintk_emit+0x130/0x420 [30129.876660] printk+0x9f/0xc5 [30129.876661] show_pwq+0x154/0x618 [30129.876662] show_workqueue_state.cold.55+0x193/0x6ca [30129.876664] __handle_sysrq+0x244/0x460 [30129.876665] write_sysrq_trigger+0x48/0x4a [30129.876665] proc_reg_write+0x1a6/0x240 [30129.876666] vfs_write+0x1a8/0x650 It looks like the commit was aimed to protect tty_insert_flip_string and there is no need for tty_flip_buffer_push to be under this lock. Fixes: b6da31b2c07c ("tty: Fix data race in tty_insert_flip_string_fixed_flag") Signed-off-by: Artem Savkov <asavkov@redhat.com> Acked-by: Jiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20200902120045.3693075-1-asavkov@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I5756d33e597fc4b2be5bf1f6cd184084e60628d6
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 30, 2021
[ Upstream commit 71a174b39f10b4b93223d374722aa894b5d8a82e ] b6da31b2c07c "tty: Fix data race in tty_insert_flip_string_fixed_flag" puts tty_flip_buffer_push under port->lock introducing the following possible circular locking dependency: [30129.876566] ====================================================== [30129.876566] WARNING: possible circular locking dependency detected [30129.876567] 5.9.0-rc2+ ivanmeler#3 Tainted: G S W [30129.876568] ------------------------------------------------------ [30129.876568] sysrq.sh/1222 is trying to acquire lock: [30129.876569] ffffffff92c39480 (console_owner){....}-{0:0}, at: console_unlock+0x3fe/0xa90 [30129.876572] but task is already holding lock: [30129.876572] ffff888107cb9018 (&pool->lock/1){-.-.}-{2:2}, at: show_workqueue_state.cold.55+0x15b/0x6ca [30129.876576] which lock already depends on the new lock. [30129.876577] the existing dependency chain (in reverse order) is: [30129.876578] -> ivanmeler#3 (&pool->lock/1){-.-.}-{2:2}: [30129.876581] _raw_spin_lock+0x30/0x70 [30129.876581] __queue_work+0x1a3/0x10f0 [30129.876582] queue_work_on+0x78/0x80 [30129.876582] pty_write+0x165/0x1e0 [30129.876583] n_tty_write+0x47f/0xf00 [30129.876583] tty_write+0x3d6/0x8d0 [30129.876584] vfs_write+0x1a8/0x650 [30129.876588] -> ivanmeler#2 (&port->lock#2){-.-.}-{2:2}: [30129.876590] _raw_spin_lock_irqsave+0x3b/0x80 [30129.876591] tty_port_tty_get+0x1d/0xb0 [30129.876592] tty_port_default_wakeup+0xb/0x30 [30129.876592] serial8250_tx_chars+0x3d6/0x970 [30129.876593] serial8250_handle_irq.part.12+0x216/0x380 [30129.876593] serial8250_default_handle_irq+0x82/0xe0 [30129.876594] serial8250_interrupt+0xdd/0x1b0 [30129.876595] __handle_irq_event_percpu+0xfc/0x850 [30129.876602] -> ivanmeler#1 (&port->lock){-.-.}-{2:2}: [30129.876605] _raw_spin_lock_irqsave+0x3b/0x80 [30129.876605] serial8250_console_write+0x12d/0x900 [30129.876606] console_unlock+0x679/0xa90 [30129.876606] register_console+0x371/0x6e0 [30129.876607] univ8250_console_init+0x24/0x27 [30129.876607] console_init+0x2f9/0x45e [30129.876609] -> #0 (console_owner){....}-{0:0}: [30129.876611] __lock_acquire+0x2f70/0x4e90 [30129.876612] lock_acquire+0x1ac/0xad0 [30129.876612] console_unlock+0x460/0xa90 [30129.876613] vprintk_emit+0x130/0x420 [30129.876613] printk+0x9f/0xc5 [30129.876614] show_pwq+0x154/0x618 [30129.876615] show_workqueue_state.cold.55+0x193/0x6ca [30129.876615] __handle_sysrq+0x244/0x460 [30129.876616] write_sysrq_trigger+0x48/0x4a [30129.876616] proc_reg_write+0x1a6/0x240 [30129.876617] vfs_write+0x1a8/0x650 [30129.876619] other info that might help us debug this: [30129.876620] Chain exists of: [30129.876621] console_owner --> &port->lock#2 --> &pool->lock/1 [30129.876625] Possible unsafe locking scenario: [30129.876626] CPU0 CPU1 [30129.876626] ---- ---- [30129.876627] lock(&pool->lock/1); [30129.876628] lock(&port->lock#2); [30129.876630] lock(&pool->lock/1); [30129.876631] lock(console_owner); [30129.876633] *** DEADLOCK *** [30129.876634] 5 locks held by sysrq.sh/1222: [30129.876634] #0: ffff8881d3ce0470 (sb_writers#3){.+.+}-{0:0}, at: vfs_write+0x359/0x650 [30129.876637] ivanmeler#1: ffffffff92c612c0 (rcu_read_lock){....}-{1:2}, at: __handle_sysrq+0x4d/0x460 [30129.876640] ivanmeler#2: ffffffff92c612c0 (rcu_read_lock){....}-{1:2}, at: show_workqueue_state+0x5/0xf0 [30129.876642] ivanmeler#3: ffff888107cb9018 (&pool->lock/1){-.-.}-{2:2}, at: show_workqueue_state.cold.55+0x15b/0x6ca [30129.876645] ivanmeler#4: ffffffff92c39980 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x123/0x420 [30129.876648] stack backtrace: [30129.876649] CPU: 3 PID: 1222 Comm: sysrq.sh Tainted: G S W 5.9.0-rc2+ ivanmeler#3 [30129.876649] Hardware name: Intel Corporation 2012 Client Platform/Emerald Lake 2, BIOS ACRVMBY1.86C.0078.P00.1201161002 01/16/2012 [30129.876650] Call Trace: [30129.876650] dump_stack+0x9d/0xe0 [30129.876651] check_noncircular+0x34f/0x410 [30129.876653] __lock_acquire+0x2f70/0x4e90 [30129.876656] lock_acquire+0x1ac/0xad0 [30129.876658] console_unlock+0x460/0xa90 [30129.876660] vprintk_emit+0x130/0x420 [30129.876660] printk+0x9f/0xc5 [30129.876661] show_pwq+0x154/0x618 [30129.876662] show_workqueue_state.cold.55+0x193/0x6ca [30129.876664] __handle_sysrq+0x244/0x460 [30129.876665] write_sysrq_trigger+0x48/0x4a [30129.876665] proc_reg_write+0x1a6/0x240 [30129.876666] vfs_write+0x1a8/0x650 It looks like the commit was aimed to protect tty_insert_flip_string and there is no need for tty_flip_buffer_push to be under this lock. Fixes: b6da31b2c07c ("tty: Fix data race in tty_insert_flip_string_fixed_flag") Signed-off-by: Artem Savkov <asavkov@redhat.com> Acked-by: Jiri Slaby <jirislaby@kernel.org> Link: https://lore.kernel.org/r/20200902120045.3693075-1-asavkov@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I5756d33e597fc4b2be5bf1f6cd184084e60628d6
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 30, 2021
[ Upstream commit e773ca7da8beeca7f17fe4c9d1284a2b66839cc1 ] Actually, burst size is equal to '1 << desc->rqcfg.brst_size'. we should use burst size, not desc->rqcfg.brst_size. dma memcpy performance on Rockchip RV1126 @ 1512MHz A7, 1056MHz LPDDR3, 200MHz DMA: dmatest: /# echo dma0chan0 > /sys/module/dmatest/parameters/channel /# echo 4194304 > /sys/module/dmatest/parameters/test_buf_size /# echo 8 > /sys/module/dmatest/parameters/iterations /# echo y > /sys/module/dmatest/parameters/norandom /# echo y > /sys/module/dmatest/parameters/verbose /# echo 1 > /sys/module/dmatest/parameters/run dmatest: dma0chan0-copy0: result ivanmeler#1: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000 dmatest: dma0chan0-copy0: result ivanmeler#2: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000 dmatest: dma0chan0-copy0: result ivanmeler#3: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000 dmatest: dma0chan0-copy0: result ivanmeler#4: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000 dmatest: dma0chan0-copy0: result ivanmeler#5: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000 dmatest: dma0chan0-copy0: result ivanmeler#6: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000 dmatest: dma0chan0-copy0: result ivanmeler#7: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000 dmatest: dma0chan0-copy0: result ivanmeler#8: 'test passed' with src_off=0x0 dst_off=0x0 len=0x400000 Before: dmatest: dma0chan0-copy0: summary 8 tests, 0 failures 48 iops 200338 KB/s (0) After this patch: dmatest: dma0chan0-copy0: summary 8 tests, 0 failures 179 iops 734873 KB/s (0) After this patch and increase dma clk to 400MHz: dmatest: dma0chan0-copy0: summary 8 tests, 0 failures 259 iops 1062929 KB/s (0) Signed-off-by: Sugar Zhang <sugar.zhang@rock-chips.com> Link: https://lore.kernel.org/r/1605326106-55681-1-git-send-email-sugar.zhang@rock-chips.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Ifd84b76d02837e91073e5207961b50fae7f32429
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 30, 2021
commit 50a0cb565246f20d59cdb161778531e4b19d35ac upstream. Starting with this commit 35eb8b81edd4 ("x86/efi: Build our own page table structures") efi regions have a separate page directory called "efi_pgd". In order to access any efi region we have to first shift %cr3 to this page table. In the bgrt code we are trying to copy bgrt_header and image, but these regions fall under "EFI_BOOT_SERVICES_DATA" and to access these regions we have to shift %cr3 to efi_pgd and not doing so will cause page fault as shown below. [ 0.251599] Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0, 1GB 4 [ 0.259126] Freeing SMP alternatives memory: 32K (ffffffff8230e000 - ffffffff82316000) [ 0.271803] BUG: unable to handle kernel paging request at fffffffefce35002 [ 0.279740] IP: [<ffffffff821bca49>] efi_bgrt_init+0x144/0x1fd [ 0.286383] PGD 300f067 PUD 0 [ 0.289879] Oops: 0000 [ivanmeler#1] SMP [ 0.293566] Modules linked in: [ 0.297039] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.4.0-rc1-eywa-eywa-built-in-47041+ ivanmeler#2 [ 0.306619] Hardware name: Intel Corporation Skylake Client platform/Skylake Y LPDDR3 RVP3, BIOS SKLSE2R1.R00.B104.B01.1511110114 11/11/2015 [ 0.320925] task: ffffffff820134c0 ti: ffffffff82000000 task.ti: ffffffff82000000 [ 0.329420] RIP: 0010:[<ffffffff821bca49>] [<ffffffff821bca49>] efi_bgrt_init+0x144/0x1fd [ 0.338821] RSP: 0000:ffffffff82003f18 EFLAGS: 00010246 [ 0.344852] RAX: fffffffefce35000 RBX: fffffffefce35000 RCX: fffffffefce2b000 [ 0.352952] RDX: 000000008a82b000 RSI: ffffffff8235bb80 RDI: 000000008a835000 [ 0.361050] RBP: ffffffff82003f30 R08: 000000008a865000 R09: ffffffffff202850 [ 0.369149] R10: ffffffff811ad62f R11: 0000000000000000 R12: 0000000000000000 [ 0.377248] R13: ffff88016dbaea40 R14: ffffffff822622c0 R15: ffffffff82003fb0 [ 0.385348] FS: 0000000000000000(0000) GS:ffff88016d800000(0000) knlGS:0000000000000000 [ 0.394533] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.401054] CR2: fffffffefce35002 CR3: 000000000300c000 CR4: 00000000003406f0 [ 0.409153] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 0.417252] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 0.425350] Stack: [ 0.427638] ffffffffffffffff ffffffff82256900 ffff88016dbaea40 ffffffff82003f40 [ 0.436086] ffffffff821bbce0 ffffffff82003f88 ffffffff8219c0c2 0000000000000000 [ 0.444533] ffffffff8219ba4a ffffffff822622c0 0000000000083000 00000000ffffffff [ 0.452978] Call Trace: [ 0.455763] [<ffffffff821bbce0>] efi_late_init+0x9/0xb [ 0.461697] [<ffffffff8219c0c2>] start_kernel+0x463/0x47f [ 0.467928] [<ffffffff8219ba4a>] ? set_init_arg+0x55/0x55 [ 0.474159] [<ffffffff8219b120>] ? early_idt_handler_array+0x120/0x120 [ 0.481669] [<ffffffff8219b5ee>] x86_64_start_reservations+0x2a/0x2c [ 0.488982] [<ffffffff8219b72d>] x86_64_start_kernel+0x13d/0x14c [ 0.495897] Code: 00 41 b4 01 48 8b 78 28 e8 09 36 01 00 48 85 c0 48 89 c3 75 13 48 c7 c7 f8 ac d3 81 31 c0 e8 d7 3b fb fe e9 b5 00 00 00 45 84 e4 <44> 8b 6b 02 74 0d be 06 00 00 00 48 89 df e8 ae 34 0$ [ 0.518151] RIP [<ffffffff821bca49>] efi_bgrt_init+0x144/0x1fd [ 0.524888] RSP <ffffffff82003f18> [ 0.528851] CR2: fffffffefce35002 [ 0.532615] ---[ end trace 7b06521e6ebf2aea ]--- [ 0.537852] Kernel panic - not syncing: Attempted to kill the idle task! As said above one way to fix this bug is to shift %cr3 to efi_pgd but we are not doing that way because it leaks inner details of how we switch to EFI page tables into a new call site and it also adds duplicate code. Instead, we remove the call to efi_lookup_mapped_addr() and always perform early_mem*() instead of early_io*() because we want to remap RAM regions and not I/O regions. We also delete efi_lookup_mapped_addr() because we are no longer using it. Signed-off-by: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com> Reported-by: Wendy Wang <wendy.wang@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Ricardo Neri <ricardo.neri@intel.com> Cc: Ravi Shankar <ravi.v.shankar@intel.com> Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: "Ghannam, Yazen" <Yazen.Ghannam@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I5654370b01519fd4ec0b353fe825e9ea7df24ae8
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 31, 2021
[ Upstream commit ec4fbd64751de18729eaa816ec69e4b504b5a7a2 ] Dmitry reported a lockdep splat [1] (false positive) that we can fix by releasing the spinlock before calling icmp_send() from ip_expire() This is a false positive because sending an ICMP message can not possibly re-enter the IP frag engine. [1] [ INFO: possible circular locking dependency detected ] 4.10.0+ #29 Not tainted ------------------------------------------------------- modprobe/12392 is trying to acquire lock: (_xmit_ETHER#2){+.-...}, at: [<ffffffff837a8182>] spin_lock include/linux/spinlock.h:299 [inline] (_xmit_ETHER#2){+.-...}, at: [<ffffffff837a8182>] __netif_tx_lock include/linux/netdevice.h:3486 [inline] (_xmit_ETHER#2){+.-...}, at: [<ffffffff837a8182>] sch_direct_xmit+0x282/0x6d0 net/sched/sch_generic.c:180 but task is already holding lock: (&(&q->lock)->rlock){+.-...}, at: [<ffffffff8389a4d1>] spin_lock include/linux/spinlock.h:299 [inline] (&(&q->lock)->rlock){+.-...}, at: [<ffffffff8389a4d1>] ip_expire+0x51/0x6c0 net/ipv4/ip_fragment.c:201 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> ivanmeler#1 (&(&q->lock)->rlock){+.-...}: validate_chain kernel/locking/lockdep.c:2267 [inline] __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340 lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:299 [inline] ip_defrag+0x3a2/0x4130 net/ipv4/ip_fragment.c:669 ip_check_defrag+0x4e3/0x8b0 net/ipv4/ip_fragment.c:713 packet_rcv_fanout+0x282/0x800 net/packet/af_packet.c:1459 deliver_skb net/core/dev.c:1834 [inline] dev_queue_xmit_nit+0x294/0xa90 net/core/dev.c:1890 xmit_one net/core/dev.c:2903 [inline] dev_hard_start_xmit+0x16b/0xab0 net/core/dev.c:2923 sch_direct_xmit+0x31f/0x6d0 net/sched/sch_generic.c:182 __dev_xmit_skb net/core/dev.c:3092 [inline] __dev_queue_xmit+0x13e5/0x1e60 net/core/dev.c:3358 dev_queue_xmit+0x17/0x20 net/core/dev.c:3423 neigh_resolve_output+0x6b9/0xb10 net/core/neighbour.c:1308 neigh_output include/net/neighbour.h:478 [inline] ip_finish_output2+0x8b8/0x15a0 net/ipv4/ip_output.c:228 ip_do_fragment+0x1d93/0x2720 net/ipv4/ip_output.c:672 ip_fragment.constprop.54+0x145/0x200 net/ipv4/ip_output.c:545 ip_finish_output+0x82d/0xe10 net/ipv4/ip_output.c:314 NF_HOOK_COND include/linux/netfilter.h:246 [inline] ip_output+0x1f0/0x7a0 net/ipv4/ip_output.c:404 dst_output include/net/dst.h:486 [inline] ip_local_out+0x95/0x170 net/ipv4/ip_output.c:124 ip_send_skb+0x3c/0xc0 net/ipv4/ip_output.c:1492 ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1512 raw_sendmsg+0x26de/0x3a00 net/ipv4/raw.c:655 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:761 sock_sendmsg_nosec net/socket.c:633 [inline] sock_sendmsg+0xca/0x110 net/socket.c:643 ___sys_sendmsg+0x4a3/0x9f0 net/socket.c:1985 __sys_sendmmsg+0x25c/0x750 net/socket.c:2075 SYSC_sendmmsg net/socket.c:2106 [inline] SyS_sendmmsg+0x35/0x60 net/socket.c:2101 do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281 return_from_SYSCALL_64+0x0/0x7a -> #0 (_xmit_ETHER#2){+.-...}: check_prev_add kernel/locking/lockdep.c:1830 [inline] check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1940 validate_chain kernel/locking/lockdep.c:2267 [inline] __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340 lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:299 [inline] __netif_tx_lock include/linux/netdevice.h:3486 [inline] sch_direct_xmit+0x282/0x6d0 net/sched/sch_generic.c:180 __dev_xmit_skb net/core/dev.c:3092 [inline] __dev_queue_xmit+0x13e5/0x1e60 net/core/dev.c:3358 dev_queue_xmit+0x17/0x20 net/core/dev.c:3423 neigh_hh_output include/net/neighbour.h:468 [inline] neigh_output include/net/neighbour.h:476 [inline] ip_finish_output2+0xf6c/0x15a0 net/ipv4/ip_output.c:228 ip_finish_output+0xa29/0xe10 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:246 [inline] ip_output+0x1f0/0x7a0 net/ipv4/ip_output.c:404 dst_output include/net/dst.h:486 [inline] ip_local_out+0x95/0x170 net/ipv4/ip_output.c:124 ip_send_skb+0x3c/0xc0 net/ipv4/ip_output.c:1492 ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1512 icmp_push_reply+0x372/0x4d0 net/ipv4/icmp.c:394 icmp_send+0x156c/0x1c80 net/ipv4/icmp.c:754 ip_expire+0x40e/0x6c0 net/ipv4/ip_fragment.c:239 call_timer_fn+0x241/0x820 kernel/time/timer.c:1268 expire_timers kernel/time/timer.c:1307 [inline] __run_timers+0x960/0xcf0 kernel/time/timer.c:1601 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614 __do_softirq+0x31f/0xbe7 kernel/softirq.c:284 invoke_softirq kernel/softirq.c:364 [inline] irq_exit+0x1cc/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:657 [inline] smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:707 __read_once_size include/linux/compiler.h:254 [inline] atomic_read arch/x86/include/asm/atomic.h:26 [inline] rcu_dynticks_curr_cpu_in_eqs kernel/rcu/tree.c:350 [inline] __rcu_is_watching kernel/rcu/tree.c:1133 [inline] rcu_is_watching+0x83/0x110 kernel/rcu/tree.c:1147 rcu_read_lock_held+0x87/0xc0 kernel/rcu/update.c:293 radix_tree_deref_slot include/linux/radix-tree.h:238 [inline] filemap_map_pages+0x6d4/0x1570 mm/filemap.c:2335 do_fault_around mm/memory.c:3231 [inline] do_read_fault mm/memory.c:3265 [inline] do_fault+0xbd5/0x2080 mm/memory.c:3370 handle_pte_fault mm/memory.c:3600 [inline] __handle_mm_fault+0x1062/0x2cb0 mm/memory.c:3714 handle_mm_fault+0x1e2/0x480 mm/memory.c:3751 __do_page_fault+0x4f6/0xb60 arch/x86/mm/fault.c:1397 do_page_fault+0x54/0x70 arch/x86/mm/fault.c:1460 page_fault+0x28/0x30 arch/x86/entry/entry_64.S:1011 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&(&q->lock)->rlock); lock(_xmit_ETHER#2); lock(&(&q->lock)->rlock); lock(_xmit_ETHER#2); *** DEADLOCK *** 10 locks held by modprobe/12392: #0: (&mm->mmap_sem){++++++}, at: [<ffffffff81329758>] __do_page_fault+0x2b8/0xb60 arch/x86/mm/fault.c:1336 ivanmeler#1: (rcu_read_lock){......}, at: [<ffffffff8188cab6>] filemap_map_pages+0x1e6/0x1570 mm/filemap.c:2324 ivanmeler#2: (&(ptlock_ptr(page))->rlock#2){+.+...}, at: [<ffffffff81984a78>] spin_lock include/linux/spinlock.h:299 [inline] ivanmeler#2: (&(ptlock_ptr(page))->rlock#2){+.+...}, at: [<ffffffff81984a78>] pte_alloc_one_map mm/memory.c:2944 [inline] ivanmeler#2: (&(ptlock_ptr(page))->rlock#2){+.+...}, at: [<ffffffff81984a78>] alloc_set_pte+0x13b8/0x1b90 mm/memory.c:3072 ivanmeler#3: (((&q->timer))){+.-...}, at: [<ffffffff81627e72>] lockdep_copy_map include/linux/lockdep.h:175 [inline] ivanmeler#3: (((&q->timer))){+.-...}, at: [<ffffffff81627e72>] call_timer_fn+0x1c2/0x820 kernel/time/timer.c:1258 ivanmeler#4: (&(&q->lock)->rlock){+.-...}, at: [<ffffffff8389a4d1>] spin_lock include/linux/spinlock.h:299 [inline] ivanmeler#4: (&(&q->lock)->rlock){+.-...}, at: [<ffffffff8389a4d1>] ip_expire+0x51/0x6c0 net/ipv4/ip_fragment.c:201 ivanmeler#5: (rcu_read_lock){......}, at: [<ffffffff8389a633>] ip_expire+0x1b3/0x6c0 net/ipv4/ip_fragment.c:216 ivanmeler#6: (slock-AF_INET){+.-...}, at: [<ffffffff839b3313>] spin_trylock include/linux/spinlock.h:309 [inline] ivanmeler#6: (slock-AF_INET){+.-...}, at: [<ffffffff839b3313>] icmp_xmit_lock net/ipv4/icmp.c:219 [inline] ivanmeler#6: (slock-AF_INET){+.-...}, at: [<ffffffff839b3313>] icmp_send+0x803/0x1c80 net/ipv4/icmp.c:681 ivanmeler#7: (rcu_read_lock_bh){......}, at: [<ffffffff838ab9a1>] ip_finish_output2+0x2c1/0x15a0 net/ipv4/ip_output.c:198 ivanmeler#8: (rcu_read_lock_bh){......}, at: [<ffffffff836d1dee>] __dev_queue_xmit+0x23e/0x1e60 net/core/dev.c:3324 #9: (dev->qdisc_running_key ?: &qdisc_running_key){+.....}, at: [<ffffffff836d3a27>] dev_queue_xmit+0x17/0x20 net/core/dev.c:3423 stack backtrace: CPU: 0 PID: 12392 Comm: modprobe Not tainted 4.10.0+ #29 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: <IRQ> __dump_stack lib/dump_stack.c:16 [inline] dump_stack+0x2ee/0x3ef lib/dump_stack.c:52 print_circular_bug+0x307/0x3b0 kernel/locking/lockdep.c:1204 check_prev_add kernel/locking/lockdep.c:1830 [inline] check_prevs_add+0xa8f/0x19f0 kernel/locking/lockdep.c:1940 validate_chain kernel/locking/lockdep.c:2267 [inline] __lock_acquire+0x2149/0x3430 kernel/locking/lockdep.c:3340 lock_acquire+0x2a1/0x630 kernel/locking/lockdep.c:3755 __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline] _raw_spin_lock+0x33/0x50 kernel/locking/spinlock.c:151 spin_lock include/linux/spinlock.h:299 [inline] __netif_tx_lock include/linux/netdevice.h:3486 [inline] sch_direct_xmit+0x282/0x6d0 net/sched/sch_generic.c:180 __dev_xmit_skb net/core/dev.c:3092 [inline] __dev_queue_xmit+0x13e5/0x1e60 net/core/dev.c:3358 dev_queue_xmit+0x17/0x20 net/core/dev.c:3423 neigh_hh_output include/net/neighbour.h:468 [inline] neigh_output include/net/neighbour.h:476 [inline] ip_finish_output2+0xf6c/0x15a0 net/ipv4/ip_output.c:228 ip_finish_output+0xa29/0xe10 net/ipv4/ip_output.c:316 NF_HOOK_COND include/linux/netfilter.h:246 [inline] ip_output+0x1f0/0x7a0 net/ipv4/ip_output.c:404 dst_output include/net/dst.h:486 [inline] ip_local_out+0x95/0x170 net/ipv4/ip_output.c:124 ip_send_skb+0x3c/0xc0 net/ipv4/ip_output.c:1492 ip_push_pending_frames+0x64/0x80 net/ipv4/ip_output.c:1512 icmp_push_reply+0x372/0x4d0 net/ipv4/icmp.c:394 icmp_send+0x156c/0x1c80 net/ipv4/icmp.c:754 ip_expire+0x40e/0x6c0 net/ipv4/ip_fragment.c:239 call_timer_fn+0x241/0x820 kernel/time/timer.c:1268 expire_timers kernel/time/timer.c:1307 [inline] __run_timers+0x960/0xcf0 kernel/time/timer.c:1601 run_timer_softirq+0x21/0x80 kernel/time/timer.c:1614 __do_softirq+0x31f/0xbe7 kernel/softirq.c:284 invoke_softirq kernel/softirq.c:364 [inline] irq_exit+0x1cc/0x200 kernel/softirq.c:405 exiting_irq arch/x86/include/asm/apic.h:657 [inline] smp_apic_timer_interrupt+0x76/0xa0 arch/x86/kernel/apic/apic.c:962 apic_timer_interrupt+0x93/0xa0 arch/x86/entry/entry_64.S:707 RIP: 0010:__read_once_size include/linux/compiler.h:254 [inline] RIP: 0010:atomic_read arch/x86/include/asm/atomic.h:26 [inline] RIP: 0010:rcu_dynticks_curr_cpu_in_eqs kernel/rcu/tree.c:350 [inline] RIP: 0010:__rcu_is_watching kernel/rcu/tree.c:1133 [inline] RIP: 0010:rcu_is_watching+0x83/0x110 kernel/rcu/tree.c:1147 RSP: 0000:ffff8801c391f120 EFLAGS: 00000a03 ORIG_RAX: ffffffffffffff10 RAX: dffffc0000000000 RBX: ffff8801c391f148 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 000055edd4374000 RDI: ffff8801dbe1ae0c RBP: ffff8801c391f1a0 R08: 0000000000000002 R09: 0000000000000000 R10: dffffc0000000000 R11: 0000000000000002 R12: 1ffff10038723e25 R13: ffff8801dbe1ae00 R14: ffff8801c391f680 R15: dffffc0000000000 </IRQ> rcu_read_lock_held+0x87/0xc0 kernel/rcu/update.c:293 radix_tree_deref_slot include/linux/radix-tree.h:238 [inline] filemap_map_pages+0x6d4/0x1570 mm/filemap.c:2335 do_fault_around mm/memory.c:3231 [inline] do_read_fault mm/memory.c:3265 [inline] do_fault+0xbd5/0x2080 mm/memory.c:3370 handle_pte_fault mm/memory.c:3600 [inline] __handle_mm_fault+0x1062/0x2cb0 mm/memory.c:3714 handle_mm_fault+0x1e2/0x480 mm/memory.c:3751 __do_page_fault+0x4f6/0xb60 arch/x86/mm/fault.c:1397 do_page_fault+0x54/0x70 arch/x86/mm/fault.c:1460 page_fault+0x28/0x30 arch/x86/entry/entry_64.S:1011 RIP: 0033:0x7f83172f2786 RSP: 002b:00007fffe859ae80 EFLAGS: 00010293 RAX: 000055edd4373040 RBX: 00007f83175111c8 RCX: 000055edd4373238 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007f8317510970 RBP: 00007fffe859afd0 R08: 0000000000000009 R09: 0000000000000000 R10: 0000000000000064 R11: 0000000000000000 R12: 000055edd4373040 R13: 0000000000000000 R14: 00007fffe859afe8 R15: 0000000000000000 Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@verizon.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I373ca4c676c87f8b606c93b9f0b1820ea80ce1c5
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 31, 2021
commit ca10845a56856fff4de3804c85e6424d0f6d0cde upstream While running btrfs/061, btrfs/073, btrfs/078, or btrfs/178 we hit the following lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 5.9.0-rc3+ ivanmeler#4 Not tainted ------------------------------------------------------ kswapd0/100 is trying to acquire lock: ffff96ecc22ef4a0 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node.part.0+0x3f/0x330 but task is already holding lock: ffffffff8dd74700 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> ivanmeler#3 (fs_reclaim){+.+.}-{0:0}: fs_reclaim_acquire+0x65/0x80 slab_pre_alloc_hook.constprop.0+0x20/0x200 kmem_cache_alloc+0x37/0x270 alloc_inode+0x82/0xb0 iget_locked+0x10d/0x2c0 kernfs_get_inode+0x1b/0x130 kernfs_get_tree+0x136/0x240 sysfs_get_tree+0x16/0x40 vfs_get_tree+0x28/0xc0 path_mount+0x434/0xc00 __x64_sys_mount+0xe3/0x120 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> ivanmeler#2 (kernfs_mutex){+.+.}-{3:3}: __mutex_lock+0x7e/0x7e0 kernfs_add_one+0x23/0x150 kernfs_create_link+0x63/0xa0 sysfs_do_create_link_sd+0x5e/0xd0 btrfs_sysfs_add_devices_dir+0x81/0x130 btrfs_init_new_device+0x67f/0x1250 btrfs_ioctl+0x1ef/0x2e20 __x64_sys_ioctl+0x83/0xb0 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> ivanmeler#1 (&fs_info->chunk_mutex){+.+.}-{3:3}: __mutex_lock+0x7e/0x7e0 btrfs_chunk_alloc+0x125/0x3a0 find_free_extent+0xdf6/0x1210 btrfs_reserve_extent+0xb3/0x1b0 btrfs_alloc_tree_block+0xb0/0x310 alloc_tree_block_no_bg_flush+0x4a/0x60 __btrfs_cow_block+0x11a/0x530 btrfs_cow_block+0x104/0x220 btrfs_search_slot+0x52e/0x9d0 btrfs_insert_empty_items+0x64/0xb0 btrfs_insert_delayed_items+0x90/0x4f0 btrfs_commit_inode_delayed_items+0x93/0x140 btrfs_log_inode+0x5de/0x2020 btrfs_log_inode_parent+0x429/0xc90 btrfs_log_new_name+0x95/0x9b btrfs_rename2+0xbb9/0x1800 vfs_rename+0x64f/0x9f0 do_renameat2+0x320/0x4e0 __x64_sys_rename+0x1f/0x30 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #0 (&delayed_node->mutex){+.+.}-{3:3}: __lock_acquire+0x119c/0x1fc0 lock_acquire+0xa7/0x3d0 __mutex_lock+0x7e/0x7e0 __btrfs_release_delayed_node.part.0+0x3f/0x330 btrfs_evict_inode+0x24c/0x500 evict+0xcf/0x1f0 dispose_list+0x48/0x70 prune_icache_sb+0x44/0x50 super_cache_scan+0x161/0x1e0 do_shrink_slab+0x178/0x3c0 shrink_slab+0x17c/0x290 shrink_node+0x2b2/0x6d0 balance_pgdat+0x30a/0x670 kswapd+0x213/0x4c0 kthread+0x138/0x160 ret_from_fork+0x1f/0x30 other info that might help us debug this: Chain exists of: &delayed_node->mutex --> kernfs_mutex --> fs_reclaim Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(kernfs_mutex); lock(fs_reclaim); lock(&delayed_node->mutex); *** DEADLOCK *** 3 locks held by kswapd0/100: #0: ffffffff8dd74700 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x5/0x30 ivanmeler#1: ffffffff8dd65c50 (shrinker_rwsem){++++}-{3:3}, at: shrink_slab+0x115/0x290 ivanmeler#2: ffff96ed2ade30e0 (&type->s_umount_key#36){++++}-{3:3}, at: super_cache_scan+0x38/0x1e0 stack backtrace: CPU: 0 PID: 100 Comm: kswapd0 Not tainted 5.9.0-rc3+ ivanmeler#4 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Call Trace: dump_stack+0x8b/0xb8 check_noncircular+0x12d/0x150 __lock_acquire+0x119c/0x1fc0 lock_acquire+0xa7/0x3d0 ? __btrfs_release_delayed_node.part.0+0x3f/0x330 __mutex_lock+0x7e/0x7e0 ? __btrfs_release_delayed_node.part.0+0x3f/0x330 ? __btrfs_release_delayed_node.part.0+0x3f/0x330 ? lock_acquire+0xa7/0x3d0 ? find_held_lock+0x2b/0x80 __btrfs_release_delayed_node.part.0+0x3f/0x330 btrfs_evict_inode+0x24c/0x500 evict+0xcf/0x1f0 dispose_list+0x48/0x70 prune_icache_sb+0x44/0x50 super_cache_scan+0x161/0x1e0 do_shrink_slab+0x178/0x3c0 shrink_slab+0x17c/0x290 shrink_node+0x2b2/0x6d0 balance_pgdat+0x30a/0x670 kswapd+0x213/0x4c0 ? _raw_spin_unlock_irqrestore+0x41/0x50 ? add_wait_queue_exclusive+0x70/0x70 ? balance_pgdat+0x670/0x670 kthread+0x138/0x160 ? kthread_create_worker_on_cpu+0x40/0x40 ret_from_fork+0x1f/0x30 This happens because we are holding the chunk_mutex at the time of adding in a new device. However we only need to hold the device_list_mutex, as we're going to iterate over the fs_devices devices. Move the sysfs init stuff outside of the chunk_mutex to get rid of this lockdep splat. CC: stable@vger.kernel.org # 4.4.x: f3cd2c58110dad14e: btrfs: sysfs, rename device_link add/remove functions CC: stable@vger.kernel.org # 4.4.x Reported-by: David Sterba <dsterba@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> [sudip: adjust context] Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Ia7a147f836b62017b7c216ae6ade56bc4b03d107
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 31, 2021
… and subvolume roots commit f32e48e925964c4f8ab917850788a87e1cef3bad upstream. The following call trace is seen when btrfs/031 test is executed in a loop, [ 158.661848] ------------[ cut here ]------------ [ 158.662634] WARNING: CPU: 2 PID: 890 at /home/chandan/repos/linux/fs/btrfs/ioctl.c:558 create_subvol+0x3d1/0x6ea() [ 158.664102] BTRFS: Transaction aborted (error -2) [ 158.664774] Modules linked in: [ 158.665266] CPU: 2 PID: 890 Comm: btrfs Not tainted 4.4.0-rc6-g511711a ivanmeler#2 [ 158.666251] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [ 158.667392] ffffffff81c0a6b0 ffff8806c7c4f8e8 ffffffff81431fc8 ffff8806c7c4f930 [ 158.668515] ffff8806c7c4f920 ffffffff81051aa1 ffff880c85aff000 ffff8800bb44d000 [ 158.669647] ffff8808863b5c98 0000000000000000 00000000fffffffe ffff8806c7c4f980 [ 158.670769] Call Trace: [ 158.671153] [<ffffffff81431fc8>] dump_stack+0x44/0x5c [ 158.671884] [<ffffffff81051aa1>] warn_slowpath_common+0x81/0xc0 [ 158.672769] [<ffffffff81051b27>] warn_slowpath_fmt+0x47/0x50 [ 158.673620] [<ffffffff813bc98d>] create_subvol+0x3d1/0x6ea [ 158.674440] [<ffffffff813777c9>] btrfs_mksubvol.isra.30+0x369/0x520 [ 158.675376] [<ffffffff8108a4aa>] ? percpu_down_read+0x1a/0x50 [ 158.676235] [<ffffffff81377a81>] btrfs_ioctl_snap_create_transid+0x101/0x180 [ 158.677268] [<ffffffff81377b52>] btrfs_ioctl_snap_create+0x52/0x70 [ 158.678183] [<ffffffff8137afb4>] btrfs_ioctl+0x474/0x2f90 [ 158.678975] [<ffffffff81144b8e>] ? vma_merge+0xee/0x300 [ 158.679751] [<ffffffff8115be31>] ? alloc_pages_vma+0x91/0x170 [ 158.680599] [<ffffffff81123f62>] ? lru_cache_add_active_or_unevictable+0x22/0x70 [ 158.681686] [<ffffffff813d99cf>] ? selinux_file_ioctl+0xff/0x1d0 [ 158.682581] [<ffffffff8117b791>] do_vfs_ioctl+0x2c1/0x490 [ 158.683399] [<ffffffff813d3cde>] ? security_file_ioctl+0x3e/0x60 [ 158.684297] [<ffffffff8117b9d4>] SyS_ioctl+0x74/0x80 [ 158.685051] [<ffffffff819b2bd7>] entry_SYSCALL_64_fastpath+0x12/0x6a [ 158.685958] ---[ end trace 4b63312de5a2cb76 ]--- [ 158.686647] BTRFS: error (device loop0) in create_subvol:558: errno=-2 No such entry [ 158.709508] BTRFS info (device loop0): forced readonly [ 158.737113] BTRFS info (device loop0): disk space caching is enabled [ 158.738096] BTRFS error (device loop0): Remounting read-write after error is not allowed [ 158.851303] BTRFS error (device loop0): cleaner transaction attach returned -30 This occurs because, Mount filesystem Create subvol with ID 257 Unmount filesystem Mount filesystem Delete subvol with ID 257 btrfs_drop_snapshot() Add root corresponding to subvol 257 into btrfs_transaction->dropped_roots list Create new subvol (i.e. create_subvol()) 257 is returned as the next free objectid btrfs_read_fs_root_no_name() Finds the btrfs_root instance corresponding to the old subvol with ID 257 in btrfs_fs_info->fs_roots_radix. Returns error since btrfs_root_item->refs has the value of 0. To fix the issue the commit initializes tree root's and subvolume root's highest_objectid when loading the roots from disk. Signed-off-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Iddc87cfab092f452f1d2b8537d970c2821d75403
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 31, 2021
commit d460131dd50599e0e9405d5f4ae02c27d529a44a upstream. Every kernel build on x86 will result in some output: Setup is 13084 bytes (padded to 13312 bytes). System is 4833 kB CRC 6d35fa35 Kernel: arch/x86/boot/bzImage is ready (ivanmeler#2) This shuts it up, so that 'make -s' is truely silent as long as everything works. Building without '-s' should produce unchanged output. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-6-arnd@arndb.de Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Id99b2658ef77f8763c00c6ce6ccd8ab57aa5adaa
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
May 31, 2021
[ Upstream commit 2c0aa08631b86a4678dbc93b9caa5248014b4458 ] Scenario: 1. Port down and do fail over 2. Ap do rds_bind syscall PID: 47039 TASK: ffff89887e2fe640 CPU: 47 COMMAND: "kworker/u:6" #0 [ffff898e35f159f0] machine_kexec at ffffffff8103abf9 ivanmeler#1 [ffff898e35f15a60] crash_kexec at ffffffff810b96e3 ivanmeler#2 [ffff898e35f15b30] oops_end at ffffffff8150f518 ivanmeler#3 [ffff898e35f15b60] no_context at ffffffff8104854c ivanmeler#4 [ffff898e35f15ba0] __bad_area_nosemaphore at ffffffff81048675 ivanmeler#5 [ffff898e35f15bf0] bad_area_nosemaphore at ffffffff810487d3 ivanmeler#6 [ffff898e35f15c00] do_page_fault at ffffffff815120b8 ivanmeler#7 [ffff898e35f15d10] page_fault at ffffffff8150ea95 [exception RIP: unknown or invalid address] RIP: 0000000000000000 RSP: ffff898e35f15dc8 RFLAGS: 00010282 RAX: 00000000fffffffe RBX: ffff889b77f6fc00 RCX:ffffffff81c99d88 RDX: 0000000000000000 RSI: ffff896019ee08e8 RDI:ffff889b77f6fc00 RBP: ffff898e35f15df0 R8: ffff896019ee08c8 R9:0000000000000000 R10: 0000000000000400 R11: 0000000000000000 R12:ffff896019ee08c0 R13: ffff889b77f6fe68 R14: ffffffff81c99d80 R15: ffffffffa022a1e0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 ivanmeler#8 [ffff898e35f15dc8] cma_ndev_work_handler at ffffffffa022a228 [rdma_cm] #9 [ffff898e35f15df8] process_one_work at ffffffff8108a7c6 ivanmeler#10 [ffff898e35f15e58] worker_thread at ffffffff8108bda0 ivanmeler#11 [ffff898e35f15ee8] kthread at ffffffff81090fe6 PID: 45659 TASK: ffff880d313d2500 CPU: 31 COMMAND: "oracle_45659_ap" #0 [ffff881024ccfc98] __schedule at ffffffff8150bac4 ivanmeler#1 [ffff881024ccfd40] schedule at ffffffff8150c2cf ivanmeler#2 [ffff881024ccfd50] __mutex_lock_slowpath at ffffffff8150cee7 ivanmeler#3 [ffff881024ccfdc0] mutex_lock at ffffffff8150cdeb ivanmeler#4 [ffff881024ccfde0] rdma_destroy_id at ffffffffa022a027 [rdma_cm] ivanmeler#5 [ffff881024ccfe10] rds_ib_laddr_check at ffffffffa0357857 [rds_rdma] ivanmeler#6 [ffff881024ccfe50] rds_trans_get_preferred at ffffffffa0324c2a [rds] ivanmeler#7 [ffff881024ccfe80] rds_bind at ffffffffa031d690 [rds] ivanmeler#8 [ffff881024ccfeb0] sys_bind at ffffffff8142a670 PID: 45659 PID: 47039 rds_ib_laddr_check /* create id_priv with a null event_handler */ rdma_create_id rdma_bind_addr cma_acquire_dev /* add id_priv to cma_dev->id_list */ cma_attach_to_dev cma_ndev_work_handler /* event_hanlder is null */ id_priv->id.event_handler Signed-off-by: Guanglei Li <guanglei.li@oracle.com> Signed-off-by: Honglei Wang <honglei.wang@oracle.com> Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com> Reviewed-by: Yanjun Zhu <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Acked-by: Doug Ledford <dledford@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <alexander.levin@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I58ede1e99f457b5c910d8316148052999798ce0a
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Jun 2, 2021
commit c7de87ff9dac5f396f62d584f3908f80ddc0e07b upstream. [ This problem is in mainline, but only rt has the chops to be able to detect it. ] Lockdep reports a circular lock dependency between serv->sv_lock and softirq_ctl.lock on system shutdown, when using a kernel built with CONFIG_PREEMPT_RT=y, and a nfs mount exists. This is due to the definition of spin_lock_bh on rt: local_bh_disable(); rt_spin_lock(lock); which forces a softirq_ctl.lock -> serv->sv_lock dependency. This is not a problem as long as _every_ lock of serv->sv_lock is a: spin_lock_bh(&serv->sv_lock); but there is one of the form: spin_lock(&serv->sv_lock); This is what is causing the circular dependency splat. The spin_lock() grabs the lock without first grabbing softirq_ctl.lock via local_bh_disable. If later on in the critical region, someone does a local_bh_disable, we get a serv->sv_lock -> softirq_ctrl.lock dependency established. Deadlock. Fix is to make serv->sv_lock be locked with spin_lock_bh everywhere, no exceptions. [ OK ] Stopped target NFS client services. Stopping Logout off all iSCSI sessions on shutdown... Stopping NFS server and services... [ 109.442380] [ 109.442385] ====================================================== [ 109.442386] WARNING: possible circular locking dependency detected [ 109.442387] 5.10.16-rt30 ivanmeler#1 Not tainted [ 109.442389] ------------------------------------------------------ [ 109.442390] nfsd/1032 is trying to acquire lock: [ 109.442392] ffff994237617f60 ((softirq_ctrl.lock).lock){+.+.}-{2:2}, at: __local_bh_disable_ip+0xd9/0x270 [ 109.442405] [ 109.442405] but task is already holding lock: [ 109.442406] ffff994245cb00b0 (&serv->sv_lock){+.+.}-{0:0}, at: svc_close_list+0x1f/0x90 [ 109.442415] [ 109.442415] which lock already depends on the new lock. [ 109.442415] [ 109.442416] [ 109.442416] the existing dependency chain (in reverse order) is: [ 109.442417] [ 109.442417] -> ivanmeler#1 (&serv->sv_lock){+.+.}-{0:0}: [ 109.442421] rt_spin_lock+0x2b/0xc0 [ 109.442428] svc_add_new_perm_xprt+0x42/0xa0 [ 109.442430] svc_addsock+0x135/0x220 [ 109.442434] write_ports+0x4b3/0x620 [ 109.442438] nfsctl_transaction_write+0x45/0x80 [ 109.442440] vfs_write+0xff/0x420 [ 109.442444] ksys_write+0x4f/0xc0 [ 109.442446] do_syscall_64+0x33/0x40 [ 109.442450] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 109.442454] [ 109.442454] -> #0 ((softirq_ctrl.lock).lock){+.+.}-{2:2}: [ 109.442457] __lock_acquire+0x1264/0x20b0 [ 109.442463] lock_acquire+0xc2/0x400 [ 109.442466] rt_spin_lock+0x2b/0xc0 [ 109.442469] __local_bh_disable_ip+0xd9/0x270 [ 109.442471] svc_xprt_do_enqueue+0xc0/0x4d0 [ 109.442474] svc_close_list+0x60/0x90 [ 109.442476] svc_close_net+0x49/0x1a0 [ 109.442478] svc_shutdown_net+0x12/0x40 [ 109.442480] nfsd_destroy+0xc5/0x180 [ 109.442482] nfsd+0x1bc/0x270 [ 109.442483] kthread+0x194/0x1b0 [ 109.442487] ret_from_fork+0x22/0x30 [ 109.442492] [ 109.442492] other info that might help us debug this: [ 109.442492] [ 109.442493] Possible unsafe locking scenario: [ 109.442493] [ 109.442493] CPU0 CPU1 [ 109.442494] ---- ---- [ 109.442495] lock(&serv->sv_lock); [ 109.442496] lock((softirq_ctrl.lock).lock); [ 109.442498] lock(&serv->sv_lock); [ 109.442499] lock((softirq_ctrl.lock).lock); [ 109.442501] [ 109.442501] *** DEADLOCK *** [ 109.442501] [ 109.442501] 3 locks held by nfsd/1032: [ 109.442503] #0: ffffffff93b49258 (nfsd_mutex){+.+.}-{3:3}, at: nfsd+0x19a/0x270 [ 109.442508] ivanmeler#1: ffff994245cb00b0 (&serv->sv_lock){+.+.}-{0:0}, at: svc_close_list+0x1f/0x90 [ 109.442512] ivanmeler#2: ffffffff93a81b20 (rcu_read_lock){....}-{1:2}, at: rt_spin_lock+0x5/0xc0 [ 109.442518] [ 109.442518] stack backtrace: [ 109.442519] CPU: 0 PID: 1032 Comm: nfsd Not tainted 5.10.16-rt30 ivanmeler#1 [ 109.442522] Hardware name: Supermicro X9DRL-3F/iF/X9DRL-3F/iF, BIOS 3.2 09/22/2015 [ 109.442524] Call Trace: [ 109.442527] dump_stack+0x77/0x97 [ 109.442533] check_noncircular+0xdc/0xf0 [ 109.442546] __lock_acquire+0x1264/0x20b0 [ 109.442553] lock_acquire+0xc2/0x400 [ 109.442564] rt_spin_lock+0x2b/0xc0 [ 109.442570] __local_bh_disable_ip+0xd9/0x270 [ 109.442573] svc_xprt_do_enqueue+0xc0/0x4d0 [ 109.442577] svc_close_list+0x60/0x90 [ 109.442581] svc_close_net+0x49/0x1a0 [ 109.442585] svc_shutdown_net+0x12/0x40 [ 109.442588] nfsd_destroy+0xc5/0x180 [ 109.442590] nfsd+0x1bc/0x270 [ 109.442595] kthread+0x194/0x1b0 [ 109.442600] ret_from_fork+0x22/0x30 [ 109.518225] nfsd: last server has exited, flushing export cache [ OK ] Stopped NFSv4 ID-name mapping service. [ OK ] Stopped GSSAPI Proxy Daemon. [ OK ] Stopped NFS Mount Daemon. [ OK ] Stopped NFS status monitor for NFSv2/3 locking.. Fixes: 719f8bc ("svcrpc: fix xpt_list traversal locking on shutdown") Signed-off-by: Joe Korty <joe.korty@concurrent-rt.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: I09ea284927a37dd9dc83848f94d9d9ef783c574c
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Jun 5, 2021
[ Upstream commit 5a25de6df789cc805a9b8ba7ab5deef5067af47e ] Freeing chip on error may lead to an Oops at the next time the system goes to resume. Fix this by removing all snd_echo_free() calls on error. Fixes: 47b5d02 ("ALSA: Echoaudio - Add suspend support ivanmeler#2") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Link: https://lore.kernel.org/r/20200813074632.17022-1-dinghao.liu@zju.edu.cn Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org> Change-Id: Idcc188a7f0b4f06cd015f6b2d21e096263a19334
fcuzzocrea
pushed a commit
to fcuzzocrea/android_kernel_samsung_universal8890
that referenced
this issue
Jul 11, 2021
[ Upstream commit 5bbf219328849e83878bddb7c226d8d42e84affc ] An out of bounds write happens when setting the default power state. KASAN sees this as: [drm] radeon: 512M of GTT memory ready. [drm] GART: num cpu pages 131072, num gpu pages 131072 ================================================================== BUG: KASAN: slab-out-of-bounds in radeon_atombios_parse_power_table_1_3+0x1837/0x1998 [radeon] Write of size 4 at addr ffff88810178d858 by task systemd-udevd/157 CPU: 0 PID: 157 Comm: systemd-udevd Not tainted 5.12.0-E620 #50 Hardware name: eMachines eMachines E620 /Nile , BIOS V1.03 09/30/2008 Call Trace: dump_stack+0xa5/0xe6 print_address_description.constprop.0+0x18/0x239 kasan_report+0x170/0x1a8 radeon_atombios_parse_power_table_1_3+0x1837/0x1998 [radeon] radeon_atombios_get_power_modes+0x144/0x1888 [radeon] radeon_pm_init+0x1019/0x1904 [radeon] rs690_init+0x76e/0x84a [radeon] radeon_device_init+0x1c1a/0x21e5 [radeon] radeon_driver_load_kms+0xf5/0x30b [radeon] drm_dev_register+0x255/0x4a0 [drm] radeon_pci_probe+0x246/0x2f6 [radeon] pci_device_probe+0x1aa/0x294 really_probe+0x30e/0x850 driver_probe_device+0xe6/0x135 device_driver_attach+0xc1/0xf8 __driver_attach+0x13f/0x146 bus_for_each_dev+0xfa/0x146 bus_add_driver+0x2b3/0x447 driver_register+0x242/0x2c1 do_one_initcall+0x149/0x2fd do_init_module+0x1ae/0x573 load_module+0x4dee/0x5cca __do_sys_finit_module+0xf1/0x140 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae Without KASAN, this will manifest later when the kernel attempts to allocate memory that was stomped, since it collides with the inline slab freelist pointer: invalid opcode: 0000 [ivanmeler#1] SMP NOPTI CPU: 0 PID: 781 Comm: openrc-run.sh Tainted: G W 5.10.12-gentoo-E620 ivanmeler#2 Hardware name: eMachines eMachines E620 /Nile , BIOS V1.03 09/30/2008 RIP: 0010:kfree+0x115/0x230 Code: 89 c5 e8 75 ea ff ff 48 8b 00 0f ba e0 09 72 63 e8 1f f4 ff ff 41 89 c4 48 8b 45 00 0f ba e0 10 72 0a 48 8b 45 08 a8 01 75 02 <0f> 0b 44 89 e1 48 c7 c2 00 f0 ff ff be 06 00 00 00 48 d3 e2 48 c7 RSP: 0018:ffffb42f40267e10 EFLAGS: 00010246 RAX: ffffd61280ee8d88 RBX: 0000000000000004 RCX: 000000008010000d RDX: 4000000000000000 RSI: ffffffffba1360b0 RDI: ffffd61280ee8d80 RBP: ffffd61280ee8d80 R08: ffffffffb91bebdf R09: 0000000000000000 R10: ffff8fe2c1047ac8 R11: 0000000000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000100 FS: 00007fe80eff6b68(0000) GS:ffff8fe339c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe80eec7bc0 CR3: 0000000038012000 CR4: 00000000000006f0 Call Trace: __free_fdtable+0x16/0x1f put_files_struct+0x81/0x9b do_exit+0x433/0x94d do_group_exit+0xa6/0xa6 __x64_sys_exit_group+0xf/0xf do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7fe80ef64bea Code: Unable to access opcode bytes at RIP 0x7fe80ef64bc0. RSP: 002b:00007ffdb1c47528 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fe80ef64bea RDX: 00007fe80ef64f60 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000 R10: 00007fe80ee2c620 R11: 0000000000000246 R12: 00007fe80eff41e0 R13: 00000000ffffffff R14: 0000000000000024 R15: 00007fe80edf9cd0 Modules linked in: radeon(+) ath5k(+) snd_hda_codec_realtek ... Use a valid power_state index when initializing the "flags" and "misc" and "misc2" fields. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211537 Reported-by: Erhard F. <erhard_f@mailbox.org> Fixes: a48b9b4 ("drm/radeon/kms/pm: add asic specific callbacks for getting power state (v2)") Fixes: 79daedc ("drm/radeon/kms: minor pm cleanups") Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Lee Jones <joneslee@google.com> Change-Id: I12a2d79240c441f566beb4a6bfa52def17832acb
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi,
I tried to use the pinning feature in android but although pinning works, unpinning does not, effectively locking my phone in the open app. The only way of getting out is restarting the phone. I tried different combinations of buttons, but nothing seems to work...
I did not find any solutions online, but I did see there is an implementation for two or tree button phones, so maybe it's there something goes wrong for the Galaxy S7 edge.
Kind regards,
PJ
The text was updated successfully, but these errors were encountered: