[PATCH] [B/snapdragon] [SRU] Kernel hangs during msm init

Paolo Pisati paolo.pisati at canonical.com
Thu Aug 29 12:28:25 UTC 2019


BugLink: https://bugs.launchpad.net/bugs/1841911

Impact:

Ubuntu-snapdragon-4.15.0-1061.68 hangs during boot around msm init.
Sometimes we get the following stack trace, or the boot completes and the board hangs during reboot:

...
[ 8.113018] msm_dsi_manager_register: failed to register mipi dsi host for DSI 0
[ 8.131081] msm 1a00000.mdss: failed to bind 1a98000.dsi (ops dsi_ops [msm]): -517
[ 8.138234] msm 1a00000.mdss: master bind failed: -517
[ 8.145551] platform 1a01000.mdp: Dropping the link to 1ef0000.iommu
[ 8.150545] iommu: Removing device 1a01000.mdp from group 1
[ 8.157051] ------------[ cut here ]------------
[ 8.162369] WARNING: CPU: 1 PID: 1316 at /build/linux-snapdragon-t5G9R3/linux-snapdragon-4.15.0/drivers/iommu/qcom_iommu.c:336 qcom_iommu_domain_free+0x74/0x88
[ 8.167166] Modules linked in: adv7511_drm cec rc_core msm(+) mdt_loader
[ 8.181137] CPU: 1 PID: 1316 Comm: systemd-udevd Not tainted 4.15.0-1061-snapdragon #68-Ubuntu
[ 8.188079] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT)
[ 8.196501] pstate: 60400005 (nZCv daif +PAN -UAO)
[ 8.203356] pc : qcom_iommu_domain_free+0x74/0x88
[ 8.207955] lr : qcom_iommu_domain_free+0x74/0x88
[ 8.212727] sp : ffff00000cbeb680
[ 8.217412] x29: ffff00000cbeb680 x28: ffff8000396d84b8
[ 8.220713] x27: ffff8000396d84b0 x26: ffff8000396d84c0
[ 8.226096] x25: ffff80003d057c10 x24: ffff8000396d8420
[ 8.231391] x23: 0000000000000003 x22: ffff80003ce40258
[ 8.236686] x21: ffff80000203ad00 x20: ffff80000203af30
[ 8.241981] x19: ffff80000203af00 x18: ffffffffffffffff
[ 8.247275] x17: 0000000000000000 x16: 0000000000000004
[ 8.252570] x15: ffff000009549c08 x14: 0720072007200720
[ 8.257866] x13: 0720072007200720 x12: 0720072007200720
[ 8.263161] x11: ffff000009549e80 x10: ffff00000871d340
[ 8.268456] x9 : 0720072007200720 x8 : 0000000000000005
[ 8.273751] x7 : 0720072d072d072d x6 : 000000000000014c
[ 8.279046] x5 : ffff000008610250 x4 : 0000000000000000
[ 8.284345] x3 : 0000000000000000 x2 : a59fa8ece8469a00
[ 8.289637] x1 : 0000000000000000 x0 : 0000000000000024
[ 8.294932] Call trace:
[ 8.300227] qcom_iommu_domain_free+0x74/0x88
[ 8.302400] iommu_group_release+0x54/0x90
[ 8.306914] kobject_put+0x8c/0x1f0
[ 8.310905] kobject_del.part.0+0x3c/0x50
[ 8.314292] kobject_put+0x74/0x1f0
[ 8.318455] iommu_group_remove_device+0x10c/0x198
[ 8.321756] qcom_iommu_remove_device+0x58/0x70
[ 8.326617] iommu_bus_notifier+0xa8/0x120
[ 8.331045] notifier_call_chain+0x5c/0xa0
[ 8.335210] blocking_notifier_call_chain+0x64/0x88
[ 8.339294] device_del+0x234/0x368
[ 8.344066] platform_device_del.part.3+0x2c/0x98
[ 8.347539] platform_device_unregister+0x24/0x38
[ 8.352410] of_platform_device_destroy+0xb8/0xc0
[ 8.357087] device_for_each_child+0x58/0xb0
[ 8.361775] of_platform_depopulate+0x4c/0x68
[ 8.366350] msm_pdev_probe+0x2c4/0x388 [msm]
[ 8.370369] platform_drv_probe+0x60/0xc0
[ 8.374707] driver_probe_device+0x2ec/0x458
[ 8.378701] __driver_attach+0xdc/0x128
[ 8.383042] bus_for_each_dev+0x78/0xd8
[ 8.386598] driver_attach+0x30/0x40
[ 8.390418] bus_add_driver+0x20c/0x2a8
[ 8.394237] driver_register+0x6c/0x110
[ 8.397797] __platform_driver_register+0x54/0x60
[ 8.401841] msm_drm_register+0x54/0x80 [msm]
[ 8.406481] do_one_initcall+0x58/0x160
[ 8.410818] do_init_module+0x64/0x1d8
[ 8.414463] load_module+0x1378/0x15c8
[ 8.418282] SyS_finit_module+0x100/0x118
[ 8.422016] el0_svc_naked+0x30/0x34
[ 8.426095] ---[ end trace 800d0885aa276bfd ]---

Fix:

During the Ubuntu-snapdragon-4.15.0-1061.68 cycle, we picked up one upstream patch that of_platform_depopulate() msm in case of probe deferral (or during the removal), but that patch triggers a WARN_ON() during the wind down of the IOMMU (and the susequent kernel hang) - unless we want to backport the new msm dri driver (and all the relevant dependencies), revert the stable patch that calls of_platform_depopulate().

How to test:

Boot a patched kernel and check if that stracktrace shows up again.

Regression:

None, i'm reverting a patch that wasn't there before and clearly wasn't tested with our downstream BSP.

Paolo Pisati (1):
  Revert "drm/msm: Depopulate platform on probe failure"

 drivers/gpu/drm/msm/msm_drv.c | 14 +++-----------
 1 file changed, 3 insertions(+), 11 deletions(-)

-- 
2.7.4




More information about the kernel-team mailing list