Fixes: driver/gpu/drm/bridge adv7511_drv.c determines EDID read success when the DDC communication error by ymh-ryutaro1-okada · Pull Request #22 · nxp-imx/linux-imx

ymh-ryutaro1-okada · 2024-09-02T07:24:22Z

Issue:

We found two issues happened following when the HPD event asserts.
・ADV7511/ADV7535 fails to read EDID(All 0x00) (Btw, it's okay until use)
・DRM tries to update new supported resolution list by using the wrong EDID(All 0x00)
→This sometimes causes unnecessary dynamic resolution changes to HDMI output from ADV7511/ADV7535.

How to fix:

HDCP_ERROR_INT means HDCP/DDC controller error occurred, it does not mean EDID read success. However, the current implementation determines both ADV7511_INT0_EDID_READY and ADV7511_INT1_DDC_ERROR are EDID read successfully. ADV7511_INT1_DDC_ERROR should be handled as EDID read failed or EDID read timeout. The EDID read timeout sequence is already implemented in the adv7511_wait_for_edid(), so it is safe to ignore DDC_ERROR.

※screen captured register map is allowed to use by ADI

Root cause:

adv7511_drv.c determines EDID read successfully when ADV7511/ADV7535 INT1 bit(HDCP_ERROR_INT) asserted.
(DRM debug log is following, added the ＠IRQ description)

DRM debug log

DRM tries to specify new resolution by wrong EDID(All 0x00).

[ 6937.533628] drm card1-HDMI-A-1: [drm:adv7511_irq_process] IRQ0:(80) @ADV7511_INT0_EDID_READY
[ 6937.533659] drm card1-HDMI-A-1: [drm:adv7511_irq_process] IRQ1:(00)
[ 6937.533982] drm card1-HDMI-A-1: [drm:adv7511_hpd_work] HDMI HPD event: connected
[ 6937.533992] [drm:drm_sysfs_hotplug_event] generating hotplug event
[ 6937.534058] imx-drm 32c00000.bus:display-subsystem: [drm:drm_client_dev_hotplug] fbdev: ret=0
[ 6937.535023] [drm:drm_ioctl] comm="weston" pid=658, dev=0xe201, auth=1, DRM_IOCTL_MODE_GETRESOURCES
[ 6937.535058] [drm:drm_ioctl] comm="weston" pid=658, dev=0xe201, auth=1, DRM_IOCTL_MODE_GETRESOURCES
[ 6937.535073] [drm:drm_ioctl] comm="weston" pid=658, dev=0xe201, auth=1, DRM_IOCTL_MODE_GETCONNECTOR
[ 6937.535087] [drm:drm_helper_probe_single_connector_modes] [CONNECTOR:35:HDMI-A-1]
[ 6937.538715] drm card1-HDMI-A-1: [drm:adv7511_irq_process] IRQ0:(00)
[ 6937.538742] drm card1-HDMI-A-1: [drm:adv7511_irq_process] IRQ1:(80) @ADV7511_INT1_DDC_ERROR
[ 6937.548077] imx-drm 32c00000.bus:display-subsystem: [drm:connector_bad_edid] HDMI-A-1: EDID is invalid:
[ 6937.548107] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6937.548111] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6937.548114] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6937.548117] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6937.548120] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6937.548123] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6937.548126] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6937.548129] [00] ZERO 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 6937.551071] [drm:drm_mode_debug_printmodeline] Modeline "640x480": 60 25175 640 656 752 800 480 490 492 525 0x40 0xa
[ 6937.551097] [drm:drm_mode_prune_invalid] Not using 640x480 mode: NOCLOCK
[ 6937.551104] [drm:drm_mode_debug_printmodeline] Modeline "800x600": 56 36000 800 824 896 1024 600 601 603 625 0x40 0x5
[ 6937.551116] [drm:drm_mode_prune_invalid] Not using 800x600 mode: NOCLOCK
[ 6937.551122] [drm:drm_mode_debug_printmodeline] Modeline "800x600": 60 40000 800 840 968 1056 600 601 605 628 0x40 0x5
[ 6937.551136] [drm:drm_mode_prune_invalid] Not using 800x600 mode: BAD
[ 6937.551143] [drm:drm_mode_debug_printmodeline] Modeline "848x480": 60 33750 848 864 976 1088 480 486 494 517 0x40 0x5
[ 6937.551156] [drm:drm_mode_prune_invalid] Not using 848x480 mode: NOCLOCK
[ 6937.551164] [drm:drm_mode_debug_printmodeline] Modeline "1024x768": 60 65000 1024 1048 1184 1344 768 771 777 806 0x40 0xa
[ 6937.551178] [drm:drm_mode_prune_invalid] Not using 1024x768 mode: BAD

…the DDC communication error has occurred. HDCP_ERROR_INT means HDCP/DDC controller error occurred, it does not mean EDID read success. However, the current implementation determines both ADV7511_INT0_EDID_READY and ADV7511_INT1_DDC_ERROR are EDID read successfully. ADV7511_INT1_DDC_ERROR should be handled as EDID read failed or EDID read timeout. The EDID read timeout sequence is already implemented in the adv7511_wait_for_edid(), so it is safe to ignore DDC_ERROR.

[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 #1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) nxp-imx#2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) nxp-imx#3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 nxp-imx#4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 nxp-imx#5 0x55c94045c776 in ui_browser__show ui/browser.c:288 nxp-imx#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 nxp-imx#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 nxp-imx#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 nxp-imx#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 nxp-imx#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 nxp-imx#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 nxp-imx#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 nxp-imx#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 nxp-imx#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 nxp-imx#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 nxp-imx#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 nxp-imx#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 nxp-imx#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 nxp-imx#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 nxp-imx#20 0x55c9400e53ad in main tools/perf/perf.c:561 nxp-imx#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 nxp-imx#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 nxp-imx#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>

…uddy pages commit 8cf360b upstream. When I did memory failure tests recently, below panic occurs: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) raw: 06fffe0000000000 dead000000000100 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffffff 0000000000000000 page dumped because: VM_BUG_ON_PAGE(!PageBuddy(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:1009! invalid opcode: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:__del_page_from_free_list+0x151/0x180 RSP: 0018:ffffa49c90437998 EFLAGS: 00000046 RAX: 0000000000000035 RBX: 0000000000000009 RCX: ffff8dd8dfd1c9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff8dd8dfd1c9c0 RBP: ffffd901233b8000 R08: ffffffffab5511f8 R09: 0000000000008c69 R10: 0000000000003c15 R11: ffffffffab5511f8 R12: ffff8dd8fffc0c80 R13: 0000000000000001 R14: ffff8dd8fffc0c80 R15: 0000000000000009 FS: 00007ff916304740(0000) GS:ffff8dd8dfd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055eae50124c8 CR3: 00000008479e0000 CR4: 00000000000006f0 Call Trace: <TASK> __rmqueue_pcplist+0x23b/0x520 get_page_from_freelist+0x26b/0xe40 __alloc_pages_noprof+0x113/0x1120 __folio_alloc_noprof+0x11/0xb0 alloc_buddy_hugetlb_folio.isra.0+0x5a/0x130 __alloc_fresh_hugetlb_folio+0xe7/0x140 alloc_pool_huge_folio+0x68/0x100 set_max_huge_pages+0x13d/0x340 hugetlb_sysctl_handler_common+0xe8/0x110 proc_sys_call_handler+0x194/0x280 vfs_write+0x387/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff916114887 RSP: 002b:00007ffec8a2fd78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000055eae500e350 RCX: 00007ff916114887 RDX: 0000000000000004 RSI: 000055eae500e390 RDI: 0000000000000003 RBP: 000055eae50104c0 R08: 0000000000000000 R09: 000055eae50104c0 R10: 0000000000000077 R11: 0000000000000246 R12: 0000000000000004 R13: 0000000000000004 R14: 00007ff916216b80 R15: 00007ff916216a00 </TASK> Modules linked in: mce_inject hwpoison_inject ---[ end trace 0000000000000000 ]--- And before the panic, there had an warning about bad page state: BUG: Bad page state in process page-types pfn:8cee00 page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8cee00 flags: 0x6fffe0000000000(node=1|zone=2|lastcpupid=0x7fff) page_type: 0xffffff7f(buddy) raw: 06fffe0000000000 ffffd901241c0008 ffffd901240f8008 0000000000000000 raw: 0000000000000000 0000000000000009 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 154211 Comm: page-types Not tainted 6.9.0-rc4-00499-g5544ec3178e2-dirty nxp-imx#22 Call Trace: <TASK> dump_stack_lvl+0x83/0xa0 bad_page+0x63/0xf0 free_unref_page+0x36e/0x5c0 unpoison_memory+0x50b/0x630 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xcd/0x550 ksys_write+0x64/0xe0 do_syscall_64+0xc2/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f189a514887 RSP: 002b:00007ffdcd899718 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f189a514887 RDX: 0000000000000009 RSI: 00007ffdcd899730 RDI: 0000000000000003 RBP: 00007ffdcd8997a0 R08: 0000000000000000 R09: 00007ffdcd8994b2 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcda199a8 R13: 0000000000404af1 R14: 000000000040ad78 R15: 00007f189a7a5040 </TASK> The root cause should be the below race: memory_failure try_memory_failure_hugetlb me_huge_page __page_handle_poison dissolve_free_hugetlb_folio drain_all_pages -- Buddy page can be isolated e.g. for compaction. take_page_off_buddy -- Failed as page is not in the buddy list. -- Page can be putback into buddy after compaction. page_ref_inc -- Leads to buddy page with refcnt = 1. Then unpoison_memory() can unpoison the page and send the buddy page back into buddy list again leading to the above bad page state warning. And bad_page() will call page_mapcount_reset() to remove PageBuddy from buddy page leading to later VM_BUG_ON_PAGE(!PageBuddy(page)) when trying to allocate this page. Fix this issue by only treating __page_handle_poison() as successful when it returns 1. Link: https://lkml.kernel.org/r/20240523071217.1696196-1-linmiaohe@huawei.com Fixes: ceaf8fb ("mm, hwpoison: skip raw hwpoison page in freeing 1GB hugepage") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Overdr0ne pushed a commit to Overdr0ne/linux-imx that referenced this pull request Jul 28, 2025

Added gpio hogs for CoPro and DAC (nxp-imx#22)

a1bf8e1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes: driver/gpu/drm/bridge adv7511_drv.c determines EDID read success when the DDC communication error#22

Fixes: driver/gpu/drm/bridge adv7511_drv.c determines EDID read success when the DDC communication error#22
ymh-ryutaro1-okada wants to merge 1 commit intonxp-imx:lf-6.6.yfrom
ymh-ryutaro1-okada:fixes_adv7511drv_HPD_event_asserts_at_DDC_error

ymh-ryutaro1-okada commented Sep 2, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ymh-ryutaro1-okada commented Sep 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue:

How to fix:

Root cause:

DRM debug log

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ymh-ryutaro1-okada commented Sep 2, 2024 •

edited

Loading