Closed Bug 1799742 Opened 2 years ago Closed 2 years ago

Crash in [@ mozilla::widget::WinUtils::WaitForMessage | nsAppShell::ProcessNextNativeEvent]

Categories

(Core :: Widget: Win32, defect, P3)

Unspecified
Windows
defect

Tracking

()

RESOLVED INVALID

People

(Reporter: jstutte, Unassigned)

Details

(Keywords: crash, Whiteboard: [win:stability])

Crash Data

Crash report: https://crash-stats.mozilla.org/report/index/0ce879c0-1a72-467e-9206-0ef570221024

Reason: EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames of crashing thread:

0  ?  @0x1401c107  
1  ?  @0x1401c61d  
2  ?  @0x1400b8e6  
3  ?  @0x1400947b  
4  ?  @0x1401ba4e  
5  ?  @0x13f3b40c  
6  ?  @0x13f4da4e  
7  kernel32.dll  WaitForMultipleObjectsExImplementation  
8  user32.dll  RealMsgWaitForMultipleObjectsEx  
9  xul.dll  mozilla::widget::WinUtils::WaitForMessage  widget/windows/WinUtils.cpp:816

I think there are also instances on AMD64 architecture and newer Windows.

OS: Windows 7 → Windows
Hardware: x86 → Unspecified

This is a crash in the main thread event loop, not an IPC event loop, so isn't IPC related. As it appears to be windows specific I'm redirecting to Widget: Windows for now, to see if we can get someone more familiar with windows to take a look, though perhaps it would be more appropriate if it was in XPCOM.

Component: IPC → Widget: Win32
Severity: -- → S3
Priority: -- → P3
Whiteboard: [win:stability]

I've looked at a few of these; they don't seem to be coming from quite the same place. More interestingly, several of them are illegal instruction exceptions despite being at known addresses with perfectly good instructions, and others are exceptions that occurred in the middle of an instruction.

In particular, looking at (e.g.) https://crash-stats.mozilla.org/report/index/5c2ff4ae-23fc-4840-89a5-ac3910221021, it's crashed at the following point:

00007FFD4E8C8540 FF 15 82 A4 DE 05    call        qword ptr [__imp_GetTickCount (07FFD546B29C8h)]

... with "Access violation writing location 0x00007FFD546B29C8"... which shouldn't be possible, since call doesn't write to memory (EDIT) its target address. However, if the first byte of this instruction is instead misread as 00, we instead get:

00007FFD4E8C8540 00 15 82 A4 DE 05    add        byte ptr [__imp_GetTickCount (07FFD546B29C8h)], dl

... an instruction-pointer-relative add that tries to write to the same address. (Many other values will have similar results: using 88, for example, will generate mov, or 31 xor. Either would generate the same exception.)

Similarly, https://crash-stats.mozilla.org/report/index/cfd61d99-b18e-439d-af3e-6fb690220803 seems plausibly due to an earlier jump instruction jumping one byte too far -- a one-bit change. https://crash-stats.mozilla.org/report/index/d00141a9-ab2c-4f4e-b14f-ca0250221108 claims the instruction 0F 1F 44 00 00 (a perfectly legal spelling of nop) is illegal, which could be due to being misread as, e.g., 0F 3F 44 00 00.

The only reason I have reservations about saying "these are unrelated memory- (or disk-) failure-induced bugs" is that I'm not sure whether this is more crashes under a single signature than one would expect for that sort of thing. If it looks about right for that, I don't think this is actionable.

Per discussion on Matrix, this does seem to be a reasonable volume for memory- or disk-failure-induced bugs. Closing accordingly.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.