Closed Bug 1715633 Opened 3 years ago Closed 3 years ago

Crash in [@ zzz_AsmCodeRange_End | DoMozStackWalkThread]

Categories

(Core :: mozglue, defect)

Unspecified
Windows 7
defect

Tracking

()

RESOLVED FIXED
91 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox89 --- unaffected
firefox90 --- unaffected
firefox91 --- fixed

People

(Reporter: aryx, Unassigned)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

9 crashes from 6 installations, all on Windows 7 with the latest Nightly (91.0a1 20210609093513). Caused by bug 1712674?

Crash report: https://crash-stats.mozilla.org/report/index/4c37805d-0d74-49c9-937d-45c680210609

Reason: EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames of crashing thread:

0 ntdll.dll zzz_AsmCodeRange_End 
1 mozglue.dll DoMozStackWalkThread mozglue/misc/StackWalk.cpp:411
2 mozglue.dll replace_malloc memory/replace/phc/PHC.cpp:1125
3 xul.dll Gecko_StartBulkWriteString xpcom/string/nsSubstring.cpp:420
4 xul.dll nsstring::conversions::nsstring_fallible_append_utf8_impl xpcom/rust/nsstring/src/conversions.rs:685
5 xul.dll mozilla::extensions::URLInfo::Spec const toolkit/components/extensions/MatchPattern.cpp:157
6 xul.dll mozilla::extensions::ChannelWrapper::GetFinalURL const toolkit/components/extensions/webrequest/ChannelWrapper.cpp:902
7 xul.dll mozilla::dom::ChannelWrapper_Binding::get_finalURL dom/bindings/ChannelWrapperBinding.cpp:1827
8  @0x4690ec2fea 
9 xul.dll _tailMerge_d3dcompiler_47.dll 
Severity: -- → S2
Flags: needinfo?(gsquelart)

Yes, I'd say it's most probably because of bug 1712674.
:glandium, could you please have a look? (I'll also investigate shortly.) Maybe a backout would be best?

Flags: needinfo?(mh+mozilla)
Regressed by: 1712674
Has Regression Range: --- → yes

Looking at the minidump associated with the report linked from comment 0:
The exception happens inside RtlVirtualUnwind.
The last couple of instructions are:

0000000076F6B0E6  mov         rcx,qword ptr [rsi+98h]  
0000000076F6B0ED  mov         rax,qword ptr [rcx]

That last instruction crashes. At that point:

RAX = 0000000000000058 RBX = 000000000088BCA0 RCX = FFF9800000000008 RDX = 0000000000000000
RSI = 000000000088BCE0 RDI = 0000004690EB000C R8  = 0000000000000058 
R9  = 0000004690EB0000 R10 = 0000000000000000 R11 = 000000000088C688 R12 = 0000000000000001
R13 = 0000004690F544D7 R14 = 0000004690EB0000 R15 = 0000004690EB0000 
RIP = 0000000076F6B0ED RSP = 000000000088BBE0 RBP = 0000000090F544D2 EFL = 00010246 

Note that RSI=88BCE0, which looks to be on the stack, being after RSP=88BBE0.
So it reads something from RSI+98h=88BD78, which is FFF9800000000008, and reading from that crashes.
I don't know what it's trying to do there, what is supposed to be at some offset 0x98 in a stack object/frame?

Looking at the call to MozStackWalk, it comes from StackTrace::Fill(), and the closure pointer to a StackTrace looks correct. At crash time, that StackTrace contains:

mLength=8
mPcs= {
  [0] 0x000007fed2cfc134 {xul.dll!Gecko_StartBulkWriteString(nsTSubstring<char16_t> * aThis, unsigned int aCapacity, unsigned int aUnitsToPreserve, bool aAllowShrinking), Line 420}
  [1] 0x000007fed2ec25b8 {xul.dll!nsstring::conversions::nsstring_fallible_append_utf8_impl(nsstring::nsAString * this, unsigned char * other, unsigned __int64 other_len, unsigned __int64 old_len), Line 685}
  [2] 0x000007fed28572b3 {xul.dll!mozilla::extensions::URLInfo::Spec(void), Line 157}
  [3] 0x000007fed285c488 {xul.dll!mozilla::extensions::ChannelWrapper::GetFinalURL(nsTString<char16_t> & aRetVal), Line 902}
  [4] 0x000007fed234079e {xul.dll!mozilla::dom::ChannelWrapper_Binding::get_finalURL(JSContext * cx, JS::Handle<JSObject *> obj, void * void_self, JSJitGetterCallArgs args), Line 1830}
  [5] 0x0000004690ec2feb
  [6] 0x0000004690f281f3
  [7] 0x0000004690f544d2
  ...
}

(The function names were found by MSVC, so the unknown addresses starting with 4690... may be generated code and/or unknown DLLs, assuming they're correct of course.)

That's as far as I could go.
Mike, what more can you see? Should we backout bug 1712674?

Flags: needinfo?(gsquelart)

That looks a lot like bug 1667663.

Flags: needinfo?(mh+mozilla)

Mike, what more can you see? Should we backout bug 1712674?

How about disabling it when we come with aThread being empty, which would disable it for PHC and the other things that use MozStackWalk directly, but not for the profiler, which would limit the scope of the problem if it really comes from that?

[@ ntdll.dll | DoMozStackWalkThread] is the Windows 8.1 variant of DoMozStackWalkThread crashes in the latest Nightly with 3 crashes reported for Windows 7.

Crash Signature: [@ zzz_AsmCodeRange_End | DoMozStackWalkThread] → [@ ntdll.dll | DoMozStackWalkThread] [@ zzz_AsmCodeRange_End | DoMozStackWalkThread]
Crash Signature: [@ ntdll.dll | DoMozStackWalkThread] [@ zzz_AsmCodeRange_End | DoMozStackWalkThread] → [@ ntdll.dll | DoMozStackWalkThread] [@ TpAlpcRegisterCompletionList] [@ zzz_AsmCodeRange_End | DoMozStackWalkThread]
Crash Signature: [@ ntdll.dll | DoMozStackWalkThread] [@ TpAlpcRegisterCompletionList] [@ zzz_AsmCodeRange_End | DoMozStackWalkThread] → [@ get_fpsr ] [@ ntdll.dll | DoMozStackWalkThread] [@ TpAlpcRegisterCompletionList] [@ zzz_AsmCodeRange_End | DoMozStackWalkThread]

(In reply to Mike Hommey [:glandium] from comment #4)

Mike, what more can you see? Should we backout bug 1712674?

How about disabling it when we come with aThread being empty, which would disable it for PHC and the other things that use MozStackWalk directly, but not for the profiler, which would limit the scope of the problem if it really comes from that?

Too late, I see it was just backed-out (thank you Alexandru), I guess crashes were just too numerous.

Also, some of the crashes may have been across threads, e.g.: https://crash-stats.mozilla.org/report/index/d88e8c9a-4755-4f2e-bf83-3bbec0210610 .

Maybe [@ DoMozStackWalkThread ] is another signature variant? bp-0621eb57-126f-4ce2-ab15-0ad6f0210610

Right, adding all signatures with "MozStackWalkThread" that happened on build 20210609093513.

Crash Signature: [@ get_fpsr ] [@ ntdll.dll | DoMozStackWalkThread] [@ TpAlpcRegisterCompletionList] [@ zzz_AsmCodeRange_End | DoMozStackWalkThread] → [@ get_fpsr ] [@ ntdll.dll | DoMozStackWalkThread] [@ TpAlpcRegisterCompletionList] [@ zzz_AsmCodeRange_End | DoMozStackWalkThread] [@ DoMozStackWalkThread] [@ RtlFreeHeap | DoMozStackWalkThread] [@ RtlVirtualUnwind | RtlpLookupDynamicFunctionEntry …

And crashes are down to zero after the backout, so I'll call this one fixed.

Bug 1712674 was reopened, we'll rework the patch there if it can be salvaged...

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 91 Branch
You need to log in before you can comment on or make changes to this bug.