[trunk] Crash on startup (mozilla::ReadAheadLib() Line 388) when building with Windows SDK 10.0.17763
Categories
(Firefox :: General, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | unaffected |
firefox67 | --- | unaffected |
firefox68 | + | fixed |
People
(Reporter: mark, Assigned: alexical)
References
(Regression)
Details
(Keywords: regression)
Attachments
(2 files)
Following the instructions to build Firefox using the bootstrap method from mozilla-central, I consistently run into the following crash after a successful build on startup:
Exception thrown at 0x000000013F7B1B10 in firefox.exe: 0xC0000005:
Access violation reading location 0x000007FEED96B000. occurred
firefox.exe!mozilla::ReadAheadLib(const wchar_t * aFilePath) Line 388 C++
firefox.exe!mozilla::GetBootstrap(const char * aXPCOMFile, mozilla::LibLoadingStrategy aLibLoadingStrategy) Line 374 C++
firefox.exe!InitXPCOMGlue(mozilla::LibLoadingStrategy aLibLoadingStrategy) Line 222 C++
firefox.exe!NS_internal_main(int argc, char * * argv, char * * envp) Line 281 C++
firefox.exe!wmain(int argc, wchar_t * * argv) Line 131 C++
This happens both with a default nightly build (no mozconfig) and my intended mozconfig (primarily some optimizations for my system).
No crash signature since this is a hard crash before the reporter is initialized.
Comment 1•6 years ago
|
||
Are you able to bisect when this failure starts?
Hard crash before crash reporter is a drop-everything-and-fix.
Updated•6 years ago
|
Reporter | ||
Comment 2•6 years ago
|
||
I'm not sure; I've never run into something like this before.
What would be the best way to go about bisecting something like this?
Comment 3•6 years ago
•
|
||
I am experiencing this as well. Running Windows 7.
I tried running mozregression both on shippable builds and debug builds, and all seem to run right up to the current date. I have tried doing clobber builds and the like locally, but they all just won't run for me.
I recently updated my Windows 10 SDK to version 10.0.17763 due to the recent ANGLE update, so I partly suspect this might have something to do with it. A hypothesis here is that maybe we are broken on Windows 7 when building with some SDK versions?
What version of the Windows 10 SDK(s) do you have installed? You an check the "Add or Remove Programs" dialog to see...
Reporter | ||
Comment 4•6 years ago
|
||
Perhaps an overview of the state (using the JIT debugger of VS) when it crashes is helpful. It seems the problem occurs when it tries to pre-read xul.dll (judging by the indicated file size). Maybe xul.dll is simply getting too big?
Reporter | ||
Comment 5•6 years ago
|
||
What version of the Windows 10 SDK(s) do you have installed?
I have 3 versions installed:
- 10.0.26624
- 10.0.17763.132
- 10.0.15063.674
Reporter | ||
Comment 6•6 years ago
|
||
A related question: Why are we trying to pre-read these files to begin with? Any modern file system and storage medium will already use read-ahead. And it's not like we're still on XP or Vista with suboptimal I/O. Can't we trust the O.S. to handle file reading properly instead of trying to force the matter?
Comment 7•6 years ago
•
|
||
I downgraded my Windows 10 SDK version from 10.0.17763 to 10.0.17134 and it finally worked for me! It does seem to be a problem with the 10.0.17763 SDK somehow interacting with Windows 7 and us in a bad way.
Until or if we find out some better reason why this is happening, I guess this is an okay workaround for now, unless we continue to be broken in subsequent SDKs as well...
Reporter | ||
Comment 8•6 years ago
|
||
I can confirm that removing the 10.0.17763.132 SDK and installing the 10.0.17134.12 SDK instead solved the startup crash. Of note is that without 17134 installed (and using just 15063 as stated on the windows build page) the tree also doesn't build due to ANGLE missing a required header in that case.
This works as a workaround but it should most definitely be mentioned on the Windows build page https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Build_Instructions/Windows_Prerequisites
I can also see this being a problem in the future if it continues to be broken in later SDKs that might become required by third party libs.
Updated•6 years ago
|
Updated•6 years ago
|
Assignee | ||
Comment 9•6 years ago
|
||
Temporarily just sidestep the issue in bug 1546498 (crash with latest SDK
on startup in Windows 7) by just continuing to use the old method in
Windows 7. We saw no wins in telemetry for Windows 7 anyway, so we should
investigate why that is, and why we see a mysterious crash in the fallback
code, in a followup bug.
Comment 10•6 years ago
|
||
Comment 11•6 years ago
|
||
Backed out changeset 1c79adcd8483 (bug 1546498) for build bustage. CLOSED TREE
Log:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=243681796&repo=autoland&lineNumber=85508
Push with failures:
https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=1c79adcd8483d038b94b6690c65a6f868132733d
Backout:
https://hg.mozilla.org/integration/autoland/rev/eb170900104e63918906d2275959ffdd6f7775b3
Comment 12•6 years ago
|
||
Comment 13•6 years ago
|
||
bugherder |
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Comment 14•6 years ago
|
||
If the binary is built with Windows SDK 10.0.17763, it will contain the .retplne
section that has PAGE_NOACCESS
protection. So reading the image will cause an access violation at the section.
Updated•6 years ago
|
Updated•6 years ago
|
Updated•3 years ago
|
Description
•