Closed
Bug 1404741
Opened 8 years ago
Closed 6 years ago
Startup crash in mozJSComponentLoader::GetSharedGlobal
Categories
(Core :: XPConnect, defect, P1)
Tracking
()
RESOLVED
WORKSFORME
| Tracking | Status | |
|---|---|---|
| firefox-esr52 | --- | unaffected |
| firefox-esr60 | --- | wontfix |
| firefox56 | --- | unaffected |
| firefox57 | --- | wontfix |
| firefox58 | --- | wontfix |
| firefox59 | --- | wontfix |
| firefox60 | --- | wontfix |
| firefox61 | --- | wontfix |
| firefox62 | --- | wontfix |
People
(Reporter: philipp, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: crash, regression, Whiteboard: [tbird crash])
Crash Data
Attachments
(1 file)
|
Bug 1404741: Don't call mozJSComponentLoader::CompilationScope during URLPreloader critical section.
59 bytes,
text/x-review-board-request
|
mccr8
:
review+
ritu
:
approval-mozilla-beta+
|
Details |
This bug was filed from the Socorro interface and is
report bp-c5858d66-e5bf-48a8-accc-8a0de0171001.
=============================================================
Crashing Thread (0)
Frame Module Signature Source
0 xul.dll mozJSComponentLoader::GetSharedGlobal(JSContext*) js/xpconnect/loader/mozJSComponentLoader.cpp:584
1 xul.dll mozilla::ScriptPreloader::DecodeNextBatch(unsigned int) js/xpconnect/loader/ScriptPreloader.cpp:1006
2 xul.dll mozilla::ScriptPreloader::InitCacheInternal() js/xpconnect/loader/ScriptPreloader.cpp:523
3 xul.dll mozilla::ScriptPreloader::InitCache(nsTSubstring<char16_t> const&) js/xpconnect/loader/ScriptPreloader.cpp:425
4 xul.dll mozilla::ScriptPreloader::GetChildSingleton() js/xpconnect/loader/ScriptPreloader.cpp:137
5 xul.dll mozilla::ScriptPreloader::GetSingleton() js/xpconnect/loader/ScriptPreloader.cpp:94
6 xul.dll NS_InitXPCOM2 xpcom/build/XPCOMInit.cpp:711
7 xul.dll ScopedXPCOMStartup::Initialize() toolkit/xre/nsAppRunner.cpp:1587
8 xul.dll XREMain::XRE_main(int, char** const, mozilla::BootstrapConfig const&) toolkit/xre/nsAppRunner.cpp:4861
9 xul.dll XRE_main(int, char** const, mozilla::BootstrapConfig const&) toolkit/xre/nsAppRunner.cpp:4960
10 xul.dll mozilla::BootstrapImpl::XRE_main(int, char** const, mozilla::BootstrapConfig const&) toolkit/xre/Bootstrap.cpp:45
11 firefox.exe wmain toolkit/xre/nsWindowsWMain.cpp:115
12 firefox.exe __scrt_common_main_seh f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:253
13 kernel32.dll BaseThreadInitThunk
14 ntdll.dll __RtlUserThreadStart
15 ntdll.dll _RtlUserThreadStart
this cross-platform crash signature is newly showing up in firefox 57 with "MOZ_RELEASE_ASSERT(globalObj)" that got added in bug 1381976.
Comment 1•8 years ago
|
||
Hm. This is worrying. The only reason we should expect to fail to create a global at this point is OOM, but all of these users appear to have plenty of available memory.
There's really no way to make this a non-fatal error, though. If we can't create that module global, we can't load JS components, which means we can't start the browser. The only real hope is that it might succeed when we call it a bit later to actually execute the script, rather than just to compile it.
I'll see if I can add some additional assertions to pinpoint exactly where this is failing.
Assignee: nobody → kmaglione+bmo
Comment 2•8 years ago
|
||
It looks like this is in the main process, which seems even weirder. Maybe this would have shown up as another crash before shared JSM modules?
Comment 3•8 years ago
|
||
(In reply to Andrew McCreight (PTO-ish Oct 1 - 12) [:mccr8] from comment #2)
> It looks like this is in the main process, which seems even weirder. Maybe
> this would have shown up as another crash before shared JSM modules?
Yeah, that's what I'm thinking. It's possible that this is happening now because we're creating the global earlier now, and wouldn't have happened before. But if so, I'd expect it to fail every time, not just for certain users.
Also, before the shared global changes, we would have treated this as a non-fatal error, and passed it on to whoever tried to load the component/module. But failure to load the components we load at startup causes us to abort startup. And failure to load other modules during startup generally makes the browser unusable. So we wouldn't have been in a better position.
Comment 4•8 years ago
|
||
Hey Andy, looks like Kris is on PTO. Suggestions on what to do with htis? Low volume crash but new in 57.
Flags: needinfo?(amckay)
Priority: -- → P1
Comment 5•8 years ago
|
||
I'm not on PTO, just still looking into options for debugging this.
So far, it looks like the odds are that this isn't a new issue in 57, just a new failure mode.
Flags: needinfo?(amckay)
| Comment hidden (mozreview-request) |
Comment 7•8 years ago
|
||
| mozreview-review | ||
Comment on attachment 8916165 [details]
Bug 1404741: Don't call mozJSComponentLoader::CompilationScope during URLPreloader critical section.
https://reviewboard.mozilla.org/r/187412/#review192476
::: commit-message-ccb09:3
(Diff revision 1)
> +Bug 1404741: Don't call mozJSComponentLoader::CompilationScope during URLPreloader critical section. r?mccr8
> +
> +The URLPreloader's initialization code access the Omnijar cache off-main
micronit: accesses
::: js/xpconnect/loader/ScriptPreloader.cpp:419
(Diff revision 1)
>
> if (!XRE_IsParentProcess()) {
> return Ok();
> }
>
> + // Grab the compilation scope before initializing the URLPreloader, it's not
nit: this should be "because it's not" or whatever
Attachment #8916165 -
Flags: review?(continuation) → review+
Comment 8•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/a5ab6b153cccc38a2fae62a529923f8370734c39
Bug 1404741: Don't call mozJSComponentLoader::CompilationScope during URLPreloader critical section. r=mccr8
Comment 9•8 years ago
|
||
| bugherder | ||
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Comment 10•8 years ago
|
||
Comment on attachment 8916165 [details]
Bug 1404741: Don't call mozJSComponentLoader::CompilationScope during URLPreloader critical section.
Approval Request Comment
[Feature/Bug causing the regression]: Bug 1381976
[User impact if declined]: This causes unpredictable startup crashes for some users, due to a race condition.
[Is this code covered by automated tests?]: It is exercised by automated tests, but there are no tests for this specific problem, since it's a race condition.
[Has the fix been verified in Nightly?]: N/A
[Needs manual test from QE? If yes, steps to reproduce]: No. This is a race condition, which shows up rarely, mostly in crashstats.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: It simply moves an operation to a slightly earlier point in startup, when it doesn't risk causing a data race with a background thread.
[String changes made/needed]: None.
Attachment #8916165 -
Flags: approval-mozilla-beta?
Comment on attachment 8916165 [details]
Bug 1404741: Don't call mozJSComponentLoader::CompilationScope during URLPreloader critical section.
Fix for a new crash, Beta57+
Attachment #8916165 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 12•8 years ago
|
||
| bugherder uplift | ||
| Reporter | ||
Comment 13•8 years ago
|
||
the crash signature is still present in beta 7 with that patch...
Flags: needinfo?(kmaglione+bmo)
Comment 14•8 years ago
|
||
(In reply to [:philipp] from comment #13)
> the crash signature is still present in beta 7 with that patch...
Thanks. The crash stacks are different now, though, and the background threads no longer have nsZipArchive::GetItem in their stacks. So hopefully this at least fixed bug 1404743, but we still need to sort out what's causing the global creation to fail.
Status: RESOLVED → REOPENED
Flags: needinfo?(kmaglione+bmo)
Resolution: FIXED → ---
Comment 15•8 years ago
|
||
Every crash in 57b7 that I've looked at so far has ZVFORT32.DLL loaded, which seems to be part of a software suite called "Net Protector".
If I had to bet at this point, that's where I'd put my money.
Comment 16•8 years ago
|
||
Adam, it sounds like this crash might be caused by a 3rd party, would you be able to help?
Flags: needinfo?(astevenson)
Comment 17•8 years ago
|
||
Yes, will reach out and update when I hear back.
Flags: needinfo?(astevenson)
Comment 18•8 years ago
|
||
Got a response that their engineering team is taking a look at this.
| Reporter | ||
Comment 19•8 years ago
|
||
url correlations on beta do not indicate that this is correlated to particular (rare) dll modules as far as i can see:
https://crash-stats.mozilla.com/signature/?signature=mozJSComponentLoader%3A%3AGetSharedGlobal#correlations
Comment 20•8 years ago
|
||
Looking at some of these from 57 release, I don't see the ZVFORT32.DLL module in any of them. Some crashes have AVAST, and others don't like to have AV.
Comment 21•8 years ago
|
||
I experienced it this morning on my Linux system.
My hard drive was full. Cleaning up didn't fix the issue.
I don't have any AV or stuff like that.
As I can reproduce it everytime, I am happy to help debugging.
Examples:
bp-ac4731c8-cc44-43a8-8796-91d4b0171220
bp-f2e17192-60d3-40ce-9d54-1cbd80171220
Flags: needinfo?(kmaglione+bmo)
Flags: needinfo?(continuation)
| Reporter | ||
Updated•8 years ago
|
Comment 22•8 years ago
|
||
Sorry for the delay. I was making an effort not to work during the holidays. Can you still reproduce this?
If so, can you try to reproduce it under rr? If you can, we can reverse-step to find the location of the actual failure.
Also, a copy of your profile's startupCache directory would be helpful.
Flags: needinfo?(kmaglione+bmo) → needinfo?(sledru)
Updated•8 years ago
|
Flags: needinfo?(continuation)
Comment 23•8 years ago
|
||
I still can reproduce it and rr works with it.
What do I need to do with that? (ie not do make it some on the failure)?
I sent you the profile by email.
Flags: needinfo?(sledru)
Updated•8 years ago
|
Comment 24•8 years ago
|
||
I did reproduce it to on Nightly on linux ubuntu 16.04 environment. (might have been related to a drive close to full)
Comment 25•8 years ago
|
||
Sounds like this was reproducible? Did anything come of the rr trace in comment 23?
status-firefox61:
--- → affected
Flags: needinfo?(kmaglione+bmo)
| Reporter | ||
Comment 26•8 years ago
|
||
could this issue be the same as bug 1276488? the crashing graphs looks fairly similar (peaks and lows fall on the same days)...
Comment 27•8 years ago
|
||
Sorry, I lost track of this last time.
I'm pretty sure this is just another startup cache/disk corruption issue.
I ran into it once when I changed a file in my local build, and a lazy source hook got called to stringify a closure with offsets that didn't make sense in the new version.
We also run into that in bug 1403348, where we get error reports when trying to execute JS files whose contents appear to be corrupt. It's still not clear whether those are a result of XDR corruption or omni jar corruption. I'm still looking into them.
We've also seen other similar reports of network failure causing errors when running Firefox from a network drive. We know that a lot of the XDR decoding crashes we see are disk access errors when accessing mmapped files. Some of those are probably from network failures. The failures to open the app jar that we've seen in a few cases (and could also be responsible for some of these crashes) may be the same issue.
(In reply to [:philipp] from comment #26)
> could this issue be the same as bug 1276488? the crashing graphs looks
> fairly similar (peaks and lows fall on the same days)...
Yeah, assuming this is a disk corruption or failure issue, that seems pretty likely.
Flags: needinfo?(kmaglione+bmo)
Comment 28•8 years ago
|
||
Is this the same bug that has been the top crasher for firefox 59.0.3 for the last couple of days?
https://crash-stats.mozilla.com/topcrashers/?product=Firefox&version=59.0.3&days=3
https://crash-stats.mozilla.com/signature/?date=%3C2018-05-04T14%3A31%3A56%2B00%3A00&date=%3E%3D2018-05-01T14%3A31%3A56%2B00%3A00&product=Firefox&version=59.0.3&signature=mozJSComponentLoader%3A%3AGetSharedGlobal
All of those reports that I clicked open had the same stack trace. It's different from the one pasted here but ends in mozJSComponentLoader::GetSharedGlobal(JSContext*).
Comment 29•8 years ago
|
||
Over 1000 startup crashes in the last week on release. That's a lot of bad disks...
Comment 30•7 years ago
|
||
Still moderately high volume on release 61 (~600 crashes/week). Only 3 or so on beta 62.
Comment 31•7 years ago
|
||
I'm fairly certain at this point that this is some sort of corruption. I don't have time to work on omnijar checksums, though, and I'm pretty sure that's what we need.
Assignee: kmaglione+bmo → nobody
Comment 32•7 years ago
|
||
I wonder if any of this would be helped by better install integrity. I've completely busted my install when my computer crashed during an auto update of firefox. I needed a full reinstall to fix it. It is possible correlations with updates are due to this.
Comment 33•6 years ago
|
||
(In reply to Kris Maglione [:kmag] (unavailable until 10/28) from comment #31)
I'm fairly certain at this point that this is some sort of corruption. I
don't have time to work on omnijar checksums, though, and I'm pretty sure
that's what we need.
does bug 1515712 address this?
(In reply to Ted Campbell [:tcampbell] from comment #32)
I wonder if any of this would be helped by better install integrity.
Another bug required?
Flags: needinfo?(kmaglione+bmo)
Whiteboard: [tbird crash]
Comment 34•6 years ago
|
||
I would say this was fixed by bug 1515712 in 68, because there are no version 68 crashes.
Status: REOPENED → RESOLVED
Closed: 8 years ago → 6 years ago
Flags: needinfo?(kmaglione+bmo)
Resolution: --- → WORKSFORME
Updated•4 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•