Closed
Bug 1295033
Opened 7 years ago
Closed 7 years ago
Crash in nsXPCWrappedJS::nsXPCWrappedJS
Categories
(Core :: XPCOM, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla53
People
(Reporter: marcia, Assigned: jonco)
References
Details
(Keywords: crash)
Crash Data
Attachments
(1 file, 1 obsolete file)
2.94 KB,
patch
|
mccr8
:
review+
jcristau
:
approval-mozilla-aurora+
gchang
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-74fc5f2a-51f9-4552-b25d-adc782160814. ============================================================= Crash seen in early betas that has risen to #10 in Beta 3. Since it has no bug associated with it, filing to get it on the radar for tracking: http://bit.ly/2aLJ9PR. Perhaps terrence has some ideas as to what might be the root cause, adding ni.
Flags: needinfo?(terrence)
Comment 1•7 years ago
|
||
These are (almost) all hitting the MOZ_RELEASE_ASSERT added in bug 1214961. Either the hashtable is failing to grow, in which case this looks like it is really some kind of OOM. Maybe some addon ends up creating a ton of these multicompartment wrappers.
Blocks: 1214961
Comment 2•7 years ago
|
||
Although, before that patch we should have been adding every wrapper to the global XPCJSRuntime table, so it can't be making it worse. It looks like we'd just silently fail! I can't imagine would have led to anything good down the line.
Comment 3•7 years ago
|
||
> It looks like we'd just silently fail! I meant, this was the behavior before bug 1214961.
Comment 4•7 years ago
|
||
I guess this is a longstanding OOM. I'd like to make this table smaller regardless in order to improve GC latencies, although I'm not sure how much success I'm likely to have. I suppose it's possible that a small number of things are taking lots of the space and we could fix the issue by deCOMtaminating a handful of things. If it's a long tail issue however, I don't think there's much we can do beyond a full-company effort to move ourself off of XPIDL and other legacy technologies that no longer get the platform support they need to stay effective.
Now the #7 topcrash in beta 5. I guess if that is something showing separately, when for 47 or 48 it was still happening, but showing as an OOM crash, it probably isn't making the overall crash rate worse. It would be nice if we could use this new information and try some of the strategies Terrence suggests. It sounds like that is unlikely to happen for 49.
Comment 6•7 years ago
|
||
Crash volume for signature 'nsXPCWrappedJS::nsXPCWrappedJS': - nightly (version 51): 9 crashes from 2016-08-01. - aurora (version 50): 27 crashes from 2016-08-01. - beta (version 49): 2227 crashes from 2016-08-02. - release (version 48): 1369 crashes from 2016-07-25. - esr (version 45): 2 crashes from 2016-05-02. Crash volume on the last weeks (Week N is from 08-22 to 08-28): W. N-1 W. N-2 W. N-3 - nightly 1 5 0 - aurora 15 3 2 - beta 1124 257 53 - release 324 169 91 - esr 0 0 0 Affected platform: Windows Crash rank on the last 7 days: Browser Content Plugin - nightly #189 - aurora #107 #127 - beta #7 #3353 - release #21 - esr
status-firefox48:
--- → affected
status-firefox50:
--- → affected
status-firefox51:
--- → affected
status-firefox-esr45:
--- → affected
Comment 7•7 years ago
|
||
I still see this crash in a build that should have bug 1295684 (bp-7d718248-c70a-4cdc-84b6-a3fde2160828) so I guess my theory about this being an OOM was wrong.
Comment 8•7 years ago
|
||
the issue looks partly correlated to kaspersky being present as well: nsXPCWrappedJS::nsXPCWrappedJS|EXCEPTION_BREAKPOINT (529 crashes) 51% (271/529) vs. 3% (1250/46019) light_plugin_ACF0E80077C511E59DED005056C00008@kaspersky.com
Comment 9•7 years ago
|
||
This could absolutely be a binary addon failing to handle GC properly and passing us a dead or moved pointer. Given the spike, I think that's probably the most likely explanation.
Flags: needinfo?(terrence)
Comment 10•7 years ago
|
||
Some interesting data about the signature in Firefox 48: (100.0% in signature vs 00.73% overall) moz_crash_reason = MOZ_RELEASE_ASSERT(nsXPConnect::GetRuntimeInstance()-> GetMultiCompartmentWrappedJSMap()->Add(cx, mRoot)) (99.95% in signature vs 39.31% overall) reason = EXCEPTION_BREAKPOINT (67.52% in signature vs 08.02% overall) contains_memory_report = 1 (56.04% in signature vs 02.18% overall) Addon "Kaspersky Protection" = true (51.40% in signature vs 17.77% overall) Addon "Adblock Plus" = true (00.39% in signature vs 25.43% overall) dom_ipc_enabled = 1 (34.24% in signature vs 00.69% overall) Addon "Adblock Plus" = true ∧ Addon "Kaspersky Protection" = true (26.15% in signature vs 07.28% overall) Addon "Adblock Plus" = true ∧ platform_pretty_version = Windows 7
Comment 11•7 years ago
|
||
(In reply to Marco Castelluccio [:marco] from comment #10) > Some interesting data about the signature in Firefox 48: > ... > (56.04% in signature vs 02.18% overall) Addon "Kaspersky Protection" = true Does this mean that Kaspersky is only potentially responsible for ~56% of these crashes (and thus comment 9 is only 44% correct)?
Flags: needinfo?(mcastelluccio)
Comment 12•7 years ago
|
||
(In reply to Andrew Overholt [:overholt] from comment #11) > (In reply to Marco Castelluccio [:marco] from comment #10) > > Some interesting data about the signature in Firefox 48: > > ... > > (56.04% in signature vs 02.18% overall) Addon "Kaspersky Protection" = true > > Does this mean that Kaspersky is only potentially responsible for ~56% of > these crashes (and thus comment 9 is only 44% correct)? Yes, 56% of crash reports with the signature 'nsXPCWrappedJS::nsXPCWrappedJS' have that addon. This is true before/after the spike (~60% before, ~56% after), so I believe the addon is not responsible for it.
Flags: needinfo?(mcastelluccio)
Comment 13•7 years ago
|
||
Thanks. Could there be another binary add-on like Terrence speculates in comment 9?
Comment 14•7 years ago
|
||
Can anyone reproduce this? Without an STR and the lack of a definitive correlation with a specific binary add-on I'm not sure what we can do next ...
Updated•7 years ago
|
Priority: -- → P1
Comment 15•7 years ago
|
||
Crash volume for signature 'nsXPCWrappedJS::nsXPCWrappedJS': - nightly (version 52): 2 crashes from 2016-09-19. - aurora (version 51): 7 crashes from 2016-09-19. - beta (version 50): 1169 crashes from 2016-09-20. - release (version 49): 4368 crashes from 2016-09-05. - esr (version 45): 8 crashes from 2016-06-01. Crash volume on the last weeks (Week N is from 10-03 to 10-09): W. N-1 W. N-2 - nightly 2 0 - aurora 5 2 - beta 973 196 - release 3470 898 - esr 1 0 Affected platform: Windows Crash rank on the last 7 days: Browser Content Plugin - nightly #480 - aurora #443 #526 - beta #7 #4151 - release #7 #349 - esr #6221
status-firefox52:
--- → affected
Comment 16•7 years ago
|
||
(In reply to Andrew Overholt [:overholt] from comment #14) > Can anyone reproduce this? Without an STR and the lack of a definitive > correlation with a specific binary add-on I'm not sure what we can do next > ... Andrew, there is a user of the French community board with this issue: https://forums.mozfr.org/viewtopic.php?f=5&t=130668 CR: https://crash-stats.mozilla.com/report/index/ff01c0e2-faf5-4657-85d0-306f32160904 He has Kaspersky too. Do you want some more info?
Flags: needinfo?(overholt)
Comment 17•7 years ago
|
||
(In reply to Loic from comment #16) > (In reply to Andrew Overholt [:overholt] from comment #14) > > Can anyone reproduce this? Without an STR and the lack of a definitive > > correlation with a specific binary add-on I'm not sure what we can do next > > ... > > Andrew, there is a user of the French community board with this issue: > https://forums.mozfr.org/viewtopic.php?f=5&t=130668 > CR: > https://crash-stats.mozilla.com/report/index/ff01c0e2-faf5-4657-85d0- > 306f32160904 > He has Kaspersky too. > > Do you want some more info? That'd be great! My French is too rusty to be useful on that forum :) Thank you!
Flags: needinfo?(overholt)
Comment 18•7 years ago
|
||
Sorry, forgot to needinfo you when I answered in comment 17.
Flags: needinfo?(epinal99-bugzilla2)
Comment 19•7 years ago
|
||
Crash report from comment #0 and the French user indicate that Kaspersky 2016 is implied (see product_info.dll in tab "Modules"), 16.0.0.694 and 16.0.1.445. Upgrading to Kaspersky 2017 (17.0.0.611) should fix the issue. A possibility is to blacklist the version 16.0.1.445 wich is almost implied in all crashes: https://crash-stats.mozilla.com/signature/?signature=nsXPCWrappedJS%3A%3AnsXPCWrappedJS&date=%3E%3D2016-10-11T14%3A36%3A00.000Z&date=%3C2016-10-18T14%3A36%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=-date&page=1
Flags: needinfo?(epinal99-bugzilla2)
Comment 20•7 years ago
|
||
How do we blacklist specific versions of binary add-ons?
Flags: needinfo?(benjamin)
Comment 21•7 years ago
|
||
(In reply to Loic from comment #19) > Crash report from comment #0 and the French user indicate that Kaspersky > 2016 is implied (see product_info.dll in tab "Modules"), 16.0.0.694 and > 16.0.1.445. > Upgrading to Kaspersky 2017 (17.0.0.611) should fix the issue. Did upgrading to Kaspersky 17.0.0.611 fix the issue for the reporter? > A possibility is to blacklist the version 16.0.1.445 wich is almost implied > in all crashes: > https://crash-stats.mozilla.com/signature/ > ?signature=nsXPCWrappedJS%3A%3AnsXPCWrappedJS&date=%3E%3D2016-10- > 11T14%3A36%3A00.000Z&date=%3C2016-10-18T14%3A36%3A00. > 000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_colum > ns=platform&_columns=reason&_columns=address&_sort=-date&page=1 Kaspersky is not in almost all crashes, but in ~50% of them. (In reply to Andrew Overholt [:overholt] from comment #20) > How do we blacklist specific versions of binary add-ons? We have a DLL blocklist, we would need to test if it's effective with Kaspersky. It's possible that the presence of Kaspersky makes this crash more likely or that Kaspersky causes a subset of the crashes with this signature. Should we try to block it?
Comment 22•7 years ago
|
||
Here are the numbers for 50.0b over the last week:
> 1332 crash reports have been analyzed and the following versions have been found:
> - product_info.dll
> - 16.0.1.445: 708
> - 16.0.0.694: 27
> - 17.0.0.643: 7
Note that blocklisting the addon as opposed to the DLL might block the injection of the DLL too, but wouldn't require a code change. So, if we want to proceed with blocking, perhaps we can try with the addon blocklist first.
Comment 23•7 years ago
|
||
(In reply to Marco Castelluccio [:marco] from comment #21) > (In reply to Loic from comment #19) > > Crash report from comment #0 and the French user indicate that Kaspersky > > 2016 is implied (see product_info.dll in tab "Modules"), 16.0.0.694 and > > 16.0.1.445. > > Upgrading to Kaspersky 2017 (17.0.0.611) should fix the issue. > > Did upgrading to Kaspersky 17.0.0.611 fix the issue for the reporter? If the answer to ^ turns out to be "yes", then ... > It's possible that the presence of Kaspersky makes this crash more likely or > that Kaspersky causes a subset of the crashes with this signature. Should we > try to block it? ... I think we should at least try.
Flags: needinfo?(benjamin)
Comment 24•7 years ago
|
||
Hello guys, Not sure how related my issue is to this, but I've been getting a similar crash. https://crash-stats.mozilla.com/report/index/902dbed4-73d9-4b6d-8c3d-7b6182161021 Can someone more knowledgeable look at the crash reports and check if they have the same root cause? Anyway, I don't have Kaspersky installed. What I did notice is that the crash occurring on my end is pretty much 100% preceded in a crash of the media player on the Bloomberg live stream (at least I don't ever remember Firefox crashing without watching Bloomberg). Let me know if I can provide you with more info.
Comment 25•7 years ago
|
||
(In reply to Denis Rosca from comment #24) > Hello guys, > Not sure how related my issue is to this, but I've been getting a similar > crash. > https://crash-stats.mozilla.com/report/index/902dbed4-73d9-4b6d-8c3d- > 7b6182161021 > > Can someone more knowledgeable look at the crash reports and check if they > have the same root cause? > Anyway, I don't have Kaspersky installed. > > What I did notice is that the crash occurring on my end is pretty much 100% > preceded in a crash of the media player on the Bloomberg live stream (at > least I don't ever remember Firefox crashing without watching Bloomberg). > > Let me know if I can provide you with more info. Hello Denis, thanks for the report. Can you reproduce this crash consistently? Can you reproduce it with addons disabled? When did the crash start to happen? After updating Firefox?
Comment 26•7 years ago
|
||
(In reply to Marco Castelluccio [:marco] from comment #25) > > Hello Denis, thanks for the report. > Can you reproduce this crash consistently? Can you reproduce it with addons > disabled? > When did the crash start to happen? After updating Firefox? My problems started 1 or 2 months ago (can't tell you more specific dates because I didn't pay too much attention to it at first). I can pretty reliably reproduce the *Firefox crashing* issue when running with plugins enabled. The problem is that the failure modes differ a bit: 1. Firefox just freezes so I have to kill the process 2. Firefox crashes and asks to submit a crash report. So I'm not exactly sure if this is the same problem or I just happen to hit two different issues somehow. I'll try to see if it reproduces without addons. I have a question about how I could debug case 1 above? How do I get more info about the reason Firefox froze? If I just kill the process no crash report is generated.
Comment 27•7 years ago
|
||
Have you ever used mozregression? If this is a regression in Firefox, this tool will help finding what caused it. You can download it here: https://github.com/mozilla/mozregression/releases. Simply select an estimated good date (e.g. 2016-01-01) and select today as "bad". It will download several builds of Firefox and let you test if the build is good (unaffected) or bad (affected) and at the end it will tell you what caused the regression. (In reply to Denis Rosca from comment #26) > So I'm not exactly sure if this is the same problem or I just happen to hit > two different issues somehow. I bet they are the same, but I'm not sure. > I'll try to see if it reproduces without addons. Great, let us know. > I have a question about how I could debug case 1 above? How do I get more > info about the reason Firefox froze? If I just kill the process no crash > report is generated. What happens if you let Firefox be frozen for a while? In theory, it should be automatically killed after a while and a crash report should be generated.
Comment 28•7 years ago
|
||
(In reply to Denis Rosca from comment #24) > Hello guys, > Not sure how related my issue is to this, but I've been getting a similar > crash. > https://crash-stats.mozilla.com/report/index/902dbed4-73d9-4b6d-8c3d- > 7b6182161021 You have LastPass installed in your profile. Do you have the latest version? If not, can you update LP, please.
Comment 29•7 years ago
|
||
(In reply to Denis Rosca from comment #26) > I have a question about how I could debug case 1 above? How do I get more > info about the reason Firefox froze? If I just kill the process no crash > report is generated. The simplest way is for you to download the crashfirefox.exe binary we provide: https://ftp.mozilla.org/pub/utilities/crashfirefox-intentionally/crashfirefox.exe If you run that it will locate a running Firefox and crash it in a way such that a crash report will be generated.
Comment 30•7 years ago
|
||
So... I've hit this again: https://crash-stats.mozilla.com/report/index/c9cf876b-1a48-4e6f-84b2-d050b2161102 I have to say that my initial statement of being able to reproduce it consistently doesn't seem to be true. Firefox is crashing quite a lot, but with different causes it seems. (In reply to Loic from comment #28) > (In reply to Denis Rosca from comment #24) > > Hello guys, > > Not sure how related my issue is to this, but I've been getting a similar > > crash. > > https://crash-stats.mozilla.com/report/index/902dbed4-73d9-4b6d-8c3d- > > 7b6182161021 > > You have LastPass installed in your profile. Do you have the latest version? > If not, can you update LP, please. I did update LastPass, the add-on reports to be on version 4.1.29a which seems to be the most recent one?
Reporter | ||
Comment 31•7 years ago
|
||
(In reply to Loic from comment #19) > Crash report from comment #0 and the French user indicate that Kaspersky > 2016 is implied (see product_info.dll in tab "Modules"), 16.0.0.694 and > 16.0.1.445. > Upgrading to Kaspersky 2017 (17.0.0.611) should fix the issue. > > A possibility is to blacklist the version 16.0.1.445 wich is almost implied > in all crashes: > https://crash-stats.mozilla.com/signature/ > ?signature=nsXPCWrappedJS%3A%3AnsXPCWrappedJS&date=%3E%3D2016-10- > 11T14%3A36%3A00.000Z&date=%3C2016-10-18T14%3A36%3A00. > 000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_colum > ns=platform&_columns=reason&_columns=address&_sort=-date&page=1 Loic: Can we get the user to agree to share his crash dump with Kaspersky?
Flags: needinfo?(epinal99-bugzilla2)
Comment 32•7 years ago
|
||
I asked him but I'm not sure if the user will respond me, so feel free to send the report (or not).
Flags: needinfo?(epinal99-bugzilla2)
Comment 33•7 years ago
|
||
(In reply to Loic from comment #32) > I asked him but I'm not sure if the user will respond me, so feel free to > send the report (or not). We've already shared the report, which is public and can be accessed by anyone, but we can't share the dump (which might contain personal information) without explicit permission from the user. Unfortunately the dump is often what the third-party developers need in order to debug the problem.
Comment 34•7 years ago
|
||
I've managed to reproduce the issue and have got mini dumps and memory reports from the crash reporter and full dumps from WER. Email me to get a full dump. It is not an easy one to reproduce. I made a test that simulates a user: navigates, opens/closes tabs (but opens no more than 10 tabs), clicks at random places. Eventually the browser crashes for some reason, OOM is the most common one. But sometimes I get a crash in the same place as in this bug. So it may take a day or two to reproduce it again. The hash that failed to add an entry looks a little bit weird to me. I found it using xul!nsXPConnect::gSelf->mRuntime->mWrappedJSMap->mTable->impl. It has similar state in all dumps. For example: [+0x000] gen : 0x8 [+0x000] hashShift : 0x15 [+0x00c] entryCount : 0x3dc [+0x010] removedCount : 0x17b Correct me if I am wrong but given these values shouldCompressTable() => false overloaded() => false so there was no need to rehash and adding an entry should not cause big memory allocations. The number of entries is pretty small too. Why did it fail to add an entry then? It is possible that we are dealing with a memory leak here, so it is reasonable to collect more memory related data. Is there a way to configure the crash reporter to write verbose and not anonymized memory reports? What else would you recommend to collect?
Reporter | ||
Comment 35•7 years ago
|
||
ni on Andrew to see if he can find someone to answer the questions from Sergei in Comment 34.
Flags: needinfo?(overholt)
Comment 36•7 years ago
|
||
Andrew said he could help.
Flags: needinfo?(overholt) → needinfo?(continuation)
Updated•7 years ago
|
Component: XPConnect → XPCOM
Flags: needinfo?(continuation)
Comment 37•7 years ago
|
||
(In reply to Sergei from comment #34) > I've managed to reproduce the issue and have got mini dumps and memory > reports from the crash reporter and full dumps from WER. Email me to get a > full dump. Thanks for looking at this. Could you at least upload the memory report here? The actual URLs probably aren't a big deal, so an anonymized one should be okay. > It is not an easy one to reproduce. I made a test that simulates a user: > navigates, opens/closes tabs (but opens no more than 10 tabs), clicks at > random places. Eventually the browser crashes for some reason, OOM is the > most common one. But sometimes I get a crash in the same place as in this > bug. So it may take a day or two to reproduce it again. > > The hash that failed to add an entry looks a little bit weird to me. I found > it using xul!nsXPConnect::gSelf->mRuntime->mWrappedJSMap->mTable->impl. It > has similar state in all dumps. For example: > > [+0x000] gen : 0x8 > [+0x000] hashShift : 0x15 > [+0x00c] entryCount : 0x3dc > [+0x010] removedCount : 0x17b > > Correct me if I am wrong but given these values > shouldCompressTable() => false > overloaded() => false > so there was no need to rehash and adding an entry should not cause big > memory allocations. The number of entries is pretty small too. Why did it > fail to add an entry then? Hmm. Unfortunately I'm not very familiar with the internals of the PLDHashtable. Maybe njn could look into this a bit? It does feel like there might be some kind of bug in PLDHashtable when an allocation fails (like an allocation that doesn't check for failure), but I've read over the code and I wasn't able to find anything. > It is possible that we are dealing with a memory leak here, so it is > reasonable to collect more memory related data. Is there a way to configure > the crash reporter to write verbose and not anonymized memory reports? What > else would you recommend to collect? The reports should be verbose, aside from the anonymization. They would at least give us a bit of a sense of what might be going wrong.
Flags: needinfo?(n.nethercote)
![]() |
||
Comment 38•7 years ago
|
||
> The hash that failed to add an entry looks a little bit weird to me. I found > it using xul!nsXPConnect::gSelf->mRuntime->mWrappedJSMap->mTable->impl. It > has similar state in all dumps. For example: > > [+0x000] gen : 0x8 > [+0x000] hashShift : 0x15 > [+0x00c] entryCount : 0x3dc > [+0x010] removedCount : 0x17b > > Correct me if I am wrong but given these values > shouldCompressTable() => false > overloaded() => false > so there was no need to rehash and adding an entry should not cause big > memory allocations. The number of entries is pretty small too. Why did it > fail to add an entry then? If hashShift is 0x15 then the capacity should be 1 << (32 - 0x15) which is 2048. The number of used entries is 988 and removed entries is 379. I agree that the table shouldn't resize in this situation. I don't have full context here... what hash table operation failed? Did a hash table operation lead to OOM? Did you get that from a minidump? If you can email me the minidump I can take a look. > It is possible that we are dealing with a memory leak here, so it is > reasonable to collect more memory related data. Is there a way to configure > the crash reporter to write verbose and not anonymized memory reports? What > else would you recommend to collect? You can get non-anonymized memory reports by visiting about:memory and clicking on the "Measure and save..." button.
Flags: needinfo?(n.nethercote) → needinfo?(sergey.kogan)
Comment 39•7 years ago
|
||
Sent the full dump to Nicholas. (In reply to Andrew McCreight [:mccr8] from comment #37) > Could you at least upload the memory report here? I uploaded the report (except the full dump), here it is: https://crash-stats.mozilla.com/report/index/22753c66-5511-40a1-9aa1-6c4ec2161123
Flags: needinfo?(sergey.kogan) → needinfo?(n.nethercote)
Comment 40•7 years ago
|
||
Reproduced it again on a VM with freshly installed (from MSDN .iso) Windows 7 SP1. Kaspersky was also installed but was not loaded. Kaspersky addon was disabled. There was no other software or addons, just clean Windows and Firefox 50. The hash is even smaller: [+0x000] gen : 0x4 [+0x000] hashShift : 0x16 [+0x00c] entryCount : 0x1fb [+0x010] removedCount : 0xef Full dump (written by the crash reporter this time) and memory report can be downloaded here: https://1drv.ms/u/s!Avfdd6lsYwdUgpZqJktPsq1EINvQaQ Some observations: - Running the test on a VM with 2GB of memory gives more stable results. Tried 1GB, 1.5GB but 2GB is better. - Eventually (after 15-30 minutes) Firefox consumes all available memory and crashes. - Almost all of the crashes are caused by OOM asserts. Others - by access violations and other asserts, but almost all of them occur in the OOM situation, too.
Comment 41•7 years ago
|
||
Set some breakpoints and reproduced the issue again. The actual failure stack looks like this: 00 xul!JS::Zone::getUniqueId 01 xul!js::MovableCellHasher<JSObject *>::ensureHash 02 xul!js::detail::HashTable<...>::lookupForAdd 03 xul!js::HashMap<...>::lookupForAdd 04 xul!JSObject2WrappedJSMap::Add 05 xul!nsXPCWrappedJS::nsXPCWrappedJS xul!JS::Zone::getUniqueId returned false because xul!JS::Zone::uniqueIds_ failed to add an entry. This hash had the following state in the full dump I posted yesterday: ((xul!nsXPCWrappedJS*)0x2af0e380)->mJSObj->ptr->group_->value->compartment_->zone_->uniqueIds_->impl [+0x000] gen : 0x35 [+0x000] hashShift : 0x11 [+0x00c] entryCount : 0x5ce8 [+0x010] removedCount : 0x318 So it was overloaded and a big allocation was needed to recreate the table and add an entry.
Comment 42•7 years ago
|
||
Oh, so that makes more sense: we have an OOM failure inside a JS hashtable, not a PLDHashtable. The JS engine tries to handle OOMs, sometimes, so you can end up with odd crashes in OOM situations. It would be nice to improve that somehow. The other question is, why are we hitting OOM? Looking at the memory report in the crash dump in comment 39, nothing looks too out of the ordinary to me. It is using about 1.5GB of memory, and there's little system memory left, but the usage is spread out between a number of categories. About 320MB of chrome JS, including 114MB for JS runtime stuff, 320 MB of window stuff, 200MB of heap-unclassified, etc. I do see that for chrome JS the unique-id-map is 3,145,728, which is fairly large given that it is a hash table. It is certainly believable that you can't grow that when memory usage is at 95%.
Assignee | ||
Comment 43•7 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #42) > Oh, so that makes more sense: we have an OOM failure inside a JS hashtable, > not a PLDHashtable. The JS engine tries to handle OOMs, sometimes, so you > can end up with odd crashes in OOM situations. It would be nice to improve > that somehow. This is crashing because of the release assert here: http://searchfox.org/mozilla-central/source/js/xpconnect/src/XPCWrappedJS.cpp#419 I think we could refactor this to so that this part of initialisation is fallible which should improve things. From the size of that hash table is sounds like something strange is going on though.
Comment 44•7 years ago
|
||
I can share another crash report with full dump and memory report, maybe it can be helpful. Only the Kaspersky addon was installed this time. Download link: https://1drv.ms/u/s!Avfdd6lsYwdUgpZr_d5za1yeOlA9EA The same unique-id-map that was mentioned in comment 42 has the following size in the reports that I have access to: - A lot of 3d-party addons (report from comment 39): hash size = 3,145,728 - No 3d-party addons (report from comment 40): hash size = 786,432 - Kaspersky addon only, no other 3d-party addons (report from this comment): hash size = 786,432
![]() |
||
Updated•7 years ago
|
Flags: needinfo?(n.nethercote)
Assignee | ||
Comment 45•7 years ago
|
||
Oh, that constructor is already fallible since it has an out parameter for the return code. How about setting this if we can't add the wrapper to the map? Hopefully this will allow the error to get handled somewhere further up the stack rather than crashing.
Attachment #8816489 -
Flags: review?(continuation)
Comment 46•7 years ago
|
||
Comment on attachment 8816489 [details] [diff] [review] bug1295033-nsXPCWrappedJS-failure Review of attachment 8816489 [details] [diff] [review]: ----------------------------------------------------------------- I guess this is a bit of an improvement. I do see one caller of GetNewOrUsed that doesn't handle failure correctly and will crash with a null deref, nsXPCWrappedJSClass::DelegatedQueryInterface, but the rest look okay.
Attachment #8816489 -
Flags: review?(continuation) → review+
Assignee | ||
Comment 47•7 years ago
|
||
Updated patch to check for failure in nsXPCWrappedJSClass::DelegatedQueryInterface.
Attachment #8816489 -
Attachment is obsolete: true
Attachment #8817262 -
Flags: review?(continuation)
Updated•7 years ago
|
Attachment #8817262 -
Flags: review?(continuation) → review+
Comment 48•7 years ago
|
||
Pushed by jcoppeard@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/a4417f62daa8 Handle failure in nsXPCWrappedJS constructor r=mccr8
Comment 49•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/a4417f62daa8
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox53:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Comment 50•7 years ago
|
||
We might need to wait until this moves to Aurora or even Beta to see the effect. Should we uplift?
Comment 51•7 years ago
|
||
i think the crash volume is high enough on aurora to quickly gauge if the fix has an effect on this channel. what do you think about uplifting?
Flags: needinfo?(jcoppeard)
Assignee | ||
Comment 52•7 years ago
|
||
Comment on attachment 8817262 [details] [diff] [review] bug1295033-nsXPCWrappedJS-failure v2 Approval Request Comment [Feature/Bug causing the regression]: The release assert was added in bug 1214961. [User impact if declined]: Possible crashes. [Is this code covered by automated tests?]: It's exercised by our tests. I'm not sure if it's tested specifically. [Has the fix been verified in Nightly?]: Requesting uplift to check whether it affects crash volumes. [Needs manual test from QE? If yes, steps to reproduce]: No. [List of other uplifts needed for the feature/fix]: None. [Is the change risky?]: No. [Why is the change risky/not risky?]: Simple change, adds handling of OOM condition. [String changes made/needed]: None.
Flags: needinfo?(jcoppeard)
Attachment #8817262 -
Flags: approval-mozilla-aurora?
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → jcoppeard
Comment 53•7 years ago
|
||
Comment on attachment 8817262 [details] [diff] [review] bug1295033-nsXPCWrappedJS-failure v2 crash fix for aurora52
Attachment #8817262 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Comment 54•7 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/ea07499d13936886f4b10455628a2f951798bf9c
Comment 55•7 years ago
|
||
(In reply to Jon Coppeard (:jonco) from comment #54) > https://hg.mozilla.org/releases/mozilla-aurora/rev/ > ea07499d13936886f4b10455628a2f951798bf9c setting flags
Comment 56•7 years ago
|
||
Hi :jonco, Do you think this is worth uplifting to Beta51?
Flags: needinfo?(jcoppeard)
Comment 57•7 years ago
|
||
The last affected Aurora build ID is 20161213004019, but the volume has always been very low so it might still be too early to tell if the crash is fully fixed. It is the #5 top crasher on release at the moment, so it would be pretty nice to fix it in 51. https://crash-stats.mozilla.com/search/?signature=%3DnsXPCWrappedJS%3A%3AnsXPCWrappedJS&product=Firefox&version=52.0a2&date=%3E%3D2016-12-06T08%3A31%3A56.000Z&date=%3C2016-12-20T08%3A31%3A56.000Z&_sort=-date&_facets=signature&_facets=build_id&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-build_id
Assignee | ||
Comment 58•7 years ago
|
||
Comment on attachment 8817262 [details] [diff] [review] bug1295033-nsXPCWrappedJS-failure v2 Approval Request Comment [Feature/Bug causing the regression]: The release assert was added in bug 1214961. [User impact if declined]: Possible crashes. [Is this code covered by automated tests?]: It's exercised by our tests. I'm not sure if it's tested specifically. [Has the fix been verified in Nightly?]: No crashes on nightly since this landed, but it was low volume anyway. [List of other uplifts needed for the feature/fix]: None. [Is the change risky?]: No, this has been on m-c since 8th December. [Why is the change risky/not risky?]: Simple change, adds handling of OOM condition. [String changes made/needed]: None.
Flags: needinfo?(jcoppeard)
Attachment #8817262 -
Flags: approval-mozilla-beta?
Comment 59•7 years ago
|
||
Comment on attachment 8817262 [details] [diff] [review] bug1295033-nsXPCWrappedJS-failure v2 Fix a crash. Beta51+. Should be in 51 beta 10.
Attachment #8817262 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Assignee | ||
Comment 60•7 years ago
|
||
https://hg.mozilla.org/releases/mozilla-beta/rev/c05b211ecbdcbacd8860750186e20cd9d430eca5
Comment 61•7 years ago
|
||
Jon, when landing a patch, please update the status flags ;) Thanks
Comment 62•7 years ago
|
||
No crashes with 51.0b10 so far.
Comment 63•7 years ago
|
||
Sergei, could you verify that you aren't able to crash anymore with your setup?
Flags: needinfo?(sergey.kogan)
Comment 64•7 years ago
|
||
Sorry for the delay. Yes I can confirm that Firefox does not crash in my environment anymore. Tested version https://ftp.mozilla.org/pub/firefox/releases/51.0b12/win32/en-US/Firefox%20Setup%2051.0b12.exe Ran the test for 4.5 hours on exactly the same machines - no crashes at all.
Flags: needinfo?(sergey.kogan)
Updated•7 years ago
|
Comment 65•7 years ago
|
||
Thanks Sergei, really appreciated!
Comment 66•7 years ago
|
||
Also, thanks for your investigation into this crash that let us figure out how to fix it!
You need to log in
before you can comment on or make changes to this bug.
Description
•