Closed
Bug 1308800
Opened 8 years ago
Closed 8 years ago
Crash [@ js::IsDerivedProxyObject ]
Categories
(Core :: JavaScript Engine: JIT, defect, P1)
Tracking
()
People
(Reporter: esthermonchari, Unassigned)
References
Details
(4 keywords, Whiteboard: [platform-rel-Facebook])
Crash Data
This bug was filed from the Socorro interface and is
report bp-14f4c0a6-629a-418b-8d79-62f072160925.
=============================================================
Seen while looking at nightly crash stats. crash started with build id 20160925030226 on 2016-09-25 22:04:16 and still continuing.
adding ni on:Jon maybe help figure out if i placed that on the right component and if you have some ideas on what might be causing the crash.
Flags: needinfo?(jcoppeard)
Comment 1•8 years ago
|
||
I don't really understand this. I can't see how we can go from nsDOMCSSDeclaration::RemoveProperty to js::IsDerivedProxyObject, but I guess it's possible there is a lot of inlining going on.
Maybe someone in layout would be better placed to look at this?
Component: JavaScript: Standard Library → Layout
Flags: needinfo?(jcoppeard)
Comment 2•8 years ago
|
||
There *is* a static call chain that would get you between those, but it seems unlikely that it would be inlined all the way through:
uint32 nsDOMCSSDeclaration::RemoveProperty(int32)
uint32 nsDOMCSSAttributeDeclaration::SetCSSDeclaration(mozilla::css::Declaration*)
uint32 nsStyledElementNotElementCSSInlineStyle::SetInlineStyleDeclaration(mozilla::css::Declaration*, nsAString_internal*, uint8)
uint32 mozilla::dom::Element::SetAttrAndNotify(int32, nsIAtom*, nsIAtom*, nsAttrValue*, nsAttrValue*, uint8, uint8, uint8, uint8)
void nsDocument::EnqueueLifecycleCallback(uint32, mozilla::dom::Element*, mozilla::dom::LifecycleCallbackArgs*, mozilla::dom::CustomElementDefinition*)
void nsContentUtils::AddScriptRunner(nsIRunnable*)
void nsContentUtils::AddScriptRunner(struct already_AddRefed<nsIRunnable>)
uint32 WindowDestroyedEvent::Run()
uint8 js::NukeCrossCompartmentWrappers(JSContext*, js::CompartmentFilter*, js::CompartmentFilter*, uint32)
void js::NukeCrossCompartmentWrapper(JSContext*, JSObject*)
uint8 js::IsDeadProxyObject(JSObject*)
uint8 js::IsDerivedProxyObject(JSObject*, js::BaseProxyHandler*)
Updated•8 years ago
|
Severity: normal → critical
Component: Layout → DOM: CSS Object Model
Comment 3•8 years ago
|
||
Boris, I don't know who else to ask to investigate here.
Flags: needinfo?(bzbarsky)
![]() |
||
Comment 4•8 years ago
|
||
There is no way the call chain in comment 2 is actually being inlined all the way through.
Not only that, but EnqueueLifecycleCallback calling AddScriptRunner can't trigger WindowDestroyedEvent::Run. It can only trigger Run() on the runnable EnqueueLifecycleCallback passes in, which is not a WindowDestroyedEvent.
I looked at crash-stats crashes in js::IsDerivedProxyObject, and most of them seem to be calling it from js::jit::GetPropertyIC::tryAttachStub, which is a much more sane claim. Specifically, I looked at the first 50 of the 168 reports we have in the past week claiming a crash in js::IsDerivedProxyObject (across all branches), and I found:
* 1 report (<https://crash-stats.mozilla.com/report/index/f15fb217-9b62-4c53-a301-a361f2161011>) which only has js::IsDerivedProxyObject on the stack.
* 1 report (<https://crash-stats.mozilla.com/report/index/0ccfb967-9518-4b59-bed8-904fd2161011>) which claims CSS2PropertiesBinding::set_maxWidth called js::IsDerivedProxyObject.
* 1 report (<https://crash-stats.mozilla.com/report/index/9617af00-5d39-46da-8504-691af2161010>) which claims js::jit::GetPropertyIC::tryAttachModuleNamespace called js::IsDerivedProxyObject. Note that the tryAttachStub callsites all actually point to the line where tryAttachStub calls tryAttachModuleNamespace, so this is really the tryAttachStub thing.
* 47 reports (e.g. <https://crash-stats.mozilla.com/report/index/493b1955-5064-44cd-8be4-f00b42161011>) which claim calls from tryAttachStub.
I tried looking at branches more specifically. I see 11 hits in the past week on 52.0a1. All have the tryAttachStub/tryAttachModuleNamespace callstack.
I will now claim that the CSS2Properties things are at the very least much lower frequency, and more likely simply bogus stacks, and what needs fixing here is the tryAttachStub path.
Component: DOM: CSS Object Model → JavaScript Engine: JIT
Flags: needinfo?(bzbarsky) → needinfo?(jdemooij)
![]() |
||
Comment 5•8 years ago
|
||
Oh, the two stacks I saw so far that show us coming through tryAttachModuleNamespace claim we come through here:
if (!obj->is<ModuleNamespaceObject>())
return true;
and are crashing under that first line with the is() call, and that this line calls js::IsDerivedProxyObject. That seems plausible, given:
JSObject::is<js::ModuleNamespaceObject>() const
{
return js::IsDerivedProxyObject(this, &js::ModuleNamespaceObject::proxyHandler);
}
If the crash address can be trusted, by the way, these are all null-pointer crashes. I don't know how much I would trust it, since I don't see how obj can be null in tryAttachStub there.....
Comment 6•8 years ago
|
||
I looked at one of these crashes in the debugger, it seems we pass a nullptr |obj| to GetPropertyIC::update, and tryAttachModuleNamespace is simply the first place where we dereference this object.
It's interesting that most URLs are for Facebook's "photo theater view" (viewing a single image almost full-screen). These URLs look like this:
https://www.facebook.com/photo.php?fbid=...&set=...&type=3&theater
I tried some of these URLs on Windows but no crashes. Maybe QA can do more testing here.
Comment 7•8 years ago
|
||
Crash volume for signature 'js::IsDerivedProxyObject':
- nightly (version 52): 56 crashes from 2016-09-19.
- aurora (version 51): 317 crashes from 2016-09-19.
- beta (version 50): 89 crashes from 2016-09-20.
- release (version 49): 203 crashes from 2016-09-05.
- esr (version 45): 4 crashes from 2016-07-25.
Crash volume on the last weeks (Week N is from 10-17 to 10-23):
W. N-1 W. N-2 W. N-3 W. N-4
- nightly 12 8 19 13
- aurora 105 76 95 18
- beta 22 25 26 10
- release 53 52 56 20
- esr 0 0 0 0
Affected platform: Windows
Crash rank on the last 7 days:
Browser Content Plugin
- nightly #127 #69
- aurora #33 #12
- beta #857 #390
- release #1292 #492
- esr
status-firefox49:
--- → affected
status-firefox50:
--- → affected
status-firefox51:
--- → affected
status-firefox-esr45:
--- → affected
Comment 8•8 years ago
|
||
Hi Kanru,
This becomes worse in 51.0b1. Can you help take a look at this one?
Flags: needinfo?(kchen)
Comment 9•8 years ago
|
||
The correlation tab for 52 shows:
(35.29% in signature vs 08.13% overall) "DXVA2D3D9+" in app_notes = true
(35.29% in signature vs 08.87% overall) "DXVA2D3D9?" in app_notes = true
(32.35% in signature vs 06.89% overall) "DXVA2D3D11?" in app_notes = true
(29.41% in signature vs 06.05% overall) "DXVA2D3D11+" in app_notes = true
And for 51
(98.70% in signature vs 06.41% overall) address = 0x0
(100.0% in signature vs 20.19% overall) reason = EXCEPTION_ACCESS_VIOLATION_READ
(07.98% in signature vs 62.64% overall) submitted_from_infobar = true
(57.00% in signature vs 15.68% overall) os_arch = amd64
(100.0% in signature vs 66.61% overall) plugin = false
(100.0% in signature vs 66.61% overall) plugin_version = null
(100.0% in signature vs 78.46% overall) shutdown_progress = null
(43.00% in signature vs 21.22% overall) os_arch = x86
(37.30% in signature vs 16.60% overall) "DXVA2D3D9?" in app_notes = true
(100.0% in signature vs 83.23% overall) ipc_channel_error = null
(34.69% in signature vs 14.63% overall) "DXVA2D3D9+" in app_notes = true
(04.56% in signature vs 37.15% overall) startup_crash = null ∧ submitted_from_infobar = true
(54.56% in signature vs 21.66% overall) "D3D11 Layers+" in app_notes = true ∧ startup_crash = 0
(54.56% in signature vs 21.66% overall) startup_crash = 0 ∧ "D3D11 Layers?" in app_notes = true
(02.28% in signature vs 31.16% overall) submitted_from_infobar = true ∧ "D3D11 Layers?" in app_notes = true
(02.28% in signature vs 31.16% overall) "D3D11 Layers+" in app_notes = true ∧ submitted_from_infobar = true
(03.42% in signature vs 25.46% overall) submitted_from_infobar = true ∧ startup_crash = 0
(04.56% in signature vs 26.69% overall) dom_ipc_enabled = 1 ∧ submitted_from_infobar = true
(20.03% in signature vs 03.43% overall) platform_version = 10.0.14393 ∧ os_arch = amd64
Looks like it's somehow more reproducible on DXVA2D3D9 and DXVA2D3D11 enabled platform.
Comment 10•8 years ago
|
||
some of the module correlations on beta also indicate that this could be related to media-playback:
js::IsDerivedProxyObject|EXCEPTION_ACCESS_VIOLATION_READ (267 crashes)
97% (259/267) vs. 35% (18271/51599) mozavcodec.dll
97% (259/267) vs. 35% (18272/51599) mozavutil.dll
89% (238/267) vs. 32% (16616/51599) msmpeg2vdec.dll
89% (238/267) vs. 33% (16849/51599) mf.dll
89% (238/267) vs. 33% (16861/51599) evr.dll
89% (238/267) vs. 33% (17114/51599) dxva2.dll
89% (238/267) vs. 33% (17224/51599) mfplat.dll
87% (231/267) vs. 33% (17222/51599) avrt.dll
82% (220/267) vs. 29% (15113/51599) dxgi.dll
93% (248/267) vs. 41% (21070/51599) NapiNSP.dll
93% (248/267) vs. 41% (21075/51599) nlaapi.dll
Comment 11•8 years ago
|
||
also adding the regression keyword, since the signature is spiking since 51 builds...
Keywords: regression
Version: 52 Branch → 51 Branch
Comment 14•8 years ago
|
||
Blake, the recent spike of this signature maybe related to a media-playback regression. Does comment 9 and comment 10 ring any bells?
Comment 15•8 years ago
|
||
My team is not very familiar with window platform yet.
Redirect ni to the expert, Chris. :-)
Flags: needinfo?(bwu) → needinfo?(cpearce)
Too late to fix in 50.1.0 release
Comment 17•8 years ago
|
||
Hi Anthony,
Can you help find someone to look at this if cpearce is not around since this crash is getting worse on Beta51?
Flags: needinfo?(ajones)
Hi Jean-Yves, Gerald, Jandem, we need a bit of help on this one. It's been a top crasher for a while and hoping we can get some sort of resolution soon.
Flags: needinfo?(jyavenard)
Flags: needinfo?(gsquelart)
Comment 19•8 years ago
|
||
As noted above, the majority of crash reports have URLs which are the facebook photo viewer. So I don't think that this is necessarily media related. The fact that video is or has been active when people visit facebook is not at all surprising; every third entry in my facebook feed is a video.
Flags: needinfo?(cpearce)
Comment 20•8 years ago
|
||
It's impossible to say much here. If we get a nullptr JSObject* it could be some (DOM) native returning a bogus value, it could be a regalloc bug, it could be a GC bug, it could be a TI bug, etc.
I'll try to repro this again, but it would be great if we could get some QA testing as well for the Facebook photo viewer. Do we have Facebook contacts?
I'm going to assume this is not a video issue for now.
Flags: needinfo?(jyavenard)
Flags: needinfo?(gsquelart)
Flags: needinfo?(ajones)
Updated•8 years ago
|
Comment 22•8 years ago
|
||
Tracking 52/53+ so we can address this crash. ni on Andrei to see if this is something (Facebook photo viewer) that SV can try to reproduce.
Comment 23•8 years ago
|
||
I tried to reproduce the issue on Windows 10 x64 & x86, Windows 8.1 x64 & x86, Windows 7 x64 & x86, using latest Nightly 53.0a1 (2017-01-03), latest Aurora 52.0a2 (2017-01-03) and 51.0b10 (20161222080852), focusing on facebook photo viewer and video player and also on other popular media websites but with no luck.
Flags: needinfo?(andrei.vaida)
Comment 24•8 years ago
|
||
This spike in crashes in 51 (and later) is nearing release. Not all the crashes are nullptrs. I agree with Anthony there's no indication this is media related.
Comment 25•8 years ago
|
||
There are probably different crashes here, the ones with address = 0x0 (99.6% of the crashes) spiked: https://crash-stats.mozilla.com/signature/?address=%3D0x0&signature=js%3A%3AIsDerivedProxyObject&date=%3E%3D2016-07-05T16%3A18%3A18.000Z&date=%3C2017-01-05T16%3A18%3A18.000Z#graphs.
The ones with address != 0x0 are pretty much stable: https://crash-stats.mozilla.com/signature/?address=%21%3D0x0&signature=js%3A%3AIsDerivedProxyObject&date=%3E%3D2016-07-05T16%3A18%3A18.000Z&date=%3C2017-01-05T16%3A18%3A18.000Z#graphs.
Keywords: topcrash,
topcrash-win
Comment 26•8 years ago
|
||
This is the most recurring stack trace: https://crash-stats.mozilla.com/report/index/53401a17-240d-467d-9962-2574e2170105#tab-details.
Comment 28•8 years ago
|
||
crashes with the address 0x0 were first recorded on 51.0a1 20160826030226 and occurred in regular frequency in subsequent builds: https://crash-stats.mozilla.com/search/?signature=%3Djs%3A%3AIsDerivedProxyObject&version=51.0a1&date=%3E%3D2016-07-05T21%3A43%3A08.000Z&_sort=-build_id&_facets=signature&_facets=build_id&_facets=platform_pretty_version&_facets=useragent_locale#facet-build_id (sort the list in that link by build id)
this would be the pushlog of patches landing in 51.0a1 20160826030226 -1 day:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=01748a2b1a463f24efd9cd8abad9ccfd76b037b8&tochange=a551f534773cf2d6933f78ce7d82a7a33a99643e
and back a further 2 days:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=24763f58772d45279a935790f732d80851924b46&tochange=01748a2b1a463f24efd9cd8abad9ccfd76b037b8
so maybe there are some clues about the cause of these crashes in this regression range...
Given the severity of this crash, we'd like to track this a s release blocking for Fx51. I hope we have a fix/mitigation in place soon that does not jeopardize the 51 go live date.
Comment 30•8 years ago
|
||
(In reply to [:philipp] from comment #28)
> so maybe there are some clues about the cause of these crashes in this
> regression range...
The main thing in that regression range is Shu's frontend rewrite, bug 1263355. Unfortunately that patch is so huge that (1) backout is impossible for sure, (2) we're not going to find this by looking at the patch. Bug 1324773 is a regression from bug 1263355 that still has to land on beta, but the odds of it being related are small.
As I said before, most of these crashes are on Facebook's photo viewer, with URLs that look like this:
https://www.facebook.com/photo.php?fbid=...&set=...&type=3&theater
I know we have contacts at Facebook. As I said 2 weeks ago, we need to get them involved - maybe they have better ways to test this or they can tell us what's special about that page. Who can work on that?
I'll try to reproduce this today, but it has been impossible so far.
(In reply to Marco Castelluccio [:marco] from comment #31)
> Dees, see comment 30.
I've just added both :marco and :jandem to the Facebook DL. Please use that channel of communication and start a new thread. They are super responsive.
Flags: needinfo?(dchinniah)
Updated•8 years ago
|
platform-rel: --- → ?
Whiteboard: [platform-rel-Facebook]
Comment 33•8 years ago
|
||
(In reply to Desigan Chinniah [:cyberdees] [:dees] [London - GMT] from comment #32)
> I've just added both :marco and :jandem to the Facebook DL. Please use that
> channel of communication and start a new thread. They are super responsive.
I didn't get any emails from that. Could you please email me the details?
Flags: needinfo?(dchinniah)
Comment 34•8 years ago
|
||
Oh nevermind, I found it on the Google Groups page.
Flags: needinfo?(dchinniah)
In case this helps, I hit this crash on my win10 laptop a few days back while looking at FB photos and zooming in/out. This was my comment "Crashed when zooming in using Ctrl+ shortcut key on FB feed." and the attached crash report is from my laptop from Jan 2nd.
Please lemme know if I can help in anyway. I will try to spend more time on FB (now officially) to repro this :)
This is the crash report I was referring to in comment 35 https://crash-stats.mozilla.com/report/index/bp-9f16da3d-dbf9-4c5f-918e-7bebd2170103
Comment 37•8 years ago
|
||
Crashes with the signature "js::SetLengthProperty" are also often happening with the Facebook photo viewer.
Crash Signature: [@ js::IsDerivedProxyObject ] → [@ js::IsDerivedProxyObject ]
[@ js::SetLengthProperty ]
A lot of folks at Mozilla have been trying to repro this crash including me and we've had no luck so far. Given that, I am removing the release blocking tag from this bug. The crash volume is still a bit worrisome but not worth blocking the release on.
Ran into the crash again! lots of ctrl+, ctrl-, scrolling up and down on the FB feed. I noticed that this is harder to repro with ctrl+mouse scroll but rather repros with ctrl+, ctrl-
https://crash-stats.mozilla.com/report/index/d3349291-32f0-4af3-a44b-316182170111
I was at zoom level 50% when it crashes and was changing zoom levels from 30% - 120% while going in & out of photo viewer mode. Doing all of these in rapid succession while scrolling up and down on my FB feed :)
Comment 42•8 years ago
|
||
(In reply to Ritu Kothari (:ritu) from comment #40)
> Ran into the crash again! lots of ctrl+, ctrl-, scrolling up and down on the
> FB feed. I noticed that this is harder to repro with ctrl+mouse scroll but
> rather repros with ctrl+, ctrl-
>
> https://crash-stats.mozilla.com/report/index/d3349291-32f0-4af3-a44b-
> 316182170111
Are you doing Ctrl+/- from within the photo theater view or on the feed page itself?
This time (3rd time) I hit it while rapidly clicking on -> key in FB photo viewer, zoom level at 30% https://crash-stats.mozilla.com/report/index/2582ca3a-b7de-4316-a08e-3f7df2170111
(In reply to Shu-yu Guo [:shu] from comment #42)
> (In reply to Ritu Kothari (:ritu) from comment #40)
> > Ran into the crash again! lots of ctrl+, ctrl-, scrolling up and down on the
> > FB feed. I noticed that this is harder to repro with ctrl+mouse scroll but
> > rather repros with ctrl+, ctrl-
> >
> > https://crash-stats.mozilla.com/report/index/d3349291-32f0-4af3-a44b-
> > 316182170111
>
> Are you doing Ctrl+/- from within the photo theater view or on the feed page
> itself?
In both, so zoom in/out while scrolling my fb feed and zoom in/out while going back/forward using <-/-> key on my fb photos
Can we annotate the crash dump with the filename/lineno of the JS script? That way maybe the Facebook people could work around this somehow.
Maybe in GetPropertyIC::update we could call CrashReporter::AnnotateCrashReport with the filename/lineno? That function isn't very fast, but we would only have to leave it in for a few days. We would just have to add a JS engine hook for AnnotateCrashReport.
Comment 46•8 years ago
|
||
We've talked in the past about getting the profiler pseudostack into the crash reports since it's likely to not be corrupted. With the changes to how the JS profiling information is stored it's more complicated now since you have to call to the JS engine to resolve the JS IP. Shu would be a good person to ask but it may be really risky to do from a crash handler with the newer architecture.
Comment 47•8 years ago
|
||
In bug 1295918 we plan to retrieve content JS stacks from the parent process. But it might be too late for this bug.
(In reply to Bill McCloskey (:billm) from comment #45)
> Can we annotate the crash dump with the filename/lineno of the JS script?
> That way maybe the Facebook people could work around this somehow.
>
> Maybe in GetPropertyIC::update we could call
> CrashReporter::AnnotateCrashReport with the filename/lineno? That function
> isn't very fast, but we would only have to leave it in for a few days. We
> would just have to add a JS engine hook for AnnotateCrashReport.
I don't know whit is GetPropertyIC::update. Will annotating in GetPropertyIC::update cause significant slow down? Maybe we can record the filename/lineno when we enter JS engine.
Flags: needinfo?(kchen)
Comment 48•8 years ago
|
||
Since ritu can repro... would it help to have her run under GDB and dump the JS stack on a crash? Or in general give another engineer access to drive GDB (or MSVC) once she has a crash?
Comment 49•8 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #48)
> Since ritu can repro... would it help to have her run under GDB and dump the
> JS stack on a crash? Or in general give another engineer access to drive
> GDB (or MSVC) once she has a crash?
I'm heading to MV tomorrow to take a look with her.
Comment 50•8 years ago
|
||
I spent a few hours with Ritu trying to reproduce the crash again. She was unable to reproduce again with the crashing build from comment 40 after an hour or two. :(
(In reply to Kan-Ru Chen [:kanru] (UTC+8) from comment #47)
> In bug 1295918 we plan to retrieve content JS stacks from the parent
> process. But it might be too late for this bug.
>
> (In reply to Bill McCloskey (:billm) from comment #45)
> > Can we annotate the crash dump with the filename/lineno of the JS script?
> > That way maybe the Facebook people could work around this somehow.
> >
> > Maybe in GetPropertyIC::update we could call
> > CrashReporter::AnnotateCrashReport with the filename/lineno? That function
> > isn't very fast, but we would only have to leave it in for a few days. We
> > would just have to add a JS engine hook for AnnotateCrashReport.
>
> I don't know whit is GetPropertyIC::update.
It's a function that's higher up in the crash stack.
> Will annotating in
> GetPropertyIC::update cause significant slow down?
Quite possibly. We could use a faster annotation, but I'm not sure it's worth it. The crash annotation function iterates over a hashtable with maybe 100 things and builds a string out of them. I don't have a sense of how often we update GetProp ICs, but this seems like a tolerable cost for a few days on Nightly.
> Maybe we can record the
> filename/lineno when we enter JS engine.
That could be very far away from where the problem occurs.
Comment 52•8 years ago
|
||
I have pretty reliable STR now and I have an rr recording.
Comment 53•8 years ago
|
||
We have a patch for this.
Let's wait for confirmation from crash-stats before we close this.
Flags: needinfo?(jdemooij)
Updated•8 years ago
|
Priority: -- → P1
Comment 54•8 years ago
|
||
Fixed in bug 1331058.
Verified the crash signature dropped to zero in last 7 days.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•