Closed Bug 883973 Opened 7 years ago Closed 7 years ago

crash in JSFunction::getExistingScript

Categories

(Core :: JavaScript Engine, defect, critical)

24 Branch
x86
All
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla25
Tracking Status
firefox23 --- unaffected
firefox24 + verified
firefox25 --- verified

People

(Reporter: scoobidiver, Assigned: bhackett)

References

()

Details

(5 keywords)

Crash Data

Attachments

(2 files, 1 obsolete file)

It first showed up in 24.0a1/20130617 and is #2 top crasher in today's build. The regression range is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=36da3cb92193&tochange=834c8941ae24
It's likely a regression from bug 883630.

Signature 	JSFunction::getExistingScript() More Reports Search
UUID	0f753241-d613-4aea-a1a6-1ccee2130617
Date Processed	2013-06-17 17:56:05
Uptime	3072
Last Crash	2.1 weeks before submission
Install Age	51.2 minutes since version was first installed.
Install Time	2013-06-17 17:04:32
Product	Firefox
Version	24.0a1
Build ID	20130617031112
Release Channel	nightly
OS	Windows NT
OS Version	6.1.7601 Service Pack 1
Build Architecture	x86
Build Architecture Info	GenuineIntel family 15 model 1 stepping 3
Crash Reason	EXCEPTION_ACCESS_VIOLATION_READ
Crash Address	0xffffffffffffff99
App Notes 	
AdapterVendorID: 0x8086, AdapterDeviceID: 0x2772, AdapterSubsysID: 27728086, AdapterDriverVersion: 8.15.10.1930
D3D10 Layers? D3D10 Layers- D3D9 Layers? D3D9 Layers- 
Processor Notes 	sp-processor07_phx1_mozilla_com_26539:2012
EMCheckCompatibility	True
Adapter Vendor ID	0x8086
Adapter Device ID	0x2772
Total Virtual Memory	2147352576
Available Virtual Memory	1698312192
System Memory Use Percentage	90
Available Page File	1016975360
Available Physical Memory	96198656

Frame 	Module 	Signature 	Source
0 	mozjs.dll 	JSFunction::getExistingScript 	js/src/jsfuninlines.h:230
1 	mozjs.dll 	js::ion::InlineFrameIteratorMaybeGC<1>::InlineFrameIteratorMaybeGC<1> 	js/src/ion/IonFrameIterator-inl.h:66
2 	mozjs.dll 	js::ion::GetPcScript 	js/src/ion/IonFrames.cpp:1019
3 	mozjs.dll 	js::GetPropertyHelper 	js/src/jsobj.cpp:4019
4 	mozjs.dll 	js::ion::GetPropertyIC::update 	js/src/ion/IonCaches.cpp:1516
5 		@0x17673cf0

More reports at:
https://crash-stats.mozilla.com/report/list?signature=JSFunction%3A%3AgetExistingScript%28%29
FWIW, triggered reliably for me by middle-wheel scrolling Salon pages such as http://www.salon.com/2013/05/17/power_tool_industry_too_powerful_to_regulate_partner/
I can't reproduce.  This crash doesn't make much sense to me, and looking at the reports 100% of the crashes are on Windows.  Has anyone (else) tried to reproduce on a non-Windows platform?  I am wondering if this is a PGO bug.
Attached file test PGO hypothesis
Disable PGO around the offending function, to see if that does anything.
Attachment #763823 - Flags: review+
Kannan pointed out that this is very similar to bug 882433, and looking at the signatures I am quite sure this is just a morphed signature for that bug.  That bug's crashes are also 100% on Windows.
Assigning to Brian as he has done the work here so far.
Assignee: general → bhackett1024
This seems to have fixed the issue - it's dropped from topcrasher.  Also, I'm pretty sure bug 882433 is a dup of this.
Duplicate of this bug: 882433
(In reply to Kannan Vijayan [:djvj] from comment #9)
> This seems to have fixed the issue - it's dropped from topcrasher.
No. There are still crashes in 24.0a1/20130619 at the same pace, 15 crashes an hour.
This is highly explosive on nightly.  In addition to crashing in this signature on Windows, Bughunter crashed on Linux with the signature from dupe bug 882433 at:

http://www.walmart.ca/en/ip/xbox-one-day-one-pre-order-edition/10234475?trail=&fromPLP=true&ancestorID=&searchString=&startSearch=&fromSearchBox=&addFacet=

Confirmed this does not affect beta22 or aurora23.
I've made some progress in determining what the issue is here.  Something is going wrong with snapshot iteration.  My debug-fu with Visual Studio is not great, so I've been having some issues, but here is what I've managed to find:

In |InlineFrameIteratorMaybeGC<allowGC>::findNextFrame()|, we copy the initial SnapshotIterator, and step through the frames.

The crash is happening at the "callee_->getExistingScript()".
"callee_" in this case, is 0xffffff85.

Here's that entire block of code for reference:
    si_ = start_;

    // Read the initial frame.
    callee_ = frame_->maybeCallee();
    script_ = frame_->script();
    pc_ = script_->code + si_.pcOffset();
#ifdef DEBUG
    numActualArgs_ = 0xbadbad;
#endif

    // This unfortunately is O(n*m), because we must skip over outer frames
    // before reading inner ones.
    unsigned remaining = start_.frameCount() - framesRead_ - 1;
    for (unsigned i = 0; i < remaining; i++) {
        JS_ASSERT(js_CodeSpec[*pc_].format & JOF_INVOKE);

        // Recover the number of actual arguments from the script.
        if (JSOp(*pc_) != JSOP_FUNAPPLY)
            numActualArgs_ = GET_ARGC(pc_);

        JS_ASSERT(numActualArgs_ != 0xbadbad);

        // Skip over non-argument slots, as well as |this|.
        unsigned skipCount = (si_.slots() - 1) - numActualArgs_ - 1;
        for (unsigned j = 0; j < skipCount; j++)
            si_.skip();

        Value funval = si_.read();

        // Skip extra slots.
        while (si_.moreSlots())
            si_.skip();

        si_.nextFrame();

        callee_ = funval.toObject().toFunction();

        // Inlined functions may be clones that still point to the lazy script
        // for the executed script, if they are clones. The actual script
        // exists though, just make sure the function points to it.
        script_ = callee_->getExistingScript();

        pc_ = script_->code + si_.pcOffset();
    }

    framesRead_++;

callee_ is getting initialized from |funval|.  Which is getting read off of the snapshotIterator (si_).  Clearly it's reading some bogus data.

One confusing thing I see is the initialization of |numActualArgs_|:
        if (JSOp(*pc_) != JSOP_FUNAPPLY)
            numActualArgs_ = GET_ARGC(pc_);

numActualArgs_ doesn't get set anywhere else.  If the inlined function is on a FUNAPPLY op, then we will skip initialization of numActualArgs_ and it'll keep the value from the previous loop iteration (i.e. the enclosing inlined frame).  I don't understand how this can work at all.

Am I missing something?
Flags: needinfo?(bhackett1024)
Clearing needinfo.  Just figured out what the reasoning behind the FUNAPPLY logic is.
Flags: needinfo?(bhackett1024)
I have a breakpoint set at the right position.  Tomorrow I'll sit down and manually walk through the encoded snapshot and cross-reference it with the corresponding Ion stack.
Regression window(m-i)
Goofd:
http://hg.mozilla.org/integration/mozilla-inbound/rev/03c829d7d4e7
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20130619 Firefox/24.0 ID:20130619044223
Crash:
http://hg.mozilla.org/integration/mozilla-inbound/rev/01068ed464ca
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20130619 Firefox/24.0 ID:20130619060920
Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=03c829d7d4e7&tochange=01068ed464ca
Blocks: 884298
OS: Windows 7 → All
Copying the crash signatures from the dupe.
Crash Signature: [@ JSFunction::getExistingScript()] → [@ JSFunction::getExistingScript()] [@ js::ion::InlineFrameIteratorMaybeGC<int>::findNextFrame()] [@ js::ion::InlineFrameIteratorMaybeGC<(js::AllowGC)1>::findNextFrame() ] [@ js::ion::SnapshotReader::readSlot() ] [@ js::ion::CompactBufferReader::readS…
It accounts for 29% of crashes for the last three builds.

The signature has morphed because of bug 883630.

Reports at:
https://crash-stats.mozilla.com/report/list?signature=JSFunction%3A%3AexistingScript%28%29
Crash Signature: js::ion::CompactBufferReader::readSigned() ] [@ js::GetPropertyHelper(JSContext*, JS::Handle<JSObject*>, JS::Handle<int>, unsigned int, JS::MutableHandle<JS::Value>)] → js::ion::CompactBufferReader::readSigned() ] [@ js::GetPropertyHelper(JSContext*, JS::Handle<JSObject*>, JS::Handle<int>, unsigned int, JS::MutableHandle<JS::Value>)] [@ JSFunction::existingScript()]
I think bug 884310 has increased how often we hit this. Also I don't know if this is way above average or not. So I don't know what the options are. Especially since Monday is merge day and this will get into aurora ...

If needed I can back out bug 884310. Than we should have the same crash-rate as before June 20th. That bug is only a performance improvement. Would a bit annoying the have it only in FF25, instead of FF24, but nothing is blocked on it...
(In reply to Hannes Verschore [:h4writer] from comment #20)
> I think bug 884310 has increased how often we hit this.
Don't know yet as it landed in today's build.

The backout before switching channels, if required, would be for a bug in the regression range of bug 882433.
Got this crash going to http://www.walmart.ca/en
Alice, can you determine the initial regression range, the one of bug 882433?
(In reply to Scoobidiver from comment #23)
> Alice, can you determine the initial regression range, the one of bug 882433?

STR please.
(In reply to Alice0775 White from comment #24)
> STR please.
Go to the URL of this bug or the dupe and scroll down.
(In reply to Scoobidiver from comment #25)
> (In reply to Alice0775 White from comment #24)
> > STR please.
> Go to the URL of this bug or the dupe and scroll down.

On Windows7:
Regression window(m-i)
Good:
http://hg.mozilla.org/integration/mozilla-inbound/rev/1df122edcf0d
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20130611 Firefox/24.0 ID:20130611113217
Bad:
http://hg.mozilla.org/integration/mozilla-inbound/rev/1684c32be328
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20130611 Firefox/24.0 ID:20130611125115
Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=1df122edcf0d&tochange=1684c32be328


On Ubuntu12.04
Regression window(m-i)
Good:
http://hg.mozilla.org/integration/mozilla-inbound/rev/3a6cd8d533b7
Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20130611 Firefox/24.0 ID:20130611123054
Bad:
http://hg.mozilla.org/integration/mozilla-inbound/rev/1684c32be328
Mozilla/5.0 (X11; Linux i686; rv:24.0) Gecko/20130611 Firefox/24.0 ID:20130611125115
Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=3a6cd8d533b7&tochange=1684c32be328

Regressed by:
1684c32be328	Kannan Vijayan — Bug 859609 - Inline functions that use the scope chain, and also inline call sites with monomorphic cloned lambdas. r=h4writer
Blocks: 859609
Removing bug 883630 from the dependencies as it's only responsible of the signature morphing.
No longer blocks: 883630
(In reply to Alice0775 White from comment #26)
> Regressed by:
> 1684c32be328	Kannan Vijayan — Bug 859609 - Inline functions that use the
> scope chain, and also inline call sites with monomorphic cloned lambdas.
> r=h4writer

Kannan, I think we should back this out. The most suspicious change is that we no longer require inline frames to have a constant callee, but InlineFrameIterator::findNextFrame expects a constant JSFunction (to reconstruct the stack).
Flags: needinfo?(kvijayan)
(In reply to Jan de Mooij [:jandem] from comment #28)
> (In reply to Alice0775 White from comment #26)
> > Regressed by:
> > 1684c32be328	Kannan Vijayan — Bug 859609 - Inline functions that use the
> > scope chain, and also inline call sites with monomorphic cloned lambdas.
> > r=h4writer
> 
> Kannan, I think we should back this out. The most suspicious change is that
> we no longer require inline frames to have a constant callee, but
> InlineFrameIterator::findNextFrame expects a constant JSFunction (to
> reconstruct the stack).

::findNextFrame doesn't expect a constant function, see:
        Value funval = si_.read();

I ran through this on Friday and discovered that either the snapshot info or the register dump was getting corrupted.

I agree with your sentiments on backing it out to remove the topcrash, but I don't want to back out the logic - I'd prefer to disable it to remove the topcrash, and then fix the issue with the code, and then re-enable it once we know that it's addressed.

Now that the bug is easily reproducible, it should be easy to test the fix.
Flags: needinfo?(kvijayan)
This just disables inlining of heavyweight functions.  It resolves the crash on AndroidPolice.com, and presumably the others.  I'll re-enable it after I can fix the bug that's causing this.
Attachment #766777 - Flags: review?(dvander)
Comment on attachment 766777 [details] [diff] [review]
Disable heavyweight inlining to resolve topcrash.

Review of attachment 766777 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/IonBuilder.cpp
@@ +198,5 @@
>              JS_ASSERT(appendOk);
>          } else {
> +            /* Temporarily disabling this logic. */
> +            return true;
> +            /*

#if 0 instead
Attachment #766777 - Flags: review?(dvander) → review+
Please make sure this gets landed on Aurora as well. This makes my Firefox unusable on some sites.
https://hg.mozilla.org/integration/mozilla-inbound/rev/d37d553cbb3f

If inbound is green, I'll ask for aurora uplift.
The leave open whiteboard was for the previous patch.

Does that one need to be backed out?
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(bhackett1024)
Resolution: --- → FIXED
Whiteboard: [leave open]
Target Milestone: --- → mozilla25
(In reply to Scoobidiver from comment #35)
> The leave open whiteboard was for the previous patch.
> 
> Does that one need to be backed out?

Given that it didn't fix the issue, yes, it should be.

But leave the 'leave open' around.  I still need to fix the bug and re-enable the inlining of heavyweight functions.  I have some other higher priority work to take care of in the next week, so I'll need to punt on that for a bit.
Attaching actual patch that was pushed.
Attachment #766777 - Attachment is obsolete: true
Comment on attachment 767194 [details] [diff] [review]
Disable heavyweight inlining to resolve topcrash.

[Approval Request Comment]
Bug caused by (feature/regressing bug #):
Bug 859609

User impact if declined: 
Severe crashiness in browser.

Testing completed (on m-c, etc.): 
Green on m-i for one night.

Risk to taking this patch (and alternatives if risky): 
Low risk.  Disables one single optimization.

String or IDL/UUID changes made by this patch:
None.
Attachment #767194 - Flags: approval-mozilla-aurora?
Attachment #767194 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #767194 - Flags: checkin?
Attachment #767194 - Flags: checkin?
Duplicate of this bug: 885726
Blocks: 884053
(In reply to Scoobidiver from comment #35)
> The leave open whiteboard was for the previous patch.
> 
> Does that one need to be backed out?

Yes.

https://hg.mozilla.org/integration/mozilla-inbound/rev/88a8fe760189
Flags: needinfo?(bhackett1024)
Bug needs to be reopened and [leave open] as per comment #36.
(In reply to Florian Bender from comment #43)
> Bug needs to be reopened and [leave open] as per comment #36.
Crashes are fixed. Please file a new bug for the underlying issue.
I'd rather leave that to Kannan, he probably knows better than me ;).
Flags: needinfo?(kvijayan)
It's fine either way.  I'll go ahead and file a new issue for re-enabling heavyweight inlining.  I'm not sure when I'll be able to get to it.
Flags: needinfo?(kvijayan)
Blocks: 897572
Verified as fixed with Firefox 24 beta 8 (build ID: 20130902131354), on Mac OSX 10.7.5 in 32bit mode, Ubuntu 12.10 32bit and Win 8 32bit. No more crashing with the URLs from comment 1, comment 12 and comment 22.

Reports from Socorro, regarding last month:

1) for the first signature, there aren't any crashes regarding last month

2) for the second signature, there is 1 crash with 24.0b4

https://crash-stats.mozilla.com/report/list?signature=js%3A%3Aion%3A%3AInlineFrameIteratorMaybeGC%3Cint%3E%3A%3AfindNextFrame%28%29&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&hang_type=any&date=2013-09-03+09%3A00%3A00&range_value=4

3) for the third signature, there aren't any crashes regarding last month

4) for the 4th signature, there are 2 crashes with 24.0b7 and 1 crash with 24.0b8

https://crash-stats.mozilla.com/report/list?signature=js%3A%3Aion%3A%3ASnapshotReader%3A%3AreadSlot%28%29&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&hang_type=any&date=2013-09-03+09%3A00%3A00&range_value=4

5) for the 5th signature, there aren't any crashes regarding last month

6) for the 6th signature, there are 3 crashes with 24.0b7

https://crash-stats.mozilla.com/report/list?signature=js%3A%3AGetPropertyHelper%28JSContext%2A%2C+JS%3A%3AHandle%3CJSObject%2A%3E%2C+JS%3A%3AHandle%3Cint%3E%2C+unsigned+int%2C+JS%3A%3AMutableHandle%3CJS%3A%3AValue%3E%29&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&hang_type=any&date=2013-09-03+09%3A00%3A00&range_value=4

7) for the 7th signature, there is 1 crash with 24.0b5

https://crash-stats.mozilla.com/report/list?signature=JSFunction%3A%3AexistingScript%28%29&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&hang_type=any&date=2013-09-03+09%3A00%3A00&range_value=4
QA Contact: manuela.muntean
(In reply to Manuela Muntean [:Manuela] [QA] from comment #47)
> Verified as fixed with Firefox 24 beta 8 (build ID: 20130902131354)

Please also check Firefox 25.
Keywords: verifyme
Verified as fixed with latest Aurora (build ID: 20130910004002), on Ubuntu 12.10 32bit and Win XP 32bit. No more crashing with the URLs from comment 1, comment 12 and comment 22.

Reports from Socorro, regarding last month:

1) for the first signature, there aren't any crashes

2) for the second signature, there are 29 crashes with 25.0a2

https://crash-stats.mozilla.com/report/list?signature=js%3A%3Aion%3A%3AInlineFrameIteratorMaybeGC%3Cint%3E%3A%3AfindNextFrame%28%29&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&hang_type=any&date=2013-09-03+09%3A00%3A00&range_value=4

3) for the third signature, there aren't any crashes

4) for the 4th signature, there are 5 crashes with 25.0a2

https://crash-stats.mozilla.com/report/list?signature=js%3A%3Aion%3A%3ASnapshotReader%3A%3AreadSlot%28%29&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&hang_type=any&date=2013-09-03+09%3A00%3A00&range_value=4

5) for the 6th signature, there are 4 crashes with 25.0a2

https://crash-stats.mozilla.com/report/list?signature=js%3A%3AGetPropertyHelper%28JSContext%2A%2C+JS%3A%3AHandle%3CJSObject%2A%3E%2C+JS%3A%3AHandle%3Cint%3E%2C+unsigned+int%2C+JS%3A%3AMutableHandle%3CJS%3A%3AValue%3E%29&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&hang_type=any&date=2013-09-03+09%3A00%3A00&range_value=4

6) for the 7th signature, there are 43 crashes with 25.0a2

https://crash-stats.mozilla.com/report/list?signature=JSFunction%3A%3AexistingScript%28%29&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&hang_type=any&date=2013-09-03+09%3A00%3A00&range_value=4
Hi all,
whether the fix is indeed being back-ported to a Firefox 24.x ESR release?
If you look at the tracking flags, this fix should already be on Firefox 24.  If you are seeing a similar crash, please file a new bug in the Javascript Engine component.
I have the similar crash in Firefox 24.4.0 ESR (the 7th signature [@ JSFunction::existingScript()]).
could you tell me about the version FF is fixed?
Additional info:
Discussion @ https://support.mozilla.org/en-US/questions/990856#answer-547750
Crash Report from Duong Nguyen
Firefox 24.4.0esr Crash Report [@ JSFunction::existingScript() ]
https://crash-stats.mozilla.com/report/index/bp-bd57776b-b975-41ac-8593-cdd802140324
You need to log in before you can comment on or make changes to this bug.