Closed Bug 694752 Opened 9 years ago Closed 9 years ago

crash JSC::Yarr::Interpreter::interpret

Categories

(Core :: JavaScript Engine, defect, critical)

x86
All
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
mozilla10

People

(Reporter: wgianopoulos, Assigned: cdleary)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-fa4c7ffd-6ccb-4267-98c7-8db562111015 .
============================================================= 

This has happened 2 times within about 10 minutes of updating to the current nightly.

10.0a1 (2011-10-15)
Built from http://hg.mozilla.org/mozilla-central/rev/6d5fd5a30c71
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Firefox → Core
QA Contact: general → general
First crash was this report.

bp-0c1d9cbe-0fd8-4597-b6eb-092812111015
This crash also seems related although the stack is a little different.

bp-59375960-a360-4e5e-a8e5-0fa8c2111015

All of these crashes seem to be related to using tbpl.mozilla.org which I do all the time.

The previous nightly definitely did not crash, so a gross regression window is:

http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=9545b88eed82&tochange=6d5fd5a30c71
Steps to reproduce:

1.  Launch a new copy of Firefox.
2.  Navigate to https://tbpl.mozilla.org/
3.  Click on the new tab button to open a second tab with about:blank
4.  Do not click back to TBPL so that it is NOT the active tab.
5.  Wait a few minutes.
This has happened to me twice within the first 15 mins of using:
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0a1) Gecko/20111015 Firefox/10.0a1

bp-ea68531d-b22b-438b-a677-66f9a2111015
bp-a3c70293-b913-4685-9878-723c42111015

As for Bill, just had TBPL (+gmail open) & had worked fine with the nightly from the day before. TBPL was pinned and not the active tab at the time.
OS: Linux → All
Crash Signature: [@ JSC::Yarr::Interpreter::interpret] → [@ JSC::Yarr::Interpreter::interpret] [@ JSC::Yarr::Interpreter::interpret()]
Using mozilla-central tinderbox builds...

Last known good (worked for as long as I tried it [15 mins], but pretty confident):
http://hg.mozilla.org/mozilla-central/rev/68799855e853

First known bad (crashed within 2-3 mins):
http://hg.mozilla.org/mozilla-central/rev/349f3d4b2d87

Reduced range:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=68799855e853&tochange=349f3d4b2d87

This range + the stacktrace makes me suspect bug 673188; CC'ing Chris.
Blocks: 673188
I should point out that i have had this occur while TBPL was in the active tab, just having the tab be inactive was the most reliable way to reproduce.  Not exactly sure why.
Changing importance to blocker since it impact the ability to use TBPL.
Severity: critical → blocker
Whilst we were waiting for the comment 8 try run to complete, myself/Bill tried the tinderbox m-c tip build, which worked fine. So sometime between http://hg.mozilla.org/mozilla-central/rev/6d5fd5a30c71 and http://hg.mozilla.org/mozilla-central/rev/3b58a9df4c8c , the issue was resolved. The only candidate I can think of is bug 694527.

Anyway, I've respun nightlies on m-c tip (3b58a9df4c8c), which should mean we can use TBPL at least.

Chris, I don't know if there are any edge cases left that could still be causing problems, so can I leave it to you to:
- reduce the TBPL testcase & add a crashtest if you think it's appropriate
- confirm bug 694527 fixed the crash in its entirety

Leaving this bug open for now, for that investigation.

Thanks :-)
Assignee: general → cdleary
Severity: blocker → critical
I too am having crashes related to having tbpl is a background tab, but my crash sig is different: 

https://crash-stats.mozilla.com/report/index/bp-3277f865-dae0-4194-a9bb-fed362111015

usually will crash in first 6-7 mins if you open the browser and let it sit... interaction seems to extend the period of time before the browser crashes, but I do suspect that on 'idle' something is doing a clean-up, GC or CC ?  

and just now got a crash with the yarr...

https://crash-stats.mozilla.com/report/index/bp-d59c318c-154b-4524-a875-f13492111015

today's nightly: 
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0a1) Gecko/20111015 Firefox/10.0a1

cset: http://hg.mozilla.org/mozilla-central/rev/6d5fd5a30c71

closing the tab with tbpl seems to ease the crashing.
Many thanks Ed! I think I know what the problem was, I'll try to get a crasher test case at that changeset to confirm (and for addition to the suite).
Using an hourly build with the same cset as the respin noted above, I've not had any crashes in over an hour, looks like the fix for bug 694527 fixed the crash in this bug.
This segfaults on the specified revision because the incref path was removed.

In detail: using the script's RegExpObject directly causes it to create a RegExpPrivate in the first loop. In the second loop, the inference can't determine that the builtin RegExp.prototype.test is the one being called, so we compile the path that doesn't incref the non-null RegExpPrivate. Then, gc() to run out of the interpreter, which performs a clone on an invalid RegExpPrivate source.
Attachment #567314 - Flags: review?
That was quick! :-)

(The review wasn't targeted at anyone, was this intentional?)
Attachment #567314 - Flags: review? → review?(bhackett1024)
(In reply to Ed Morley [:edmorley] from comment #14)
> (The review wasn't targeted at anyone, was this intentional?)

Nope, thanks for pointing that out.
Comment on attachment 567314 [details] [diff] [review]
Tricky recompilation path.

The patch is missing code changes, and just has the added test.
Attachment #567314 - Flags: review?(bhackett1024)
Or, looking at comment 13, are things correct on tip?  If so, I don't think you need review to add tests.
Comment on attachment 567314 [details] [diff] [review]
Tricky recompilation path.

I want to make sure that my understanding of the recompilation path is correct.

Here I do an eval that mutates the regexp prototype, but that may be way overkill (the source of the bug didn't do anything so nuclear). What's the spectrum of things I can do to get recompilation to fall back from the "one known native use" path in the mjit compiler to the "unknown use but inline clone" path?
Attachment #567314 - Flags: feedback?(bhackett1024)
(In reply to Chris Leary [:cdleary] from comment #18)
> Comment on attachment 567314 [details] [diff] [review] [diff] [details] [review]
> Tricky recompilation path.
> 
> I want to make sure that my understanding of the recompilation path is
> correct.
> 
> Here I do an eval that mutates the regexp prototype, but that may be way
> overkill (the source of the bug didn't do anything so nuclear). What's the
> spectrum of things I can do to get recompilation to fall back from the "one
> known native use" path in the mjit compiler to the "unknown use but inline
> clone" path?

Using eval() is a good way of making sure TI doesn't see a side effect until it actually occurs.  In this case it's not necessary, though, as the barriers on the property accesses in 'RegExp.prototype' mean that the effects will not show up in type information until the code executes.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #20)
> We have some other Yarr crashes on Nightly since 2011-10-15

All of the crashes I looked at only had the build id 2011-10-15-03-10-37 and we seem to have spun three new nightlies since then. If you see something that contradicts that, definitely let me know.
Attachment #567314 - Flags: feedback?(bhackett1024)
https://hg.mozilla.org/mozilla-central/rev/b152ca691213
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Resolution: FIXED → WORKSFORME
I am confused.  This was a bug, and code was checked in (see comment 22) to fix it.

Why is FIXED not the correct resolution.
(In reply to Bill Gianopoulos from comment #24)
> I am confused.  This was a bug, and code was checked in (see comment 22) to
> fix it.
> 
> Why is FIXED not the correct resolution.

That checkin adds a test case. I assumed it was a test case to cover regressions on the original bug, which presumably was fixed elsewhere.
(In reply to David Mandelin from comment #25)
> (In reply to Bill Gianopoulos from comment #24)
> > I am confused.  This was a bug, and code was checked in (see comment 22) to
> > fix it.
> > 
> > Why is FIXED not the correct resolution.
> 
> That checkin adds a test case. I assumed it was a test case to cover
> regressions on the original bug, which presumably was fixed elsewhere.

Silly me.  You are correct.  I remember now we speculated this was fixed by the check-in for Bug 694527, but were not sure.  WORKSFORME is indeed the correct resolution.
I just filed bug 701396 as this is not completely gone, and actually seems to have significantly regressed in the last days.
You need to log in before you can comment on or make changes to this bug.