Windows 8 on Aurora spontaneously started permafailing in bc3 and dt and jp mochitests with too much recursion

RESOLVED FIXED in Firefox 40

Status

()

defect
--
blocker
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: KWierso, Assigned: jandem)

Tracking

41 Branch
mozilla42
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox40 fixed, firefox41+ fixed, firefox42 fixed)

Details

bc3: https://treeherder.mozilla.org/logviewer.html#?job_id=936760&repo=mozilla-aurora
dt: https://treeherder.mozilla.org/logviewer.html#?job_id=936761&repo=mozilla-aurora
jp: https://treeherder.mozilla.org/logviewer.html#?job_id=936868&repo=mozilla-aurora

These tests started permafailing this morning, when the only thing that's changed since Saturday was Bug 1160441 which doesn't seem related.

The common failure between all three suites is "too much recursion".

I've triggered a new Windows 8 build on the previous push to see if it fails, too.

I guess if nothing else we can try to back out bug 1160441 but I don't see how it could have caused this.


I'm closing Aurora until this can get looked at.
Naveed, do you have someone who can look into this soonish? :)
Severity: normal → blocker
Flags: needinfo?(nihsanullah)
The fact that these are all recursion errors and on PGO suggest that PGO builds keep getting larger and larger stack frames and are messing with the native stack limit. My hunch is that since JIT code and C++ share the same stack, the PGO frames are affecting the effective JS stack limit.

I'm at a conference and PTO for the next 2 weeks. Jan, do you think you could take a look?
See comment 2.
Flags: needinfo?(jdemooij)
See also bug 1167883 on Aurora Win64 PGO stack overflow issues.
Depends on: 1181040
I just started a PGO build locally to work around bug 1181040, hopefully it will finish. Fingers crossed I'll be able to reproduce one of these issues; apparently they are intermittent.
(In reply to Jan de Mooij [:jandem] from comment #22)
> I just started a PGO build locally to work around bug 1181040, hopefully it
> will finish. Fingers crossed I'll be able to reproduce one of these issues;
> apparently they are intermittent.

With this build I can't repro the bc3 or dt failures, but I can reproduce the Google Calendar issue in bug 1167883 comment 6. I'll focus on that one for now and hope these issues are related.
Setting Tracking for 41 Aurora and affected.
Depends on: 1181558
I posted a patch in bug 1167883, hopefully it will fix this too.
Flags: needinfo?(nihsanullah)
(In reply to Jan de Mooij [:jandem] from comment #25)
> I posted a patch in bug 1167883, hopefully it will fix this too.

Clearing needinfo, let's see if this comes back after that lands.
Flags: needinfo?(jdemooij)
Bug 1167883 landed on Aurora so we can probably reopen the tree now.
Except e10s tests were turned on in the mean time and there's a Linux32 permafail :(
Ryan, I am trying to follow up on the FF41+ tracked bugs. Is this test failure still showing up on Beta41? IF it is e10s tests only, I would like to move this bug to FF42 tracking list. Please let me know.
Flags: needinfo?(ryanvm)
Yes, Linux32 still has a permafail being tracked in bug 1193096. This bug is safe to close, though. It was fixed by bug 1167883.
Assignee: nobody → jdemooij
Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(ryanvm)
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
You need to log in before you can comment on or make changes to this bug.