Closed Bug 431910 Opened 17 years ago Closed 17 years ago

boxset crashing [@ gfxASurface::AddRef()] on Tp since about 5am 2 May

Categories

(Camino Graveyard :: General, defect)

PowerPC
macOS
defect
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: alqahira, Unassigned)

References

()

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

boxset stopped reporting to the graph server after 4:49 am $MOZ this morning. mento checked the logs and is finding a crash 8 secs after launch/start of Tp. Last good build was MOZ_CO_DATE=2008:05:02:04:17:00 Checkins immediately following that CO_DATE are bug 430624 and bug 406730l. I can't repro the crash on 10.5.2/Intel with the latest cb-x1 tinderbuild, but that could only mean it's 10.4, or PPC (or possibly just boxset). Filing in Camino for now until we get a little more info, and as a blocker because lack of T* results closes the Camino tree.
Chris says the tinderbuild doesn't crash for him on 10.5.2/PPC, either, so we're now down to 10.4 or boxset. mento's going to reboot boxset.
Rebooting didn't seem to help. Someone should try backing out one or both of bug 430624 and bug 406730 locally on boxset and seeing if that fixes things.
I rebuilt my trunk this evening and I'm seeing this in the console with every page-load (and it's new console spam): May 3 01:39:19 Qalaat-Samaan [0x0-0x510510].org.mozilla.camino[72178]: WARNING: NS_ENSURE_TRUE(mainWidget) failed: file /Users/smokey/Camino/dev/trunk/mozilla/dom/src/base/nsGlobalWindow.cpp, line 2244 Blame for http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/dom/src/base/nsGlobalWindow.cpp&rev=1.1013#2244 is bug 406730. I don't know how this might relate the crash, but there's some new data for you.
There was some talk in #macdev today about that bug, actually, and apparently it hasn't "fully landed" yet, I guess. I don't think any of the theme work is going to fix that warning, though...
BTW, I don't see any of that in Console from when I ran the tinderbuild earlier on 10.5.
(In reply to comment #4) > May 3 01:39:19 Qalaat-Samaan [0x0-0x510510].org.mozilla.camino[72178]: > WARNING: NS_ENSURE_TRUE(mainWidget) failed: file > /Users/smokey/Camino/dev/trunk/mozilla/dom/src/base/nsGlobalWindow.cpp, line > 2244 > > Blame for > http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/dom/src/base/nsGlobalWindow.cpp&rev=1.1013#2244 > is bug 406730. I don't know how this might relate the crash The very next thing that happens is that mainWidget is dereferenced, so if it's null that would certainly explain a crash.
I checked in Markus's bustage-fix patch in bug 406730; it definitely gets rid of the warning for me, so let's hope it fixes the crash on boxset, too.
I had to back out the patch because it caused orange on one of the Linux unit-test boxen. In the hour in which it was in, boxset didn't report a single successful Tp run (the cycle time before the crash had been about 35 minutes), so I'm not sure the bustage fix helped. Our best course of action here is still to do a local backout of bug 406730 (and possibly bug 430624 as well) to see if we can get a green cycle, as well as checking the crashlogs from about the past hour (I checked in at 19:10 and out at 20:23 according to bonsai) to see if the crash stacks changed at all with the bustage patch in.
The Linux orange appeared spurious, so I checked that patch back in thanks to philor's charitable offer of tree-sitting. Still nothing from boxset, so something ("else") is still wrong, but at least we're one warning less :)
Before I fell asleep late last night, here's what I discovered: 1) As of that point, it was still possible to back bug 430624 out cleanly. 2) It is no longer possible to back out the patches from bug 406730 cleanly. The bustage fix comes off OK, but the other one starts failing pretty quickly before it dies because of a corrupt patch. Given that so much has landed that depends on that fix, we're either going to have to back out a mess of patches or make boxset do MOZ_CO_DATE (and then apply that patch back to confirm breakage) if backing out bug 430624 first doesn't fix it.
I enabled Tp on cb-x1 tonight, and it didn't crash (and when it didn't, I enabled Tdhtml, which also didn't crash). This basically doubles the trunk cycle time, from 13mins to 28mins. On one hand, this means only boxset is crashing (and as of yet we don't know why) :( On the other, if we want to live with "perf" data from a fast machine that historically hasn't caught regressions and re-open the trunk tomorrow for checkins, we could (and face a doubled cycle time).
(In reply to comment #12) > On one hand, this means only boxset is crashing (and as of yet we don't know > why) :( It's still possible that this crash is a 10.4-PPC or 10.4-shared crash. I used the in-tree CLOBBER support to issue a clobber to boxset to see if there's maybe just some crap in its objdir (unlikely, but hey); I think that was the only thing left we can do without someone who has access to the box investigating themselves. As always, I'll check in on boxset via my perf dashboard in the morning, and I'll leave cb-x1 running Tp/Tdhtml at least overnight so that we can get a number of perf runs if we decide to go that route.
CLOBBER++ I'd like to leave boxset producing perf data unmolested for another hour or so, but then we'll consider the trunk/tree open for mozilla/camino checkins again. Likewise, I'll turn off Tp/Tdhtml on cb-x1 when I get in so that we can regain the fast cycle time there.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Crash Signature: [@ gfxASurface::AddRef()]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: