Closed Bug 488596 Opened 11 years ago Closed 10 years ago

[moz2-darwin9-slave*] xpcshell-tests: test_crashreporter.js crashes intermittently, near/on shutdown

Categories

(Toolkit :: Crash Reporting, defect, major)

x86
macOS
defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla1.9.2a1

People

(Reporter: sdwilsh, Assigned: benjamin)

References

()

Details

(Keywords: crash, intermittent-failure, Whiteboard: [See comment 32-34] )

Attachments

(2 files, 1 obsolete file)

TEST-UNEXPECTED-FAIL | /builds/slave/mozilla-central-macosx-unittest/build/objdir/_tests/xpcshell/crashreporter/unit/test_crashreporter.js | test failed, see following log:

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1239824865.1239834216.30943.gz&fulltext=1#err0
OS X 10.5.2 mozilla-central unit test on 2009/04/15 12:47:45

which looks like a crash to me :/
Whiteboard: [orange]
Hrm, that's pretty odd. I guess it could be a shutdown crash. Need to get bug 483062 finished and landed.
Also, for reference the full output looks like:
TEST-UNEXPECTED-FAIL | /builds/slave/mozilla-central-macosx-unittest/build/objdir/_tests/xpcshell/crashreporter/unit/test_crashreporter.js | test failed, see following log:
  >>>>>>>
  ### XPCOM_MEM_LEAK_LOG defined -- logging leaks to /var/folders/TL/TLg3RrMbFAur2hBCXvCeqk+++TM/-Tmp-/runxpcshelltests_leaks.log
*** test pending
*** test finished
*** exiting
*** PASS ***

  <<<<<<<
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1241844110.1241849344.2394.gz
OS X 10.5.2 mozilla-central unit test on 2009/05/08 21:41:50  

TEST-UNEXPECTED-FAIL | /builds/slave/mozilla-central-macosx-unittest/build/objdir/_tests/xpcshell/crashreporter/unit/test_crashreporter.js | test failed (with xpcshell return code: -10), see following log:
(In reply to comment #1)
> Need to get bug 483062 finished and landed.

That part landed at 'May 12 04:05:30 2009 -0700'.

Was this bug seen since then, or could it be resolved as WFM?
Summary: Random Test Failure (test_crashreporter.js) → xpcshell-tests: Random Test Failure (test_crashreporter.js)
No, dbaron's report in comment 4 is the last time this has happened, apparently. It's possible my other patch fixed this accidentally, or just made it stop happening. Regardless, if it happens again we should get a stack, and can reopen this.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
Apparently this is still happening.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Duplicate of this bug: 495801
Depends on: 495730
I pushed: http://hg.mozilla.org/mozilla-central/rev/4050d514fb37
which re-enables crash reporting at the end of that test, to try to get a stack the next time this happens.
Assignee: nobody → ted.mielczarek
(In reply to comment #11)
> I pushed: http://hg.mozilla.org/mozilla-central/rev/4050d514fb37

Not sure why you pushed this rather than review my patch in bug 495730 :-|
a) I forgot that was there.
b) Your patch is unnecessarily complicated anyway.
:-/ There's absolutely no useful info in that log still. I note that this does seeem to only be occurring on OS X, so I'll see if I can reproduce.
(In reply to comment #14)

Ftr, all previous logs reported this test as "*** PASS ***" (before crashing).

> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1244698045.1244705824.666.gz

The new detailed log:
{
test_crashreporter.js | test failed (with xpcshell return code: 1), see following log:

[...]
TEST-PASS | /builds/slave/mozilla-central-macosx-unittest/build/objdir/_tests/xpcshell/crashreporter/unit/test_crashreporter.js | [run_test : 51] false == false
}

Current code is
{
49   // check that we can disable the crashreporter
50   cr.enabled = false;
51   do_check_false(cr.enabled);
52   // ensure that double-disabling doesn't error
53   cr.enabled = false;
54   // leave it enabled at the end in case of shutdown crashes
55   cr.enabled = true;
56 }
}
(= misses the additional checks from my bug 495730 patch to pinpoint where this one crashed!)

(In reply to comment #15)
> :-/ There's absolutely no useful info in that log still.

Not true:

there is no crash stack;
(= misses correct minidumpPath from my bug 495730 patch?!)

but, in this occurrence,
1) the return code is 1 instead of -10.
2) it misses "TEST-PASS | ... | check(s) passed".

which is worth knowing, as different than previous cases.

> I note that this does
> seeem to only be occurring on OS X, so I'll see if I can reproduce.

Indeed! I don't know why it was marked as All/All in the first place.
Status: REOPENED → NEW
OS: All → Mac OS X
Hardware: All → x86
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1244740786.1244748662.24470.gz&fulltext=1
OS X 10.5.2 mozilla-central unit test on 2009/06/11 10:19:46
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1244743454.1244752411.1304.gz&fulltext=1
OS X 10.5.2 mozilla-central unit test on 2009/06/11 11:04:14

Same as previous, but with rc=-10 again.
Yeah, I'll review your patch ASAP so we can hopefully get a handle on this.

I think the different return code means Breakpad fired, instead of the app just crashing without Breakpad.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1244900001.1244905713.27144.gz
OS X 10.5.2 mozilla-central unit test on 2009/06/13 06:33:21
Keywords: crash
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1245435790.1245443923.9465.gz
OS X 10.5.2 mozilla-central unit test on 2009/06/19 11:23:10
Summary: xpcshell-tests: Random Test Failure (test_crashreporter.js) → [MacOSX] xpcshell-tests: test_crashreporter.js crashes intermittently, near/on shutdown
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1245599350.1245605386.22746.gz
OS X 10.5.2 mozilla-central unit test on 2009/06/21 08:49:10
TEST-UNEXPECTED-FAIL | /builds/slave/mozilla-central-macosx-unittest/build/objdir/_tests/xpcshell/crashreporter/unit/test_crashreporter.js | test failed (with xpcshell return code: -10), see following log:

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1245661907.1245667157.32432.gz
Depends on: 474688
First build with bug 495730 test patch = first crash :->
(though not the expected one :-|)

{
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1245709371.1245711161.3017.gz&fulltext=1
OS X 10.5.2 mozilla-central test everythingelse on 2009/06/22 15:22:51

TEST-UNEXPECTED-FAIL | ...| test failed (with xpcshell return code: -10)
...
INFO | test_crashreporter.js | Disable crashreporter.
TEST-PASS | ...| [run_test : 20] false == false
}
So it crashed in the next check block, "Getting serverURL when disabled",
which obviously can't give a stack :-/
(In reply to comment #28)
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1245709371.1245711161.3017.gz&fulltext=1
> OS X 10.5.2 mozilla-central test everythingelse on 2009/06/22 15:22:51
> 
> So it crashed in the next check block, "Getting serverURL when disabled",

And it passed on
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1245707688.1245714105.8420.gz
OS X 10.5.2 mozilla-central unit test on 2009/06/22 14:54:48
:->

*****

(In reply to comment #29)
> I don't see how that getter could crash, but maybe I'm missing something:

I've no idea what and how is crashing exactly either.
For now, we'll just have to see if it happens again or not...
(In reply to comment #30)
> For now, we'll just have to see if it happens again or not...

2nd 'E' build = second (same) crash:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1245715374.1245717212.14149.gz&fulltext=1
OS X 10.5.2 mozilla-central test everythingelse on 2009/06/22 17:02:54

(Still too soon to say it would be some packaging related issue...)
{
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1245729925.1245731723.4184.gz&fulltext=1
OS X 10.5.2 mozilla-central test everythingelse on 2009/06/22 21:05:25

TEST-PASS | test_crashreporter.js | [run_test : 27] 3253927937 == 3253927937
}
== crashed 1 test later: "Getting minidumpPath when disabled".

"Fix" timeframe:
http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2009-06-22+16%3A01%3A43&enddate=2009-06-22+20%3A50%3A54
with nothing obvious.

***

{
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1245731854.1245733651.6999.gz&fulltext=1
OS X 10.5.2 mozilla-central test everythingelse on 2009/06/22 21:37:34
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1245740055.1245741778.19981.gz&fulltext=1
OS X 10.5.2 mozilla-central test everythingelse on 2009/06/22 23:54:15

TEST-PASS | test_crashreporter.js | [run_test : 43] 3253927937 == 3253927937
}
== crashed yet 2 tests later: "Calling appendAppNotesToCrashReport() when disabled".

"Fix" timeframe:
http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2009-06-22+20%3A50%3A52&enddate=2009-06-22+21%3A28%3A20
with a WinCE checkin only.

***

Interestingly, only the 'E' box has this intermittent crash, not the 'U' one.
As if there were something wrong in the "packaging"...

Oh! Let's see which VM the 'E' box ran on:
(2009/06/22 14:59:04)
moz2-darwin9-slave02 "Getting serverURL when disabled"
bm-xserve16
moz2-darwin9-slave07 "Getting serverURL when disabled"
bm-xserve17
bm-xserve18
moz2-darwin9-slave05 "Getting minidumpPath when disabled"
moz2-darwin9-slave02 "Calling appendAppNotesToCrashReport() when disabled"
moz2-darwin9-slave02 "Calling appendAppNotesToCrashReport() when disabled"
bm-xserve17
bm-xserve17
bm-xserve19
bm-xserve17
bm-xserve22
bm-xserve22
(2009/06/23 05:51:08)
It looks like a VM issue on moz2-darwin9-slave__ VMs compared to bm-xserve__ ones!

(But don't ask me what is different for moz2-darwin9-slave__ VMs running 'E' tests :-()
Severity: normal → major
Flags: wanted1.9.2?
Whiteboard: [orange] → [See comment 32] [orange]
No longer blocks: 383136
(In reply to comment #32)
> Interestingly, only the 'E' box has this intermittent crash, not the 'U' one.

This test has not crashed (at shutdown) on the 'U' box since comment 28 landing.
Do we care to try and understand why? (Maybe not, given the following:)

> It looks like a VM issue on moz2-darwin9-slave__ VMs compared to bm-xserve__
> ones!

I checked all previously reported builds in this bug:
the shutdown crash always happened on moz2-darwin9-slave__ VMs too!
Whiteboard: [See comment 32] [orange] → [See comment 32+33] [orange]
Note that none of these are VMs, they're all physical mac machines. The bm-xserveNN machines are Xserves, whereas the moz2-darwin9-slaveNN machines are Mac Minis.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1247531225.1247533066.29139.gz

Note that the effects of bug 503976 appear in this log as well, although as far as I know they're unrelated.
Duplicate of this bug: 503988
Duplicate of this bug: 503976
This memory leak was identified originally in bug 503988.  It turns out that there is a very similar leak in all the logs attached to this bug, so I'm concluding that it is simply another symptom of this bug.
(In reply to comment #52)
> Created an attachment (id=388392) [details]
> Large memory leak originally identified in bug 503988.
> 
> This memory leak was identified originally in bug 503988.  It turns out that
> there is a very similar leak in all the logs attached to this bug, so I'm
> concluding that it is simply another symptom of this bug.

s/503988/503976

Sorry for the confusion.
Comment on attachment 388392 [details]
Large memory leak originally identified in bug 503988.


Long leak like this one are better as attachment than comment :-)

But this leak (and that bug) is unrelated to the current _crash_ bug.

Fwiw, I noticed 3 "nsHtml5*" in this log...
Attachment #388392 - Attachment is obsolete: true
(In reply to comment #49)
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1247531225.1247533066.29139.gz
> 
> Note that the effects of bug 503976 appear in this log as well, although as far
> as I know they're unrelated.

Plain wrong comment: see bug 503976 comment 7.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1248091605.1248093507.5336.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/07/20 05:06:45  

TEST-UNEXPECTED-FAIL | /builds/slave/mozilla-central-macosx-unittest-everythingelse/build/xpcshell/tests/crashreporter/unit/test_crashreporter.js | test failed (with xpcshell return code: -10), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1248076020.1248077906.27649.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/07/20 00:47:00  

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1248083741.1248085627.12653.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/07/20 02:55:41  

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1248065520.1248067358.12149.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/07/19 21:52:00
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1248256961.1248259167.30724.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/07/22 03:02:41  
TEST-UNEXPECTED-FAIL | /builds/slave/mozilla-central-macosx-unittest-everythingelse/build/xpcshell/tests/crashreporter/unit/test_crashreporter.js | test failed (with xpcshell return code: -10), see following log:
http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1248231285.1248239919.17902.gz
OS X 10.5.2 try hg unit test on 2009/07/21 19:54:45
Whiteboard: [See comment 32+33] [orange] → [See comment 32-34] [orange]
woot for sayrer
Assignee: ted.mielczarek → benjamin
Attachment #390229 - Flags: review?(ted.mielczarek)
Comment on attachment 390229 [details] [diff] [review]
Fix race unsetting the exception handler, rev. 1

   if (!gExceptionHandler)
     return NS_ERROR_NOT_INITIALIZED;
 
-  delete gExceptionHandler;
   gExceptionHandler = nsnull;

Should we be moving this whole block up there? Maybe it doesn't matter. Thanks for fixing this, and thanks to sayrer for finding the root cause!
Attachment #390229 - Flags: review?(ted.mielczarek) → review+
http://hg.mozilla.org/mozilla-central/rev/16267f092342
Status: NEW → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Flags: wanted1.9.2? → in-testsuite+
Summary: [MacOSX] xpcshell-tests: test_crashreporter.js crashes intermittently, near/on shutdown → [moz2-darwin9-slave*] xpcshell-tests: test_crashreporter.js crashes intermittently, near/on shutdown
Target Milestone: --- → mozilla1.9.2a1
I think we hit this again today, on moz2-darwin9-slave05
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1250178113.1250189451.4140.gz
OS X 10.5.2 mozilla-central unit test on 2009/08/13 08:41:53

Full output looks like this:

TEST-UNEXPECTED-FAIL | /builds/moz2_slave/mozilla-central-macosx-unittest/build/objdir/_tests/xpcshell/crashreporter/unit/test_crashreporter.js | test failed (with xpcshell return code: -10), see following log:
  >>>>>>>
  ### XPCOM_MEM_LEAK_LOG defined -- logging leaks to /var/folders/TL/TLg3RrMbFAur2hBCXvCeqk+++TM/-Tmp-/runxpcshelltests_leaks.log
TEST-INFO | (xpcshell/head.js) | test 1 pending
INFO | test_crashreporter.js | Get crashreporter service.
TEST-PASS | /builds/moz2_slave/mozilla-central-macosx-unittest/build/objdir/_tests/xpcshell/crashreporter/unit/test_crashreporter.js | [run_test : 11] [xpconnect wrapped nsICrashReporter] != null
INFO | test_crashreporter.js | Disable crashreporter.
TEST-PASS | /builds/moz2_slave/mozilla-central-macosx-unittest/build/objdir/_tests/xpcshell/crashreporter/unit/test_crashreporter.js | [run_test : 20] false == false
TEST-PASS | /builds/moz2_slave/mozilla-central-macosx-unittest/build/objdir/_tests/xpcshell/crashreporter/unit/test_crashreporter.js | [run_test : 27] 3253927937 == 3253927937
TEST-PASS | /builds/moz2_slave/mozilla-central-macosx-unittest/build/objdir/_tests/xpcshell/crashreporter/unit/test_crashreporter.js | [run_test : 35] 3253927937 == 3253927937

  <<<<<<<
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
You know, I didn't look closely at your log before, but this looks like a different bug. It appears to be crashing somewhere in the middle of the test, as opposed to on shutdown, which was what this bug was about. I'd like to spin that off to a new bug, so as not to confuse the issue here further, since bsmedberg fixed the particular issue from this bug.
Status: REOPENED → RESOLVED
Closed: 11 years ago10 years ago
Resolution: --- → FIXED
I filed bug 514397 on this new crash.
Whiteboard: [See comment 32-34] [orange] → [See comment 32-34]
You need to log in before you can comment on or make changes to this bug.