Closed Bug 522729 Opened 10 years ago Closed 10 years ago

mochitest-browser-chrome: intermittent "browser_aboutCrashesResubmit.js | Timed out"

Categories

(Toolkit :: Crash Reporting, defect)

x86
All
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla1.9.3a1
Tracking Status
status1.9.2 --- beta5-fixed
status1.9.1 --- .8-fixed

People

(Reporter: sgautherie, Assigned: ted)

References

()

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

{
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey2.0/1255705925.1255711260.31970.gz
Linux comm-1.9.1 unit test on 2009/10/16 08:12:05

Running chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js...
TEST-PASS | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashes.js | about:crashes loaded
TEST-INFO | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | Console message: [JavaScript Error: "doc.getElementById("tbody") is null" {file: "chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashes.js" line: 8}]
Document about:crashes loaded successfully
TEST-PASS | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | about:crashes lists correct number of crash reports
TEST-PASS | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | 0: crash ID is correct
TEST-PASS | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | 1: crash ID is correct
TEST-PASS | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | pending URL links to the correct static page
TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | Timed out
}

First time I notice this.
The issue seems to be that browser_aboutCrashes.js leaked to browser_aboutCrashesResubmit.js...

Fwiw,
http://mxr.mozilla.org/mozilla1.9.1/source/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js
Blocks: 438871
Whiteboard: [orange]
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256484292.1256487074.15853.gz
Linux mozilla-central test everythingelse on 2009/10/25 08:24:52
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256610180.1256612565.24420.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/10/26 19:23:00
Not sure what if anything it means, but http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256684328.1256687368.3684.gz is the first time I've seen it *not* leaking to the next test.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256856351.1256858818.26094.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/10/29 15:45:51
(Looks like this one did leak the next test -- "leaked 1214191 bytes during test execution")
OS: Linux → All
Version: 1.9.1 Branch → Trunk
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256937293.1256940385.8434.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/10/30 14:14:53
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256958625.1256960827.8704.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/10/30 20:10:25
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256952654.1256955388.14863.gz
Linux mozilla-central test everythingelse on 2009/10/30 18:30:54
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256980319.1256982879.19967.gz
Linux mozilla-central test opt everythingelse on 2009/10/31 02:11:59
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257137387.1257139692.28785.gz#err2
OS X 10.5.2 mozilla-central test everythingelse on 2009/11/01 20:49:47
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257163928.1257167174.9002.gz
Linux mozilla-central test opt everythingelse on 2009/11/02 04:12:08
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257154734.1257157741.30525.gz#err1
Linux mozilla-central test opt everythingelse on 2009/11/02 01:38:54
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257190211.1257193681.30333.gz
Linux mozilla-central test opt everythingelse on 2009/11/02 11:30:11  Linux mozilla-central test opt everythingelse on 2009/11/02 11:30:11
Note that the log from comment 17 did not involve a leak.

This one did, however:
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257293097.1257295897.21122.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/11/03 16:04:57
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257287386.1257291227.611.gz#err1
Linux mozilla-central test opt everythingelse on 2009/11/03 14:29:46
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257359346.1257361818.25566.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/11/04 10:29:06
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257389238.1257392521.10691.gz
Linux mozilla-central test opt everythingelse on 2009/11/04 18:47:18
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257649073.1257651823.12409.gz
Linux mozilla-central opt test everythingelse on 2009/11/07 18:57:53
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257680684.1257686323.13142.gz
Linux mozilla-central debug test everythingelse on 2009/11/08 03:44:44
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257707836.1257710150.29420.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/11/08 11:17:16
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257718245.1257720487.12115.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/11/08 14:10:45
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257718677.1257721005.17484.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/11/08 14:17:57
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257817018.1257819472.21109.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/11/09 17:36:58
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257847425.1257849781.10289.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/11/10 02:03:45
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257855541.1257857825.7211.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/11/10 04:19:01
None of the logs I've looked at show browser_aboutCrashes.js failing, so I'm taking that out of the summary. All the failures look like this:
Running chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js...
TEST-PASS | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | about:crashes lists correct number of crash reports
TEST-PASS | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | 0: crash ID is correct
TEST-PASS | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | 1: crash ID is correct
TEST-PASS | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | pending URL links to the correct static page
TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | Timed out

This means that something in check_submit_pending is failing:
http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js#50

check_submit_pending tries to send a mouse click to the pending crash link in about:crashes, then waits for the onload of the result to check that it submitted correctly. However, the onload apparently isn't firing, so it times out. I'm going to add a few debug dump statements to the test so we can at least get some more info here.
Summary: mochitest-browser-chrome: intermittent "browser_aboutCrashesResubmit.js | Timed out", related to browser_aboutCrashes.js (failing and) leaking to next test → mochitest-browser-chrome: intermittent "browser_aboutCrashesResubmit.js | Timed out"
So, this failure has my debugging output:

Running chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js...
TEST-PASS | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | about:crashes lists correct number of crash reports
TEST-PASS | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | 0: crash ID is correct
TEST-PASS | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | 1: crash ID is correct
TEST-PASS | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | pending URL links to the correct static page
check_submit_pending: trying to submit crash 43893d06-7059-604c-b75d-8a8e9d1ba8b
check_submit_pending: sent mouse event to 43893d06-7059-604c-b75d-8a8e9d1ba8b
csp_onload, location = about:crashes
csp_onload, location = about:crashes
TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | Timed out

I'm not sure exactly what happened there. We got two onloads for about:crashes, then nothing. I guess it's possible the submit failed somehow. I think I'll add some debug dumps to the actual crashes.js code next.
Assignee: nobody → ted.mielczarek
(In reply to comment #4)
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256684328.1256687368.3684.gz
> is the first time I've seen it *not* leaking to the next test.

Not sure what you meant:
browser_aboutCrashes.js leaked to browser_aboutCrashesResubmit.js with comment 0 build only.


(In reply to comment #5)
> OS X 10.5.2 mozilla-central test everythingelse on 2009/10/29 15:45:51
> (Looks like this one did leak the next test -- "leaked 1214191 bytes during
> test execution")

That memory leak is intermittent, I assume it is not related to this bug...


(In reply to comment #29)
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1257680684.1257686323.13142.gz
> Linux mozilla-central debug test everythingelse on 2009/11/08 03:44:44

This is the only case reported with a debug build: I don't know if that one can help.


(In reply to comment #37)
> None of the logs I've looked at show browser_aboutCrashes.js failing, so I'm
> taking that out of the summary.

(Indeed. And it leaked to the next test with comment 0 build only.)
Status: NEW → ASSIGNED
I was able to reproduce this locally (with extra logging code), something interesting is happening for sure:
check_submit_pending: trying to submit crash f7379507-7fc3-a54d-b34f-4d516598f4a
*** about:crashes *** submitPendingReport called
check_submit_pending: sent mouse event to f7379507-7fc3-a54d-b34f-4d516598f4a
csp_onload, location = about:crashes
*** about:crashes *** createAndSubmitForm: iframe loaded
TEST-INFO | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | Console message: [JavaScript Error: "this._storage is null" {file: "file:///Users/luser/build/obj-firefox/dist/Minefield.app/Contents/MacOS/components/nsLoginManager.js" line: 547}]
TEST-INFO | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | Console message: [JavaScript Error: "uncaught exception: 2147500037"]
TEST-INFO | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | Console message: [JavaScript Error: "uncaught exception: 2147500037"]
*** error running SJS at /Users/luser/build/obj-firefox/_tests/testing/mochitest/./browser/toolkit/crashreporter/test/browser/crashreport.sjs: TypeError: invalid 'in' operand id on line 148
*** about:crashes *** submitForm: request returned HTTP status 500
TEST-UNEXPECTED-FAIL | chrome://mochikit/content/browser/toolkit/crashreporter/test/browser/browser_aboutCrashesResubmit.js | Timed out

More digging needed, but at least I can repro locally.
I'm not sure that this is the exact same failure mode, of course, since we don't see these JS console error messages in the tinderbox logs.
jorendorff determined that that error message from the SJS is faulty (and filed a bug on it), but the real error there is that formData is undefined:
http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/test/browser/crashreport.sjs#146

I'm not sure what's happening with those console messages, they're spewing NS_ERROR_FAILURE, but with no source info.
So I'm not sure that the failure mode I'm seeing locally is the same thing at all. Talking to bz, he notes that since nsIWebProgressListeners are held with a weak reference, they can be GCed, which would fit the failure mode we're seeing on tinderbox. This little patch makes it held in a global var instead.
Attachment #412040 - Flags: review?(dtownsend)
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1258096293.1258099064.25538.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/11/12 23:11:33
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1258130323.1258133650.8828.gz
OS X 10.5.2 mozilla-central test everythingelse on 2009/11/13 08:38:43
s: moz2-darwin9-slave13
Attachment #412040 - Flags: review?(dtownsend) → review+
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1258285350.1258288873.5966.gz
Linux mozilla-central opt test everythingelse on 2009/11/15 03:42:30
s: moz2-linux-slave05
Pushed to m-c:
http://hg.mozilla.org/mozilla-central/rev/59c16df179cd

Please reopen if this reoccurs.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment on attachment 412040 [details] [diff] [review]
hold the webprogresslistener in a global var

Requesting branch approvals. This fixes an actual bug in the about:crashes resubmit feature, that manifests as a test suite failure. It probably happens to actual users as well, manifesting as the feature just failing to work properly. We have test coverage on this, it just took a while to track down why the test was intermittently failing (because of this bug).
Attachment #412040 - Flags: approval1.9.2?
Attachment #412040 - Flags: approval1.9.1.7?
Comment on attachment 412040 [details] [diff] [review]
hold the webprogresslistener in a global var

a192 and 1917=beltzner, the latter should land after we're clear on 1916
Attachment #412040 - Flags: approval1.9.2?
Attachment #412040 - Flags: approval1.9.2+
Attachment #412040 - Flags: approval1.9.1.7?
Attachment #412040 - Flags: approval1.9.1.7+
The tree is open and ready for 1.9.1.7 checkins. Feel free to land anytime.
Target Milestone: --- → mozilla1.9.3a1
Blocks: 533902
(In reply to comment #40)
> (In reply to comment #4)
> > http://tinderbox.mozilla.org/showlog.cgi?log=Firefox-Unittest/1256684328.1256687368.3684.gz
> > is the first time I've seen it *not* leaking to the next test.
> 
> Not sure what you meant:
> browser_aboutCrashes.js leaked to browser_aboutCrashesResubmit.js with comment
> 0 build only.

I filed bug 533902 about the report leak, as it can be reproduced (again).

> (In reply to comment #37)
> > None of the logs I've looked at show browser_aboutCrashes.js failing, so I'm
> > taking that out of the summary.
> 
> (Indeed. And it leaked to the next test with comment 0 build only.)

Fwiw, I can get some failure (now): see bug 533902 too.
Changed my mind about those two, and filed bug 577409 instead.
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.