Closed Bug 766264 Opened 9 years ago Closed 9 years ago

Permanent orange: TEST-UNEXPECTED-FAIL | test_0200_app_launch_apply_update.js | Exceeded MAX_TIMEOUT_RUNS whilst... also in test_0201_app_launch_apply_update.js and test_0203_app_launch_apply_update.js

Categories

(Toolkit :: Application Update, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla16
Tracking Status
firefox15 --- verified

People

(Reporter: standard8, Assigned: standard8)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

Since background updates (bug 307181) landed, we've been seeing permanent timeouts on the Thunderbird builders. Bug 757794 has converted these into failures. This bug is to track fixing those failures.

The failures only occur on Windows, here's some examples:

https://tbpl.mozilla.org/php/getParsedLog.php?id=12798852&tree=Thunderbird-Trunk#error3

TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\tests\toolkit\mozapps\update\test\unit\test_0200_app_launch_apply_update.js | test failed (with xpcshell return code: 0), see following log:
>>>>>>>
### XPCOM_MEM_LEAK_LOG defined -- logging leaks to c:\docume~1\cltbld\locals~1\temp\tmpzsbozw\runxpcshelltests_leaks.log

TEST-INFO | (xpcshell/head.js) | test 1 pending

TEST-INFO | (xpcshell/head.js) | test 2 pending
TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0200_app_launch_apply_update.js | [run_test : 78] update test directory path: c:\talos-slave\test\build\thunderbird\update_test
TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0200_app_launch_apply_update.js | [run_test : 117] launching C:\WINDOWS\System32\cmd.exe /D /Q /C c:\talos-slave\test\build\thunderbird\0200_aus_test_app.exe -no-remote -process-updates 1> c:\talos-slave\test\build\xpcshell\tests\toolkit\mozapps\update\test\unit\0200_app_console_log 2>&1
TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/head_update.js | [setEnvironment : 2788] setting the XRE_NO_WINDOWS_CRASH_DIALOG environment variable to 1... previously it didn't exist
TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/head_update.js | [setEnvironment : 2835] removing the XPCOM_MEM_LEAK_LOG environment variable... previous value c:\docume~1\cltbld\locals~1\temp\tmpzsbozw\runxpcshelltests_leaks.log
TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/head_update.js | [setEnvironment : 2842] setting the XPCOM_DEBUG_BREAK environment variable to warn... previous value stack-and-abort
TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/head_update.js | [setEnvironment : 2868] setting MOZ_NO_SERVICE_FALLBACK environment variable to 1
TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/head_update.js | [resetEnvironment : 2888] setting the XPCOM_MEM_LEAK_LOG environment variable back to c:\docume~1\cltbld\locals~1\temp\tmpzsbozw\runxpcshelltests_leaks.log
TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/head_update.js | [resetEnvironment : 2894] setting the XPCOM_DEBUG_BREAK environment variable back to stack-and-abort
TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/head_update.js | [resetEnvironment : 2928] removing the XRE_NO_WINDOWS_CRASH_DIALOG environment variable
TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/head_update.js | [resetEnvironment : 2950] removing MOZ_NO_SERVICE_FALLBACK environment variable

TEST-INFO | (xpcshell/head.js) | test 2 finished

TEST-INFO | (xpcshell/head.js) | running event loop
TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0200_app_launch_apply_update.js | [PO_observe : 195] topic process-finished, process exitValue 0

TEST-UNEXPECTED-FAIL | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0200_app_launch_apply_update.js | Exceeded MAX_TIMEOUT_RUNS whilst waiting for updates log to be created - See following stack:
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_throw :: line 440
JS frame :: c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0200_app_launch_apply_update.js :: checkUpdateFinished :: line 250
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: <TOP_LEVEL> :: line 121

TEST-INFO | (xpcshell/head.js) | exiting test

TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\head.js | exception thrown from do_timeout callback: 2147500036 - See following stack:
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_throw :: line 440
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: <TOP_LEVEL> :: line 123

TEST-INFO | (xpcshell/head.js) | exiting test
TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0200_app_launch_apply_update.js | [end_test : 167] removing update test directory c:\talos-slave\test\build\thunderbird\update_test
WARNING: nsExceptionService ignoring thread destruction after shutdown: file e:/builds/moz2_slave/tb-comm-cen-w32-dbg/build/mozilla/xpcom/base/nsExceptionService.cpp, line 166
WARNING: OOPDeinit() without successful OOPInit(): file e:/builds/moz2_slave/tb-comm-cen-w32-dbg/build/mozilla/toolkit/crashreporter/nsExceptionHandler.cpp, line 1992
nsStringStats
 => mAllocCount:           3042
 => mReallocCount:          511
 => mFreeCount:            3042
 => mShareCount:           8486
 => mAdoptCount:            133
 => mAdoptFreeCount:        133
<<<<<<<

TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\tests\toolkit\mozapps\update\test\unit\test_0201_app_launch_apply_update.js | test failed (with xpcshell return code: 3), see following log:
>>>>>>>
### XPCOM_MEM_LEAK_LOG defined -- logging leaks to c:\docume~1\cltbld\locals~1\temp\tmpi2rtk0\runxpcshelltests_leaks.log

TEST-INFO | (xpcshell/head.js) | test 1 pending

TEST-INFO | (xpcshell/head.js) | test 2 pending
TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0201_app_launch_apply_update.js | [adjustPathsOnWindows : 285] Using this new bin directory: c:\docume~1\cltbld\locals~1\temp\tmpi2rtk0\ExecutableDir.tmp\bin

TEST-PASS | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0201_app_launch_apply_update.js | [run_test : 159] true == true

TEST-INFO | (xpcshell/head.js) | test 2 finished

TEST-INFO | (xpcshell/head.js) | running event loop

TEST-UNEXPECTED-FAIL | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0201_app_launch_apply_update.js | Exceeded MAX_TIMEOUT_RUNS whilst waiting for update to be applied - See following stack:
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_throw :: line 440
JS frame :: c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0201_app_launch_apply_update.js :: checkUpdateApplied :: line 338
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: <TOP_LEVEL> :: line 121

TEST-INFO | (xpcshell/head.js) | exiting test

TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\head.js | exception thrown from do_timeout callback: 2147500036 - See following stack:
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_throw :: line 440
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: <TOP_LEVEL> :: line 123

TEST-INFO | (xpcshell/head.js) | exiting test
c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0201_app_launch_apply_update.js:194: TypeError: gProcess is undefined
WARNING: nsExceptionService ignoring thread destruction after shutdown: file e:/builds/moz2_slave/tb-comm-cen-w32-dbg/build/mozilla/xpcom/base/nsExceptionService.cpp, line 166
WARNING: OOPDeinit() without successful OOPInit(): file e:/builds/moz2_slave/tb-comm-cen-w32-dbg/build/mozilla/toolkit/crashreporter/nsExceptionHandler.cpp, line 1992
nsStringStats
 => mAllocCount:           3693
 => mReallocCount:          296
 => mFreeCount:            3693
 => mShareCount:           9531
 => mAdoptCount:            143
 => mAdoptFreeCount:        143
<<<<<<<

TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\tests\toolkit\mozapps\update\test\unit\test_0203_app_launch_apply_update.js | test failed (with xpcshell return code: 3), see following log:
>>>>>>>
### XPCOM_MEM_LEAK_LOG defined -- logging leaks to c:\docume~1\cltbld\locals~1\temp\tmpw1_qhz\runxpcshelltests_leaks.log

TEST-INFO | (xpcshell/head.js) | test 1 pending

TEST-INFO | (xpcshell/head.js) | test 2 pending
TEST-INFO | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0203_app_launch_apply_update.js | [adjustPathsOnWindows : 313] Using this new bin directory: c:\docume~1\cltbld\locals~1\temp\tmpw1_qhz\ExecutableDir.tmp\bin

TEST-PASS | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0203_app_launch_apply_update.js | [run_test : 166] true == true

TEST-INFO | (xpcshell/head.js) | test 2 finished

TEST-INFO | (xpcshell/head.js) | running event loop

TEST-UNEXPECTED-FAIL | c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0203_app_launch_apply_update.js | Exceeded whilst waiting for update to be applied - See following stack:
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_throw :: line 440
JS frame :: c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0203_app_launch_apply_update.js :: checkUpdateApplied :: line 366
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: <TOP_LEVEL> :: line 121

TEST-INFO | (xpcshell/head.js) | exiting test

TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\head.js | exception thrown from do_timeout callback: 2147500036 - See following stack:
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: do_throw :: line 440
JS frame :: c:\talos-slave\test\build\xpcshell\head.js :: <TOP_LEVEL> :: line 123

TEST-INFO | (xpcshell/head.js) | exiting test
c:/talos-slave/test/build/xpcshell/tests/toolkit/mozapps/update/test/unit/test_0203_app_launch_apply_update.js:222: TypeError: gProcess is undefined
WARNING: nsExceptionService ignoring thread destruction after shutdown: file e:/builds/moz2_slave/tb-comm-cen-w32-dbg/build/mozilla/xpcom/base/nsExceptionService.cpp, line 166
WARNING: OOPDeinit() without successful OOPInit(): file e:/builds/moz2_slave/tb-comm-cen-w32-dbg/build/mozilla/toolkit/crashreporter/nsExceptionHandler.cpp, line 1992
nsStringStats
 => mAllocCount:           3734
 => mReallocCount:          299
 => mFreeCount:            3734
 => mShareCount:           9568
 => mAdoptCount:            145
 => mAdoptFreeCount:        145
<<<<<<<
Whiteboard: [tb-orange]
I think I'm getting a handle on at least one of the test issues. I've currently got a patch running on try server for it.
Assignee: nobody → mbanner
I've got test_0201 and test_0203 to run further on try server with:

+++ b/toolkit/mozapps/update/test/unit/head_update.js.in
+#ifdef MOZ_MAINTENANCE_SERVICE
 const STATE_APPLIED_PLATFORM = IS_WIN ? STATE_APPLIED_SVC : STATE_APPLIED;
+#else
+const STATE_APPLIED_PLATFORM = STATE_APPLIED;
+#endif

This obviously didn't quite take account of the maintenance service not being in use.

With this fix the broken tests all get to the same state where checkUpdateFinished fails due to not seeing the succeeded state. I've got some more debug running for that on try.
(In reply to Mark Banner (:standard8) from comment #26)
> With this fix the broken tests all get to the same state where
> checkUpdateFinished fails due to not seeing the succeeded state. I've got
> some more debug running for that on try.

So this step is stuck because the state is "applied" and not "succeeded". It seems like the second run of the app to complete the update isn't progressing to the succeeded state. I did try setting app.update.stage.enabled to true, but that didn't work either.

Hence currently I'm at a bit of a loss as to why this isn't working (just on windows as well).
This doesn't fix it fully, but we might as well progress this a bit further.

As mentioned in comment 26, this takes account of the maintenance service not necessarily being in use on Windows and then changes the correct state. This gets test_0201 and test_0203 through the checkUpdateApplied state to the checkUpdateFinished state.

I've also added a little bit more debug for when we're timing out - so that we know what state we're in, and made the debug more consistent across all the files.
Attachment #638280 - Flags: review?(ehsan)
Oh and any hints to the checkUpdateFinished issues would be greatly appreciated, I'm not sure what's going on there.

Try server runs with the latest patch will appear here in the next couple of hours:

https://tbpl.mozilla.org/?tree=Thunderbird-Try&noignore=1&rev=b2d1c1f4b982
Comment on attachment 638280 [details] [diff] [review]
Partial fix and debug clean up

Review of attachment 638280 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/mozapps/update/test/unit/head_update.js.in
@@ +292,4 @@
>  const STATE_APPLIED_PLATFORM = IS_WIN ? STATE_APPLIED_SVC : STATE_APPLIED;
> +#else
> +const STATE_APPLIED_PLATFORM = STATE_APPLIED;
> +#endif

This is not what we want.  So it turns out that Thunderbird doesn't define MOZ_MAINTENANCE_SERVICE at all, so we should just disable these tests on Thunderbird, I think!

Instead of this, at the beginning of shouldRunServiceTest, just do an ifndef MOZ_MAINTENANCE_SERVICE, and return false.  :-)

::: toolkit/mozapps/update/test/unit/test_0200_app_launch_apply_update.js
@@ +247,4 @@
>    log.append(FILE_UPDATE_LOG);
>    if (!log.exists()) {
>      if (gTimeoutRuns > MAX_TIMEOUT_RUNS)
> +      do_throw("Exceeded MAX_TIMEOUT_RUNS whilst waiting for updates log to be created at" + log.path);

Nit: space before ".

::: toolkit/mozapps/update/test_svc/unit/test_0200_app_launch_apply_update_svc.js
@@ +204,4 @@
>    log.append(FILE_UPDATE_LOG);
>    if (!log.exists()) {
>      if (++gTimeoutRuns > MAX_TIMEOUT_RUNS)
> +      do_throw("Exceeded MAX_TIMEOUT_RUNS whilst waiting for updates log to be created at" + log.path);

Ditto.
Attachment #638280 - Flags: review?(ehsan) → review-
Comment on attachment 638280 [details] [diff] [review]
Partial fix and debug clean up

OK, I didn't realize this patch also affects the non _svc tests.  r=me then!
Attachment #638280 - Flags: review- → review+
Ok, latest update is that I think I may have found the issue.

In confvars.sh Thunderbird doesn't specify MOZ_APP_BASENAME. A side-effect of this is that update-settings.ini doesn't get generated.

Although it looks like the unit tests should pass without this (as at least in some places they generate it themselves), I think I may have missed something with it being missing, or something else due to MOZ_APP_BASENAME as a try run with MOZ_APP_BASENAME set has passed the test_0200_app_launch_apply_update.js run.

test_0201 & test_0203 didn't pass because the run didn't include the patch from this bug. I'm now rebuilding with that patch overnight to check this out in more detail.
Blocks: 771085
Checked in:

https://hg.mozilla.org/mozilla-central/rev/675f55c4310c
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla16
Comment on attachment 638280 [details] [diff] [review]
Partial fix and debug clean up

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 307181
User impact if declined: None, test-only
Testing completed (on m-c, etc.): Landed on m-c
Risk to taking this patch (and alternatives if risky): None
String or UUID changes made by this patch: None

Test-only change, last in a series of permanent orange fixes for apps which don't yet use the maintenance service.
Attachment #638280 - Flags: approval-mozilla-aurora?
Comment on attachment 638280 [details] [diff] [review]
Partial fix and debug clean up

[Triage Comment]
Test-only orange fix for fallout from bg updates. Approved for Aurora 15.
Attachment #638280 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Whiteboard: [tb-orange]
You need to log in before you can comment on or make changes to this bug.