Closed Bug 1233522 Opened 4 years ago Closed 4 years ago

Intermittent TEST-UNEXPECTED-FAIL | apps/ftu/test/marionette/late_customization_test.js | First Time Use > perform late customization with apps "before each" hook for "free apps panel is shown"

Categories

(Firefox OS Graveyard :: Gaia::First Time Experience, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED
2.6 S5 - 1/15

People

(Reporter: mikehenrty, Assigned: sfoster)

Details

(Keywords: intermittent-failure, Whiteboard: [MJS][systemsfe])

Attachments

(2 files)

TEST-UNEXPECTED-FAIL | apps/ftu/test/marionette/late_customization_test.js | First Time Use > perform late customization with apps "before each" hook for "free apps panel is shown"
ScriptTimeout: ScriptTimeout: timed out
Remote Stack:
<none>
    at Error.MarionetteError (node_modules/marionette-client/lib/marionette/error.js:105:13)
    at Object.Client._handleCallback (node_modules/marionette-client/lib/marionette/client.js:529:19)
    at Object.<anonymous> (node_modules/marionette-client/lib/marionette/client.js:572:23)
    at TcpSync.sendRaw (node_modules/marionette-client/lib/marionette/drivers/tcp-sync.js:226:10)
    at TcpSync.sendMessage (node_modules/marionette-client/lib/marionette/drivers/tcp-sync.js:209:15)
    at TcpSync.send (node_modules/marionette-client/lib/marionette/drivers/tcp-sync.js:190:17)
    at Object.send (node_modules/marionette-client/lib/marionette/client.js:492:34)
    at Object.Client._sendCommand (node_modules/marionette-client/lib/marionette/client.js:553:21)
    at Object._executeScript (node_modules/marionette-client/lib/marionette/client.js:1597:19)
    at Object.executeAsyncScript (node_modules/marionette-client/lib/marionette/client.js:1385:19)
    at SettingsApi.set (node_modules/marionette-settings-api/index.js:45:29)
    at apps/ftu/test/marionette/late_customization_test.js:43:29
    at apps/ftu/test/marionette/lib/latecustomization.js:58:9
Sam could you take a look here?
Flags: needinfo?(sfoster)
(In reply to Michael Henretty [:mhenretty] from comment #2)
> Sam could you take a look here?

Yeah. I'm currently looking into bug 1233520. This test spawns a child process for the mock server and at a glance seems to be the same cause.
Assignee: nobody → sfoster
Flags: needinfo?(sfoster)
Whiteboard: [MJS] → [MJS][systemsfe]
Target Milestone: --- → 2.6 S5 - 1/15
Comment on attachment 8705751 [details] [review]
[gaia] sfoster:ftu-latecustom-gij-intermittent-bug-1233522 > mozilla-b2g:master

I've re-triggered this a bunch of times, but so far it looks stable. It appears to be another instance of the client.settings.set call in setup timing out. So this patch just bumps up that timeout. ISTM that 20s should be more than enough and it would be good to get a better understanding of what is actually taking all that time. But provided it stays green, I think this is good to go for this particular test.
Attachment #8705751 - Flags: review?(mhenretty)
Comment on attachment 8705751 [details] [review]
[gaia] sfoster:ftu-latecustom-gij-intermittent-bug-1233522 > mozilla-b2g:master

I have a hard time believing this helps the situation. The default script timeout right now is 60 seconds [1], so you are actually decreasing the timeout with this change. I'd like to see a test run only running this test, and re-triggering something like 40 times.

1.) https://github.com/mozilla-b2g/gaia/blob/59229f2e82f2e841edd99d79cb1c5055260304ea/tests/jsmarionette/client/marionette-client/lib/marionette/client.js#L9
Attachment #8705751 - Flags: review?(mhenretty)
(In reply to Michael Henretty [:mhenretty] from comment #6)
> Comment on attachment 8705751 [details] [review]
> [gaia] sfoster:ftu-latecustom-gij-intermittent-bug-1233522 >
> mozilla-b2g:master
> 
> I have a hard time believing this helps the situation. The default script
> timeout right now is 60 seconds [1], so you are actually decreasing the
> timeout with this change. 

Ooops, yeah I must have first looked into this before your patch landed to increase that timeout. So I wonder if this is even still intermittent? That only failure I've seen associated with this test is a timeout in the client.settings.set call, which I believe is directly associated with the scriptTimeout and the normal heavy load in the main process during startup. I've updated this PR to just run that single test and re-triggered and load of times: 

https://treeherder.mozilla.org/#/jobs?repo=gaia&revision=e6304d8f2fb1ffaf876b586b2c3133ef99924d3c&selectedJob=3351147
Gij6 failed 1/40 in that last run, in the FTU late customization > apps are installed test. It looks like the crash occurred when trying to launch the newly installed app from the homescreen. We got an 'unload was called exception'. Here's the relevant part of the stack: 

    at Object.Client.waitFor (/home/worker/git_checkout/node_modules/marionette-client/lib/marionette/client.js:727:60)
    at Object.Homescreen.launchIcon (/home/worker/git_checkout/apps/homescreen/test/marionette/lib/homescreen.js:278:12)
    at Context.<anonymous> (/home/worker/git_checkout/apps/ftu/test/marionette/late_customization_test.js:123:14)
    at callFn (/home/worker/git_checkout/node_modules/mocha/lib/runnable.js:286:21)
    at Test.Runnable.run (/home/worker/git_checkout/node_modules/mocha/lib/runnable.js:279:7)
    at Test.MarionetteTest.run (/home/worker/git_checkout/node_modules/marionette-js-runner/lib/ui.js:25:31)
    at Runner.runTest (/home/worker/git_checkout/node_modules/mocha/lib/runner.js:421:10)
    at /home/worker/git_checkout/node_modules/mocha/lib/runner.js:528:12
    at next (/home/worker/git_checkout/node_modules/mocha/lib/runner.js:341:14)
    at /home/worker/git_checkout/node_modules/mocha/lib/runner.js:351:7
    at next (/home/worker/git_checkout/node_modules/mocha/lib/runner.js:283:14)
    at /home/worker/git_checkout/node_modules/mocha/lib/runner.js:314:7
    at done (/home/worker/git_checkout/node_modules/mocha/lib/runnable.js:247:5)
    at callFn (/home/worker/git_checkout/node_modules/mocha/lib/runnable.js:301:7)
    at Hook.Runnable.run (/home/worker/git_checkout/node_modules/mocha/lib/runnable.js:279:7)
    at next (/home/worker/git_checkout/node_modules/mocha/lib/runner.js:297:10)
    at /home/worker/git_checkout/node_modules/mocha/lib/runner.js:314:7
    at done (/home/worker/git_checkout/node_modules/mocha/lib/runnable.js:247:5)
    at /home/worker/git_checkout/node_modules/mocha/lib/runnable.js:317:7
    at /home/worker/git_checkout/apps/ftu/test/marionette/late_customization_test.js:52:11
    at /home/worker/git_checkout/apps/ftu/test/marionette/lib/latecustomization.js:58:9',
       message: 'JavaScriptError: unload was called
       Remote Stack:
       <none>'

I'm not sure what to make of it. My best guess is that in calling home.launchIcon - which clicks on the icon to launch the test app - some other process was killed or reference otherwise lost. If that's the case, we can expect similar issues with any test that launches apps? Does that sound familiar :mhenretty? I do see this code in Contacts apparently working around a similar issue: https://github.com/mozilla-b2g/gaia/blob/master/apps/communications/contacts/test/marionette/lib/contacts.js#L257:L271 - but I would want to understand it all better before considering something similar.
Flags: needinfo?(mhenretty)
Where did you get that remote stack from? Are you able to reproduce that locally? When I look at the failed run from treeherder [1], all I see is "Crash detected but error running stackwalk" which is due to bug 1175116.

1.) https://treeherder.mozilla.org/logviewer.html#?job_id=3352642&repo=gaia
Flags: needinfo?(mhenretty) → needinfo?(sfoster)
(In reply to Michael Henretty [:mhenretty] from comment #9)
> Where did you get that remote stack from? Are you able to reproduce that
> locally? When I look at the failed run from treeherder [1], all I see is
> "Crash detected but error running stackwalk" which is due to bug 1175116.

The stack is available in https://public-artifacts.taskcluster.net/UeF0OEzdSHuOiG6Uhc7_1A/0/public/marionette_js_tests/debug.log - its about 1/2 way down, just find in page for "unload was called". Some TH tasks produce this debug.log, but not all? I'm not sure where it is configured. 

I've not yet reproduced this locally. I'll try a few more times.
Flags: needinfo?(sfoster) → needinfo?(mhenretty)
Weird. We really need a screenshot here to see what's going on. You might tree console.log(client.screenshot) in launchIcon to see what's going on with the homescreen when it's timing out. Other than that I'm not sure how to help here.
Flags: needinfo?(mhenretty)
Comment on attachment 8705751 [details] [review]
[gaia] sfoster:ftu-latecustom-gij-intermittent-bug-1233522 > mozilla-b2g:master

That's 40 consecutive green runs at: https://treeherder.mozilla.org/#/jobs?repo=gaia&revision=c10c31fdbde6d72495d9451178976c1b3c30f8a5 .. The patch just re-enables the test and some minor logging changes. I've updated the PR again to remove the marionette_js_tests.yml changes and will retrigger gij block 6 some more. But as I can't reproduce this locally either I'm inclined to just land it and watch it carefully - unless you have other suggestions.
Attachment #8705751 - Flags: review?(mhenretty)
Comment on attachment 8705751 [details] [review]
[gaia] sfoster:ftu-latecustom-gij-intermittent-bug-1233522 > mozilla-b2g:master

Works for me. Better error logging is always useful. Let's keep an eye out.
Attachment #8705751 - Flags: review?(mhenretty) → review+
Merged to master to re-enable this test: https://github.com/mozilla-b2g/gaia/commit/b1455a3233b072b7e31560d5308ffaa944d3736a
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.