Intermittent Windows/Mac e10s startup hang leading to REFTEST PROCESS-CRASH | reftest | application crashed [@ CrashingThread(void *)], | application timed out after 330 seconds with no output

RESOLVED FIXED in Firefox 50

Status

()

P3
normal
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: RyanVM, Assigned: jimm)

Tracking

(Blocks: 1 bug, {crash, intermittent-failure})

Trunk
mozilla52
Unspecified
Windows
crash, intermittent-failure
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(e10s?, firefox46 unaffected, firefox47 wontfix, firefox48 wontfix, firefox49 wontfix, firefox50 fixed, firefox51 fixed, firefox52 fixed)

Details

(Whiteboard: [e10s-orangeblockers])

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

3 years ago
Created attachment 8744762 [details]
screenshot

I'd noticed this on Ash but haven't had time to narrow it down. Then I noticed today that it was affecting Aurora as well. After a giant pile of retriggering, I can say with a high level of certainty that it's caused by bug 1235633. It also fits the regression range on Ash (and I assume it would on other trunk branches as well, but I'll leave that to the full-time sheriffs to confirm).

Screenshot always shows a blank window. Affects Win7/Win8 reftests & crashtests with e10s enabled.

https://treeherder.mozilla.org/logviewer.html#?job_id=2460175&repo=mozilla-aurora
Flags: needinfo?(wmccloskey)
(Reporter)

Comment 1

3 years ago
(In reply to Ryan VanderMeulen [:RyanVM] from comment #0)
> but I'll leave that to the full-time sheriffs to confirm).

Going back through the history, I see failures matching this starred as "intermittent" without a bug being filed going back for about two weeks, so there's that...
(Assignee)

Updated

3 years ago
Blocks: 984139
tracking-e10s: ? → +
Can you provide more information about the priority here Ryan? How often does it happen? Does it happen on inbound/central? I don't understand why it wasn't caught when my patch originally landed.
Flags: needinfo?(wmccloskey) → needinfo?(ryanvm)
(Reporter)

Comment 3

3 years ago
It's intermittent. Yes, happens on production as well:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&endday=2016-04-28&startday=2016-04-01&tree=trunk

Mostly PGO-only, for whatever that's worth. My main concern is whether this'll manifest in a user-facing way or not.
Flags: needinfo?(ryanvm)
12 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-aurora: 5
* mozilla-inbound: 3
* mozilla-central: 3
* try: 1

Platform breakdown:
* windows7-32: 12

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-04-25&endday=2016-05-01&tree=all
6 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 3
* mozilla-central: 1
* mozilla-beta: 1
* fx-team: 1

Platform breakdown:
* windows7-32: 4
* osx-10-10: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-05-30&endday=2016-06-05&tree=all
6 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 3
* mozilla-aurora: 2
* fx-team: 1

Platform breakdown:
* windows7-32: 4
* windows8-64: 1
* osx-10-10: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-06-06&endday=2016-06-12&tree=all
5 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 2
* fx-team: 2
* ash: 1

Platform breakdown:
* windows7-32: 4
* osx-10-10: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-06-13&endday=2016-06-19&tree=all
12 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-beta: 4
* mozilla-aurora: 3
* try: 1
* mozilla-inbound: 1
* mozilla-central: 1
* fx-team: 1
* autoland: 1

Platform breakdown:
* windows8-64: 6
* windows7-32: 5
* windows7-32-vm: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-06-20&endday=2016-06-26&tree=all
6 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 3
* mozilla-beta: 2
* fx-team: 1

Platform breakdown:
* windows7-32: 6

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-06-27&endday=2016-07-03&tree=all
Intermittent test crash
Priority: -- → P3
18 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* try: 4
* mozilla-inbound: 3
* mozilla-beta: 3
* fx-team: 3
* autoland: 3
* mozilla-central: 1
* ash: 1

Platform breakdown:
* windows7-32: 11
* linux64: 4
* windows8-64: 2
* windowsxp: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-07-04&endday=2016-07-10&tree=all
6 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* ash: 5
* mozilla-inbound: 1

Platform breakdown:
* windows8-64: 5
* windows7-32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-07-11&endday=2016-07-17&tree=all
(Reporter)

Comment 13

2 years ago
 09:30:07 INFO - JavaScript error: , line 0: uncaught exception: undefined
15 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 15

Platform breakdown:
* windows7-32: 15

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-07-28&endday=2016-07-28&tree=all
39 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 15
* autoland: 13
* ash: 9
* try: 2

Platform breakdown:
* windows7-32: 38
* windows8-64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-07-25&endday=2016-07-31&tree=all
20 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 10
* fx-team: 5
* ash: 3
* autoland: 2

Platform breakdown:
* windows7-32: 20

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-02&endday=2016-08-02&tree=all
33 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 22
* mozilla-inbound: 11

Platform breakdown:
* windows7-32: 33

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-03&endday=2016-08-03&tree=all
42 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 19
* mozilla-inbound: 17
* fx-team: 6

Platform breakdown:
* windows7-32: 41
* windows7-32-vm: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-04&endday=2016-08-04&tree=all
26 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 12
* autoland: 8
* fx-team: 5
* mozilla-central: 1

Platform breakdown:
* windows7-32-vm: 15
* windows7-32: 11

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-05&endday=2016-08-05&tree=all
189 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 79
* autoland: 61
* fx-team: 32
* mozilla-central: 10
* ash: 5
* mozilla-aurora: 2

Platform breakdown:
* windows7-32: 152
* windows7-32-vm: 35
* windows8-64: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-01&endday=2016-08-07&tree=all
40 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 24
* mozilla-inbound: 13
* fx-team: 2
* mozilla-central: 1

Platform breakdown:
* windows7-32-vm: 23
* windows7-32: 16
* windows8-64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-08&endday=2016-08-08&tree=all
19 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 11
* autoland: 6
* fx-team: 2

Platform breakdown:
* windows7-32-vm: 11
* windows7-32: 6
* windowsxp: 1
* android-4-3-armv7-api15: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-09&endday=2016-08-09&tree=all
20 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 10
* autoland: 8
* mozilla-central: 1
* fx-team: 1

Platform breakdown:
* windows7-32-vm: 12
* windows7-32: 8

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-10&endday=2016-08-10&tree=all
118 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 52
* mozilla-inbound: 45
* fx-team: 12
* mozilla-central: 9

Platform breakdown:
* windows7-32-vm: 66
* windows7-32: 48
* windowsxp: 1
* windows8-64: 1
* osx-10-10: 1
* android-4-3-armv7-api15: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-08&endday=2016-08-14&tree=all
(Reporter)

Comment 25

2 years ago
This is extremely frequent across the board on Windows these days. Would be great if we could get someone to look into it. Affects all Windows e10s suites that use the reftest harness.
status-firefox47: affected → wontfix
status-firefox48: affected → wontfix
status-firefox49: --- → affected
status-firefox50: --- → affected
status-firefox51: --- → affected
Flags: needinfo?(jmathies)
40 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 21
* mozilla-inbound: 17
* mozilla-beta: 2

Platform breakdown:
* windows7-32-vm: 20
* windows7-32: 18
* windows8-64: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-16&endday=2016-08-16&tree=all
15 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 7
* mozilla-inbound: 3
* fx-team: 3
* mozilla-central: 2

Platform breakdown:
* windows7-32: 8
* windows7-32-vm: 7

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-17&endday=2016-08-17&tree=all
24 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 8
* autoland: 7
* mozilla-aurora: 6
* mozilla-central: 2
* try: 1

Platform breakdown:
* windows7-32: 12
* windows7-32-vm: 8
* windows8-64: 4

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-18&endday=2016-08-18&tree=all
107 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 42
* mozilla-inbound: 39
* mozilla-aurora: 12
* mozilla-central: 6
* mozilla-beta: 4
* fx-team: 3
* try: 1

Platform breakdown:
* windows7-32: 50
* windows7-32-vm: 47
* windows8-64: 10

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-15&endday=2016-08-21&tree=all
(Assignee)

Comment 30

2 years ago
Tracy, can you look into this? ping me if you have questions.
Flags: needinfo?(jmathies) → needinfo?(twalker)
per IRC re-NIing Jim
Flags: needinfo?(twalker) → needinfo?(jmathies)
(Reporter)

Comment 32

2 years ago
Bug 1265229 is probably the same issue.
See Also: → bug 1265229
I'm willing to believe that the OS X failures which started a few days ago are somehow different, but until someone explains how by filing a separate bug, I'm going to just dump them in here.
Summary: Intermittent Windows e10s startup hang leading to REFTEST PROCESS-CRASH | reftest | application crashed [@ CrashingThread(void *)] → Intermittent Windows/Mac e10s startup hang leading to REFTEST PROCESS-CRASH | reftest | application crashed [@ CrashingThread(void *)], | application timed out after 330 seconds with no output
53 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 20
* mozilla-inbound: 16
* mozilla-aurora: 8
* mozilla-central: 4
* mozilla-beta: 4
* fx-team: 1

Platform breakdown:
* osx-10-10: 23
* windows7-32: 20
* windows8-64: 7
* windows7-32-vm: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-25&endday=2016-08-25&tree=all
37 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 20
* autoland: 11
* mozilla-central: 3
* mozilla-aurora: 3

Platform breakdown:
* windows7-32: 19
* osx-10-10: 12
* windows7-32-vm: 5
* windows8-64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-26&endday=2016-08-26&tree=all
119 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* autoland: 43
* mozilla-inbound: 42
* mozilla-central: 16
* mozilla-aurora: 11
* mozilla-beta: 4
* fx-team: 3

Platform breakdown:
* windows7-32: 66
* osx-10-10: 35
* windows7-32-vm: 10
* windows8-64: 8

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-22&endday=2016-08-28&tree=all
24 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 10
* mozilla-central: 6
* autoland: 5
* fx-team: 3

Platform breakdown:
* windows7-32: 14
* osx-10-10: 10

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-30&endday=2016-08-30&tree=all
(Assignee)

Updated

2 years ago
tracking-e10s: + → ?
Flags: needinfo?(jmathies)
24 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 10
* autoland: 6
* mozilla-central: 4
* mozilla-aurora: 3
* mozilla-beta: 1

Platform breakdown:
* windows7-32: 11
* osx-10-10: 9
* windows8-64: 2
* windows7-32-vm: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-31&endday=2016-08-31&tree=all

Comment 39

2 years ago
I looked at this and logs say that we are failing to launch the child process completely: which triggers some assertions, but we don't have stacks/reasons for failing to launch.
19 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 6
* mozilla-beta: 5
* fx-team: 3
* autoland: 3
* mozilla-central: 2

Platform breakdown:
* windows7-32: 13
* windows8-64: 5
* windows7-32-vm: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-01&endday=2016-09-01&tree=all
17 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-aurora: 7
* mozilla-inbound: 3
* mozilla-beta: 2
* fx-team: 2
* autoland: 2
* mozilla-central: 1

Platform breakdown:
* windows7-32: 8
* windows8-64: 7
* windows7-32-vm: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-02&endday=2016-09-02&tree=all
17 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 6
* mozilla-central: 3
* mozilla-beta: 3
* mozilla-aurora: 3
* fx-team: 1
* autoland: 1

Platform breakdown:
* windows7-32: 11
* windows8-64: 3
* windows7-32-vm: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-03&endday=2016-09-03&tree=all
136 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 42
* autoland: 35
* mozilla-central: 17
* mozilla-aurora: 16
* mozilla-beta: 11
* fx-team: 9
* ash: 6

Platform breakdown:
* windows7-32: 78
* osx-10-10: 24
* windows8-64: 18
* windows7-32-vm: 16

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-08-29&endday=2016-09-04&tree=all
33 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 15
* autoland: 11
* mozilla-release: 3
* mozilla-central: 2
* fx-team: 2

Platform breakdown:
* windows7-32-vm: 30
* windows8-64: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-05&endday=2016-09-05&tree=all
41 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 21
* autoland: 15
* mozilla-central: 4
* fx-team: 1

Platform breakdown:
* windows7-32-vm: 41

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-06&endday=2016-09-06&tree=all
I dragged a handy mozilla-central push off into a corner and retriggered the Win7 opt e10s reftest-noaccel 25 times, and got this 13 times on it, so that suite is unacceptable and needs to be hidden. Not sure how many other flavors also need to be hidden.
Only 7 of 25 on PGO, but then, 28% is also completely unacceptable.
90 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 32
* mozilla-central: 24
* autoland: 22
* fx-team: 7
* mozilla-beta: 2
* mozilla-aurora: 2
* ash: 1

Platform breakdown:
* windows7-32-vm: 88
* windows8-64: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-07&endday=2016-09-07&tree=all
Hiding it is temporarily blocked by bug 1301260.

Fun story: the Mac failures completely stopped September 1st, and since that time there have been 4 failures on regular accelerated reftest, and 245 on reftest-noaccel.
(Assignee)

Updated

2 years ago
Flags: needinfo?(jmathies)
38 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 13
* autoland: 11
* mozilla-central: 5
* fx-team: 5
* mozilla-aurora: 2
* ash: 2

Platform breakdown:
* windows7-32-vm: 37
* windows8-64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-08&endday=2016-09-08&tree=all
35 automation job failures were associated with this bug yesterday.

Repository breakdown:
* autoland: 14
* mozilla-inbound: 8
* try: 4
* mozilla-central: 3
* mozilla-aurora: 3
* fx-team: 2
* mozilla-release: 1

Platform breakdown:
* windows7-32-vm: 32
* windows8-64: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-09&endday=2016-09-09&tree=all
Win7 opt/PGO e10s Ru hidden on all trunk trees (and try, though it won't actually be running there until next week), click the "Excluded Jobs" link in the gray bar at the top of treeherder to see it.
Blocks: 1301905
269 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-inbound: 96
* autoland: 76
* mozilla-central: 44
* fx-team: 26
* mozilla-aurora: 8
* ash: 7
* mozilla-release: 5
* try: 4
* mozilla-beta: 3

Platform breakdown:
* windows7-32-vm: 256
* windows8-64: 11
* linux64: 1
* linux32: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-05&endday=2016-09-11&tree=all
15 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-aurora: 10
* mozilla-inbound: 2
* mozilla-release: 1
* mozilla-central: 1
* autoland: 1

Platform breakdown:
* windows8-64: 6
* windows7-32-vm: 6
* windows7-32: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-12&endday=2016-09-18&tree=all
Interesting. I was about to update the exclusion to hide it on aurora, since we merged whatever is busted there this morning, but 10 retriggers there gave me 10 green runs.
25 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-inbound: 10
* mozilla-beta: 10
* autoland: 3
* fx-team: 2

Platform breakdown:
* windows7-32-vm: 15
* windows8-64: 7
* windows7-32: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-23&endday=2016-09-23&tree=all
70 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-beta: 38
* mozilla-inbound: 14
* try: 8
* mozilla-aurora: 4
* autoland: 4
* fx-team: 2

Platform breakdown:
* windows8-64: 28
* windows7-32-vm: 28
* windows7-32: 12
* osx-10-10: 2

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-19&endday=2016-09-25&tree=all
16 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-beta: 7
* mozilla-inbound: 4
* autoland: 4
* mozilla-central: 1

Platform breakdown:
* windows7-32-vm: 9
* windows8-64: 4
* windows7-32: 3

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-27&endday=2016-09-27&tree=all
21 automation job failures were associated with this bug yesterday.

Repository breakdown:
* mozilla-beta: 8
* mozilla-inbound: 7
* autoland: 4
* try: 1
* fx-team: 1

Platform breakdown:
* windows7-32-vm: 12
* windows8-64: 3
* windows7-32: 3
* osx-10-10: 2
* windows10-64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-28&endday=2016-09-28&tree=all
63 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-beta: 24
* mozilla-inbound: 14
* autoland: 14
* mozilla-aurora: 4
* mozilla-central: 3
* fx-team: 3
* try: 1

Platform breakdown:
* windows7-32-vm: 36
* windows8-64: 14
* windows7-32: 10
* osx-10-10: 2
* windows10-64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-09-26&endday=2016-10-02&tree=all
(Assignee)

Comment 61

2 years ago
An update here - on Windows the child process *is* launching, but for some reason it fails to load up content and operate correctly. Will continue digging.

07:24:46     INFO - REFTEST INFO | Checking for orphan ssltunnel processes...
07:24:46     INFO - REFTEST INFO | Checking for orphan xpcshell processes...
07:24:47     INFO - REFTEST INFO | Running with e10s: True
07:24:47     INFO - REFTEST INFO | Application command: C:\slave\test\build\application\firefox\firefox.exe -marionette -profile c:\users\cltbld\appdata\local\temp\tmpwydbjz.mozrunner
07:24:51     INFO - 1475763891578	Marionette	INFO	Listening on port 2828
07:24:52     INFO - GeckoChildProcessHost::LaunchAndWaitForProcessHandle()
07:24:52     INFO - GeckoChildProcessHost::PrepareLaunch()
07:24:52     INFO - GeckoChildProcessHost::RunPerformAsyncLaunch()
07:24:52     INFO - GeckoChildProcessHost::PerformAsyncLaunch()
07:24:52     INFO - GeckoChildProcessHost::PerformAsyncLaunchInternal()
07:24:52     INFO - PerformAsyncLaunchInternal: GeckoProcessType_Content
07:24:52     INFO - call mSandboxBroker.LaunchApp...
07:24:52     INFO - ==> process 2160 launched child process 3124 ("C:\slave\test\build\application\firefox\firefox.exe" -contentproc --channel="2160.0.121296096\921340549" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser"  2160 "\\.\pipe\gecko-crash-server-pipe.2160" tab)
(timeout)
Flags: needinfo?(jmathies)
(Assignee)

Comment 62

2 years ago
This is caused by a mysteriously missing 'load' event reftest-content.js registers for when it loads [1]. I couldn't find any existing bugs on frame scripts failing to fire 'load', so maybe this is something new. I need to dig a bit further.

try log [2]:
05:48:05     INFO - REFTEST INFO | Checking for orphan ssltunnel processes...
05:48:05     INFO - REFTEST INFO | Checking for orphan xpcshell processes...
05:48:06     INFO - REFTEST INFO | Running with e10s: True
05:48:06     INFO - REFTEST INFO | Application command: C:\slave\test\build\application\firefox\firefox.exe -marionette -profile c:\users\cltbld\appdata\local\temp\tmpffuyrr.mozrunner
05:48:06     INFO - **** firefox.exe child process launched
05:48:10     INFO - 1475930890470	Marionette	INFO	Listening on port 2828
05:48:11     INFO - GeckoChildProcessHost::LaunchAndWaitForProcessHandle()
05:48:11     INFO - GeckoChildProcessHost::PrepareLaunch()
05:48:11     INFO - GeckoChildProcessHost::RunPerformAsyncLaunch()
05:48:11     INFO - GeckoChildProcessHost::PerformAsyncLaunch()
05:48:11     INFO - GeckoChildProcessHost::PerformAsyncLaunchInternal()
05:48:11     INFO - PerformAsyncLaunchInternal: GeckoProcessType_Content
05:48:11     INFO - call mSandboxBroker.LaunchApp...
05:48:11     INFO - ==> process 3420 launched child process 1840 ("C:\slave\test\build\application\firefox\firefox.exe" -contentproc --channel="3420.0.1033849503\316985519" -greomni "C:\slave\test\build\application\firefox\omni.ja" -appomni "C:\slave\test\build\application\firefox\browser\omni.ja" -appdir "C:\slave\test\build\application\firefox\browser"  3420 "\\.\pipe\gecko-crash-server-pipe.3420" tab)
05:48:11     INFO - TabParent::SendLoadRemoteScript url='chrome://global/content/browser-content.js'
05:48:11     INFO - TabParent::SendLoadRemoteScript url='resource://gre/modules/addons/Content.js'
05:48:11     INFO - TabParent::SendLoadRemoteScript url='chrome://satchel/content/formSubmitListener.js'
05:48:11     INFO - **** firefox.exe child process launched
05:48:11     INFO - TabParent::SendLoadRemoteScript url='chrome://gfxsanity/content/gfxFrameScript.js'
05:48:11     INFO - dropping into loop->Run()
05:48:11     INFO - TabChild::RecvLoadRemoteScript url='chrome://global/content/browser-content.js'
05:48:11     INFO - TabChild::RecvLoadRemoteScript url='resource://gre/modules/addons/Content.js'
05:48:11     INFO - TabChild::RecvLoadRemoteScript url='chrome://satchel/content/formSubmitListener.js'
05:48:11     INFO - TabChild::RecvLoadRemoteScript url='chrome://gfxsanity/content/gfxFrameScript.js'
05:48:12     INFO - TabParent::SendLoadRemoteScript url='chrome://global/content/browser-content.js'
05:48:12     INFO - TabParent::SendLoadRemoteScript url='resource://gre/modules/addons/Content.js'
05:48:12     INFO - TabParent::SendLoadRemoteScript url='chrome://satchel/content/formSubmitListener.js'
05:48:12     INFO - TabParent::SendLoadRemoteScript url='chrome://global/content/browser-child.js'
05:48:12     INFO - TabParent::SendLoadRemoteScript url='chrome://global/content/select-child.js'
05:48:12     INFO - TabParent::SendLoadRemoteScript url='chrome://browser/content/tab-content.js'
05:48:12     INFO - TabParent::SendLoadRemoteScript url='chrome://browser/content/content.js'
05:48:12     INFO - TabParent::SendLoadRemoteScript url='chrome://browser/content/content-UITour.js'
05:48:12     INFO - TabParent::SendLoadRemoteScript url='chrome://global/content/manifestMessages.js'
05:48:12     INFO - TabParent::SendLoadRemoteScript url='chrome://browser/content/content-sessionStore.js'
05:48:12     INFO - TabParent::SendLoadRemoteScript url='chrome://marionette/content/listener.js'
05:48:12     INFO - TabChild::RecvLoadRemoteScript url='chrome://global/content/browser-content.js'
05:48:12     INFO - TabChild::RecvLoadRemoteScript url='resource://gre/modules/addons/Content.js'
05:48:12     INFO - TabChild::RecvLoadRemoteScript url='chrome://satchel/content/formSubmitListener.js'
05:48:12     INFO - TabChild::RecvLoadRemoteScript url='chrome://global/content/browser-child.js'
05:48:12     INFO - TabChild::RecvLoadRemoteScript url='chrome://global/content/select-child.js'
05:48:12     INFO - TabChild::RecvLoadRemoteScript url='chrome://browser/content/tab-content.js'
05:48:12     INFO - TabChild::RecvLoadRemoteScript url='chrome://browser/content/content.js'
05:48:12     INFO - TabChild::RecvLoadRemoteScript url='chrome://browser/content/content-UITour.js'
05:48:12     INFO - TabChild::RecvLoadRemoteScript url='chrome://global/content/manifestMessages.js'
05:48:12     INFO - TabChild::RecvLoadRemoteScript url='chrome://browser/content/content-sessionStore.js'
05:48:12     INFO - TabChild::RecvLoadRemoteScript url='chrome://marionette/content/listener.js'
05:48:14     INFO - REFTEST INFO | function OnRefTestLoad(win)
05:48:14     INFO - TabParent::SendLoadRemoteScript url='chrome://global/content/browser-content.js'
05:48:14     INFO - TabParent::SendLoadRemoteScript url='resource://gre/modules/addons/Content.js'
05:48:14     INFO - TabParent::SendLoadRemoteScript url='chrome://satchel/content/formSubmitListener.js'
05:48:14     INFO - TabChild::RecvLoadRemoteScript url='chrome://global/content/browser-content.js'
05:48:14     INFO - TabChild::RecvLoadRemoteScript url='resource://gre/modules/addons/Content.js'
05:48:14     INFO - TabChild::RecvLoadRemoteScript url='chrome://satchel/content/formSubmitListener.js'
05:48:14     INFO - REFTEST INFO | function RegisterMessageListenersAndLoadContentScript()
05:48:14     INFO - REFTEST INFO | Loading reftest-content.js frame script...
05:48:14     INFO - TabParent::SendLoadRemoteScript url='chrome://reftest/content/reftest-content.js'
05:48:14     INFO - TabChild::RecvLoadRemoteScript url='chrome://reftest/content/reftest-content.js'
05:48:14     INFO - REFTEST INFO | [CONTENT] reftest-content.js: i'm loaded!
05:53:44    ERROR - REFTEST ERROR | reftest | application timed out after 330 seconds with no output

[1] http://searchfox.org/mozilla-central/rev/c635b8c61d648bb8a0317c19f8905b3be8132a8a/layout/tools/reftest/reftest-content.js#1157
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=29f88c9234fbbeee8a19d1d06489b6b205e29d53
45 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-beta: 36
* fx-team: 3
* try: 2
* mozilla-release: 1
* mozilla-inbound: 1
* mozilla-central: 1
* mozilla-aurora: 1

Platform breakdown:
* windows8-64: 24
* windows7-32: 9
* windows7-32-vm: 8
* osx-10-10: 4

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-10-03&endday=2016-10-09&tree=all
(Assignee)

Comment 64

2 years ago
Created attachment 8799781 [details] [diff] [review]
wip
Assignee: nobody → jmathies
(Assignee)

Comment 65

2 years ago
Created attachment 8799889 [details] [diff] [review]
fix

This isn't a browser bug, it's a harness bug. The load of reftest-content.js frame script sometimes happens after the load event for content fires so reftest-content.js never gets started and we hang on test startup. The fix is to check to see if the load event has fired, and if so, go ahead and get the test suite running. Otherwise, keep the old load event as a starting trigger.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=e36bf38727ce3bdadb91ae29525c2f5fbd0f5b5e
Attachment #8799781 - Attachment is obsolete: true
Attachment #8799889 - Flags: review?(jmaher)
Comment on attachment 8799889 [details] [diff] [review]
fix

Review of attachment 8799889 [details] [diff] [review]:
-----------------------------------------------------------------

good find!
Attachment #8799889 - Flags: review?(jmaher) → review+
(Assignee)

Updated

2 years ago
Keywords: checkin-needed

Comment 68

2 years ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb296adbce78
Reftest harness should check readyState and continue if the document state is 'complete'. r=jmaher
Keywords: checkin-needed
(Assignee)

Comment 69

2 years ago
(In reply to Phil Ringnalda (:philor) from comment #52)
> Win7 opt/PGO e10s Ru hidden on all trunk trees (and try, though it won't
> actually be running there until next week), click the "Excluded Jobs" link
> in the gray bar at the top of treeherder to see it.

Should I file a bug on undoing hiding of tests now that we (hopefully) have this fixed?
Flags: needinfo?(philringnalda)
Nope, that would be bug 1301905, you just have to wait until it has at least made it to mozilla-central and thus around to other branches, if not given it a day or so to bake after that, and then poke at me there.
Flags: needinfo?(philringnalda)

Comment 71

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/bb296adbce78
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox52: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
(Reporter)

Comment 72

2 years ago
Thanks for tracking this down, Jim!
status-firefox49: affected → wontfix
Whiteboard: [e10s-orangeblockers][checkin-needed-aurora][checkin-needed-beta]

Comment 73

2 years ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-aurora/rev/5f981bfee893
status-firefox51: affected → fixed
Whiteboard: [e10s-orangeblockers][checkin-needed-aurora][checkin-needed-beta] → [e10s-orangeblockers][checkin-needed-beta]
https://hg.mozilla.org/releases/mozilla-beta/rev/9646fff5e1c7
Whiteboard: [e10s-orangeblockers][checkin-needed-beta] → [e10s-orangeblockers]

Updated

2 years ago
status-firefox50: affected → fixed
33 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-beta: 26
* mozilla-release: 3
* mozilla-aurora: 2
* fx-team: 1
* autoland: 1

Platform breakdown:
* windows8-64: 15
* windows7-32: 10
* osx-10-10: 4
* windows7-32-vm: 3
* linux64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-10-10&endday=2016-10-16&tree=all
5 automation job failures were associated with this bug in the last 7 days.

Repository breakdown:
* mozilla-release: 5

Platform breakdown:
* windows8-64: 5

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1267106&startday=2016-10-24&endday=2016-10-30&tree=all
You need to log in before you can comment on or make changes to this bug.