Closed Bug 1330550 Opened 7 years ago Closed 7 years ago

stylo: Talos fail with "Could not find report in browser output"

Categories

(Testing :: Talos, defect, P3)

x86_64
Linux
defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: shinglyu, Unassigned)

References

Details

Talos running on Stylo

https://treeherder.mozilla.org/#/jobs?repo=try&revision=cc113d4a2d9acd3eb7cd30542bf1f079e7ae5705&selectedJob=66082395

g1, g2 and tp test failed with:

Could not find report in browser output: [('tsformat', ('__start_report', '__end_report')), ('tpformat', ('__start_tp_report', '__end_tp_report'))]
Component: CSS Parsing and Computation → Talos
Product: Core → Testing
Will, since you are involved in the mail thread, could you take a look at this or redirect to the right person? Thank you.
Flags: needinfo?(wlachance)
Blocks: stylo-perf-test
No longer blocks: stylo
it looks like talos isn't running successfully due to errors in the browser.  On g2 (damp test) I see this in the log:
02:51:49     INFO -  PROCESS | 9209 | ERROR:geckoservo::glue: Unnecessary call to traverse_subtree
02:51:50     INFO -  PROCESS | 9209 | stylo: skipping declaration without ParserContextExtraData
02:51:50     INFO -  PROCESS | 9209 | ERROR:geckoservo::glue: Unnecessary call to traverse_subtree
02:51:50     INFO -  PROCESS | 9209 | ERROR:geckoservo::glue: Unnecessary call to traverse_subtree
02:51:50     INFO -  PROCESS | 9209 | console.error:
02:51:50    ERROR -  PROCESS | 9209 |   Message: TypeError: doc.documentElement is null
02:51:50     INFO -  PROCESS | 9209 |   Stack:
02:51:50     INFO -  PROCESS | 9209 |     supportsHighlighters@resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/server/actors/inspector.js:2857:9
02:51:50     INFO -  PROCESS | 9209 | handler@resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/shared/protocol.js:1082:19
02:51:50     INFO -  PROCESS | 9209 | onPacket@resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/server/main.js:1752:15
02:51:50     INFO -  PROCESS | 9209 | receiveMessage@resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/shared/transport/transport.js:761:7
02:51:50     INFO -  PROCESS | 9209 |
02:51:50     INFO -  PROCESS | 9209 | supportsHighlighters@resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/server/actors/inspector.js:2857:9
02:51:50     INFO -  PROCESS | 9209 | handler@resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/shared/protocol.js:1082:19
02:51:50     INFO -  PROCESS | 9209 | onPacket@resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/server/main.js:1752:15
02:51:50     INFO -  PROCESS | 9209 | receiveMessage@resource://gre/modules/commonjs/toolkit/loader.js -> resource://devtools/shared/transport/transport.js:761:7
02:51:50     INFO -  PROCESS | 9209 | ERROR:geckoservo::glue: Unnecessary call to traverse_subtree
02:51:50     INFO -  PROCESS | 9209 | ERROR:geckoservo::glue: Unnecessary call to traverse_subtree
02:51:50     INFO -  PROCESS | 9209 | Assertion failure: mSource.IsGeckoRuleNode(), at /home/worker/workspace/build/src/layout/style/nsStyleContext.h:315
02:51:50     INFO -  PROCESS | 9209 | #01: ???[/builds/slave/test/build/application/firefox/libxul.so +0xa258ce]
02:51:50     INFO -  PROCESS | 9209 | ExceptionHandler::GenerateDump cloned child 9330
02:51:50     INFO -  PROCESS | 9209 | ExceptionHandler::SendContinueSignalToChild sent continue signal to child
02:51:50     INFO -  PROCESS | 9209 | ExceptionHandler::WaitForContinueSignal waiting for continue signal...
02:51:50     INFO -  Terminating psutil.Process(pid=9209, name='firefox')
02:51:51     INFO -  TEST-INFO | 9209: exit 11


the assertion is probably the cause of the browser terminating and not producing the results we want.  In tp and g1, I see a hundreds of these message |ERROR:geckoservo::glue: Unnecessary call to traverse_subtree|.  Talos expect output and progress at reasonable intervals (usually within 5 seconds) or it will quit.

I would run this locally and compare firefox vs stylo- that seems like the best bet for watching what is going on.
Flags: needinfo?(wlachance)
I agree with Joel's advice of running locally, though I would personally start with a conceptually simpler test, tp5o (find it under the "tp" symbol in Treeherder) is probably my favorite (https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5). Looking there, I see some pretty weird output before it fails:

02:50:49     INFO -  TEST-INFO | started process 29436 (/builds/slave/test/build/application/firefox/firefox -profile /tmp/tmp83QyoG/profile http://localhost:52623/getInfo.html)
1146
02:50:50     INFO -  PROCESS | 29436 | 1483527050288	addons.xpi	WARN	Exception running bootstrap method install on shield-recipe-client@mozilla.org: [Exception... "Component returned failure code: 0x8000ffff (NS_ERROR_UNEXPECTED) [nsIPrefBranch.getBoolPref]"  nsresult: "0x8000ffff (NS_ERROR_UNEXPECTED)"  location: "JS frame :: resource://gre/modules/addons/XPIProvider.jsm -> jar:file:///builds/slave/test/build/application/firefox/browser/features/shield-recipe-client@mozilla.org.xpi!/bootstrap.js :: this.install :: line 38"  data: no] Stack trace: this.install()@resource://gre/modules/addons/XPIProvider.jsm -> jar:file:///builds/slave/test/build/application/firefox/browser/features/shield-recipe-client@mozilla.org.xpi!/bootstrap.js:38 < callBootstrapMethod()@resource://gre/modules/addons/XPIProvider.jsm:4986 < processFileChanges()@resource://gre/modules/addons/XPIProvider.jsm -> resource://gre/modules/addons/XPIProviderUtils.js:2097 < checkForChanges()@resource://gre/modules/addons/XPIProvider.jsm:3827 < startup()@resource://gre/modules/addons/XPIProvider.jsm:2831 < callProvider()@resource://gre/modules/AddonManager.jsm:264 < _startProvider()@resource://gre/modules/AddonManager.jsm:771 < startup()@resource://gre/modules/AddonManager.jsm:957 < startup()@resource://gre/modules/AddonManager.jsm:2923 < observe()@resource://gre/components/addonManager.js:65
1147
02:50:55     INFO -  PROCESS | 29436 | [Parent 29436] WARNING: pipe error (66): Connection reset by peer: file /home/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 346
1148
02:50:55     INFO -  PROCESS | 29436 |
1149
02:50:55     INFO -  PROCESS | 29436 | ###!!! [Parent][MessageChannel] Error: (msgtype=0x46001A,name=PContent::Msg_PreferenceUpdate) Channel error: cannot send/recv
1150
02:50:55     INFO -  PROCESS | 29436 |
1151
02:50:55     INFO -  PROCESS | 29436 |
1152
02:50:55     INFO -  PROCESS | 29436 | ###!!! [Parent][MessageChannel] Error: (msgtype=0x2C0084,name=PBrowser::Msg_Destroy) Channel error: cannot send/recv
1153
02:50:55     INFO -  PROCESS | 29436 |
1154
02:50:55     INFO -  PROCESS | 29436 | ERROR:geckoservo::glue: Unnecessary call to traverse_subtree
1155
02:50:55     INFO -  PROCESS | 29436 | ERROR:geckoservo::glue: Unnecessary call to traverse_subtree
1156
02:50:55     INFO -  PROCESS | 29436 | ERROR:geckoservo::glue: Unnecessary call to traverse_subtree
1157
02:51:53     INFO -  PROCESS | 29436 | 1483527113649	addons.productaddons	ERROR	Request failed certificate checks: [Exception... "SSL is required and URI scheme is not https."  nsresult: "0x8000ffff (NS_ERROR_UNEXPECTED)"  location: "JS frame :: resource://gre/modules/CertUtils.jsm :: checkCert :: line 145"  data: no]
1158
02:55:50     INFO -  PROCESS | 29436 | *************************
1159
02:55:50     INFO -  PROCESS | 29436 | A coding exception was thrown and uncaught in a Task.
1160
02:55:50     INFO -  PROCESS | 29436 |
1161
02:55:50     INFO -  PROCESS | 29436 | Full message: ReferenceError: fetch is not defined
1162
02:55:50     INFO -  PROCESS | 29436 | Full stack: apiCall@resource://shield-recipe-client/lib/NormandyApi.jsm:37:5
1163
02:55:50     INFO -  PROCESS | 29436 | get@resource://shield-recipe-client/lib/NormandyApi.jsm:44:12
1164
02:55:50     INFO -  PROCESS | 29436 | this.NormandyApi.fetchRecipes<@resource://shield-recipe-client/lib/NormandyApi.jsm:52:34
1165
02:55:50     INFO -  PROCESS | 29436 | TaskImpl_run@resource://gre/modules/Task.jsm:319:42
1166
02:55:50     INFO -  PROCESS | 29436 | TaskImpl@resource://gre/modules/Task.jsm:277:3
1167
02:55:50     INFO -  PROCESS | 29436 | asyncFunction@resource://gre/modules/Task.jsm:252:14
1168
02:55:50     INFO -  PROCESS | 29436 | this.RecipeRunner.start<@resource://shield-recipe-client/lib/RecipeRunner.jsm:64:23
1169
02:55:50     INFO -  PROCESS | 29436 | TaskImpl_run@resource://gre/modules/Task.jsm:319:42
1170
02:55:50     INFO -  PROCESS | 29436 | TaskImpl@resource://gre/modules/Task.jsm:277:3
1171
02:55:50     INFO -  PROCESS | 29436 | asyncFunction@resource://gre/modules/Task.jsm:252:14
1172
02:55:50     INFO -  PROCESS | 29436 | setTimeout_timer@resource://gre/modules/Timer.jsm:30:5
1173
02:55:50     INFO -  PROCESS | 29436 |
1174
02:55:50     INFO -  PROCESS | 29436 | *************************
1175

1176


https://treeherder.mozilla.org/logviewer.html#?job_id=66082404&repo=try&lineNumber=1176
is this still a problem?
I run the tp5o test locally, the first few runs OK until chinaz.com, the one failed to report its number. I believe the report is generated from an Addon? Where can I find it and debug it?
Also
Depends on: 1329919
Found it: testing/talos/talos/pageloader/ but not sure where to start debugging. hmm
Is there any way I can manually trigger the pageloader addon and maybe debug the JS?
Flags: needinfo?(jmaher)
I found that the gDisableE10S parameter is not correctly set in the pageloader addon, even if I use a non-e10s test suite but the addon still use the e10s handlers. But fixing that doesn't fix this bug.
OK, after some debugging, the error is clearly about cnn.com crashing the stylo build. Block on Bug 1329919.
Flags: needinfo?(jmaher)
Depends on: 1337305
Priority: -- → P3
Summary: [Stylo] Talos fail with "Could not find report in browser output" → stylo: Talos fail with "Could not find report in browser output"
I am not sure this is a problem anymore, :shinglyu- can you confirm this is a problem or resolve this bug?
Flags: needinfo?(shing.lyu)
https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-searchStr=stylo&selectedJob=104379964

All the Talos test on stylo platform seems to be running successfully now. Resolving this bug. Thanks for the reminder, Joel.
Flags: needinfo?(shing.lyu)
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.