Closed Bug 1090434 Opened 9 years ago Closed 9 years ago

e10s talos tp5o is failing on holly - appears to be hung on


(Testing :: Talos, defect)

Not set



Tracking Status
e10s + ---


(Reporter: jmaher, Assigned: billm)


(Blocks 1 open bug)



(1 file)

looking at the latest push:

we have a lot of red tp jobs.  Going into detail of the logs (except for osx), we have a consistent failure pattern:

15:30:50     INFO -  __WARNTimeout (18/20) exceeded on http://localhost/page_load_test/tp5n/
15:30:50     INFO -  RSS: Main: 106823680
15:30:50     INFO -  __WARNTimeout (19/20) exceeded on http://localhost/page_load_test/tp5n/
15:30:50     INFO -  RSS: Main: 105926656

for some reason we hit our internal timeout of 5 seconds with no pageload or mozafterpaint event and terminate the run after 20 attempts.

So why is this happening and it wasn't before?

I think some next steps might be to download a build and run this locally.  Are we executing too many e10s calls?  Is there something special about the webpage that is causing a hang or finding a bug?
good news! ok, reproced this locally, I get a failure on the page.  I will have to compare the console output with/without e10s as well as try to see if events are making it back.
and I get this in the console window:
JavaScript error: http://localhost:15707/page_load_test/tp5n/, line 18: Error: Permission denied to access property 'apply'

turning e10s off, it works fine.  So somewhere along the lines either this apple website is doing something wrong, or firefox is doing something wrong.

since coherent.js is minified js, it is hard to tell what is really going on, here is the source for it all:

:billm, is there anything you can think of as to why we are getting this error and hanging tp5?
Flags: needinfo?(wmccloskey)
hacking around in the .js file, I see this as the root cause:
var K=I.prototype.__factory__.apply(I,J);

commenting out that like I am able to continue on and test away.
some steps to reproduce this:
1) get talos and have it running:
2) get the tp5 pageset (, copy it to the talos/page_load_test/ directory, then unzip it there
3) cd talos/
4) run 'python -v -e /home/jmaher/dump/firefox/firefox --e10s -a tp5o --develop --output tp5.yml --results_url t.out --datazilla-url t.json'  <- NOTE: put a path to your firefox
5) edit page_load_test/tp5n/tp5o.manifest.develop, remove everything but the entry containing
6) run 'python -d -n tp5.yml'

you will see that we don't load and do an internal pageloader timeout.  We actually receive the load event, but we never setup the mozafterpaint event.  If you repeat steps 4-6 but without --e10s, you will see it work just fine.
Attached patch fix-apple.comSplinter Review
Sorry this took me so long to get to. The problem is that the frame script is using content.wrappedJSObject.setTimeout to do its timing. The coherent.js code is overriding setTimeout with its own version. However, since its version is unprivileged, it's not allowed to call our privileged timeout handler--that's where the failure comes from.

The solution is to use an X-ray wrapper (i.e., don't go through wrappedJSObject) so that we get the default version of setTimeout. With this change, the seems to work.
Assignee: nobody → wmccloskey
Flags: needinfo?(wmccloskey)
Attachment #8517077 - Flags: review?(jmaher)
Comment on attachment 8517077 [details] [diff] [review]

Review of attachment 8517077 [details] [diff] [review]:

nice, I am glad you could figure this out!
Attachment #8517077 - Flags: review?(jmaher) → review+
please land this on talos and I will work on getting this in production.
Blocks: 1094961
this is ready for deployment, will be updated this week on inbound.
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.