Open Bug 1264278 Opened 3 years ago Updated 2 years ago

New a11y test for long page performance bug 1123039

Categories

(Testing :: Talos, defect)

Version 3
Unspecified
All
defect
Not set

Tracking

(Not tracked)

REOPENED

People

(Reporter: surkov, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 5 obsolete files)

Attached patch patch (obsolete) — Splinter Review
that's the best I could do, but I didn't even try it, since not sure if there's a way to run talos for a single test without installations. I'd be awesome if somebody can make it working. I'm happy to help with or fix a11y part, if it works in unexpected way (or doesn't work at all).
Joel how do we test this out? (Or what is the next step?)
Flags: needinfo?(jmaher)
it appears we are looking to run pdf.worker.js_index.html, but when I load that up with this patch applied, nothing shows up.

:surkov, can you verify the url to load and that it will load for you locally?
Flags: needinfo?(jmaher) → needinfo?(surkov.alexander)
(In reply to Joel Maher (:jmaher) from comment #2)
> it appears we are looking to run pdf.worker.js_index.html, but when I load
> that up with this patch applied, nothing shows up.

this file is in the patch, https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=1264278&attachment=8740927

> :surkov, can you verify the url to load and that it will load for you
> locally?

how can I do this?
Flags: needinfo?(surkov.alexander)
Attached patch part1: infra (obsolete) — Splinter Review
keep web page files and infrastructure files separate
Attachment #8740927 - Attachment is obsolete: true
Attached patch part1: infra (obsolete) — Splinter Review
fix a11y.manifest
Attachment #8740974 - Attachment is obsolete: true
Attached patch part2: test files (obsolete) — Splinter Review
I assume it is:
file:///home/jmaher/mozilla/mozilla-inbound/testing/talos/talos/tests/a11y/pdf.worker.js_index.html

I don't understand pdf.worker.js_index.html vs pdf.worker.js.html

if you can verify via a file:/// uri that the page loads as expected with no errors, then I will see how to get running properly in talos.
(In reply to Joel Maher (:jmaher) from comment #7)
> I assume it is:
> file:///home/jmaher/mozilla/mozilla-inbound/testing/talos/talos/tests/a11y/
> pdf.worker.js_index.html

yeah, this one, I updated a11y.manifset in the latest patch

> I don't understand pdf.worker.js_index.html vs pdf.worker.js.html

pdf.worker.js.html is a page itself, the index file initializes accessibility and loads that page into a frame. See 'part1: infra' patch.

> if you can verify via a file:/// uri that the page loads as expected with no
> errors, then I will see how to get running properly in talos.

file protocol doesn't allow enhanced privileges required by a11y.js, but otherwise the file exists and loaded by file protocol.
a couple issues I see and they might not be related to why this isn't working for me locally on linux64:
* nsIAccessibleEvent is not defined: a11y.js:88:1
* can't access lexical declaration `EVENT_DOCUMENT_LOAD_COMPLETE' before initialization: pdf.worker.js_index.html:19:9

I don't know if these are really related to the test not running.  I suspect they are as we don't have nsIAccessibleEvent defined.  I defined it and then found Services was not defined.  Overall I have added these two lines to a11y.js:
nsIAccessibleEvent = Components.interfaces.nsIAccessibleEvent;
var Services = Components.utils.import("resource://gre/modules/Services.jsm", {}).Services;

After doing that, I run again |./mach talos-test -a a11yr| and I get no test complete after 10 minutes.  In fact, it runs for a little bit and then seems to halt.  I don't see additional errors related in the browser console or the terminal where I ran mach from.

Let me know if there are other tricks to get this running.  I would appreciate if this would run using:
./mach talos-test -a a11yr

then I could validate it amongst different platforms, reliability, etc.
(In reply to Joel Maher (:jmaher) from comment #9)

> After doing that, I run again |./mach talos-test -a a11yr| and I get no test
> complete after 10 minutes.

it is slow, in bug 1123039 Marco mentioned that it took about 10 minutes, in release build I think. Does the browser get ever responsive in your case?
giving it 10 minutes it didn't do anything.  I had about 2 minutes for it to load and then the browser was usable and still was usable 10 minutes later.
Attached patch part1: infra v2 (obsolete) — Splinter Review
this one should be working
Attachment #8740976 - Attachment is obsolete: true
I put both patches in a queue locally and ran |mach talos-test -a a11yr| and the page just kept spinning while loading the pdf worker bits (10+ minutes).  I couldn't pull up an error console to see if anything was in there, nor did I see anything in the commandline console where I ran mach.

I am wondering if this might be a lost cause- can we simplify this test case possibly?
(In reply to Joel Maher (:jmaher) from comment #13)
> I put both patches in a queue locally and ran |mach talos-test -a a11yr| and
> the page just kept spinning while loading the pdf worker bits (10+ minutes).
> I couldn't pull up an error console to see if anything was in there, nor did
> I see anything in the commandline console where I ran mach.

Did it stop spinning finally? Was that release build? Are there any timeouts for the talos tests?

> I am wondering if this might be a lost cause- can we simplify this test case
> possibly?

What I like about this test is that it is a real web case. I think it can be made as small as you want though, but the numbers would be probably not representative, as it may change under-the-hood processing. Anyway, for testing/debugging proposals, the content can be shrinked.
it never stopped spinning and there are no errors, warnings, or other messages to indicate something is happening.  This is a local opt build on linux64.

Talos requires collecting a lot of data points and returning a value- this is automated and needs to be very stable on all operating systems.  I am testing on a higher end development machine, where our automation machines are a bit lower end and might never complete.

If this is a conformance test to ensure this runs, it should be a mochitest.

Can you outline what exactly you are trying to test here?  that might help determine how small to make this.
this is a perf test, we need numbers how fast/slow we are. the goal is the page shouldn't take >10 minutes. Maybe  we can comment a portion of the page to make the test running stable, and then re-enable those portions as the a11y engine improves.
a perf test needs at least 10 iterations to be useful, so if it takes >1 minute to load the page, I don't think it will work.  Can we reduce the page to load in <1minute?
(In reply to Joel Maher (:jmaher) from comment #17)
> a perf test needs at least 10 iterations to be useful, so if it takes >1
> minute to load the page, I don't think it will work.  Can we reduce the page
> to load in <1minute?

commenting code should do a trick, do you want to try or do you want me to play with that?
please give it a go and upload the patches, I am happy to help validate them.  Please ensure they run with |mach talos-test -a a11yr|
Attachment #8740977 - Attachment is obsolete: true
Attached patch part1: infra v3Splinter Review
Attachment #8746557 - Attachment is obsolete: true
(In reply to Joel Maher (:jmaher) from comment #19)
> please give it a go and upload the patches, I am happy to help validate
> them.  Please ensure they run with |mach talos-test -a a11yr|

v3 version is loaded in less than one minute for me and shows alert with a number, when I do |mach talos-test -a a11yr|
awesome stuff, :rwood, you said you wanted to try out a new talos test- lets work on this together!
The new test is attempting to access external resources (but talos tests have no external network):

14:02:27     INFO -  PROCESS | 46769 | FATAL ERROR: Non-local network connections are disabled and a connection attempt to mxr.mozilla.org (63.245.215.42) was made.
(In reply to Robert Wood [:rwood] from comment #24)
> The new test is attempting to access external resources (but talos tests
> have no external network):
> 
> 14:02:27     INFO -  PROCESS | 46769 | FATAL ERROR: Non-local network
> connections are disabled and a connection attempt to mxr.mozilla.org
> (63.245.215.42) was made.

apparently I forgot to remove 

<link rel="icon" href="https://mxr.mozilla.org/mxr.png" type="image/png">

from html file before uploading the patch
Thanks Alexander, when I removed that line the test runs but eventually is blocked by a browser-pop up (that just displays a number i.e. '54521' with an OK button) and hangs there.
(In reply to Robert Wood [:rwood] from comment #26)
> Thanks Alexander, when I removed that line the test runs but eventually is
> blocked by a browser-pop up (that just displays a number i.e. '54521' with
> an OK button) and hangs there.

I think that was an alert showing a number in ms how long the test takes, that alert should be replaced by a function call recording test results.
I found the alert pdf.worker.js_index.html:33: alert(new Date() - s);
Commented that out, dialog no longer appears but test still hangs at least for me (OSX 10.11.4) hmmm
Thanks guys, yes I just uncommented out the call to 'tpRecordTime' below the alert, and the test worked.

Ran the test suite with: ./mach talos-test -a a11yr

The new pdf.worker.js_index.html test runs 4 cycles of 25, for a total of 100 page loads. I'm assuming this will need to be changed to match the others (?) i.e. dhtml.html and tablemutation.html each run 1 cycle of 25 page loads. As it stands, the a11y suite took 32 minutes to finish for me on OSX.

17:35:16     INFO -  SUITE-END | took 1913s
1 cycle of 25 is great, if we need 1 cycles of fewer that is ok, just 12+.

next steps would be to run all 3 of these pages together and see if they run on try!
I triggered some jobs:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=5443ea23f35e&filter-searchStr=talos%20other

try is backlogged very bad, so the windows jobs will take 48 hours or so to complete.

Outside of that, it is nice to see this run, we have some timeouts- that isn't a good sign- possibly we are doing too much- it would be nice to investigate this further.
There's a problem running the new test on non-e10s. I can reproduce the timeout locally on OSX 10.11.4 with --disable-e10s, on the very first iteration. Nightly (and talos) hangs with this url displayed: file:///Users/rwood/mozilla-central/testing/talos/talos/tests/a11y/pdf.worker.js.html#1136

Alexander, any idea why this test would hang on non-e10s? Maybe a process is hanging/crashing and e10s can handle it, but non-e10s cannot? Perhaps we need to scale down the size of the .html.
Flags: needinfo?(surkov.alexander)
(In reply to Robert Wood [:rwood] from comment #33)
> There's a problem running the new test on non-e10s. I can reproduce the
> timeout locally on OSX 10.11.4 with --disable-e10s, on the very first
> iteration. Nightly (and talos) hangs with this url displayed:
> file:///Users/rwood/mozilla-central/testing/talos/talos/tests/a11y/pdf.
> worker.js.html#1136
> 
> Alexander, any idea why this test would hang on non-e10s? Maybe a process is
> hanging/crashing and e10s can handle it, but non-e10s cannot? 

No good ideas so far. Could you try to run the test with 'export A11YLOG=events' environment variable set to check out what events were fired? iirc the test expects two events document_loaded_complete and scrolling_start

> Perhaps we
> need to scale down the size of the .html.

right, I commented out part of the file, so we could comment out more code to make the test running faster.
Flags: needinfo?(surkov.alexander)
Attached file a11y_events.txt
A11y events when running pdf.worker.js_index.html

Alexander, 'scrolling start' is missing when the test is run with --disable-e10s
(In reply to Robert Wood [:rwood] from comment #35)
> Created attachment 8757014 [details]
> a11y_events.txt
> 
> A11y events when running pdf.worker.js_index.html
> 
> Alexander, 'scrolling start' is missing when the test is run with
> --disable-e10s

it is weird, did you try to comment out more content of the page by any chance, curious if that helps.
> it is weird, did you try to comment out more content of the page by any chance, curious if that helps.

Yes, tried commenting out more and more and it didn't make a difference, even when I commented out almost the entire body. Also actually removed all of the commented out code to see if it was a problem loading so many thousands of lines of comments themselves, but that didn't make a difference either. With NON e10s, the test always hangs waiting for the 'scrolling start' a11y event.
(In reply to Robert Wood [:rwood] from comment #37)
> > it is weird, did you try to comment out more content of the page by any chance, curious if that helps.
> 
> Yes, tried commenting out more and more and it didn't make a difference,
> even when I commented out almost the entire body.

item 1136 has to be in the document, otherwise no scroll event I assume as there's nothing to scroll to

> Also actually removed all
> of the commented out code to see if it was a problem loading so many
> thousands of lines of comments themselves, but that didn't make a difference
> either. With NON e10s, the test always hangs waiting for the 'scrolling
> start' a11y event.

it'd be good to run the test, if that scrolling event promise is removed. If the alert shows same time then it's probably the way to go.
Thanks, with everything below item 1136 commented out (lines 1228 to 41633) still hangs with e10s disabled, but works fine with e10s.

Took the scrolling promise out of pdf.worker.js_index.html, and using original pdf.worker.js.html (with lines 14828 to 41633 commented out) ==> test runs fine with or without e10s.

Suite a11yr including the new test, on default e10s:
Completed test suite (00:10:05)

Same with --disable-e10s:
Completed test suite (00:02:16)

That's on my OSX 10.11.4, will push to try and see what it looks like.
(In reply to Robert Wood [:rwood] from comment #41)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=8da198e4d6e4

With the modifications mentioned above (mainly the scrolling promise removed from pdf.worker.js_index.html) the test runs fine.

Linux x64 non-e10s:
07:46:21     INFO -  TEST-OK | a11yr | took 295029ms

Linux x64 e10s:
07:48:09     INFO -  TEST-OK | a11yr | took 404382ms

OSX 10.10 non-e10s:
07:44:47     INFO -  TEST-OK | a11yr | took 167771ms

OSX 10.10 e10s:
07:50:06     INFO -  TEST-OK | a11yr | took 486165ms

A couple of questions:

1) Is it a bug that the fact that the 'scrolling start' a11y event never fires during the test when the test is run on non-e10s? Should another bug be filed for that?

2) I can get the test to run on e10s and non-e10s by removing the promise awaiting the 'scrolling start' a11y event, but doesn't that jeopardize the integrity of the test itself? I just want to be 100% sure the new test is legitimate in that the test measurements are accurate and valuable.
Flags: needinfo?(surkov.alexander)
(In reply to Robert Wood [:rwood] from comment #42)

> 1) Is it a bug that the fact that the 'scrolling start' a11y event never
> fires during the test when the test is run on non-e10s? Should another bug
> be filed for that?

probably not, scrolling_start is fired when document is focused and has an anchor target (scrolled-to element), if the document gets focused before, then there's no scrolling_start event as far as I can tell. In latter case I think screen reader can request an anchor target separately.

> 2) I can get the test to run on e10s and non-e10s by removing the promise
> awaiting the 'scrolling start' a11y event, but doesn't that jeopardize the
> integrity of the test itself? I just want to be 100% sure the new test is
> legitimate in that the test measurements are accurate and valuable.

technically it should be enough to wait for document_loaded_complete, that's what all screen readers do. I worried though that the layout/accessibility engine may do extra processing after the document load. While scrolling_start doesn't guarantee that there's no immediate processing in the queue after docload, but it was supposed to add extra level of confidence on this way.

It'd be interesting to check out manually if there are "hangs" after the test reports. If the browser continues to hang then it would explain that big difference between e10s and non e10s builds, and would probably mean, we need to listen some extra notifications to catch that extra processing. If not, then we are likely in a good shape, and e10s is just slow.

Either way, the test is a real web page and it takes significant time to load. It's good to have it running I think, even if it doesn't measure "the full page load", just to see how we evolve.
Flags: needinfo?(surkov.alexander)
Rob, this one might felt out of radar, pls ping me if you have more questions.
Flags: needinfo?(rwood)
(In reply to alexander :surkov from comment #44)
> Rob, this one might felt out of radar, pls ping me if you have more
> questions.

Hi, sorry for the delay.

"It'd be interesting to check out manually if there are "hangs" after the test reports. If the browser continues to hang then it would explain that big difference between e10s and non e10s builds, and would probably mean, we need to listen some extra notifications to catch that extra processing. If not, then we are likely in a good shape, and e10s is just slow."

When I run the test (with all promises in) manually on non-e10s it just seems to hang forever and never report back. I'm doing an updated build now and will try it again.

Question - instead of having 26,000+ lines commented out from 'pdf.worker.js.html', I just removed those lines from the file completely (on the last try run in comment 42 above). Is that ok? I don't see the point in checking in 26,000+ lines that are just commented out.
Flags: needinfo?(rwood)
(In reply to Robert Wood [:rwood] from comment #45)
> (In reply to alexander :surkov from comment #44)
> > Rob, this one might felt out of radar, pls ping me if you have more
> > questions.
> 
> Hi, sorry for the delay.
> 
> "It'd be interesting to check out manually if there are "hangs" after the
> test reports. If the browser continues to hang then it would explain that
> big difference between e10s and non e10s builds, and would probably mean, we
> need to listen some extra notifications to catch that extra processing. If
> not, then we are likely in a good shape, and e10s is just slow."
> 
> When I run the test (with all promises in) manually on non-e10s it just
> seems to hang forever and never report back. I'm doing an updated build now
> and will try it again.

I meant if there's a spinner showing that the browser is doing something.

> Question - instead of having 26,000+ lines commented out from
> 'pdf.worker.js.html', I just removed those lines from the file completely
> (on the last try run in comment 42 above). Is that ok? I don't see the point
> in checking in 26,000+ lines that are just commented out.

I would keep them, since I hope we won't be that slow forever. I wanted to keep uncommenting those lines, while improving, to get an original web page in the tests in the end.
Ok thanks Alexander,

"I meant if there's a spinner showing that the browser is doing something."

I tried it again, no there is no spinner, it just completely hangs. Talos then eventually tries to kill the firefox process.

"I would keep them, since I hope we won't be that slow forever. I wanted to keep uncommenting those lines, while improving, to get an original web page in the tests in the end."

Ok I will put the commented lines back, and see how long the test takes on e10s and non-e10s (with the scrolling event promise removed). Do you mean that you want to keep changing this test as time goes on? If the test changes though, then any historic numbers wouldn't be valid anymore... is that going to be a concern?

(Including :jmaher as perhaps he has done this with some talos tests in the past?)
Flags: needinfo?(surkov.alexander)
Flags: needinfo?(jmaher)
(In reply to Robert Wood [:rwood] from comment #47)
> Ok thanks Alexander,
> 
> "I meant if there's a spinner showing that the browser is doing something."
> 
> I tried it again, no there is no spinner, it just completely hangs. Talos
> then eventually tries to kill the firefox process.

so, the browser is still responsive, correct?

> "I would keep them, since I hope we won't be that slow forever. I wanted to
> keep uncommenting those lines, while improving, to get an original web page
> in the tests in the end."
> 
> Ok I will put the commented lines back, and see how long the test takes on
> e10s and non-e10s (with the scrolling event promise removed).

thank you

> Do you mean
> that you want to keep changing this test as time goes on?

correct

> If the test
> changes though, then any historic numbers wouldn't be valid anymore... 

true, but when we will be fast enough to run the whole test, no more changes after.
Flags: needinfo?(surkov.alexander)
"so, the browser is still responsive, correct?"

Tried the test again, yes you are correct while the test 'hangs' the browser itself is still responsive.
here are the load times in ms for pdfworker:

linux64:
[1098446.0, 272.0, 264.0, 299.0, 275.0, 283.0, 301.0, 280.0, 298.0, 285.0, 285.0, 281.0, 302.0, 308.0, 299.0, 318.0, 280.0, 287.0, 297.0, 298.0, 303.0, 300.0, 331.0, 300.0, 306.0]

linux64-e10s:
[357.0, 303.0, 307.0, 293.0, 300.0, 331.0, 306.0, 287.0, 265.0, 262.0, 322.0, 304.0, 313.0, 310.0, 317.0, 321.0, 334.0, 323.0, 311.0, 312.0, 316.0, 284.0, 307.0, 327.0, 307.0]

osx 10.10:
[211811.0, 503.0, 493.0, 504.0, 501.0, 522.0, 553.0, 576.0, 506.0, 509.0, 505.0, 520.0, 509.0, 504.0, 520.0, 524.0, 502.0, 513.0, 506.0, 504.0, 513.0, 505.0, 504.0, 505.0, 507.0]

osx-e10s:
[425.0, 475.0, 419.0, 415.0, 484.0, 408.0, 397.0, 426.0, 416.0, 421.0, 443.0, 412.0, 418.0, 415.0, 410.0, 494.0, 423.0, 417.0, 416.0, 426.0, 481.0, 441.0, 408.0, 402.0, 394.0]

win8:
[160554.0, 376.0, 367.0, 383.0, 379.0, 377.0, 388.0, 384.0, 382.0, 384.0, 390.0, 392.0, 391.0, 402.0, 401.0, 392.0, 403.0, 402.0, 402.0, 407.0, 406.0, 413.0, 414.0, 410.0, 414.0]

win8-e10s:
[347.0, 346.0, 344.0, 354.0, 344.0, 336.0, 348.0, 339.0, 334.0, 331.0, 333.0, 336.0, 353.0, 340.0, 336.0, 360.0, 483.0, 337.0, 332.0, 342.0, 348.0, 367.0, 338.0, 342.0, 348.0]

This test isn't measuring the right things- it takes 30+ minutes longer to run in e10s and there is no data to indicate why that is the case as the reported times do not take into account this time.  I suspect we are recording a completed page_load and the browser then takes a few minutes to finish loading (i.e. is responsive, but not done).

regarding a talos test which updates the pages frequently, there is no problem with that.  We will see changes on the graphs and will be able to annotate those and associate them with bugs so we can easily determine they are test related changes.

In short, as this test is currently running (even with the large majority commented out) we have too long of a runtime in the first load in non-e10s and all page loads on e10s.
Flags: needinfo?(jmaher)
(In reply to Joel Maher ( :jmaher ) from comment #51)

> This test isn't measuring the right things- it takes 30+ minutes longer to
> run in e10s and there is no data to indicate why that is the case as the
> reported times do not take into account this time.  I suspect we are
> recording a completed page_load and the browser then takes a few minutes to
> finish loading (i.e. is responsive, but not done).

How do we get real numbers for page load?

Also, it isn't clear to me why we can't reduce the test to run much shorter but still measure the desired performance problem...
Summary: new test for bug 1123039 → New a11y test for long page performance bug 1123039
it is hard to guarantee the exact page load time- it can mean different things to different people.  We look for two events currently in Talos:
1) 'load'
2) 'mozafterpaint'

When both those events are hit, we mark the page as loaded and record the time.  We can optionally not wait for the 'mozafterpaint' event if that would help here.

I would think we could reduce this page- I assume there is some construct in the page or pattern that we need to validate with a11y.  Ideally this is something that can be verified by the dev team and working great locally and myself or :rwood can help push it to try and determine the validity across different platforms/configs.
Blocks: a11yperf
Looks like this wasn't a priority and was dropped; if it becomes a priority in the future (and an updated test is made available, as per comment 51) then please re-open this bug at this time or file a new one. Thanks!
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WONTFIX
(In reply to Robert Wood [:rwood] from comment #54)
> Looks like this wasn't a priority and was dropped; if it becomes a priority
> in the future (and an updated test is made available, as per comment 51)
> then please re-open this bug at this time or file a new one. Thanks!

I suspect bug 1321960 caused an intermittent page load time spikes. Can we give it one more spin please, since that bug was fixed recently?
Status: RESOLVED → REOPENED
Flags: needinfo?(jmaher)
Resolution: WONTFIX → ---
thanks for reopening this bug, please point me to a try run if there are issues and :rwood or myself could help look at issues.
Flags: needinfo?(jmaher)
You need to log in before you can comment on or make changes to this bug.