Closed Bug 1331552 (stylo-alexa) Opened 9 years ago Closed 8 years ago

[Stylo] Stylo vs Gecko diffing on real sites

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: shinglyu, Assigned: shinglyu)

References

Details

Attachments

(3 files, 2 obsolete files)

stylo-diff.zip 9 years ago Shing Lyu [:shinglyu] 8.45 MB, application/zip		Details
List of failing and crashing sites 8 years ago Shing Lyu [:shinglyu] 40.27 KB, text/plain		Details
Failures categorization list 8 years ago Shing Lyu [:shinglyu] 6.12 KB, text/markdown		Details
Failures categorization list 20170612 8 years ago Shing Lyu [:shinglyu] 3.08 KB, text/markdown		Details
Failures categorization list 20170612 8 years ago Shing Lyu [:shinglyu] 2.49 KB, text/markdown		Details

Shing Lyu [:shinglyu]

Assignee

Description

•

9 years ago

I use a script to download alexa top 25 sites wget -E -H -k -p http://google.com and run them in reftest using ./mach reftest --log-html stylo-reftest.html --disable-e10s alexa/reftest.list --setpref=reftest.compareStyloToGecko=true You can see the results in the attached html log. I removed duplicated google sites for different country, also I removed some bad test cases: amazon.com: Fail to download twitter.com: Loads forever instagram.com: Loads forever

Shing Lyu [:shinglyu]

Assignee

Comment 1

•

9 years ago

Bobby you might be interested in this.

Flags: needinfo?(bobbyholley)

Chris Peterson [:cpeterson]

Updated

•

9 years ago

Blocks: stylo-nightly
No longer blocks: stylo

Bobby Holley (:bholley)

Comment 2

•

9 years ago

(In reply to Shing Lyu [:shinglyu] from comment #0) > I use a script to download alexa top 25 sites > > wget -E -H -k -p http://google.com > > and run them in reftest using > > ./mach reftest --log-html stylo-reftest.html --disable-e10s > alexa/reftest.list --setpref=reftest.compareStyloToGecko=true > > You can see the results in the attached html log. (I think the log is missing). In general this is awesome. Do any of the sites pass the test? Are the results reproducible? It's probably not worth spending too much time debugging these now, since we have lots of reftests to fix first (which are presumably easier to debug). But if this technique gives reproducible results, it could be hugely valuable in a few months to validate that stylo renders real-world websites correctly.

Flags: needinfo?(bobbyholley)

Shing Lyu [:shinglyu]

Assignee

Comment 3

•

9 years ago

Attached file stylo-diff.zip — Details

(The file was too big so it failed to upload in the first time) Only 2 out of 18 tests have passed.

Shing Lyu [:shinglyu]

Assignee

Comment 4

•

9 years ago

You can see the screenshot in the attachment, it might give you a general sense of how good/bad Stylo is doing.

Bobby Holley (:bholley)

Comment 5

•

9 years ago

Nice! I think this will be really helpful when we get closer to full rendering fidelity. Can you get this added to our test plan? It looks like you probably want to spoof the UA to Gecko's when you crawl. The facebook screenshot seems to be giving us the "unsupported browser" page.

Chris Peterson [:cpeterson]

Comment 6

•

9 years ago

(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #5) > Nice! I think this will be really helpful when we get closer to full > rendering fidelity. Can you get this added to our test plan? I added these tests (and this bug) to our test plan and release criteria.

Shing Lyu [:shinglyu]

Assignee

Comment 7

•

9 years ago

I put the script I use in here: https://github.com/shinglyu/stylo-reftest-scripts/tree/master/real_page_diff

Shing Lyu [:shinglyu]

Assignee

Comment 8

•

9 years ago

I create a test set using the list from [1], sadly for the 24 test cases we only passed one (google map). I'm attaching the reftest html log and the plain text log which you can view with reftest-analyzer. I'll also update the tips for how to prepare the test case in my repo [2]. [1] https://docs.google.com/document/d/1HkkEzufEFbyJl5M7DGVVQH9rtvZihcOs9_hVkiPVO8U/edit# [2] https://github.com/shinglyu/stylo-reftest-scripts/

Flags: needinfo?(cpeterson)

Chris Peterson [:cpeterson]

Comment 9

•

9 years ago

(In reply to Shing Lyu [:shinglyu] from comment #8) > I create a test set using the list from [1], sadly for the 24 test cases we > only passed one (google map). One is better than none. <:) We still have thousands of other test failures, so this isn't too surprising. I'm guessing Google Maps does use much CSS since most of the page is a canvas map. Your tests here will be a big step for improving Stylo's real web compatibility beyond what our artificial tests measure.

Flags: needinfo?(cpeterson)

Shing Lyu [:shinglyu]

Assignee

Comment 10

•

9 years ago

here are the files: HTML log (screenshots) https://drive.google.com/open?id=0B7PInO5B2dItcGRxSllVMG5uekk Plain Text log (for reftest-analyzer) https://drive.google.com/open?id=0B7PInO5B2dItcEstMFZTYWd4M2s Test suite (web pages) https://drive.google.com/open?id=0B7PInO5B2dItYXV5alJEeTExQTQ Feel free to share the test suite within the organization, but keep in mind that we can't publish it due to copyright issue.

Shing Lyu [:shinglyu]

Assignee

Updated

•

8 years ago

Depends on: 1341997

Shing Lyu [:shinglyu]

Assignee

Comment 11

•

8 years ago

New results based on Chris' ~200 downloaded pages: HTML log with screenshot: https://drive.google.com/open?id=0B7PInO5B2dItLTdGa2JZLUVybVk Text log for reftest analyzer: https://drive.google.com/open?id=0B7PInO5B2dItblRJU3ZRY29YRUE Sadly we didn't pass any. Looks like we have some problem with fonts and text centering.

Shing Lyu [:shinglyu]

Assignee

Comment 12

•

8 years ago

Because we want to avoid accessing external resources (reftest forbids it), we change all "http(s)://" string to "bogus://". But Chris points out that this causes some itermittents: > I see some pages show (expected) error messages about the "bogus://" URL scheme, > but others just show a blank frame. Some of the error messages show up in Stylo > images and others in Gecko images, so this looks like a timing issue, not a Stylo > bug. Should the reftest loader wait longer before taking the screenshot? Or > should I try to strip out the external resource URLs entirely, replacing them > with an empty string "" instead of changing "https?://" to "bogus://"? Scripting > an empty string replacement would be trickier, but should give us consistent > blank frames. I believe changing "https?://" to "" may not fix the problem. For example if "https://foo.com/bar.html" becomes "foo.com/bar.html", it will be treated as a relative path, and it might still end up being a 404 page or other error. And to put it more generally, I think we should focus on quantity rather then quality for this test. We already have reftest and mochitest for more fine-grain test. If we go down the path of refining the test case, we'll spend a lot of time on individual test cases but loose the whole picture. I believe the developer who try to fix these errors will use the real site instead of the saved version.

Chris Peterson [:cpeterson]

Comment 13

•

8 years ago

(In reply to Shing Lyu [:shinglyu] from comment #12) > I believe changing "https?://" to "" may not fix the problem. For example if > "https://foo.com/bar.html" becomes "foo.com/bar.html", it will be treated as > a relative path, and it might still end up being a 404 page or other error. I meant replacing the entire URL "https://foo.com/bar.html" with "", not just removing the URL scheme like "foo.com/bar.html". > And to put it more generally, I think we should focus on quantity rather > then quality for this test. We already have reftest and mochitest for more > fine-grain test. If we go down the path of refining the test case, we'll > spend a lot of time on individual test cases but loose the whole picture. I > believe the developer who try to fix these errors will use the real site > instead of the saved version. Sounds good. I will continue saving more sites and then post them to Google Drive.

Shing Lyu [:shinglyu]

Assignee

Comment 14

•

8 years ago

(In reply to Chris Peterson [:cpeterson] from comment #13) > I meant replacing the entire URL "https://foo.com/bar.html" with "", not > just removing the URL scheme like "foo.com/bar.html". That might work for iframes, but probably not for those in JavaScript. Also you'll have to write complicated regular expression to match those hidden in some ugly minified JS code. Let's not worry too much about it. The more valuable information is the screenshots, not the pass rate number.

Chris Peterson [:cpeterson]

Updated

•

8 years ago

Priority: -- → P2

Shing Lyu [:shinglyu]

Assignee

Comment 15

•

8 years ago

April 24th run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=125e0d61d3353476bdde95cc433707c03e8f4349

Shing Lyu [:shinglyu]

Assignee

Comment 16

•

8 years ago

FAIL 165 PASS 0 TIMEOUT 0 CRASH 52 TOTAL 169 Note: Some crashes are also counted as UNEXPECTED-FAILS, so the FAIL + CRASH > TOTAL

Shing Lyu [:shinglyu]

Assignee

Comment 17

•

8 years ago

Attached file List of failing and crashing sites — Details

Hiroyuki Ikezoe (:hiro)

Comment 18

•

8 years ago

As far as I can tell scrollbars now work fine, and some of failures in comment 15 will be passed.

Bobby Holley (:bholley)

Comment 19

•

8 years ago

The other major fixes we're waiting for are bug 1349651 and bug 1340484. When those land, I'd expect a number of these to start passing.

Chris Peterson [:cpeterson]

Comment 20

•

8 years ago

Assigning to Shing because he has been working on the real website diffing.

Assignee: nobody → shing.lyu

Shing Lyu [:shinglyu]

Assignee

Comment 21

•

8 years ago

Attached file Failures categorization list (obsolete) — Details

I analyzed recent failures and categorized them, I'll file bugs according to this and block this meta bug.

Shing Lyu [:shinglyu]

Assignee

Updated

•

8 years ago

Alias: stylo-alexa

Shing Lyu [:shinglyu]

Assignee

Updated

•

8 years ago

Depends on: 1366977

Shing Lyu [:shinglyu]

Assignee

Comment 22

•

8 years ago

Attached file Failures categorization list 20170612 (obsolete) — Details

43 tests fixed since last time I checked. Here are the new list of failures.

Attachment #8870706 - Attachment is obsolete: true

Shing Lyu [:shinglyu]

Assignee

Comment 23

•

8 years ago

Attached file Failures categorization list 20170612 — Details

(Updating the previous result after manual inspection) These issues are GONE in the recent test: * Scrollbar missing * SVG icon not loading * Placeholder style not applied * Link color style not applied Top remaining issues are: * Color difference abound blurry corners * Color difference in vertical strip pattern on images * Color gradient has the wrong color

Attachment #8876585 - Attachment is obsolete: true

Bobby Holley (:bholley)

Comment 24

•

8 years ago

(In reply to Shing Lyu [:shinglyu] from comment #23) > Top remaining issues are: > * Color difference abound blurry corners > * Color difference in vertical strip pattern on images > * Color gradient has the wrong color Are these "real" rendering bugs, in the sense that the rendering difference is perceptible, or are these differences only visible via the direct pixel-per-pixel comparison we're doing? What happens if we mark those reftests as 'fuzzy'? Manish was looking into some of these and had the opinion that there were certain rounding/blending cases where it was not worthwhile to try to make Gecko/Stylo match exactly. I don't know enough about the details to have an opinion on that, but I _do_ have the opinion that fixing such issues should be low-priority for us right now relative to other things. So if we can mark those tests to run with the reftest fuzzy-comparison logic, I would prefer to do that for now, and focus on the visually-observable rendering differences. We can file a P3 bug to make a decision about this later.

Flags: needinfo?(shing.lyu)

Shing Lyu [:shinglyu]

Assignee

Updated

•

8 years ago

Depends on: 1372821

Chris Peterson [:cpeterson]

Comment 25

•

8 years ago

We have completed our differential testing of Stylo vs Gecko on a few hundred top sites.

Status: NEW → RESOLVED

Closed: 8 years ago

Flags: needinfo?(shing.lyu)

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.