Closed Bug 1472237 Opened Last year Closed 9 months ago

Intermittent dom/base/test/test_bug590812.html | noxul domain same as ref

Categories

(Core :: DOM: Core & HTML, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla65
Tracking Status
firefox65 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: timdream)

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell disable-recommended])

Attachments

(3 files, 1 obsolete file)

Filed by: btara [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=185586643&repo=mozilla-inbound

https://queue.taskcluster.net/v1/task/F2DDaQXXSDSHMmhscnRgyA/runs/0/artifacts/public/logs/live_backing.log

[task 2018-06-29T14:50:25.194Z]     INFO - TEST-START | dom/base/test/test_bug590812.html
[task 2018-06-29T14:50:25.415Z]     INFO - TEST-INFO | started process screentopng
[task 2018-06-29T14:50:26.482Z]     INFO - TEST-INFO | screentopng: exit 0
[task 2018-06-29T14:50:26.483Z]     INFO - TEST-UNEXPECTED-FAIL | dom/base/test/test_bug590812.html | noxul domain same as ref 
[task 2018-06-29T14:50:26.483Z]     INFO -     runTest@dom/base/test/test_bug590812.html:25:3
[task 2018-06-29T14:50:26.483Z]     INFO -     onload@dom/base/test/test_bug590812.html:1:1
[task 2018-06-29T14:50:26.483Z]     INFO - TEST-PASS | dom/base/test/test_bug590812.html | xul supporting domain same as ref 
[task 2018-06-29T14:50:26.483Z]     INFO - GECKO(4612) | MEMORY STAT | vsize 1925MB | residentFast 189MB | heapAllocated 22MB
[task 2018-06-29T14:50:26.483Z]     INFO - TEST-OK | dom/base/test/test_bug590812.html | took 235ms
Hi Brian, I wonder... with XUL going away, I wonder if we should just let this test get disabled? A lot of failures tho, so unsure what to do.
Flags: needinfo?(bgrinstead)
Priority: -- → P3
(In reply to Marcos Caceres [:marcosc] (triage duty) from comment #17)
> Hi Brian, I wonder... with XUL going away, I wonder if we should just let
> this test get disabled? A lot of failures tho, so unsure what to do.

Looking at the test, I don't think this has a lot to do with XUL. It seems to just be testing the XML pretty print feature on two different domains (one which allows remote xul and one which doesn't). If it was only failing on the one that allowed remote xul that'd be one thing (given Bug 1460732), but the description it seems to indicate the failure is on the "non-remote-xul" domain which would be basically like a normal website (where this feature should work).

Though I wonder if this has to do with the iframe loaded over http://noxul.example.com/ not being having finished loading when the snapshots are compared? I notice we are only waiting [onload] on the test's window, and maybe there's a race in which we should be waiting for the subframes to load before doing the checks.
Flags: needinfo?(bgrinstead)
Wow! Thanks for the great analysis - I think you are right. Ok, I'll put up a quick patch that waits for the iframes to explicitly load.
Assignee: nobody → mcaceres
Priority: P3 → P1
Waits for each iframe to load before actually doing comparisons.
Unfortunately, the patch didn't fix it. I ran it 5000 times and the error is still showing up:

Ran 20004 checks (5001 tests, 10002 subtests, 5001 asserts)
Expected results: 19978
Unexpected results: 26

  FAIL noxul domain same as ref
    runTest@dom/base/test/test_bug590812.html:41:3
    async*onload@dom/base/test/test_bug590812.html:1:1
What's the failing window look like? Is it just the non-pretty-printed version of the XML?
(In reply to Brian Grinstead [:bgrins] from comment #23)
> What's the failing window look like? Is it just the non-pretty-printed
> version of the XML?

So, not sure if this is expected, but the fail window doesn't show the results (it's just blank). However, I was able to dump the ref image, and indeed it's not being pretty printed: it just has "Here be sea hags", black text, white background. 

Another unfortunate thing... it's a Heisenbug. When the debugger console is open, it's not reproducible.
Attached image Dumped image
Pushed by mcaceres@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6a72d0e28508
Intermittent dom/base/test/test_bug590812.html | noxul domain same as ref. r=bgrins
heh, amazing. Made it worse :) Ok, will think of different approach.
Flags: needinfo?(mcaceres)
:marcos are there any updates?
Flags: needinfo?(mcaceres)
No, unfortunately. We might need to move a level deeper to someone that knows the syntax highlighting code.
Flags: needinfo?(mcaceres)
Un-assigning myself, as I've probably reached the limit of what I can try without delving much deeper.  

@overholt, unfortunately the test was created by someone who is no longer at Mozilla. Do you have any suggestions for someone else we might as for help? This is related to XML pretty printing.
Assignee: mcaceres → nobody
Flags: needinfo?(overholt)
Hmm, is this a regression from bug 1437956
Flags: needinfo?(overholt) → needinfo?(timdream)
My hypnosis of the cause would be that the xmlprettyprint CSS didn't load when runTest() runs, given that we are using normal Shadow Root here.

I am going to mark the Shadow Root as UA Widget Shadow Root, so we could profit from bug 1431255 comment XIII.
Assignee: nobody → timdream
Status: NEW → ASSIGNED
Flags: needinfo?(timdream)
(In reply to Tim Guan-tin Chien [:timdream] (please needinfo) from comment #37)
> I am going to mark the Shadow Root as UA Widget Shadow Root, so we could
> profit from bug 1431255 comment XIII.

*commit* Part XIII, https://hg.mozilla.org/mozilla-central/rev/dca187f7c72c
That didn't work...

I don't think we need to wait for onload event of the iframes because the parent frame should have its onload runs before that, but I might be wrong and it doesn't hurt trying.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8542279565a102c29d9920c29dae77a93a82b85b
Still failing unfortunately 😿
Hum, I've just reproduced the finding in comment 24. I should have the read the bug more carefully.

So if that's the case, either something had triggered the XML pretty print to bail out, or it was never applied for some reason. A few printf() in nsXMLPrettyPrint.cpp should allow us to dive deeper. I will prepare the patch shortly.
The result from the try push shows that nsXMLPrettyPrinter::PrettyPrint() is never called, at least not to line 130.

This is strange because I didn't change things a lot before line 130

https://searchfox.org/mozilla-central/diff/b9a40e754b2d29dbbe83ec66a0963803876e4646/dom/xml/nsXMLPrettyPrinter.cpp#132

This is going to take a lot more printf() ...
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4811ecc33c980d8bd6c316ff90a3d9ea6776eef8

My recollection is that XML document rendering always reaches nsXMLContentSink, so these printf should allow us to find out why it had given up pretty printing.
The alternative fix would be simply drop the lines that tap into nsIPresShell, since we are no longer using XBL binding for pretty printing and the related code was just removed in bug 1503019, there is no reason to not pretty print if there isn't a presshell.
(In reply to Tim Guan-tin Chien [:timdream] (please needinfo) from comment #52)

https://treeherder.mozilla.org/#/jobs?repo=try&revision=3e70dbe529a485e60ae2fab39b40f0947ad5b0ab

Yet another good clean-up resulting from XBL removal :)
ML pretty print, as it was implemented in bug 64945, skips itself when the loaded document is not visible. There is no explanation why it does that, other than the fact that it is backed by an XBL binding, and XBL bindings always bound to DOM with layout frames. A later change in bug 1400618 made the existence of pressshell necessary before binding the XBL binding.

With bug 1437956 and bug 1503019, XML pretty print is now backed by Shadow DOM. That leaves no reason for keeping this behavior.

The test file test_bug590812.html also revealed that XML pretty print might incorrectly opt-out itself because of the said behavior. We don't know if this something only happens in the test or it is something that may happen in the wild. Nonetheless, it's a better idea to simplify the behavior here.
I've made my best attempt to justify the cleanup. Please let me know if the above commit message is incorrect or unclear.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=02b7fbd0f4ad9c07aa4b3da2901134c5ff6c260f

:mats did the change in bug 1400618 related to presshell so I guess he should review this. Let me know if I should flag someone else (though I would like to avoid overloading :bz for this rather localized change).
Attachment #9024535 - Attachment is obsolete: true
Pushed by tchien@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0e6465e2cdf4
Always do XML Pretty print even if the frame is hidden r=mats
https://hg.mozilla.org/mozilla-central/rev/0e6465e2cdf4
Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.