Closed Bug 1554777 Opened 5 years ago Closed 5 years ago

Categories

(Core :: Layout: Images, Video, and HTML Frames, defect, P5)

defect

Tracking

()

RESOLVED FIXED
mozilla70
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- unaffected
firefox68 --- unaffected
firefox69 --- disabled
firefox70 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: emilio)

References

(Depends on 1 open bug)

Details

(Keywords: intermittent-failure, regression, Whiteboard: [retriggered][stockwell disabled])

Attachments

(5 files)

Whiteboard: [retriggered]

It looks like an intermittent that started last month when we turned on these tests for GeckoView. It needs investigation from the appropriate team (Layout?). Feel free to update the metadata if it becomes a problem, but for now it doesn't seem that frequent and doesn't warrant disabling in my opinion.

Flags: needinfo?(mjzffr)

This test should show a green box. If it fails, it shows a red box. It depends on an image resource.

I can reproduce the intermittent failure outside of the wpt harness by running:

 adb shell am start -W -n org.mozilla.geckoview.test/org.mozilla.geckoview.test.TestRunnerActivity -a android.intent.action.MAIN -es args "http://web-platform.test:8000/css/css-backgrounds/border-image-width-006.xht"

Every 5-10 runs I see that the green box doesn't appear at all until I click/touch the window. When it fails, the page load takes a bit longer and emulator screen flickers black. The image in the test always loads fine, so that's not the issue. Looking at logcat, I think it might be related to Android resource management and/or activity focus? Maybe we lose focus?

In the attached log, you can see that when the green box does not appear (FAIL)
there messages below appear before we get a pageshow event:

  • WindowManager: Destroying surface...
  • dispatch GeckoView:SetActive, data={"active":false}
  • Choreographer: Skipped 66 frames! The application may be doing too much work on its main thread

In a failing run, the last logcat messages shown once we see a red square are:

06-28 21:19:37.770  2938  2953 D GeckoViewContent[C]: handleEvent: pageshow
06-28 21:19:38.132  2877  2892 W GeckoEventDispatcher: No listener for GeckoView:StateUpdated

and then when I click/touch the window, green square appears and I get:

06-28 21:20:45.475  2877  2922 D EGL_emulation: eglMakeCurrent: 0xc6d9fa60: ver 2 0 (tinfo 0xd1fcbdf0)

I'm not sure why this test in particular is affected. Presumably this kind of situation occurs all the time on Android...

Whiteboard: [retriggered][stockwell disable-recommended] → [retriggered][stockwell needswork]
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d34e20b62877
Disable test on Android because of flakiness and frequent failures. a=testonly

I've disabled this because of it's flakiness. On the push that Andreea updated the expectations it started failing as TEST-UNEXPECTED-PASS https://treeherder.mozilla.org/#/jobs?repo=autoland&selectedJob=254152507&resultStatus=testfailed%2Cbusted%2Cexception&revision=48d35c2a2dd5cd8301cee595c0238366d67a34c4&searchStr=wr%2Candroid

Keywords: leave-open
Whiteboard: [retriggered][stockwell needswork] → [retriggered][stockwell disabled]

As Maja observed in bug 1567639, this test (viewable at http://web-platform-tests.live/css/css-backgrounds/border-image-width-006.xht ) often fails on first load even in Firefox Nightly on Desktop, until you click or interact with the page.

I suspect this may have to do with the our image decoding -- not sure whether it's the off-main-thread-decoding or some other aspect of lazy image decoding, but at least if I toggle the pref image.decode-immediately.enabled to true then I can't reproduce anymore (on Desktop).

emilio, heycam mentioned that you'd worked on a similar issue where image async-decode wasn't working properly with onload-unblocking... Did you run into anything like this where images weren't getting painted/invalidated when decoded?

Component: Layout → Layout: Images, Video, and HTML Frames
Flags: needinfo?(emilio)

FWIW here's a profile of me loading the web-hosted testcase twice, in desktop Nightly. The bug does not reproduce the first time, but it does reproduce the second time.
Profile: https://perfht.ml/2TaRbEQ

On the first load, the image shows up on its own a little after 5 seconds. Then I shift+reloaded, and the image never showed up on its own (i.e. the bug reproduced!) though it did show up when I clicked the page at around 15 seconds.

This doesn't reproduce on mac for me, but is really easy to reproduce on linux.

I'm guessing the problem has something to do with calling StartDecoding during the paint here https://searchfox.org/mozilla-central/rev/e0b0c38ee83f99d3cf868bad525ace4a395039f1/layout/painting/nsImageRenderer.cpp#110 and then painting the image, and then delivering the notifications for the decode being finishing after the paint. My guess would be that when the notifications arrive the receiver probably thinks it already painted and so ignores them? But we need to invalidate so the pixels make it to the screen and not just layer contents.

This only repros with WebRender disabled.

I can try to take a look.

So the issue here is that this test-case has a zero-sized border, but still with
a border-image (which we should paint).

So the first time we get here, the image is not loaded, and thus we don't get
here:

https://searchfox.org/mozilla-central/rev/e0b0c38ee83f99d3cf868bad525ace4a395039f1/layout/painting/nsDisplayList.h#4254

Which means that we end up with zero bounds and thus we don't even get to the
border painting code.

However, when the image loads, we get to MarkNeedsDisplayItemRebuild(), but that
doesn't schedule any paint, only marks the frames as modified in order for
display items to be rebuilt eventually.

Thus eventually, where we force a repaint by other means, we paint correctly.

This only works in more general cases because we get to the nsImageRenderer code
which does vastly different stuff.

InvalidateFrame() seems to do the right thing and schedule a paint, so use it.
It's not clear to me which one of nsIFrame::InvalidateFrame() or
nsIFrame::SchedulePaint() we should use...

If I understand correctly, InvalidateFrame() will only do something iff there
are display items for the frame, so that should make the IsVisible() check
redundant.

Note however that I think there's still a race condition, if we get to paint in
between the SIZE_AVAILABLE notification (the one where we actually invalidate
the display items), and the LOAD notification (the one the border-image code
checks).

I'll send a separate patch for that, I think SIZE_AVAILABLE should be a
strong-enough hint and that allows us to remove nsStyleImage::IsLoaded()...

The RequestReflow stuff also looks highly suspicious... shape-outside
sync-decodes, and it seems we could end up invalidating reflow from the reflow
code...

The current code needs to handle incomplete draws already due to async decoding,
and this prevents the race condition where we paint between the size-available
and load notifications, as the CSS image loader only invalidates for the first.

Once we try to decode the image at least once, then we'd repaint properly from
ImageLoader::OnFrameComplete.

Flags: needinfo?(emilio)
Assignee: nobody → emilio

I think they should pass after the last few patches.

Thanks, Emilio!

Attachment #9083664 - Attachment description: Bug 1554777 - Call InvalidateFrame() rather than MarkNeedsDisplayItemRebuild() when we get the size available notification for a style image. → Bug 1554777 - Call SchedulePaint() rather than MarkNeedsDisplayItemRebuild() when we get the size available notification for a style image.
Attachment #9083664 - Attachment description: Bug 1554777 - Call SchedulePaint() rather than MarkNeedsDisplayItemRebuild() when we get the size available notification for a style image. → Bug 1554777 - Call SchedulePaint() rather than MarkNeedsDisplayItemRebuild() when we get the size available notification for a style image. r=mattwoodrow,tnikkel
Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/da9ae31b5000
Call SchedulePaint() rather than MarkNeedsDisplayItemRebuild() when we get the size available notification for a style image. r=mattwoodrow,tnikkel
https://hg.mozilla.org/integration/autoland/rev/87ed773c0a80
Use SIZE_AVAILABLE rather than loaded to figure out whether to try to draw a border image. r=tnikkel
Status: NEW → RESOLVED
Closed: 5 years ago
Keywords: leave-open
Resolution: --- → FIXED
Flags: in-testsuite+
Target Milestone: --- → mozilla70
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: