nsIconChannel on Windows can flush layout at a bad time.

RESOLVED FIXED in Firefox 69

Status

()

defect
P1
normal
RESOLVED FIXED
4 months ago
15 days ago

People

(Reporter: emilio, Assigned: tnikkel)

Tracking

(Blocks 2 bugs, {regression})

unspecified
mozilla69
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr60 wontfix, firefox66 wontfix, firefox67 wontfix, firefox68 wontfix, firefox69 fixed)

Details

(crash signature)

Attachments

(8 attachments, 2 obsolete attachments)

47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review

This is the nsIconChannel issue in bug 1438939. I filed a different bug for the other issue I'm aware of, which is bug 1530188.

Updated

4 months ago
Priority: -- → P3
Reporter

Updated

3 months ago
Duplicate of this bug: 1541427
Crash Signature: [@ mozilla::RestyleManager::ContentStateChanged]
Assignee: nobody → tnikkel
Reporter

Updated

2 months ago
Duplicate of this bug: 1544818

Updated

2 months ago
Crash Signature: [@ mozilla::RestyleManager::ContentStateChanged] → [@ mozilla::RestyleManager::ContentStateChanged] [@ mozilla::PresShell::ContentStateChanged ]
Keywords: regression

Changing the priority to p1 as the bug is tracked by a release manager for the current beta.
See How Do You Triage for more information

Priority: P3 → P1
Reporter

Comment 4

2 months ago

FWIW I can probably paper over the crash by downgrading the assertion to a diagnostic assert... Do we is accessible from content?

It doesn't seem to be... data:text/html,<img width=16 height=16 src="moz-icon://foo.txt?size=16">

If it's not, and if this is the only offender of the assertion, probably downgrading it at least for now makes sense... WDYT?

Tim, are you working on this bug? Thanks

Flags: needinfo?(tnikkel)
Assignee

Comment 6

2 months ago

Yeah, but if the expectation is an upliftable patch to beta for this cycle I'm not sure that is a reasonable goal.

Flags: needinfo?(tnikkel)
Reporter

Updated

2 months ago
Depends on: 1545842
Assignee

Comment 7

2 months ago

There are two reasons for this. One is performance, it can be slow. The second is that is can spin the event loop which can re-enter into things like layout.

This patch uses the image decoder thread pool for that. But this is unsuitable (reason in next patch), the next patch changes it to use the img io thread used for network to pass data to imglib off main thread.

Assignee

Comment 8

2 months ago

This mutex only protects the iothread, not the rest of the decodepool. There is a different monitor for that in DecodePoolImpl.

Depends on D28419

Assignee

Comment 9

2 months ago

The unifed tweak I make in the next patch exposes this.

Depends on D28420

Assignee

Comment 10

2 months ago

We need to call CoInitialize, so we need to include the objbase.h header. That defines some generic boolean stuff, which conflicts with something that libjpeg defines. I tried working around it, but I gave up, this solution seems fine.

Depends on D28421

Assignee

Comment 11

2 months ago

Windows doesn't like more than one thread (at the same time?) calling SHGetFileInfoW. So we use the same thread to avoid this.

We also need to call CoInitialize on the thread we do this.

Depends on D28422

Assignee

Comment 12

2 months ago

The system call we need to use (SHGetFileInfo) can't be called from more than one thread at a time. So without adding a bunch of extra unnecessary complexity we can't support both Open and AsyncOpen. Further, SHGetFileInfo can spin the event loop (and we open icon channels in the middle of layout) so it's not even safe to call it synchronously.

nsIconChannel::Open isn't used on a full try run, I don't know enough about networking to know if we can entirely rule out that it is used or not.

Depends on D28423

Assignee

Comment 13

2 months ago

With nsIconChannel::Open no longer implemented we never pass false for aNonBlocking. We can remove that code and assert that SHFileGetInfo is called off main thread. Note that I would like to assert that we always call it on the same thread, but there isn't a super simple way to assert that.

Depends on D28424

Assignee

Comment 14

2 months ago

[Tracking Requested - why for this release]:
I don't think we need to track this for 67 anymore because bug 1545842 downgraded the release assert to diagnostic assert.

It's also late in the beta cycle to uplift significant work, so marking as wontfix for 67 since we have a short term solution in place with bug 1545842. Thanks

Assignee

Comment 16

2 months ago

Would you be able to shed any light on who uses nsIChannel::Open and if any of those users would use it on an icon channel? re https://phabricator.services.mozilla.com/D28424

Flags: needinfo?(jkt)
Assignee

Comment 17

2 months ago

As stated in other patches in this bug, the system call has to happen off main thread, and it has to always be on the same thread. So we dispatch the task to that thread and synchronously wait on the main thread for it to finish.

Depends on D28423

Sorry I don't really know much more than you do, I was able to assess the code paths and see that we weren't using the legacy format in my change in Bug 1520868 however I don't see anything different to your assement here:
https://phabricator.services.mozilla.com/D28424#860308

Flags: needinfo?(jkt)
Attachment #9059966 - Attachment is obsolete: true
Attachment #9059965 - Attachment is obsolete: true
Assignee

Comment 19

Last month

Looks like this is left over from a partial attempt at making GetStockIcon async like GetHIconFromFile.

But I don't think it needs to be async, the only system call SHGetStockIconInfo doesn't have anything about background thread like SHGetFileInfo.

Depends on D29626

Assignee

Comment 20

Last month

We frequently get more asserts in this test after we increased the async-ness of AsyncOpen.

Before the patches of this bug, in this test we would see 15 calls to nsIconChannel::AsyncOpen, and then a flurry of reflow asserts ("ASSERTION: bad inline size").

After the patches for this bug, sometimes we get all 15 AsyncOpen calls to complete (FinishAsyncOpen) before the asserts start. In that case we get an expected number of asserts. But frequently we get a case where some FinishAsyncOpen calls don't happen until after all of the asserts. In this case we get more asserts than we are expecting.

So it seems the lack of a complete icon image is causing extra reflow churn (it's the same asserts, just more).

Depends on D31013

Comment 21

Last month
Pushed by tnikkel@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/71bf2b9ca6ea
Call SHGetFileInfo off the main thread in nsIconChannel on Windows. r=tnikkel
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb340bd557d4
Change description of DecodePool::mMutex to be more accurate. r=aosmond
https://hg.mozilla.org/integration/mozilla-inbound/rev/875d43d35a71
Fix unified build bustage in image/Downscaler.cpp. r=aosmond
https://hg.mozilla.org/integration/mozilla-inbound/rev/8010faead0fd
De-unify image/DecodePool.cpp on Windows. r=aosmond
https://hg.mozilla.org/integration/mozilla-inbound/rev/2d9c02e85710
Use the image io thread to call SHGetFileInfo. r=aosmond
https://hg.mozilla.org/integration/mozilla-inbound/rev/642316b68e6a
Allow nsIconChannel::Open to work on Windows. r=aosmond
https://hg.mozilla.org/integration/mozilla-inbound/rev/1d13cc5dbb8a
Cleanup GetStockIcon from first patch. r=aosmond
https://hg.mozilla.org/integration/mozilla-inbound/rev/17018058233c
Adjust assert expectations on  toolkit/content/tests/chrome/test_bug437844.xul. r=aosmond

\p/

That was supposed to be \o/ :-)

Thanks for fixing this Timothy

Assignee

Updated

Last month
Blocks: 1553061

Taking care of the assert in bug 1545842 for 68.

No longer blocks: 1553061
Regressions: 1553061
Regressions: 1556360
You need to log in before you can comment on or make changes to this bug.