Open Bug 1677170 Opened 5 years ago Updated 10 months ago

[meta] Asynchronous, out-of-process Win32 file picker

Categories

(Core :: Widget: Win32, task, P3)

Product:

Component:

Platform:

Unspecified

Windows

Type:

task

Priority:

P3

Severity:

S3

Tracking

()

Status:

NEW

People

(Reporter: bugzilla, Assigned: rkraesig)

References

(Depends on 1 open bug, Blocks 3 open bugs)

Details

(Keywords: meta, Whiteboard: [bhr:CFileOpenSave::Show][win:stability])

Attachments

(6 files)

fuzzy query 4 years ago Stephen A Pohl [:spohl] 658 bytes, patch		Details \| Diff \| Splinter Review
preffed on for nightly 4 years ago Stephen A Pohl [:spohl] 943 bytes, patch		Details \| Diff \| Splinter Review
installer 4 years ago Stephen A Pohl [:spohl] 8.86 KB, patch		Details \| Diff \| Splinter Review
OOP and async nsFilePicker 4 years ago Stephen A Pohl [:spohl] 26.26 KB, patch		Details \| Diff \| Splinter Review
gkwinoop 4 years ago Stephen A Pohl [:spohl] 19.77 KB, patch		Details \| Diff \| Splinter Review
gkwinoop_ps 4 years ago Stephen A Pohl [:spohl] 5.65 KB, patch		Details \| Diff \| Splinter Review

(No longer employed by Mozilla) Aaron Klotz

Reporter

Description

•

5 years ago

I've got a plan for moving the COM bits of nsFilePicker out-of-process. The benefits are twofold:

It serves as a DLL injection mitigation, as pickers load shell extensions;
It gives us the opportunity to show pickers asynchronously, which keeps our parent process's main thread (mostly) active.

We can pull this off because Windows provides proxy/stubs for the IFileOpenDialog, IFileSaveDialog, and IFileDialog interfaces. We can create our own implementation of these interfaces that lives out of process, and forwards calls to the "real" object implementation.

Note that for the async stuff we actually need to generate our own async interface, but I think that it's worth it for the responsiveness gains we get on the main thread.

Toshihito Kikuchi [:toshi]

Updated

•

5 years ago

See Also: → 1468250

Gabriele Svelto [:gsvelto]

Comment 1

•

5 years ago

Are you really doing this now? Remind me when we meet at the next all hands because this needs a serious toast and the tab's on me.

(No longer employed by Mozilla) Aaron Klotz

Reporter

Comment 2

•

5 years ago

I've landed the prerequisites :-)

(No longer employed by Mozilla) Aaron Klotz

Reporter

Updated

•

5 years ago

Blocks: 1675831

(No longer employed by Mozilla) Aaron Klotz

Reporter

Comment 3

•

5 years ago

Hey Jamie, would you mind trying out this try build to see whether or not I broke file picker a11y? Please note that you will need to run its installer as it registers some components.

(No longer employed by Mozilla) Aaron Klotz

Reporter

Comment 4

•

5 years ago

Flags: needinfo?(jteh)

James Teh [:Jamie]

Comment 5

•

5 years ago

I tried quite hard, but I couldn't break it. Nicely done.

I was mostly worried about focus not being restored correctly when you dismiss a file picker, but it does get restored correctly.

Flags: needinfo?(jteh)

(No longer employed by Mozilla) Aaron Klotz

Reporter

Comment 6

•

5 years ago

Here are some profiles with this patchset:

Preffed off (ie, still runs synchronously)
Preffed on

(No longer employed by Mozilla) Aaron Klotz

Reporter

Updated

•

5 years ago

Depends on: 1683425

Toshihito Kikuchi [:toshi]

Updated

•

5 years ago

See Also: → 1683033

Toshihito Kikuchi [:toshi]

Updated

•

5 years ago

See Also: → 1321097

Gian-Carlo Pascutto [:gcp]

Comment 7

•

4 years ago

Toshi, can we grab Aaron's WIP before they disappear from the try server? (And attach them to this bug)

Flags: needinfo?(tkikuchi)

Stephen A Pohl [:spohl]

Comment 8

•

4 years ago

Attached patch fuzzy query — Details — Splinter Review

Stephen A Pohl [:spohl]

Comment 9

•

4 years ago

Attached patch preffed on for nightly — Details — Splinter Review

Stephen A Pohl [:spohl]

Comment 10

•

4 years ago

Attached patch installer — Details — Splinter Review

Stephen A Pohl [:spohl]

Updated

•

4 years ago

Attachment #9261053 - Attachment is patch: true

Stephen A Pohl [:spohl]

Updated

•

4 years ago

Attachment #9261056 - Attachment is patch: true

Stephen A Pohl [:spohl]

Updated

•

4 years ago

Attachment #9261057 - Attachment is patch: true

Stephen A Pohl [:spohl]

Comment 11

•

4 years ago

Attached patch OOP and async nsFilePicker — Details — Splinter Review

Stephen A Pohl [:spohl]

Comment 12

•

4 years ago

Attached patch gkwinoop — Details — Splinter Review

Stephen A Pohl [:spohl]

Comment 13

•

4 years ago

Attached patch gkwinoop_ps — Details — Splinter Review

Stephen A Pohl [:spohl]

Comment 14

•

4 years ago

(In reply to Gian-Carlo Pascutto [:gcp] from comment #7)

Toshi, can we grab Aaron's WIP before they disappear from the try server? (And attach them to this bug)

I was curious to find out how to do this when I saw this request come through bugmail. It was easier than expected. Toshi, I figured I'd just attach the patches instead of making you go through the same excercise.

Flags: needinfo?(tkikuchi)

Toshihito Kikuchi [:toshi]

Comment 15

•

4 years ago

(In reply to Stephen A Pohl [:spohl] from comment #14)

I was curious to find out how to do this when I saw this request come through bugmail. It was easier than expected. Toshi, I figured I'd just attach the patches instead of making you go through the same excercise.

Thank you for doing it. I was thinking of getting commits with hg pull, rebasing, and uploading them to phabricator.

Stephen A Pohl [:spohl]

Comment 16

•

4 years ago

(In reply to Toshihito Kikuchi [:toshi] from comment #15)

Thank you for doing it. I was thinking of getting commits with hg pull, rebasing, and uploading them to phabricator.

Good point. If we end up sending these patches for review (rather than writing new patches based on Aaron's work), then yes, that might be the better option and we can remove the attached patches in bugzilla.

(No longer employed by Mozilla) Aaron Klotz

Reporter

Comment 17

•

4 years ago

I've still got the patchset, which is probably more up to date than whatever was on try. Let me see if I can dig them up.

Stephen A Pohl [:spohl]

Updated

•

4 years ago

Severity: -- → S3

Priority: -- → P2

Florian Quèze [:florian]

Updated

•

4 years ago

Whiteboard: [bhr:CFileOpenSave::Show]

Ray Kraesig [:rkraesig]

Assignee

Updated

•

3 years ago

See Also: → 1743393

Ray Kraesig [:rkraesig]

Assignee

Comment 18

•

3 years ago

(In reply to (No longer employed by Mozilla) Aaron Klotz from comment #17)

I've still got the patchset, which is probably more up to date than whatever was on try. Let me see if I can dig them up.

Any luck?

Assignee: bugzilla → rkraesig

Gian-Carlo Pascutto [:gcp]

Updated

•

3 years ago

See Also: → 1771638

Stephen A Pohl [:spohl]

Comment 19

•

3 years ago

(In reply to Ray Kraesig [:rkraesig] from comment #18)

(In reply to (No longer employed by Mozilla) Aaron Klotz from comment #17)

I've still got the patchset, which is probably more up to date than whatever was on try. Let me see if I can dig them up.

Any luck?

Let's proceed under the assumption that the answer is no, and adjust any patches if Aaron comes through after all.

Gian-Carlo Pascutto [:gcp]

Updated

•

3 years ago

See Also: → 1670411

Stephen A Pohl [:spohl]

Updated

•

3 years ago

Depends on: 1743393

See Also: 1743393 →

Stephen A Pohl [:spohl]

Updated

•

3 years ago

Blocks: 1743393

No longer depends on: 1743393

Marco Bonardo [:mak]

Updated

•

3 years ago

See Also: → 1794264

Updated

•

3 years ago

See Also: → 1798791

Stephen A Pohl [:spohl]

Updated

•

3 years ago

Whiteboard: [bhr:CFileOpenSave::Show] → [bhr:CFileOpenSave::Show][win:stability]

Stephen A Pohl [:spohl]

Comment 20

•

3 years ago

Ray, are you still working on this? Could you provide a status update on what's missing before we can land?

Flags: needinfo?(rkraesig)

Ray Kraesig [:rkraesig]

Assignee

Comment 21

•

3 years ago

I'm afraid this has almost completely fallen by the wayside. (Although the recent commentary on bug 1798791 got me to poke at my notes a bit.)

The conclusion I came to some months ago was that :aklotz's original idea of using the system DLL surrogate resulted in too much complexity at both install and (outside of the happy path) invocation time due to tying us to the global registration mechanism of COM. (See, relatedly, what happened with AccessibleHandler.dll in bug 1670147.) While the problems there are solvable, it would be simpler just to have our own custom host-process in the Firefox installation directory -- and if we have that, there's not much reason to use COM at all.

I've done some initial investigation into using either a) an auxiliary Firefox process or b) a custom host process. Both appear feasible, and I vaguely prefer b), but the metaphorical shovel has not yet struck earth.

Removing myself as the assignee because of that. Feel free to reassign me to it, though, and I'll requeue it and reprioritize things.

Assignee: rkraesig → nobody

Status: ASSIGNED → NEW

Flags: needinfo?(rkraesig)

Ray Kraesig [:rkraesig]

Assignee

Updated

•

3 years ago

Depends on: 1704500

Ray Kraesig [:rkraesig]

Assignee

Updated

•

3 years ago

Depends on: 1816740

Ray Kraesig [:rkraesig]

Assignee

Comment 22

•

2 years ago

Removing myself as the assignee because of that. Feel free to reassign me to it, though, and I'll requeue it and reprioritize things.

Taking, because I'm working on this again.

Assignee: nobody → rkraesig

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

Depends on: 1833450

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

See Also: → 1810341

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

Blocks: 1582795

Gregory Pappas [:gregp]

Updated

•

2 years ago

See Also: → 112134

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

See Also: → 1837008

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

Depends on: 1837079

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

Blocks: 1845379

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

Depends on: 1858225

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

Depends on: 1862712

Ray Kraesig [:rkraesig]

Assignee

Comment 23

•

2 years ago

(Bug 1683425 appears to have been a blocker for :aklotz's externally-hosted COM implementation, but does not block the current XPIDL-based approach.)

No longer depends on: 1683425

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

Blocks: 1810341

See Also: 1810341 →

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

Blocks: 705190

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

No longer blocks: 1810341

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

No longer blocks: 1675831

Oliver Medhurst [:canadahonk]

Updated

•

2 years ago

See Also: → 1866517

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

Summary: Asynchronous, out-of-process Win32 file picker → [meta] Asynchronous, out-of-process Win32 file picker

BugBot (nomail) [:suhaib / :marco/ :calixte]

Updated

•

2 years ago

Keywords: meta

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

Depends on: 1872397

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

Blocks: 1837008

Takanori MATSUURA

Updated

•

2 years ago

See Also: 1837008 →

Updated

•

2 years ago

No longer blocks: 705190

Depends on: 705190

Updated

•

2 years ago

No longer blocks: 1661752

See Also: → 1661752

Kagami Rosylight [:saschanaz] (they/them)

Updated

•

2 years ago

See Also: → 1749130

Ray Kraesig [:rkraesig]

Assignee

Updated

•

2 years ago

No longer depends on: 705190

Jens Stutte [:jstutte]

Comment 24

•

1 years ago

Hi :rkraesig, I was trying to understand the status of this bug. Given it is a meta-bug, I'd have expected to see some open bugs it depends on, but that seems not to be the case. And FWIW I do not see this bug anymore as top-bug in the background hang monitor data - does this mean it's fixed?

Flags: needinfo?(rkraesig)

Ray Kraesig [:rkraesig]

Assignee

Comment 25

•

1 years ago

The status is that it's currently in Nightly and is pref-blocked from riding the trains. This is partly to flush out papercut bugs, and partly because we — well, I — want to reduce the failures we're seeing before letting it go out. The latter is blocked on getting more information about failures via telemetry; and getting that telemetry in has been hovering at the top of my personal to-do list for a while, but it keeps getting pushed out by P2s. There should indeed be some open bugs it depends on; I just haven't been very good about linking bugs to it. I'll see about doing that.

On the other hand, the fact that this issue ever showed up in the background hang monitor data was probably an issue with the hang-detector itself — or alternatively, that I'm not understanding what it's detecting. (The latter is quite likely, since I don't think I've ever seen that dashboard before.) The stack from its entry on 2023-10-01 suggests that a) the process wasn't actually "hung", just in a nested (but alien) modal event loop; and b) that it stopped being detected when bug 1858225 landed. That was the "async" part of this bug, which pushed the uncontrolled modal event loop into a separate thread.

Flags: needinfo?(rkraesig)

Jens Stutte [:jstutte]

Comment 26

•

1 years ago

Cool, thanks!

There should indeed be some open bugs it depends on; ... I'll see about doing that.

Yeah, that would be nice.

b) that it stopped being detected when bug 1858225 landed. That was the "async" part of this bug, which pushed the uncontrolled modal event loop into a separate thread.

My (also limited) understanding of the BHM is that it only detects things blocking the main thread. So moving something away on a different thread definitely will make it less harmful for main thread responsiveness and will make it not show up anymore here.

The stack from its entry on 2023-10-01 suggests that a) the process wasn't actually "hung", just in a nested (but alien) modal event loop;

I think the important piece here is the "alien" loop, IIUC that means we are processing native windows events but no normal thread runnables/events which from our responsiveness point of view means to be hanging while waiting, I think. This might or might not be an issue also for the thread you moved this to, depending on what the thread is supposed to do and how you handle it now. An important edge case to consider here is thread shutdown, where the thread needs to be responsive to some extent, otherwise it might block process shutdown (but I did not look at what you actually did here, so my concerns are only hints, no real ones).

Florian Quèze [:florian]

Comment 27

•

1 years ago

(In reply to Ray Kraesig [:rkraesig] from comment #25)

On the other hand, the fact that this issue ever showed up in the background hang monitor data was probably an issue with the hang-detector itself — or alternatively, that I'm not understanding what it's detecting. (The latter is quite likely, since I don't think I've ever seen that dashboard before.) The stack from its entry on 2023-10-01 suggests that a) the process wasn't actually "hung", just in a nested (but alien) modal event loop; and b) that it stopped being detected when bug 1858225 landed. That was the "async" part of this bug, which pushed the uncontrolled modal event loop into a separate thread.

There was some discussion in bug 1586922 about whether detecting the File Picker as a hang was a false positive. I initially thought it was, but changed my mind with the profile in bug 1586922 comment 1 that shows the main thread being blocked by OS code doing main thread I/O, and the BHR markers being in that time frame. We should not be reporting hangs during the nested event loop while waiting for the user to act on the prompt, the reported hangs should be the time it takes to open the prompt.

The entry from 2023-10-01 you pointed to shows 124,770 hangs reported within a day for a total time of 42,691s, that's an average of 342ms per hang, wait too short for it to be waiting for a user action.

For context, BHR reports a hang and captures a stack whenever the main thread has been blocked for 128ms.

Ray Kraesig [:rkraesig]

Assignee

Updated

•

1 years ago

Depends on: 1883943

Ray Kraesig [:rkraesig]

Assignee

Updated

•

1 years ago

Depends on: 1884221

Ray Kraesig [:rkraesig]

Assignee

Updated

•

1 years ago

Depends on: 1884426

Ray Kraesig [:rkraesig]

Assignee

Updated

•

1 year ago

Depends on: 1893868

Gian-Carlo Pascutto [:gcp]

Updated

•

1 year ago

Blocks: 1899111

Yannis Juglaret [:yannis]

Updated

•

1 year ago

No longer blocks: 1899111

Yannis Juglaret [:yannis]

Updated

•

1 year ago

Blocks: 1899111

Yannis Juglaret [:yannis]

Updated

•

1 year ago

See Also: → 1901230

Alice0775 White

Updated

•

1 year ago

Depends on: 1911706

Ray Kraesig [:rkraesig]

Assignee

Updated

•

10 months ago

No longer blocks: 1582795

Ray Kraesig [:rkraesig]

Assignee

Comment 28

•

10 months ago

With bug 1883943 and the release of Fx130 a few months ago, the main part of this bug is complete. There's a bit of cleanup work yet to do, but that's not P2. Reducing priority accordingly.

Priority: P2 → P3

You need to log in before you can comment on or make changes to this bug.