Closed Bug 1792466 Opened 2 years ago Closed 1 year ago

Firefox is taking considerable amount of time to batch open PDFs (video: 0:15 - 0:22)

Categories

(Toolkit :: Startup and Profile System, defect)

Firefox 105
defect

Tracking

()

RESOLVED FIXED
109 Branch
Performance Impact medium
Tracking Status
firefox106 --- wontfix
firefox107 --- wontfix
firefox108 --- wontfix
firefox109 --- fixed

People

(Reporter: anzepintar, Assigned: mossop)

References

(Depends on 1 open bug)

Details

(Keywords: perf:startup, reproducible)

Attachments

(3 files)

Attached video Video about the issue

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:105.0) Gecko/20100101 Firefox/105.0

Steps to reproduce:

Batch opening PDFs

Actual results:

First PDF opens normaly, but then firefox waits to load all of the PDFs and then open them at the same time

Expected results:

It should open all PDF in tabs and then load them when user opens tab

Component: Untriaged → File Handling
Summary: Firefox is taking considerable amount of time to batch open PDFs → Firefox is taking considerable amount of time to batch open PDFs (video from 0:15 - 0:22)
Summary: Firefox is taking considerable amount of time to batch open PDFs (video from 0:15 - 0:22) → Firefox is taking considerable amount of time to batch open PDFs (video: 0:15 - 0:22)
Component: File Handling → PDF Viewer

The severity field is not set for this bug.
:calixte, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(cdenizet)
Status: UNCONFIRMED → NEW
Performance Impact: --- → ?
Component: PDF Viewer → Shell Integration
Ever confirmed: true
Flags: needinfo?(cdenizet)
Component: Shell Integration → File Handling

OK, we should really stop shuffling this along components without adding any information or rationale as to why.

I reproduced this issue once, and then promptly stopped being able to reproduce it, with 8 or more tabs being opened promptly. Reporter (or Marco, as you confirmed the bug), do you see something similar? Or does this reproduce reliably for you? I'm wondering if disk caching or caching on the Firefox side is somehow making a difference.

FWIW, even when opening the files is quick, I'm sometimes seeing multiple windows being opened, rather than tabs. And the tabs not being opened at all in the video suggests that the slowdown is somehow before the PDFs end up in PDF.js, but rather in how we handle the incoming files being handed to us by Windows. I think the latter points to Shell Integration, but Kirk, you moved it out of that component as well - can you elaborate as to why you think this isn't related to our Windows DDE (or whatever it's called) code? I'm assuming Windows is spinning up more than 1 process for each file handled, and they all have to consolidate into the first one...

Flags: needinfo?(mcastelluccio)
Flags: needinfo?(bytesized)
Flags: needinfo?(anze.pintar2)

Despite the broad name, the Shell Integration component was originally only intended for bugs related to determining if Firefox is the default browser, setting Firefox as the default browser, and setting the desktop background.

Over time, our team has started putting a few related issues in here, like registering Firefox as the default PDF viewer and making sure files have the correct icons when Firefox is set as the default handler for them. But I still don't think that Shell Integration is the right component for this.

can you elaborate as to why you think this isn't related to our Windows DDE (or whatever it's called) code?

A search engine tells me that Windows DDE is "Dynamic Data Exchange". I'm guessing that that is what Firefox uses to have new processes sorta just hand off their arguments to an existing process? It does seem possible that that is involved with whatever is going on here, but as far as I know, our team has never owned that functionality. It looks to me like that functionality lives in toolkit/components/remote, which specifies that its component is Toolkit::Startup and Profile System.

Flags: needinfo?(bytesized)

I can reproduce it every time.

But, I found another bug related to this one:
Steps to reproduce:
Have two firefox windows open. Batch open multiple PDF files.
When the first one opens in the last focused window, click on the second firefox window.
Result:
First PDF opens in one window, and the remaining open in the second window.
Video

Flags: needinfo?(anze.pintar2)

(In reply to anze.pintar2 from comment #4)

But, I found another bug related to this one:
Steps to reproduce:
Have two firefox windows open. Batch open multiple PDF files.
When the first one opens in the last focused window, click on the second firefox window.
Result:
First PDF opens in one window, and the remaining open in the second window.
Video

Could you file a new bug for this problem?

Flags: needinfo?(mcastelluccio)

(In reply to Marco Castelluccio [:marco] from comment #5)

(In reply to anze.pintar2 from comment #4)

But, I found another bug related to this one:
Steps to reproduce:
Have two firefox windows open. Batch open multiple PDF files.
When the first one opens in the last focused window, click on the second firefox window.
Result:
First PDF opens in one window, and the remaining open in the second window.
Video

Could you file a new bug for this problem?

I have made new bug report: bug 1795590

I managed to reproduce this issue on:

  • Firefox 106.0;
  • Firefox 107.0b1;
  • Nightly 108.0a1;

Tested on:

  • Ubuntu 22;
  • Windows 10;
  • macOS 10.12;

On macOS the behaviour seems to be more like bug 1795590, since the files were opened in more than one window. Apart from this, macOS seems to be the least affected platform out of the three.
Since this bug is already set as NEW I'm just going to update the tracking status.

The severity field is not set for this bug.
:Gijs, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(gijskruitbosch+bugs)

So I don't really understand what is responsible for the slowdown here which means this isn't actionable. But as far as I can tell from repro'ing on a VM, the following happens:

  1. the OS creates N Firefox instances to open the N pdfs. This is a consequence of how Firefox's "open" verbs are registered, and how the PDFs are opened from Windows Explorer - this is just how Windows works, AIUI. If you used the commandline and simply passed 5 paths/URLs to one instance I think the problem would not occur (though I haven't verified this).
  2. the first pdf opens in instance A
  3. all the other N-1 PDFs lead to background instances B,C, ... N being opened
  4. I don't know how to work out what is being slow in the extra instances being opened, or if it's just instance A being slow processing messages, or what - but if I profile instance A, I see almost no activity in the parent process for a few seconds, until suddenly all the tabs open at once.

This is the profile from instance A: https://share.firefox.dev/3gQTJYL .

But this doesn't show what is happening in the other instances. Florian, is there some way to get that information?

Meanwhile, given that the slowdown seems to be due to how long it takes for the other instances to hand their URLs to instance A, passing this to a startup component.

I will file a separate bug to investigate using MultiSelectModel so we just get passed more than 1 file at the same time, which would make our life a lot easier.

Component: File Handling → Startup and Profile System
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(florian)
Product: Firefox → Toolkit
Depends on: 1797979
See Also: → 1795590

When Firefox is launched we do some careful locking to make sure that only one instance can be starting up at the same time, any other instances delay until the existing instance has selected its profile and is ready for remote urls to be passed to it. So here whichever secures the lock first will complete startup and load and then the other three instances should send their urls to it. The first instance is meant to release its lock fairly immediately after securing the profile and starting to open the new window, the other instances are meant to poll to try to get the lock every 100ms so really should be passing the url along near enough instantly. But here there is a delay of about five seconds before the urls are opened which interestingly is the time we wait before giving up attempting to get the startup lock and proceeding anyway. So my guess is that the first instance isn't releasing that startup lock correctly.

(In reply to Dave Townsend [:mossop] from comment #10)

So my guess is that the first instance isn't releasing that startup lock correctly.

This is an interesting hypothesis - how would you validate this?

I wasn't sure how to startup profiling here because we don't control the startup - we're started by Windows. So I couldn't work out how to get the profiler to start from startup (which I guess might then work with profiler markers or similar)

(In reply to :Gijs (he/him) from comment #11)

(In reply to Dave Townsend [:mossop] from comment #10)

So my guess is that the first instance isn't releasing that startup lock correctly.

This is an interesting hypothesis - how would you validate this?

We should be able to use Process Monitor and watch for file accesses to files named parent.lock in the temporary directory. Getting a list of those accesses when this issue occurs should confirm/debunk this theory.

(In reply to :Gijs (he/him) from comment #9)

This is the profile from instance A: https://share.firefox.dev/3gQTJYL .

But this doesn't show what is happening in the other instances. Florian, is there some way to get that information?

You can apply the patch from bug 1528859, then set the MOZ_PROFILER_STARTUP and MOZ_PROFILER_SHUTDOWN environment variables globally (ie. as part of the Windows user session). In the value of the MOZ_PROFILER_SHUTDOWN environment variable, use the new %p magic placeholder to have the profiler write to disk each profile with its pid in the name.

Flags: needinfo?(florian)

(In reply to Dave Townsend [:mossop] from comment #12)

(In reply to :Gijs (he/him) from comment #11)

(In reply to Dave Townsend [:mossop] from comment #10)

So my guess is that the first instance isn't releasing that startup lock correctly.

This is an interesting hypothesis - how would you validate this?

We should be able to use Process Monitor and watch for file accesses to files named parent.lock in the temporary directory. Getting a list of those accesses when this issue occurs should confirm/debunk this theory.

Ok I've just done this and I think I was half right. It is the startup lock that is the problem but while the first instance is releasing the lock file correctly it appears that the other instances are polling for the lock incorrectly. I have a suspicion for what is going on here and can dig into this next week.

Assignee: nobody → dtownsend

Oh I see it. When the first instance releases the lock file it also deletes the temporary directory created to hold the lock file. Other instances when polling then attempt to re-create the lock file in a now non-existent directory 🤦‍♂️

The Performance Impact Calculator has determined this bug's performance impact to be medium. If you'd like to request re-triage, you can reset the Performance Impact flag to "?" or needinfo the triage sheriff.

Platforms: Windows
Impact on browser: Causes noticeable startup delay
[x] Able to reproduce locally

Performance Impact: ? → medium

I have a fix that works for this but all the pdf files end up in separate windows. I suspect that now the other instances acquire the lock so quickly that when they pass their file paths over the initial instance hasn't finished opening the first window yet and so decides to open new windows.

(In reply to Dave Townsend [:mossop] from comment #17)

I have a fix that works for this but all the pdf files end up in separate windows. I suspect that now the other instances acquire the lock so quickly that when they pass their file paths over the initial instance hasn't finished opening the first window yet and so decides to open new windows.

🤡

It's hacky but could we check the process start time in BrowserContentHandler, and if the parent process was started <15s ago, use a ChromeUtils.idleDispatch limited to 3-5s or so before opening another window, if we can't find one?

Flags: needinfo?(dtownsend)

(In reply to :Gijs (he/him) from comment #18)

(In reply to Dave Townsend [:mossop] from comment #17)

I have a fix that works for this but all the pdf files end up in separate windows. I suspect that now the other instances acquire the lock so quickly that when they pass their file paths over the initial instance hasn't finished opening the first window yet and so decides to open new windows.

🤡

It's hacky but could we check the process start time in BrowserContentHandler, and if the parent process was started <15s ago, use a ChromeUtils.idleDispatch limited to 3-5s or so before opening another window, if we can't find one?

Yes that is a hacky solution. I want to see if there is something slightly nicer we can do before resorting to that though.

Flags: needinfo?(dtownsend)

(In reply to Dave Townsend [:mossop] from comment #17)

I have a fix that works for this but all the pdf files end up in separate windows. I suspect that now the other instances acquire the lock so quickly that when they pass their file paths over the initial instance hasn't finished opening the first window yet and so decides to open new windows.

Looks like this is basically bug 1600153, which is mac-only but "only" has 20 dupes... 😬

See Also: → 1600153

(In reply to :Gijs (he/him) from comment #20)

(In reply to Dave Townsend [:mossop] from comment #17)

I have a fix that works for this but all the pdf files end up in separate windows. I suspect that now the other instances acquire the lock so quickly that when they pass their file paths over the initial instance hasn't finished opening the first window yet and so decides to open new windows.

Looks like this is basically bug 1600153, which is mac-only but "only" has 20 dupes... 😬

I feel like you're nerd sniping me at this point...

Ideally we would use the window mediator to just find new browser windows that
are in the process of opening but while we can find the windows they just appear
as about:blank with no way to verify that they are browser windows.

This just takes the straightforward approach of forcing code that opens browser
windows to register them with the BrowserWindowTracker. Perhaps we should add an
API to BrowserWindowTracker to actually open the window? We sure do that in
enough places that it might be useful to centralise it. The only niggle is that
when windows start being tracked by BrowserWindowTracker they haven't yet
started opening any initial urls and so if we immediately allow other urls to be
opened those initial urls clobber the others. So there is still a small race
here but one that already exists.

Depends on D161075

Attachment #9301627 - Attachment description: WIP: Bug 1792466: Ensure the startup lock directory exists each time we attempt to acquire the startup lock. r=Gijs! → Bug 1792466: Ensure the startup lock directory exists each time we attempt to acquire the startup lock. r=mhowell!
Attachment #9301628 - Attachment description: WIP: Bug 1792466: Track browser windows that are in the process of being opened so we don't open multiple windows when trying to open many files/urls quickly. r=Gijs! → Bug 1792466: Track browser windows that are in the process of being opened so we don't open multiple windows when trying to open many files/urls quickly. r=Gijs!

The severity field is not set for this bug.
:mossop, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(dtownsend)
Severity: -- → S4
Flags: needinfo?(dtownsend)
Pushed by dtownsend@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/418d1ede2ae3
Ensure the startup lock directory exists each time we attempt to acquire the startup lock. r=mhowell
https://hg.mozilla.org/integration/autoland/rev/36cc09178d76
Track browser windows that are in the process of being opened so we don't open multiple windows when trying to open many files/urls quickly. r=Gijs
Backout by sstanca@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e459fd828f16
Backed out 2 changesets for causing mch fails on test_mochitest_integration.py. CLOSED TREE

Backed out for causing mch fails on test_mochitest_integration.py.

Flags: needinfo?(dtownsend)
Pushed by dtownsend@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3e3bee6750ff
Ensure the startup lock directory exists each time we attempt to acquire the startup lock. r=mhowell
https://hg.mozilla.org/integration/autoland/rev/531ea5f0643b
Track browser windows that are in the process of being opened so we don't open multiple windows when trying to open many files/urls quickly. r=Gijs
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 109 Branch
Flags: needinfo?(dtownsend)

The patch landed in nightly and beta is affected.
:mossop, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox108 to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(dtownsend)

I don't think this is critical enough to warrant uplift.

Flags: needinfo?(dtownsend)
Flags: qe-verify+

Reproducible on Firefox 108.0 on Windows 10.
On Firefox 109.0b4(build ID: 20221218190303) , tabs open as expected. However, on 110.0a1 Nightly(latest), sometimes one of the opened tabs that should’ve contained the PDF content is redirectioned to a somewhat non-functional New Tab Page. Here’s a recording of that behaviour on Windows 10.
Browser console shows the following:

can't access property "get", linkMap is undefined 4 NewTabUtils.sys.mjs:2027
Links_onLinkChanged resource://gre/modules/NewTabUtils.sys.mjs:2027
PlacesProvider__callObservers resource://gre/modules/NewTabUtils.sys.mjs:720
handlePlacesEvents resource://gre/modules/NewTabUtils.sys.mjs:702
handlePlacesEvents self-hosted:1494

Flags: needinfo?(dtownsend)

(In reply to Ardelean Oana from comment #32)

Reproducible on Firefox 108.0 on Windows 10.
On Firefox 109.0b4(build ID: 20221218190303) , tabs open as expected. However, on 110.0a1 Nightly(latest), sometimes one of the opened tabs that should’ve contained the PDF content is redirectioned to a somewhat non-functional New Tab Page. Here’s a recording of that behaviour on Windows 10.
Browser console shows the following:

can't access property "get", linkMap is undefined 4 NewTabUtils.sys.mjs:2027
Links_onLinkChanged resource://gre/modules/NewTabUtils.sys.mjs:2027
PlacesProvider__callObservers resource://gre/modules/NewTabUtils.sys.mjs:720
handlePlacesEvents resource://gre/modules/NewTabUtils.sys.mjs:702
handlePlacesEvents self-hosted:1494

Please file a new bug for this.

Flags: needinfo?(dtownsend) → needinfo?(oana.ardelean)

I filled bug 1808321 for the issue described above.

Flags: needinfo?(oana.ardelean)
Blocks: 713713
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: