Open Bug 1646986 Opened 5 years ago Updated 3 years ago

Firefox freezes when opening an item from download drop-down with no associated program

Categories

(Core :: XPCOM, defect, P3)

77 Branch
defect

Tracking

()

UNCONFIRMED

People

(Reporter: brandonjhope, Unassigned)

References

Details

(Whiteboard: [win:stability])

Attachments

(4 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:77.0) Gecko/20100101 Firefox/77.0

Steps to reproduce:

Context I'm a dev and work on a windows application that does a COM registration to handle double clicks on our custom file types. Sometimes when doing dev work the COM registrations will no work correctly and the file types are left "un-associated". Generally when opening our files directly from Firefox it will open in our application, but sometimes firefox will end up freezing (blocking call on a thread?). This freezes the entire GUI/Window and windows may even report it as being "unresponsive".
I think this is a defect, though not super likely or widespread. This has happened on previous versions of firefox as well so not a "new" issue by any means.

  1. Download a file that currently has no program associated to open it.
  2. click on the file from the download manager drop down (or download manager)

Actual results:

  1. Clicked on file.
  2. firefox freezes for 30-60s.
  3. A windows error message pops up.
  4. hit okay, get an unresponsive script error in firefox.

Expected results:

  1. Download a file that currently has no program associated to open it.
  2. click on the file from the download manager drop down (or download manager)
  3. Firefox should continue to function and spawn an error message or open the "choose app dialogue"

Bugbug thinks this bug should belong to this component, but please revert this change in case of error.

Component: Untriaged → File Handling

image 1 from zipfile

Attached image firefox_freeze.JPG

We won't be able to reproduce this without more details. Can you elaborate on what state the registry / file registration is in when this happens?

When opening files, we try to ask Windows Explorer to ask the file (so as not to inherit restrictive settings into the new process, which can break the other process), see https://searchfox.org/mozilla-central/source/widget/windows/ShellHeaderOnlyUtils.h#48 . If that doesn't work, we will call ShellExecuteW - https://searchfox.org/mozilla-central/rev/a87a1c3b543475276e6d57a7a80cb02f3e42b6ed/xpcom/io/nsLocalFileWin.cpp#3057 .

It's not clear to me why this would hang. Are you able to get a profile using the builtin profiler tool? ( https://profiler.firefox.com/ )

Flags: needinfo?(brandonjhope)

Here's an attached profile https://spacebear.rocks/index.php/s/fSsAide9JHzHTnW (link will expire in 1 year)

For the registration, on start up with add registry keys which link to our application. I'm assuming for debug builds this process is incomplete which is creating a handler that doesn't point to a valid application/executable. The keys are being placed in HKEY_CLASSES_ROOT and HKEY_LOCAL_MACHINE.

Comparing the behaviour to windows explorer, I get a similar error message, but it doesn't lock up explorer, and stil takes 30s or so to spawn. If I would guess the win32 function handling this is taking it's sweet time, but Firefox is calling it in a way that blocks the whole process.

Flags: needinfo?(brandonjhope)

The severity field is not set for this bug.
:Gijs, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(gijskruitbosch+bugs)

The profile shows us spending like half a minute with a stack that looks like this:

NtUserMsgWaitForMultipleObjectsEx
unsigned long RealMsgWaitForMultipleObjectsEx(unsigned long,void * const *,unsigned long,unsigned long,unsigned long)
CCliModalLoop::BlockFn(void**, unsigned long, unsigned long*)
ClassicSTAThreadWaitForCall(CSyncClientCall*, WaitForCallReason, unsigned long)
ThreadSendReceive(tagRPCOLEMESSAGE*, CSyncClientCall*, _GUID const&)
CSyncClientCall::SendReceive2(tagRPCOLEMESSAGE*, unsigned long*)
ClassicSTAThreadSendReceive(CSyncClientCall*, tagRPCOLEMESSAGE*, unsigned long*)
CSyncClientCall::SendReceive(tagRPCOLEMESSAGE*, unsigned long*)
CClientChannel::SendReceive(tagRPCOLEMESSAGE*, unsigned long*)
NdrExtpProxySendReceive(void*, _MIDL_STUB_MESSAGE*)
union _CLIENT_CALL_RETURN NdrpClientCall2(struct _MIDL_STUB_DESC const *,unsigned char const *,unsigned char *,unsigned char)
ObjectStublessClient(void*, long long*, long)
ObjectStubless()
mozilla::ShellExecuteByExplorer(_bstr_t const&, _variant_t const&, _variant_t const&, _variant_t const&, _variant_t const&)
nsLocalFile::Launch()
nsMIMEInfoWin::LaunchDefaultWithFile(nsIFile*)
nsMIMEInfoWin::LaunchWithFile(nsIFile*)
XPTC__InvokebyIndex
static XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode)
XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*)
js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason)
Interpret(JSContext*, js::RunState&)
InternalCall(JSContext*, js::AnyInvokeArgs const&, js::CallReason)
launchFile
launchDownload
launch
downloadsCmd_open

:toshi, any idea what's going on there?

Component: File Handling → XPCOM
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(tkikuchi)
Product: Firefox → Core

(In reply to brandonjhope from comment #0)

Context I'm a dev and work on a windows application that does a COM registration to handle double clicks on our custom file types. Sometimes when doing dev work the COM registrations will no work correctly and the file types are left "un-associated".

Does your windows application end up injecting code/DLLs in Firefox in any way?

Flags: needinfo?(brandonjhope)

(In reply to :Gijs (he/him) from comment #8)

(In reply to brandonjhope from comment #0)

Context I'm a dev and work on a windows application that does a COM registration to handle double clicks on our custom file types. Sometimes when doing dev work the COM registrations will no work correctly and the file types are left "un-associated".

Does your windows application end up injecting code/DLLs in Firefox in any way?

No. We don't touch/inject code into any other applications.

Flags: needinfo?(brandonjhope)

(In reply to :Gijs (he/him) from comment #7)

:toshi, any idea what's going on there?

Basically IShellDispatch2.ShellExecute should be asynchronous, but that callstack tells us it can block the thread, waiting on some internal objects.

(In reply to brandonjhope from comment #5)

Comparing the behaviour to windows explorer, I get a similar error message, but it doesn't lock up explorer, and stil takes 30s or so to spawn. If I would guess the win32 function handling this is taking it's sweet time, but Firefox is calling it in a way that blocks the whole process.

So this means with your application, executing a file somehow takes 30-60s. Do you think it's a bug in Windows, or your application does something wrong about file association?

Can you try the repro steps without running explorer.exe (i.e. kill explorer.exe before opening a file)? In this scenario, we fall back to ShellExecuteExW API instead of IShellDispatch2.ShellExecute.

Flags: needinfo?(tkikuchi)

Needinfo for comment #10...

Flags: needinfo?(brandonjhope)

Can you try the repro steps without running explorer.exe (i.e. kill explorer.exe before opening a file)? In this scenario, we fall back to ShellExecuteExW API instead of IShellDispatch2.ShellExecute.

I repeated the steps with explorer.exe killed. Firefox doesn't lock up and I get the "file does not have an app..." message about a minute later. I got another profile for you of this scenario. https://spacebear.rocks/index.php/s/jszYcxM2eZQx4FL

Do you think it's a bug in Windows, or your application does something wrong about file association?
It only happens on developer debug builds so I'm going with it's our applications problem. Unfortunately the code has a lot of inter dependencies so I can't make a little stand alone program for you guys.

Flags: needinfo?(brandonjhope)

(In reply to brandonjhope from comment #12)

Can you try the repro steps without running explorer.exe (i.e. kill explorer.exe before opening a file)? In this scenario, we fall back to ShellExecuteExW API instead of IShellDispatch2.ShellExecute.

I repeated the steps with explorer.exe killed. Firefox doesn't lock up and I get the "file does not have an app..." message about a minute later. I got another profile for you of this scenario. https://spacebear.rocks/index.php/s/jszYcxM2eZQx4FL

Do you think it's a bug in Windows, or your application does something wrong about file association?

It only happens on developer debug builds so I'm going with it's our applications problem. Unfortunately the code has a lot of inter dependencies so I can't make a little stand alone program for you guys.

Thanks. It seems ShellExecuteExW with SEE_MASK_ASYNCOK is perfectly async while ShellExecuteByExplorer is not, unexpectedly. We may consider to move a part of this operation off the main thread, but it's not a quick fix.

Basically ShellExecuteByExplorer delegates a task of executing a file to explorer.exe. In this repro scenario, explorer.exe might be busy with interacting with your application, or some other heavy task..?

Basically ShellExecuteByExplorer delegates a task of executing a file to explorer.exe. In this repro scenario, explorer.exe might be busy with interacting with your application, or some other heavy task..?

I suspect it's calling our file handler, but it's non existent in this case and timing out, since we have C++ and a custom language interacting in a symbiotic manner. At least the problem is now documented :).

Depends on: 1654100

(In reply to Toshihito Kikuchi [:toshi] from comment #13)

Thanks. It seems ShellExecuteExW with SEE_MASK_ASYNCOK is perfectly async while ShellExecuteByExplorer is not, unexpectedly. We may consider to move a part of this operation off the main thread, but it's not a quick fix.

It's been a few months, and we now have IOUtils. I think it should be reasonably straightforward to have a static launch method on there that takes a path, constructs a file, and then runs the launch command on the background threadpool (which is broadly also how some of the other primitives are implemented, and there are convenient lambdas and error handling helpers already available). We could then use that method from the downloads code. Do you think that'd help here?

Flags: needinfo?(tkikuchi)

(In reply to :Gijs (he/him) from comment #15)

It's been a few months, and we now have IOUtils. I think it should be reasonably straightforward to have a static launch method on there that takes a path, constructs a file, and then runs the launch command on the background threadpool (which is broadly also how some of the other primitives are implemented, and there are convenient lambdas and error handling helpers already available). We could then use that method from the downloads code. Do you think that'd help here?

That's a great idea. Yes, having the async version of IOUtils::Launch is what I exactly meant in comment 13. Now that we learned ShellExecuteByExplorer takes ~100ms from bug 1654100, I think running it in the background will improve user experience even in normal cases. I think I can write up a patch for IOUtils.cpp. Can you take care of a JS part like DownloadIntegration.jsm?

Flags: needinfo?(tkikuchi)

(In reply to Toshihito Kikuchi [:toshi] from comment #16)

(In reply to :Gijs (he/him) from comment #15)

It's been a few months, and we now have IOUtils. I think it should be reasonably straightforward to have a static launch method on there that takes a path, constructs a file, and then runs the launch command on the background threadpool (which is broadly also how some of the other primitives are implemented, and there are convenient lambdas and error handling helpers already available). We could then use that method from the downloads code. Do you think that'd help here?

That's a great idea. Yes, having the async version of IOUtils::Launch is what I exactly meant in comment 13. Now that we learned ShellExecuteByExplorer takes ~100ms from bug 1654100, I think running it in the background will improve user experience even in normal cases. I think I can write up a patch for IOUtils.cpp. Can you take care of a JS part like DownloadIntegration.jsm?

I could, though switching it should be straightforward to do tbh - you just want to replace instances of file.launch(); (I think there are only 1 or 2) with IOUtils.launch(file.path) (or whatever you ended up calling the method).

To open a downloaded file on Windows, we call IShellDispatch2.ShellExecute
via ShellExecuteByExplorer by default. It's expected to be asynchronous,
but bug 1646986 reported a situation where the thread was blocked inside
that method.

This patch introduces a new "launch" method to IOUtils which invokes
ShellExecuteByExplorer in the background thread, and use it to launch
a downloaded file on Windows, so that the main thread is not frozen even
though the launching operation takes an unusually long time.

Assignee: nobody → tkikuchi

Postponing this for a bit because the solution is non-trivial.

Severity: -- → S4
Priority: -- → P2
Priority: P2 → P3
Whiteboard: [win:stability]

Sorry, there was a problem with the detection of inactive users. I'm reverting the change.

Assignee: nobody → tokikuc
Flags: needinfo?(nika)

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: tokikuc → nobody
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: