Open Bug 1796514 Opened 2 years ago Updated 2 months ago

firefox 106.0 freezes when opening not accessible html-file on smb share

Categories

(Core :: Networking, defect, P2)

Firefox 106
defect

Tracking

()

People

(Reporter: jp.richters, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-triaged])

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:106.0) Gecko/20100101 Firefox/106.0

Steps to reproduce:

Prerequisite:

  • local HMTL-file on smb-share
  • on my setup: the smb-share is connected via a VPN with the VPN being disconnected
    (I currently cannot check if this problem also exists if there is no VPN involved, as I cannot set up a local SMB server at the moment.)

I can reproduce the problem using the following steps:

  • Open a new firefox window
  • make sure the local file is not accessible (VPN disabled)
  • Enter url to the local html-file on an smb-share using the following url-style file://///192.168.xx.xx/path/to/file.html

Notes:

  • This behavior is independent on the used firefox profile.
  • This behavior can be reproduced using firefox nightly (108.0a1 (2022-10-19) (64-Bit))
  • User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:106.0) Gecko/20100101 Firefox/106.0

Actual results:

firefox completely freezes for several seconds, all tabs + all windows. After a certain time, all windows/tabs become responsive again. No network-timeout information is displayed.

Expected results:

The tab concering the file to open should not freeze the whole browser, and a network-timeout information should be displayed.
This network-timeout information is actually displeyed when I try access a webpage hosted on a webserver via the disconnected VPN (not using file://///). In this case, the browser does not freeze.

The Bugbug bot thinks this bug should belong to the 'Core::Networking' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Networking
Product: Firefox → Core

Is this a recent regression?
If yes, could you use mozregression to help us find out the regression change?
Thanks.

Flags: needinfo?(jp.richters)

OK, this is definitely not a recent regression.

Using mozregression I found there has been actually a two step change of behavior:

  1. up to firefox release 2016-11-25:
    Immediate display of 'access denied' page + no freeze of other tabs
  2. from firefox release 2016-11-25 until 2019-12-14:
    indefinite loading of file (spinning ball in tab and still no message after ~ 10 min) + no freeze of other tabs
  3. firefox release 2019-12-15 until recent:
    complete freeze of firefox
Flags: needinfo?(jp.richters)

I caught this in gdb (for me it froze permanently)

#0  ___lxstat64 (vers=1, name=0x7eff45d4d048 "/mnt/local_share/example.htm", buf=0x7ffe40f29c80) at ../sysdeps/unix/sysv/linux/lxstat64.c:50
#1  0x00007eff81c4bb31 in nsLocalFile::IsSymlink(bool*) (this=<optimized out>, aResult=0x7ffe40f29d87) at /mozilla-central/xpcom/io/nsLocalFileUnix.cpp:1819
#2  0x00007eff8218c024 in nsFileChannel::Init() (this=0x7eff45d965c0) at /mozilla-central/netwerk/protocol/file/nsFileChannel.cpp:256
#3  0x00007eff8218daad in nsFileProtocolHandler::NewChannel(nsIURI*, nsILoadInfo*, nsIChannel**)
    (this=<optimized out>, uri=0x7eff45d87d60, aLoadInfo=0x7eff46050c80, result=0x7ffe40f29f28)
    at /mozilla-central/netwerk/protocol/file/nsFileProtocolHandler.cpp:200
#4  0x00007eff81e497c1 in mozilla::net::nsIOService::NewChannelFromURIWithProxyFlagsInternal(nsIURI*, nsIURI*, unsigned int, nsILoadInfo*, nsIChannel**)
    (this=<optimized out>, aURI=0x7eff45d87d60, aProxyURI=0x0, aProxyFlags=0, aLoadInfo=0x7eff46050c80, result=0x7ffe40f2a020)
    at /mozilla-central/netwerk/base/nsIOService.cpp:1169
#5  0x00007eff81e59f0e in NS_NewChannelInternal(nsIChannel**, nsIURI*, nsILoadInfo*, mozilla::dom::PerformanceStorage*, nsILoadGroup*, nsIInterfaceRequestor*, unsigned int, nsIIOService*)
   
     (outChannel=outChannel@entry=0x7ffe40f2a0e0, aUri=aUri@entry=0x7eff45d87d60, aLoadInfo=aLoadInfo@entry=0x7eff46050c80, aPerformanceStorage=aPerformanceStorage@entry=0x0, aLoadGroup=aLoadGroup@entry=0x0, aCallbacks=0x7eff45d53be0, aLoadFlags=2686980, aIoService=0x7eff45d4d048)
    at /mozilla-central/netwerk/base/nsNetUtil.cpp:277
#6  0x00007eff86567d73 in nsDocShell::CreateRealChannelForDocument(nsIChannel**, nsIURI*, nsILoadInfo*, nsIInterfaceRequestor*, unsigned int, nsTSubstring<char16_t> const&, nsIURI*) (aChannel=aChannel@entry=0x7ffe40f2a1f8, aURI=0x7eff45d87d60, aLoadInfo=aLoadInfo@entry=0x7eff46050c80, aCallbacks=0x277751cf3c4d, 
    aCallbacks@entry=0x7eff4670c760, aLoadFlags=aLoadFlags@entry=2686980, aSrcdoc=..., aBaseURI=0x0) at /mozilla-central/docshell/base/nsDocShell.cpp:9776
#7  0x00007eff8656845f in nsDocShell::CreateAndConfigureRealChannelForLoadState(mozilla::dom::BrowsingContext*, nsDocShellLoadState*, mozilla::net::LoadInfo*, nsIInterfaceRequestor*, nsDocShell*, mozilla::OriginAttributes const&, unsigned int, unsigned int, nsresult&, nsIChannel**)

I haven't looked at this long enough to figure out how to fix it, but blocking main thread IO is bad :(

This is far from ideal, and I can't think of an easy way to fix it.
Maybe Nika has an idea for what to do here. Fortunately this doesn't seem to affect too many people.

Severity: -- → S3
Flags: needinfo?(nika)
Priority: -- → P2
Whiteboard: [necko-triaged]

Looks like this is happening because of https://searchfox.org/mozilla-central/rev/968bd894205cf4f579d94ac4e175cc3187458605/netwerk/protocol/file/nsFileChannel.cpp#242-244,256. It seems like we quite explicitly immediately check for a symlink target, and then replace the URI of the channel as we're creating it when loading a symlink.

Perhaps this is something which could instead be handled within the file channel by doing a redirect, rather than checking for the symlink synchronously on the main thread? After doing an AsyncOpen we could do the IsSymlink and GetNativeTarget calls on a background thread before calling OnStartRequest, and then trigger a redirect to a different channel after following it rather than mutating the channel in-place.

I imagine this would require some changes to the current setup, as I think we currently open the channel with OpenContentStream (https://searchfox.org/mozilla-central/rev/968bd894205cf4f579d94ac4e175cc3187458605/netwerk/protocol/file/nsFileChannel.cpp#327), which I believe will also run on the main thread, making it tricky to do async work like that before OnStartRequest.

That being said, I don't think that is the entirely of the main thread I/O which we can theoretically do when opening a file URI. A quick look at the implementation of OpenContentStream shows two calls:

https://searchfox.org/mozilla-central/rev/968bd894205cf4f579d94ac4e175cc3187458605/netwerk/protocol/file/nsFileChannel.cpp#340-341

if (NS_SUCCEEDED(fileHandler->ReadURLFile(file, getter_AddRefs(newURI))) ||
    NS_SUCCEEDED(fileHandler->ReadShellLink(file, getter_AddRefs(newURI)))) {

These appear to synchronously open and read the given files if they are windows .lnk/.url or xdg .desktop files, returning a new URI to redirect to instead of loading the original file URI. This would also probably be best to do off-main-thread, as we presumably don't want to be doing sync I/O to read these files on the main thread either.

If there happens to be an upload stream, the call to NS_NewLocalFileOutputStream also appears to not specify nsIFileOutputStream::DEFER_OPEN, meaning it will also open/create the file on the main thread (https://searchfox.org/mozilla-central/rev/968bd894205cf4f579d94ac4e175cc3187458605/netwerk/protocol/file/nsFileChannel.cpp#363-365).

Fortunately it does appear that we do content length fixups off-main-thread when async (https://searchfox.org/mozilla-central/rev/968bd894205cf4f579d94ac4e175cc3187458605/netwerk/protocol/file/nsFileChannel.cpp#394-402), so for most file URI loads, (which don't end in .link/.url or .desktop), the only main thread I/O we should probably do is checking if it's a symlink.

I don't quite remember off the top of my head how difficult it would be to do this work before OnStartRequest and handle it uniformly as a redirect, and it seems like whatever we do we should probably add a test-case for a symlink, as it appears uncovered right now. A quick scan doesn't show any existing channels which do this with nsBaseChannel, but it might be possible to do with some combination of ListenerBlockingPromise and Redirect. It might also be possible to do using BeginAsyncRead, but I don't know how that interacts with redirects off the top of my head.

ni? back to :valentin for any insight into what the best strategy for this kind of situation would be from the necko side.

EDIT: Actually it looks like we do some more main-thread I/O even in the async case to check if the URL is referencing a file or a directory: https://searchfox.org/mozilla-central/rev/968bd894205cf4f579d94ac4e175cc3187458605/netwerk/protocol/file/nsFileChannel.cpp#293, so it's not just the symlink.

Flags: needinfo?(nika) → needinfo?(valentin.gosu)

Perhaps this is something which could instead be handled within the file channel by doing a redirect, rather than checking for the symlink synchronously on the main thread? After doing an AsyncOpen we could do the IsSymlink and GetNativeTarget calls on a background thread before calling OnStartRequest, and then trigger a redirect to a different channel after following it rather than mutating the channel in-place.

I like this suggestion - redirects with nsBaseChannel isn't a common place thing, but I don't see any other way around the main thread IO.

Flags: needinfo?(valentin.gosu)
Status: UNCONFIRMED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
See Also: → 862383
Status: RESOLVED → REOPENED
Ever confirmed: true
Resolution: FIXED → ---
Status: REOPENED → NEW
Whiteboard: [necko-triaged] → [necko-triaged][necko-priority-review]
Whiteboard: [necko-triaged][necko-priority-review] → [necko-triaged]
Duplicate of this bug: 1913810
You need to log in before you can comment on or make changes to this bug.