Open Bug 1795657 Opened 5 months ago Updated 5 months ago

Nightly start fails to restore - no websites load - popup force quit or wait

Categories

(Core :: Networking, defect)

Firefox 107
x86_64
Linux
defect

Tracking

()

UNCONFIRMED
Performance Impact ?

People

(Reporter: gjunk1, Unassigned, NeedInfo)

Details

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:107.0) Gecko/20100101 Firefox/107.0

Steps to reproduce:

nightly build 22021017 linux - on start windows and tabs start but all are blank - nothing loads - popup periodically offers to wait or force quit. waiting doesn't help

Actual results:

no websites loaded at all.
Going back to nightly 20221014 works normally

The Bugbug bot thinks this bug should belong to the 'Core::Networking' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Networking
Product: Firefox → Core
OS: Unspecified → Linux
Hardware: Unspecified → x86_64

Hi Reporter,

Could you try to capture a http log?
We'd like to know what happens during startup, so please set the environment variables before starting Firefox.
Note that the log file may contain some privacy information, so you could send it to necko@mozilla.com directly.
Thanks.

Flags: needinfo?(gjunk1)

Yes I can create a log - but first I was trying to find a minimal case that triggers (to help deal with privacy) - and in doing so i have discovered that reducing the number of windows / tabs which auto-start - each window has multiple tabs - makes problem go away - so its related to the total number of windows and/or tabs at startup. I have restore windows obviously.

Original case with problem has 22 windows and this number of tabs in each window:
(13, 9, 5, 1, 18 ,5 ,9 ,7 ,43, 6, 4, 4, 8, 6, 7, 1,7, 1, 3, 11, 3, 1)

Reducing to 14 windows and it starts up as usual restores everything as usual:
(11, 9, 3, 1, 43, 5, 18, 6, 4, 7, 7, 1, 1, 3)

Since it appears to me that connections are only made from 1 (selected) tab per window and assume rest are pulled from cache (?) - my guess is the triggering condition is more the number of windows than tabs - assuming cache restore doesn't interfere in some way with the connections - or a threads issue etc.

Flags: needinfo?(gjunk1)

For the failed case with 22 windows there are 12 log files created 6 f which are empty. 11 are xx-child-N and 1 xxx.moz_log.0
For the smaller case with 14 windows which succeeded there are 35 log files 5 of which are empty. 31 xx-child and 4 xxx.moz_log.[0 - 3]
I need to carefully go through for privacy before sending anything.

Perhaps the clue about number of windows is enough

If this is about the number of opened windows and tabs, maybe this is not a networking bug. I think Firefox only tries to load the foreground tab at start up (please correct me if I was wrong).
I am not sure if a profiler link would be helpful.

Andrew, do you have any idea about how to proceed?
Thanks.

Flags: needinfo?(acreskey)

I agree Kershaw -- a startup profiler would be a good next step.

gene, if you are able to capture a startup profile, it would be helpful in isolating the problem.
The instructions are here

It looks like this just happened, so another option would be to use mozregression to pinpoint the range.
Instructions here:
https://mozilla.github.io/mozregression/quickstart.html

Either of those would be very helpful.

Performance Impact: --- → ?
Flags: needinfo?(acreskey)

Well I have the log from the original failed case and while I have not had time to check and cleanse I do see DNS succeeding and what might be http exchanges. Things like 'X/nsHttp where X is (D, E, V) and 'Y/nsSocketXXX' with Y = (D,E) including many for normandy.cdn.mozilla.net as well as others. with 1/2 millions lines I'm not doing justice to the extent including things like V/nsHttp pruning [ci=.S.......[tlsflags0x00000000]www.xxx

Anyway, if there's something specific I can look for in the logs that you think might be useful I'd be happy to grep stuff out. Again I apologize for not yet finding time to go through and check it all.

Also, I cannot vouch it is a regression in the sense that I recently "increased" the number of windows - so its entirely possible this has been a longer standing problem, just unnoticed until the number of windows crossed a threshold - somewhere around 20. Possibly.
At this point I'd need to re-create the larger number of windows, as in my efforts to find the minimal case, I no longer have 22 windows.

I would think perhaps it might be helpful if anyone else can also try starting up 22 windows with a few tabs in each - quit and restart - and see if it's reproducible that way for anyone else - and yes I should do the same too :) - since it was blocked and unable to load - it appears that 22 empty windows may not cut it - its bit more subtle than that.

profiling looks like it will generate an even larger file for me to go through for privacy etc.
Be nice if there was a clean simple reproducer - which I don't yet have sorry to say.

sorry - typo its only 1/4 million lines and many seem to be quite benign' ... still need to exercise some care on my end, hope you understand

(In reply to gene bug from comment #8)

profiling looks like it will generate an even larger file for me to go through for privacy etc.
Be nice if there was a clean simple reproducer - which I don't yet have sorry to say.

When sharing the profile you have the option of scrubbing private info using these options:
Include additional data that may be identifiable

  • Include hidden threads
  • Include hidden time range
  • Include screenshots
  • Include resource URLs and paths
  • Include extension information

Even a scrubbed down profile is a good place to start. You don't have to download it, instead you can upload and share the link (via email if you are still concerned)

Btw, it would be good to make a copy of the profile that triggers this problem, in case it becomes harder to reproduce in the future.

Alternatively, you could also create a dump of the hanging firefox process [1], and analyze the backtrace yourself with WinDbg [2]
[1] https://support.kaspersky.com/common/diagnostics/12401
[2] https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/debugger-download-tools

Thanks!

Flags: needinfo?(gjunk1)
You need to log in before you can comment on or make changes to this bug.