Closed Bug 1964030 Opened 18 days ago Closed 11 days ago

Intermittently, pages just stall loading - they keep on loading and loading. Page opened in Chrome load instantly.

Categories

(Core :: Networking, defect, P1)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox139 --- fixed
firefox140 --- fixed

People

(Reporter: mayankleoboy1, Assigned: valentin)

References

Details

(Whiteboard: [necko-triaged])

Attachments

(3 files, 1 obsolete file)

Since maybe 2 weeks, intermittently pages will stop loading - they will keep on loading and loading. If i open the same page on Chrome, they load normally.

I have captured a partial log - log was started when i was already seeing the pageload stall. But the log does capture the moment when the pages again started loading.
Profile: https://share.firefox.dev/44fTKvS

Flags: needinfo?(valentin.gosu)
Summary: Intermittently, pages just stop loading (most observed on BMO) → Intermittently, pages just stall loading - they keep on loading and loading. Page opened in Chrome load instantly.
Attached file about:support β€”

Usually if I close the browser while the pages have stalled loading, the browser wont quit cleanly. When i restart, i get the message "Another instance of firefox is running. do you want to close that".
I had filed bug 1962022 for some crashes i was seeing at teh same time. So the stalling, the dirty shutdown and the occassional crash all happen together. The crashes have been fixed since bug 1962022 though.

See Also: → 1962022

Thank you for the profile. It's really odd that nothing is really happening on either the cache or the socket threads.
I did see a request for http://wpad/wpad.dat at the end, so I do wonder if there's another proxy request going on in the background, or if this is just another issue similar to bug 1937367.

@JanErik, in the profile here I do see that setupTelemetry is lasting for a very long time - Is it possible that addon initialization is waiting for telemetry and in turn that's blocking network requests?
https://share.firefox.dev/4cXKnU1

Flags: needinfo?(valentin.gosu) → needinfo?(jrediger)
Assignee: nobody → valentin.gosu
Status: NEW → ASSIGNED

Additionally I noticed that https://phabricator.services.mozilla.com/D243076 doesn't include the NS_DISPATCH_EVENT_MAY_BLOCK when dispatching the potentially blocking runnable. I don't think that would block the entire thread pool, but it's better to be safe.

Severity: -- → S3
Keywords: leave-open
Priority: -- → P1

(In reply to Valentin Gosu [:valentin] (he/him) from comment #3)

@JanErik, in the profile here I do see that setupTelemetry is lasting for a very long time - Is it possible that addon initialization is waiting for telemetry and in turn that's blocking network requests?
https://share.firefox.dev/4cXKnU1

Uhm ... not that I know of? That code is essentially untouched for a long-time and we haven't had any other reports about it.

Flags: needinfo?(jrediger)
See Also: → 1866944

(In reply to Valentin Gosu [:valentin] (he/him) from comment #5)

Additionally I noticed that https://phabricator.services.mozilla.com/D243076 doesn't include the NS_DISPATCH_EVENT_MAY_BLOCK when dispatching the potentially blocking runnable. I don't think that would block the entire thread pool, but it's better to be safe.

Well, it just directs the event to a different pool with a few more threads. But once all those threads are blocked by really blocking events, the entire pool will block.

Depends on: 1964064

Comment on attachment 9485021 [details]
Bug 1964030 - Add NS_DISPATCH_EVENT_MAY_BLOCK for GetPACFromDHCP runnable r=#necko

Revision D247584 was moved to bug 1964064. Setting attachment 9485021 [details] to obsolete.

Attachment #9485021 - Attachment is obsolete: true

Dispatching multiple of them during a period when the system doesn't
return from the DhcpRequestParams call could potentially block all
background thread pool tasks.

Whiteboard: [necko-triaged]
Pushed by valentin.gosu@gmail.com: https://hg.mozilla.org/integration/autoland/rev/f3a372d30948 Make sure only one GetPACFromDHCP runnable is active at any time r=necko-reviewers,kershaw
Pushed by abutkovits@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/1e64a48e3517 Revert "Bug 1964030 - Make sure only one GetPACFromDHCP runnable is active at any time r=necko-reviewers,kershaw" for causing high frequency Gtest failures complaining about TestPACMan.
Pushed by valentin.gosu@gmail.com: https://hg.mozilla.org/integration/autoland/rev/605517b61bbd Make sure only one GetPACFromDHCP runnable is active at any time r=necko-reviewers,kershaw
Flags: needinfo?(valentin.gosu)
Status: ASSIGNED → RESOLVED
Closed: 11 days ago
Keywords: leave-open
Resolution: --- → FIXED

Dispatching multiple of them during a period when the system doesn't
return from the DhcpRequestParams call could potentially block all
background thread pool tasks.

Original Revision: https://phabricator.services.mozilla.com/D247615

Attachment #9486855 - Flags: approval-mozilla-beta?

firefox-beta Uplift Approval Request

  • User impact if declined: The runnable dispatched to avoid hanging in an OS call may themselves hang blocking the user from making any networking requests.
  • Code covered by automated testing: no
  • Fix verified in Nightly: yes
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: It's unclear exactly what causes the DhcpRequestParams call to hang in windows - probably network changes. Not easily reproducible (at least on my machine)
  • Risk associated with taking this patch: Low
  • Explanation of risk level: These two patches apply the NS_DISPATCH_EVENT_MAY_BLOCK flag to the runnable, so it ends in the correct thread pool, and make sure to not dispatch another one if we already have one in progress.
  • String changes made/needed: none
  • Is Android affected?: no
Attachment #9486855 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: