Closed Bug 1540917 Opened 6 years ago Closed 3 years ago

With strict content-blocking, course pages at business.unsw.edu.au trigger a content-process hang (with error-handling recursion death-spiral, in site's JS)

Categories

(Web Compatibility :: Site Reports, defect, P3)

defect

Tracking

(firefox87 affected)

RESOLVED WORKSFORME
Tracking Status
firefox87 --- affected

People

(Reporter: bmo, Assigned: twisniewski)

Details

(4 keywords)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Firefox/68.0

Steps to reproduce:

Visit "https://www.business.unsw.edu.au/degrees-courses/course-outlines/INFS1609".

Actual results:

"Webpage is slowing down your browser" info bar appears while loading the page.

Expected results:

Page loads and works normally.

Tested the page to work normally on Edge.

Confirmed.

Component: Untriaged → Performance
Product: Firefox → Core
Whiteboard: [qf]

Here are some profiles I captured while reproducing:

regular profile, with WebRender + uBlock origin: https://perfht.ml/2U72VqD
fresh profile: https://perfht.ml/2U771PF

I could not reproduce with a fresh profile.

Here's a profile I captured (though I could not reproduce the info-bar -- I just saw a slow-ish pageload, possibly just due to slow network roundtrips): https://perfht.ml/2Z0GtmU

Component: Performance → JavaScript Engine
Keywords: topperf
Whiteboard: [qf] → [qf:p3:responsiveness]

Winson & Virtual_ManPL,do you have a content-blocker (like ublock origin or adblock plus) installed? Can you still reproduce the issue if you disable those?

Flags: needinfo?(winson.wen1)

No, I don't have a content-blocker installed. Strangely enough, I can only reproduce this in private mode when in a new profile, maybe caching is affecting it? Still freezes on both my laptop and desktop on my existing profile in normal and private mode.

Flags: needinfo?(winson.wen1)

Thanks! That is a really useful observation -- I can indeed reproduce in Private Browsing mode in a new profile.

I can also reproduce in a normal window (non-Private) if I enable "Strict" tracking protection in Firefox preferences, which makes normal windows as strict as Private windows, from a content-blocking perspective.

So, it seems we're blocking some tracking/analytics library that the site uses, and the site has a pretty catastrophic failure mode if that resource is blocked. It's one of these two resources, based on my browser console:

The resource at “https://www.googletagmanager.com/gtm.js?id=GTM-5JB423” was blocked because content blocking is enabled.
The resource at “https://ws.sharethis.com/button/buttons.js” was blocked because content blocking is enabled.

For now, I think it's safe to assume that other browsers would have similar issues if they blocked the same tracking resource.

This is reminiscent of bug 1516552 (though more catastrophic)...

--> Reclassifying as Core/Privacy:Anti-tracking, since that's what's causing this (combined with pathological/fragile behavior on the part of this site itself). It's possible we can ameliorate this with a surrogate as discussed in bug 1516552.

Component: JavaScript Engine → Privacy: Anti-Tracking
See Also: → 1516552
Summary: Basic page consistently triggering "Webpage is slowing down your browser" info bar → With strict content-blocking, course pages at business.unsw.edu.au trigger a content-process hang (with horrible error-handling recursion in site's JS)

Might be more of a developer outreach/web compat thing?

Yeah, really it probably is.

To add a bit more detail: this is effectively an "A calls B which calls A which calls B" recursive death spiral.

So what happens here is:
​(1) We block the ShareThis buttons JS file (via strict content blocking)
(2) The site uses the variable from that JS file without checking whether it's defined, which triggers the following JS error:
"ReferenceError: stLight is not defined"
(3) This invokes the site's window.onerror handler (set up in their init.js file), which is called ULSOnError, which is really just a wrapper for their function ULSSendExceptionImpl. So now we're in that function to handle the error.
(4) They skip the majority of that function because ULS.enable is undefined (and the majority of the function is wrapped in a check for it), so they jump straight to the final line which is:

  return Boolean(ULS) && Boolean(ULS.OriginalOnError) ? ULS.OriginalOnError(c, a, String(b))  : false

This ends up being a recursive call into ULS.OriginalOnError which is an alias for the ULSOnError function. So: we're recursively calling into the function from (2) above. And we continue into a recursive death spiral.

I'm guessing the site doesn't expect that ULS.OriginalOnError will be an alias for ULSOnError, so it's trying to invoke some other error-handler in a non-recursive way. But that is in fact what ULS.OriginalOnError is in this case.

Here's the full snippet of JS in question (the two ping-ponging functions):

function ULSSendExceptionImpl(c, a, b, d) {
  if (Boolean(ULS) && ULS.enable) {
    ULS.enable = false;
    window.onerror = ULS.OriginalOnError;
    ULS.WebServiceNS = 'http://schemas.microsoft.com/sharepoint/diagnostics/';
    try {
      ULS.message = c;
      if (a.indexOf('?') != - 1) a = a.substr(0, a.indexOf('?'));
      ULS.file = a.substr(a.lastIndexOf('/') + 1);
      ULS.line = b;
      ULS.teamName = '';
      ULS.originalFile = '';
      ULS.callStack = '<stack>\n' + ULSGetCallstack(d) + '</stack>';
      ULS.clientInfo = '<client>\n' + ULSGetClientInfo() + '</client>';
      ULSSendReport(true)
    } catch (e) {
    }
  }
  return Boolean(ULS) && Boolean(ULS.OriginalOnError) ? ULS.OriginalOnError(c, a, String(b))  : false
}
function ULSOnError(b, c, a) {
  return ULSSendExceptionImpl(b, c, a, ULSOnError.caller)
}

https://www.business.unsw.edu.au/_layouts/15/init.js

Note that each function's return statement invokes the other function (since ULS.OriginalOnError is an alias for ULSOnError).

Component: Privacy: Anti-Tracking → Desktop
Product: Core → Web Compatibility
Summary: With strict content-blocking, course pages at business.unsw.edu.au trigger a content-process hang (with horrible error-handling recursion in site's JS) → With strict content-blocking, course pages at business.unsw.edu.au trigger a content-process hang (with error-handling recursion death-spiral, in site's JS)
Version: 68 Branch → unspecified

(In reply to Daniel Holbert [:dholbert] from comment #6)

The resource at “https://www.googletagmanager.com/gtm.js?id=GTM-5JB423” was blocked because content blocking is enabled.
The resource at “https://ws.sharethis.com/button/buttons.js” was blocked because content blocking is enabled.

For now, I think it's safe to assume that other browsers would have similar issues if they blocked the same tracking resource.

This is reminiscent of bug 1516552 (though more catastrophic)...

--> Reclassifying as Core/Privacy:Anti-tracking, since that's what's causing this (combined with pathological/fragile behavior on the part of this site itself). It's possible we can ameliorate this with a surrogate as discussed in bug 1516552.

No, totally unrelated to that bug. The first script listed here is a GTM script which is basically some arbitrary site-specific JS, not something that can be shimmed in a generic way. There is no shim for the sharethis.com URL in https://github.com/uBlockOrigin/uAssets/blob/master/filters/resources.txt. So bug 1516552 will be of no help here whatsoever.

See Also: 1516552
Priority: -- → P3

The issue is not reproducible with ETP - Standard, but still occurs with ETP - Strict.
https://prnt.sc/xtfocn

Tested with:
Browser / Version: Firefox Nightly 87.0a1 (2021-01-29)
Operating System: Windows 10 Pro

Whiteboard: [qf:p3:responsiveness] → [qf:p3:responsiveness]
Assignee: nobody → twisniewski
Performance Impact: --- → P3
Whiteboard: [qf:p3:responsiveness]

I was not able to reproduce the issue anymore with ETP - Strict enabled or Private Window.
https://prnt.sc/yCqmYsh2FcHz

Tested with:
Browser / Version: Firefox Nightly 103.0a1 (2022-06-15)
Operating System: Windows 10 Pro

Winston can you still reproduce it on your side?

Status: NEW → RESOLVED
Closed: 3 years ago
Performance Impact: P3 → ---
Flags: needinfo?(winson.wen1)
Resolution: --- → FIXED

Just tested with Nightly 103.0a1, the page still freezes for a while when loading with strict tracking protection enabled, and loads almost instantly with tracking protection disabled.

I don't see the "Webpage is slowing down your browser" info bar anymore, though I've upgraded my computer since filing the bug so maybe my computer is just fast enough now that it doesn't reach the threshold to trigger.

Flags: needinfo?(winson.wen1)

I do see messages about too much recursion here in Strict ETP mode, so I'll take a look ASAP (it might be that something is being blocked by ETP which their code isn't tolerant of being blocked).

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

No, actually this is happening regardless of ETP. In fact I see it happening in Chrome as well. Apparently their scripts rely on an Intercom variable which doesn't exist, and their error-handling scripts get into an infinite loop trying to handle that error, which is eventually cancelled by the browser. I'm not sure what can be done here to improve the situation on the browser's end, this is something the site should fix.

Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.