Open Bug 1843454 Opened 8 months ago Updated 1 month ago

Untenable to diagnose a situation on a webpage due to debugger performance issues

Categories

(DevTools :: Debugger, defect, P2)

defect

Tracking

(Not tracked)

People

(Reporter: twisniewski, Unassigned)

References

(Depends on 8 open bugs, Blocks 1 open bug, )

Details

I can easily lock up Firefox by debugging the scripts at https://ageor.dipot.com/2020/03/Covid-19-in-Greece.html . For instance, this is what I wanted to do, to investigate why the page throws a SecurityError on a stock nightly build, but not on a local (non-debug) artifact build:

  1. visit the site on a stock nightly build and open the debugger
  2. full-text search for "dataStudioOfflineCache"
  3. open the result
  4. pretty print (optional)
  5. put a breakpoint on this.H=(this.D=d.indexedDB)&&
  6. reload the page
  7. wait for the first breakpoint to trip
  8. see what the value of d is (it is probably a restricted window)
  9. repeat the same steps on the other build

However, the moment I start trying to set the breakpoint, Firefox starts freezing. Sometimes it recovers after 10 seconds (on my 2018 macbook), but before I can read the value of d it has almost certainly locked up completely, my CPU fans spin up, and I have to crash the app. I managed to get the nightly build, but an artifact build consistently locks up by the time I'm waiting for it to show me the scope variables.

I wonder if it would be feasible for it to not lock the process up entirely while it's doing whatever it's doing, or at least provide some sort of progress feedback so I can judge whether it has locked up entirely? By contrast the Chrome and Safari devtools manage to get me to the results with only a few seconds of delay here and there.

Update: after about 5 minutes the artifact build managed to get the scopes listed

I second bomsy conclusions on Slack.
Let me help by providing in-depth analysis.

The opened source https://www.gstatic.com/_/datastudio/_/js/k=datastudio.datastudio.fr.lSd1ASPaFTk.es6.O/d=0/rs=AHAPuoNbbYWNZkijuD11cSSB6oLLrtzh-A/m=sy6,sy3,sya,syb,sy5,sy4,syj,syk,sy9,syn,syp,syg,syx,sy14,syc,sy7,sym,syi,sy12,syf,syo,sy18,sy1e,sy1b,syd,sy15,syq,sy16,sy13,syr,sys,syt,syw,syh,sy11,sy1c,sy1l,sy2,sy8,sye,syl,syu,syv,syy,syz,sy10,sy17,sy19,sy1a,sy1d,sy1f,sy1g,sy1v,sy1k,sy23,sy1o,sy1,sy1m,sy1p,sy1x,sy22,sy24,sy1t,sy1n,sy27,sy20,sy1w,sy1j,sy28,sy1i,sy1r,sy2e,sy2j,sy2k,sy2i,pm_ng2report where dataStudioOfflineCache is defined is 1.6MB large.
This is large, but nothing really impressive for a bundle.

The STR can be simplier and stop at 3. You just need to wait for a few seconds (many tens!) and firefox will freeze.
This STR will reproduce on reload, which makes it quite bad.

Here is a profile record: https://share.firefox.dev/46WuhGM
This highlights very well what is going wrong.
The parser worker spends 13 seconds parsing this 1.6MB JS file.
The worker sends a message via postMessage so large that it takes 3seconds to complete!
This causes a 570MB memory increase in the parent proces!!!
This is quickly followed by a 3s GC pause in the worker... but also a 34s GC pause in the parent process freeing around 300MB.

While we can probably investigate and make the parser worker pass smaller data,
we should probably investigate into avoiding parsing large sources entirely, or make that on-demand.

In a some cleanup I simplified the story around worker parser by having a very unique callsite where we request the worker to parse the source.
It is now only done when selecting a source, over here:
https://searchfox.org/mozilla-central/rev/c615dc4db129ece5cce6c96eb8cab8c5a3e26ac3/devtools/client/debugger/src/actions/sources/select.js#203
We should probably review all the places where we use the parser worker data. i.e. all usages of getSymbols:
https://searchfox.org/mozilla-central/search?q=getSymbols%28&path=devtools%2Fclient%2Fdebugger%2Fsrc&case=true&regexp=false

It is notably used for:

Now, it would be great to review all the data emitted by the worker.
I think it is defined over there:
https://searchfox.org/mozilla-central/rev/c615dc4db129ece5cce6c96eb8cab8c5a3e26ac3/devtools/client/debugger/src/workers/parser/getSymbols.js#98-111
There might be attributes, which are store very large data set, which aren't used in the main thread.
Dropping all of those would be a easy and significant improvement.
For example comments doesn't seem to be used outside of the worker itself:
https://searchfox.org/mozilla-central/search?q=symbol%3A%23comments&path=debugger%2Fsrc&case=false&regexp=false

Here is some statistics about each attribute of symbols:
functions = 5510810 (5M)
callExpressions = 18516555 (18M)
memberExpressions = 42169971 (42M)
objectProperties = 8261303 (8M)
comments = 662
identifiers = 107075003 (107M !!!)
classes = 2
imports = 2
literals = 639923 (600k)
hasJsx = 5
hasTypes = 5
framework = 9

This is computed by counting the JSON size of each attribute, like this:

diff --git a/devtools/client/debugger/src/reducers/ast.js b/devtools/client/debugger/src/reducers/ast.js
index 9796815727e5c..e7257ab1a6c8d 100644
--- a/devtools/client/debugger/src/reducers/ast.js
+++ b/devtools/client/debugger/src/reducers/ast.js
@@ -40,6 +40,10 @@ function update(state = initialASTState(), action) {
         value: action.value,
         threadActorId: location.sourceActor?.thread,
       };
+      dump(" >> SYMBOLS:\n");
+      for(let i in action.value) {
+        dump(" " + i + " = "+JSON.stringify(action.value[i]).length);
+      }
       if (location.source.isOriginal) {
         state.mutableOriginalSourcesSymbols[location.source.id] = entry;
       } else {

The largest is, by far, identifiers, which is only used by inline/popup preview...
Followed by memberExpressions, same usage.
And callExpressions, which isn't used in the main thread.

Then, once such cleanups are done within the main thread, we can start investigating simplifying the worker itself.
We may have dropped unused data and be able to simplify it and make it faster to respond.

Blocks: dbg-perf
Severity: -- → S3
Priority: -- → P2
Depends on: 1851522
Depends on: 1851566

With bug 1851522 and bug 1851566, the main thread no longer freezes. We are no longer sending across thread the large parsing data.
But we still have a heavy 20s parsing operation happening in the parsing worker thread.
This may still make the browser feel slow and may delay inline preview very significantly.
Also on this page, we are still passing the 5M functions array. This may still introduce a freeze, but much smaller as we used to pass 175M arrays before. But this one array is being used by more callsites and would require more analysis.

I think the next step here is to investigate bomsy idea to limit parsing to only a subset of the source content (bug 1844318).

Depends on: 1844318

Bug 1852979, bug 1853124 and bug 1853123 may also help in a lower extend.
This should avoid a bunch of uncessary work, which may actually end up being significant if applied to a very large bundle.

Depends on: 1852979, 1853124, 1853123
Depends on: 1853294
Depends on: 1853296

Outside of the parser worker, we also have some issues around getColumnBreakpoints selector:
https://searchfox.org/mozilla-central/rev/077fc34d03b85b09add26b5f99f1a3a3a72c8720/devtools/client/debugger/src/selectors/visibleColumnBreakpoints.js#124-151
I can see many seconds spent computing the column breakpoints.
This code has some logic to filter out data in an attempt to reduce computation,
but unfortunately it ends up duplicating large data set many times and is, at the end, very slow and causes lots of GC overhead.
All this is in the main thread.

Depends on: 1854421
Depends on: 1854423
Depends on: 1854941
Depends on: 1855019
Depends on: 1855404
Depends on: 1855484

I reduced this test case on http://techno-barje.fr/fission/large-bundle/
With only one 7MB bundle. Note that using pretty printing is an important factor here.

STR:

  • open http://techno-barje.fr/fission/large-bundle/
  • open "large-bundle.js"
  • pretty print it (you may later no longer need this step as this is save over sessions)
  • set a breakpoint on line 7 var window = this;
  • reload to pause on that line

=> The debugger will take ages to fully update.

First issue is that we open the bundle first and do lots of computations on it,
later on we display the pretty printed version,
much later the breakpoint positions gets updated (lines in codemirror gutter get updated to reflect the breakable lines)
and then later you may see the inline previews.

I opened bug 1855484 to bypass all the computations made against the bundle.
It may easily make things twice as fast.

Here is a profiler record for comment 5's STR (with new fixes avoiding lots of computation against the bundle source):
https://share.firefox.dev/3RCkvo0
Important detail, I wait for some significant time before reloading so that the workers have time to do full GCs which can last for 10s+.

Here is another record, where I reload promptly after opening the debugger or setting the BP:
https://share.firefox.dev/3EWPIuk
We can see at the beginning a 10s+ GC pause in the worker.
These GC pauses are the next biggest challenge.

When we reload, we delete many objects via this code:
https://searchfox.org/mozilla-central/rev/118283c2e3eb393e9b1ca00a0c328e32eeab1c40/devtools/client/debugger/src/workers/parser/worker.js#16-21
We do remove items from various Maps.

Important notice, this worker receive a 7MB Javascript file text content, which we pass to Babel in order to compute an AST and traverse it.
So this ends up generating tons of objects, with probably a complex GC graph.
I tried instantiating a new worker on each new navigation, but I was still having very long GC pauses.

This code in devtools/client/debugger/src/workers/ has to be manually updated via:

$ cd devtools/client/debugger/src/workers/
$ yarn
$ node bin/bundle.js

We can probably try to be more conservative and try to avoid instantiating some set of objects... but Babel will always end up spawning an handful of them, which we will want to keep allocated for a bit.
So it would be interesting if there is anything we can do to help get faster GC here.

mccr8, do you happen to have any insight on how to improve the 10s spent building the CC in https://share.firefox.dev/3EWPIuk?

Flags: needinfo?(continuation)

That's interesting because it is a worker cycle collector. I think we've done approximately no work on improving worker CC performance because it hasn't come up before. 10 seconds is a lot of time! That's usually how long it takes to CC when we've leaked an entire web page. It sounds like the test case here is freeing a lot of data because of some kind of reload so maybe that's similar. It is harder to optimize the CC in a case where things are supposed to be freed, if that's the case here. The first step would be looking at a CC log to figure out what is being traversed.

Flags: needinfo?(continuation)
Depends on: 1856658

I've started writing a doc to summarize the overall quest to address the numerous performance issues highlighted by this bug report:
https://docs.google.com/document/d/1NRL-8c7o-cQdPwWLon5fu0PzShsFClgeveKxZypLkrM/edit

Depends on: 1857214
Depends on: 1859243
Depends on: 1859531
Depends on: 1873432
Depends on: 1873657
Depends on: 1874656
Depends on: 1874698
You need to log in before you can comment on or make changes to this bug.