Closed Bug 1596390 Opened 6 years ago Closed 2 years ago

Very poor performance for connecting to Firefox and closing the connection

Categories

(Remote Protocol :: CDP, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: whimboo, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [puppeteer-beta2-mvp])

I run the unit tests for the chrome-remote-interface client and our performance in connecting and closing a connection is very poor:

  closing a connection
    with callback
      ✓ should allow a subsequent new connection (1676ms)
    without callback
      ✓ should allow a subsequent new connection (1608ms)

  connecting to Chrome
    with callback
      with default parameters
        ✓ should succeed with "connect" callback passed as an argument (781ms)
      with custom parameters
        ✓ should succeed with "connect" callback passed as an argument (925ms)
        ✓ should succeed with custom target by index (773ms)
        ✓ should succeed with custom target by function (793ms)
        ✓ should succeed with custom target by object (783ms)
        ✓ should succeed with custom target by full URL (779ms)
        ✓ should succeed with custom target by partial URL (776ms)

With Chrome no timings are printed given that it is that fast:

  closing a connection
    with callback
      ✓ should allow a subsequent new connection (55ms)
    without callback
      ✓ should allow a subsequent new connection

  connecting to Chrome
    with callback
      with default parameters
        ✓ should succeed with "connect" callback passed as an argument
      with custom parameters
        ✓ should succeed with "connect" callback passed as an argument
        ✓ should succeed with custom target by index
        ✓ should succeed with custom target by function
        ✓ should succeed with custom target by object
        ✓ should succeed with custom target by full URL
        ✓ should succeed with custom target by partial URL
        ✓ should succeed with custom target by id
      with custom (wrong) parameters
        ✓ should fail (wrong port)
        ✓ should fail (wrong host)
        ✓ should fail (wrong target)

Ok, the above numbers weren't that fair. Those have been for a debug artifact build of Firefox. Here the numbers for a Firefox Nightly build:

  closing a connection
    with callback
      ✓ should allow a subsequent new connection (215ms)
    without callback
      ✓ should allow a subsequent new connection (206ms)

  connecting to Chrome
    with callback
      with default parameters
        ✓ should succeed with "connect" callback passed as an argument (100ms)
      with custom parameters
        ✓ should succeed with "connect" callback passed as an argument (101ms)
        ✓ should succeed with custom target by index (100ms)
        ✓ should succeed with custom target by function (98ms)
        ✓ should succeed with custom target by object (98ms)
        ✓ should succeed with custom target by full URL (99ms)
        ✓ should succeed with custom target by partial URL (96ms)

But it's still way slower than Chrome.

Here is a profile as generated with the Gecko profiler: https://perfht.ml/2QfXvLE

As it looks like most of the time is spent in httpd.js close() that itself fully seems to wait on a 1000ms GC.

Going up one more frame which is _connectionClosed, we have a forced GC at the very end, which perfectly explains this massive delay.

Andrew, is there a way to make this GC async? Or which other option might we have to not have to wait for it?

It's a bummer that we have to depend on this server..

Flags: needinfo?(continuation)

And there is still the question why the GC takes that long. Is there a GC logger via MOZ_LOG?

It looks like it was added in bug 508128 because the GC was not running enough and it was using a ton of memory. Balancing running the GC enough but not too much is the age old problem of GC, especially in testing. I guess it makes sense to do some cleanup when you close a connection. Maybe you could start an incremental GC if one hasn't started instead? IGC was not available in 2009. It'll still end up doing the same amount of work, but it will be spread out more. It might also be more efficient if you are closing a bunch of connections at once. :Jonco might have some more ideas.

Flags: needinfo?(continuation)

How would I force an incremental GC? Looks like I cannot find a synonymous method in xpccomponents.idl.

So it is not unusual that it would run 1s? I mean it's quiet a long time, and I really wonder if we have some code in the Remote Protocol which performs badly when it comes to GC.

I think we also wouldn't be worried at all to just have an forceGC option for HTTPd, so that we can turn it off completely for our case. Or the other way around, and test harnesses would have to opt-in to a GC. Also I wonder why it actually cannot be triggered by the test harnesses itself but has to be done in _connectionClosed of httpd.js? Maybe we haven't had this option in the past?

Flags: needinfo?(josh)
Flags: needinfo?(gbrown)

I don't have anything useful to add here; it's been ~9 years since I last looked at httpd.js :)

Flags: needinfo?(josh)

Sorry, I know very little about httpd.js.

Flags: needinfo?(gbrown)

Yes, we still make use of httpd.js inside of xpcshell:
https://searchfox.org/mozilla-central/search?q=httpd.js&case=false&regexp=false&path=xpcshell

Jonco, any ideas from your side? Please see comments 4, 5 and 7. Thanks

Flags: needinfo?(jcoppeard)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+2] from comment #9)
Maybe someone can try removing the forceGC() call and see whether it makes any difference to memory usage. Nowadays we should trigger GC automatically and this shouldn't be necessary.

Flags: needinfo?(jcoppeard)
Depends on: 1597227
Priority: P2 → P3
Whiteboard: [puppeteer-alpha] → [puppeteer-alpha-reserve]
Priority: P3 → P2
Whiteboard: [puppeteer-alpha-reserve] → [puppeteer-beta-mvp]

With the upcoming changes to httpd.js we will be twice that fast. Compare with the numbers in comment 1:

  closing a connection
    with callback
      ✓ should allow a subsequent new connection (124ms)
    without callback
      ✓ should allow a subsequent new connection (101ms)

  connecting to Chrome
    with callback
      with default parameters
        ✓ should succeed with "connect" callback passed as an argument (49ms)
      with custom parameters
        ✓ should succeed with "connect" callback passed as an argument (51ms)
        ✓ should succeed with custom target by index (55ms)
        ✓ should succeed with custom target by function (48ms)
        ✓ should succeed with custom target by object (49ms)
        ✓ should succeed with custom target by full URL (48ms)
        ✓ should succeed with custom target by partial URL (51ms)

But it's still slow and we need further tweaks.

Here updated numbers for debug builds:

  closing a connection
    with callback
      ✓ should allow a subsequent new connection (746ms)
    without callback
      ✓ should allow a subsequent new connection (739ms)

  connecting to Chrome
    with callback
      with default parameters
        ✓ should succeed with "connect" callback passed as an argument (349ms)
      with custom parameters
        ✓ should succeed with "connect" callback passed as an argument (352ms)
        ✓ should succeed with custom target by index (352ms)
        ✓ should succeed with custom target by function (348ms)
        ✓ should succeed with custom target by object (350ms)
        ✓ should succeed with custom target by full URL (353ms)
        ✓ should succeed with custom target by partial URL (354ms)
Depends on: 1608370
Priority: P2 → P3
Whiteboard: [puppeteer-beta-mvp] → [puppeteer-beta-reserve]
Blocks: 1624785

When running the Puppeteer unit tests on Ubuntu 18.04 (see bug 1623128) we have a connection timeout after 15s!

Depends on: 1623128
Blocks: 1634146
No longer depends on: 1623128
Whiteboard: [puppeteer-beta-reserve] → [puppeteer-beta2-mvp]
Component: CDP: Page → CDP
Severity: normal → S3

It's not needed anymore.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WONTFIX
Blocks: 1827818
No longer blocks: 1634146
You need to log in before you can comment on or make changes to this bug.