Closed Bug 1529610 Opened 2 years ago Closed 7 months ago

Can we somehow reduce the overhead caused by HandletoFilename when the mainthreadio feature is enabled?

Categories

(Core :: Gecko Profiler, enhancement, P2)

67 Branch
enhancement

Tracking

()

RESOLVED FIXED
mozilla80
Tracking Status
firefox80 --- fixed

People

(Reporter: florian, Assigned: gerald)

References

(Blocks 3 open bugs)

Details

Attachments

(2 files)

In a profile like https://perfht.ml/2DZXunC the amount of overhead caused by HandletoFilename is significant; I suspect it of doing I/O itself.

I'm wonder if there are better APIs we could use instead of what's currently used, or if we could somehow cache the results to pay the cost only once when there are plenty of consecutive IO markers for the same file (eg. in https://perfht.ml/2DZJESb 72% of the samples are in HandletoFilename).

I just read the Mac version, and it does look up the filename:

On Mac: https://searchfox.org/mozilla-central/source/xpcom/build/PoisonIOInterposerMac.cpp#91

Also, here is the windows usage:

On Windows: https://searchfox.org/mozilla-central/source/xpcom/build/PoisonIOInterposerWin.cpp#166

https://searchfox.org/mozilla-central/rev/fbb251448feb7276f9b1d0a88f9c0cb1cd144ce4/xpcom/build/PoisonIOInterposerWin.cpp#161-163 looks like it was meant to return early. But that would only help if we access the filename more than once for the same IO marker, which I hope we don't do.

Blocks: 1329212
Priority: -- → P2
Version: unspecified → 67 Branch

Caching filenames in 32-entry LRU array covers >95% of calls, and makes the average Filename() call 5 to 10 times cheaper.

browser_start_content_mainthreadio.js needed to be updated to handle operations that now have a filename thanks to the cache.

Assignee: nobody → gsquelart
Status: NEW → ASSIGNED
Pushed by gsquelart@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e33817395845
LRU cache of Windows filenames in WinIOAutoObservation - r=canaltinova,florian
Regressions: 1646707
Backout by abutkovits@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/989539bbabb2
Backed out changeset e33817395845 as requested by dev. CLOSED TREE

Gerald, looks like these asan failures also started from the push in comment 4.

Th: https://treeherder.mozilla.org/#/jobs?repo=autoland&fromchange=954c49edcc7627faf941e273c88143a29c55ebbb&searchStr=windows%2C10%2Cx64%2Casan%2Cmochitests%2Ctest-windows10-64-asan%2Fopt-mochitest-plain-gpu-e10s%2Cm%28gpu%29&tochange=5cd8ce0bfd95bb71f8280446bb605f95b409a3cd&selectedTaskRun=OXFUDEKQSXSkL_5Psy0-cg.0

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=306771133&repo=autoland&lineNumber=1178

[task 2020-06-18T12:24:36.553Z] 12:24:36     INFO - TEST-START | dom/bindings/test/test_sequence_wrapping.html
[task 2020-06-18T12:24:38.222Z] 12:24:38     INFO - GECKO(11100) | JavaScript error: resource://specialpowers/WrapPrivileged.jsm, line 272: NS_ERROR_NOT_AVAILABLE: Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIMemoryReporterManager.heapAllocated]
[task 2020-06-18T12:24:38.222Z] 12:24:38     INFO - GECKO(11100) | MEMORY STAT heapAllocated not supported in this build configuration.
[task 2020-06-18T12:24:38.223Z] 12:24:38     INFO - GECKO(11100) | MEMORY STAT | vsize 19405887MB | vsizeMaxContiguous 73580156MB | residentFast 348MB
[task 2020-06-18T12:24:38.273Z] 12:24:38     INFO - GECKO(11100) | JavaScript error: http://mochi.test:8888/tests/SimpleTest/SimpleTest.js, line 605: TypeError: Node.appendChild: Argument 1 is not an object.
[task 2020-06-18T12:24:38.292Z] 12:24:38     INFO - TEST-OK | dom/bindings/test/test_sequence_wrapping.html | took 1738ms
[task 2020-06-18T12:24:38.311Z] 12:24:38     INFO - TEST-START | Shutdown
[task 2020-06-18T12:24:38.312Z] 12:24:38     INFO - Passed:  4
[task 2020-06-18T12:24:38.312Z] 12:24:38     INFO - Failed:  0
[task 2020-06-18T12:24:38.313Z] 12:24:38     INFO - Todo:    0
[task 2020-06-18T12:24:38.313Z] 12:24:38     INFO - Mode:    e10s
[task 2020-06-18T12:24:38.313Z] 12:24:38     INFO - Slowest: 1738ms - /tests/dom/bindings/test/test_sequence_wrapping.html
[task 2020-06-18T12:24:38.314Z] 12:24:38     INFO - SimpleTest FINISHED
[task 2020-06-18T12:24:38.316Z] 12:24:38     INFO - TEST-INFO | Ran 1 Loops
[task 2020-06-18T12:24:38.316Z] 12:24:38     INFO - SimpleTest FINISHED
[task 2020-06-18T12:24:39.078Z] 12:24:39     INFO - GECKO(11100) | ###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
[task 2020-06-18T12:24:39.078Z] 12:24:39     INFO - GECKO(11100) | ###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
[task 2020-06-18T12:24:39.861Z] 12:24:39     INFO - GECKO(11100) | 1592483079859	Marionette	TRACE	Received observer notification xpcom-will-shutdown
[task 2020-06-18T12:24:39.861Z] 12:24:39     INFO - GECKO(11100) | 1592483079859	Marionette	INFO	Stopped listening on port 2828
[task 2020-06-18T12:24:39.861Z] 12:24:39     INFO - GECKO(11100) | 1592483079860	Marionette	DEBUG	Marionette stopped listening
[task 2020-06-18T12:24:40.313Z] 12:24:40     INFO - GECKO(11100) | ###!!! [Child][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost
[task 2020-06-18T12:24:40.385Z] 12:24:40     INFO - GECKO(11100) | [Parent 8208, Gecko_IOThread] WARNING: pipe error: 232: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 520
[task 2020-06-18T12:24:40.469Z] 12:24:40     INFO - GECKO(11100) | [2020-06-18T12:24:40Z ERROR xulstore::persist] removeDocument error: unavailable
[task 2020-06-18T12:24:42.306Z] 12:24:42     INFO - GECKO(11100) | =================================================================
[task 2020-06-18T12:24:42.306Z] 12:24:42    ERROR - GECKO(11100) | ==8208==ERROR: AddressSanitizer: access-violation on unknown address 0x000000000000 (pc 0x7ffddec3108f bp 0x0057315fe600 sp 0x0057315fe410 T0)
[task 2020-06-18T12:24:42.307Z] 12:24:42     INFO - GECKO(11100) | ==8208==The signal is caused by a WRITE memory access.
[task 2020-06-18T12:24:42.307Z] 12:24:42     INFO - GECKO(11100) | ==8208==Hint: address points to the zero page.
[task 2020-06-18T12:24:44.209Z] 12:24:44     INFO - TEST-INFO | Main app process: exit c0000005
[task 2020-06-18T12:24:44.209Z] 12:24:44     INFO - Buffered messages finished
[task 2020-06-18T12:24:44.210Z] 12:24:44    ERROR - TEST-UNEXPECTED-FAIL | Last test finished | application terminated with exit code 3221225477
[task 2020-06-18T12:24:44.210Z] 12:24:44     INFO - runtests.py | Application ran for: 0:00:28.844000
[task 2020-06-18T12:24:44.210Z] 12:24:44     INFO - zombiecheck | Reading PID log: c:\users\task_1592480269\appdata\local\temp\tmplknmxepidlog
[task 2020-06-18T12:24:44.211Z] 12:24:44     INFO - ==> process 8208 launched child process 9572 ("Z:\task_1592480269\build\application\firefox\firefox.exe" -contentproc --channel="8208.0.754065884\1738572493" -parentBuildID 20200618082718 -prefsHandle 2140 -prefMapHandle 2116 -prefsLen 1 -prefMapSize 235503 -appdir "Z:\task_1592480269\build\application\firefox\browser" - 8208  gpu)
[task 2020-06-18T12:24:44.211Z] 12:24:44     INFO - ==> process 8208 launched child process 9924 ("Z:\task_1592480269\build\application\firefox\firefox.exe" -contentproc --channel="8208.6.1503304258\41181699" -childID 1 -isForBrowser -prefsHandle 2784 -prefMapHandle 2780 -prefsLen 1779 -prefMapSize 235503 -parentBuildID 20200618082718 -appdir "Z:\task_1592480269\build\application\firefox\browser" - 8208  tab)
[task 2020-06-18T12:24:44.211Z] 12:24:44     INFO - ==> process 8208 launched child process 5580 ("Z:\task_1592480269\build\application\firefox\firefox.exe" -contentproc --channel="8208.13.525552173\1448406682" -childID 2 -isForBrowser -prefsHandle 4624 -prefMapHandle 4620 -prefsLen 2193 -prefMapSize 235503 -parentBuildID 20200618082718 -appdir "Z:\task_1592480269\build\application\firefox\browser" - 8208  tab)
[task 2020-06-18T12:24:44.211Z] 12:24:44     INFO - ==> process 8208 launched child process 9856 ("Z:\task_1592480269\build\application\firefox\firefox.exe" -contentproc --channel="8208.20.381930125\1013809698" -childID 3 -isForBrowser -prefsHandle 4620 -prefMapHandle 4800 -prefsLen 2233 -prefMapSize 235503 -parentBuildID 20200618082718 -appdir "Z:\task_1592480269\build\application\firefox\browser" - 8208  tab)
[task 2020-06-18T12:24:44.212Z] 12:24:44     INFO - ==> process 8208 launched child process 10752 ("Z:\task_1592480269\build\application\firefox\firefox.exe" -contentproc --channel="8208.27.687960204\996384937" -childID 4 -isForBrowser -prefsHandle 4552 -prefMapHandle 4548 -prefsLen 10841 -prefMapSize 235503 -parentBuildID 20200618082718 -appdir "Z:\task_1592480269\build\application\firefox\browser" - 8208  tab)
[task 2020-06-18T12:24:44.212Z] 12:24:44     INFO - zombiecheck | Checking for orphan process with PID: 9856
[task 2020-06-18T12:24:44.212Z] 12:24:44     INFO - zombiecheck | Checking for orphan process with PID: 9924
[task 2020-06-18T12:24:44.213Z] 12:24:44     INFO - zombiecheck | Checking for orphan process with PID: 5580
[task 2020-06-18T12:24:44.213Z] 12:24:44     INFO - zombiecheck | Checking for orphan process with PID: 9572
[task 2020-06-18T12:24:44.213Z] 12:24:44     INFO - zombiecheck | Checking for orphan process with PID: 10752

Maybe this gives you more info.

Flags: needinfo?(gsquelart)
Blocks: 1647107

The ClearPoisonIOInterposer clean-up function is actually never called (I've just filed bug 1647107 for that), which means IO interposition may still happen past ShutdownPhase::ShutdownFinal so we can't just destroy the LRUCache at that point because it could still be used.
I will add another way to safely remove the cache during shutdown...

Flags: needinfo?(gsquelart)
Pushed by gsquelart@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6862b2fa8dbb
SmallArrayLRUCache - r=canaltinova
https://hg.mozilla.org/integration/autoland/rev/7c6ea9680bd5
LRU cache of Windows filenames in WinIOAutoObservation - r=canaltinova,florian
Status: ASSIGNED → RESOLVED
Closed: 7 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla80
You need to log in before you can comment on or make changes to this bug.