Closed Bug 1607505 Opened 3 months ago Closed 2 months ago

Improve shutdown performance of StartupCache

Categories

(Core :: XPCOM, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla75
Tracking Status
firefox75 --- fixed

People

(Reporter: dthayer, Assigned: dthayer)

References

(Blocks 2 open bugs)

Details

Attachments

(2 files)

I don't think we have solid evidence for deciding to block shutdown in order to write the startup cache. We already write the startup cache 60s after startup (we should probably do something more sophisticated here, but I think that's for another bug). Effectively, we're blocking shutdown for the cases where a user starts up their browser and closes it within a minute. One such case in which users would do this would be when the browser wants to apply an update, in which case writing the startup cache is redundant anyway because it will be invalidated by the update. We could do fancier things here to avoid writing the startup cache if we know we're applying an update, but I think we should just keep it simple and remove the write on shutdown, as it's likely not getting us much value anyway.

Summary: Don't block shutdown to write StartupCache file → Improve shutdown performance of StartupCache
Assignee: nobody → dothayer
Status: NEW → ASSIGNED

The initial thought for getting the StartupCache out of the shutdown
path was to simply not write it during shutdown, as it should write
60 seconds after startup, and the theory was that if the user shut
down within the first 60 seconds of use, they were likely updating or
something and it shouldn't matter. However, considering how many of
our users only ever open one tab, I think it's rather likely that
users are starting up firefox to go to a web site, then closing it
when done with that website, and then maybe opening up a new instance
in order to go to a different website. Accordingly it still makes
sense to continue writing it. However, we may as well leverage a
background thread for this and get it kicked off earlier during
shutdown, so we don't sit there blocking in the destructor late
during shutdown.

Since we've decoupled the writing of the StartupCache from the freeing
of the StartupCache's tables (which takes nontrivial time), this patch
moves the StartupCache's destructor later in shutdown so it will be
skipped in the early _exit(0) efforts. There is a remaining
WaitOnWriteThread in the StartupCache's destructor, but this is a
minor sanity check to avoid use-after-frees in the write thread during
shutdown. In general it won't matter as we already wait for the write
thread in the xpcom-shutdown observer, and if we shut down during write,
the system is set up to be okay with that (because it already happens).

Depends on D62294

Pushed by dothayer@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/db2b0b9a9886
Start writing StartupCache earlier in shutdown r=froydnj
https://hg.mozilla.org/integration/autoland/rev/eb423c859513
Move StartupCache destructor later in shutdown r=froydnj
Blocks: 1614795
Backout by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/acfa1e334a45
Backed out 2 changesets for causing bc leaks in browser_startupcache_telemetry.

Failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&searchStr=linux%2Cx64%2Casan%2Cmochitests%2Ctest-linux64-asan%2Fopt-mochitest-browser-chrome-e10s-9%2Cm%28bc9%29&fromchange=eb423c8595139c36622ae9e42e1f62fb99120019&selectedJob=288438090

Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=288456134&repo=autoland

Backout link: https://hg.mozilla.org/integration/autoland/rev/acfa1e334a45

[task 2020-02-12T00:48:58.472Z] 00:48:58 INFO - GECKO(4044) | -----------------------------------------------------
[task 2020-02-12T00:48:59.417Z] 00:48:59 INFO - GECKO(4044) | =================================================================
[task 2020-02-12T00:48:59.418Z] 00:48:59 ERROR - GECKO(4044) | ==4044==ERROR: LeakSanitizer: detected memory leaks
[task 2020-02-12T00:48:59.418Z] 00:48:59 INFO - GECKO(4044) | Indirect leak of 1024 byte(s) in 1 object(s) allocated from:
[task 2020-02-12T00:48:59.419Z] 00:48:59 INFO - GECKO(4044) | #0 0x56145568cd12 in calloc /builds/worker/fetches/llvm-project/llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:154:3
[task 2020-02-12T00:48:59.420Z] 00:48:59 INFO - GECKO(4044) | #1 0x7fe7d5cce27d in PR_SetThreadPrivate /builds/worker/workspace/build/src/nsprpub/pr/src/threads/prtpd.c:162:27
[task 2020-02-12T00:48:59.420Z] 00:48:59 INFO - GECKO(4044) | #2 0x7fe7b8f64c10 in nsThread::InitCurrentThread() /builds/worker/workspace/build/src/xpcom/threads/nsThread.cpp:696:26
[task 2020-02-12T00:48:59.422Z] 00:48:59 INFO - GECKO(4044) | #3 0x7fe7b8f65f83 in nsThreadManager::GetCurrentThread() /builds/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp:583:18
[task 2020-02-12T00:48:59.422Z] 00:48:59 INFO - GECKO(4044) | #4 0x7fe7b8f7350b in NS_GetCurrentThread() /builds/worker/workspace/build/src/xpcom/threads/nsThreadUtils.cpp:497:33
[task 2020-02-12T00:48:59.423Z] 00:48:59 INFO - GECKO(4044) | #5 0x7fe7c2eca717 in profiler_register_thread(char const*, void*) /builds/worker/workspace/build/src/tools/profiler/core/platform.cpp:4520:9
[task 2020-02-12T00:48:59.424Z] 00:48:59 INFO - GECKO(4044) | #6 0x7fe7c36dc1ab in AutoProfilerRegisterThread /builds/worker/workspace/build/src/obj-firefox/dist/include/GeckoProfiler.h:1030:5
[task 2020-02-12T00:48:59.424Z] 00:48:59 INFO - GECKO(4044) | #7 0x7fe7c36dc1ab in mozilla::scache::StartupCache::ThreadedPrefetch(void*) /builds/worker/workspace/build/src/startupcache/StartupCache.cpp:644:3
[task 2020-02-12T00:48:59.425Z] 00:48:59 INFO - GECKO(4044) | #8 0x7fe7d5cd8d0e in _pt_root /builds/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:201:5
[task 2020-02-12T00:48:59.426Z] 00:48:59 INFO - GECKO(4044) | #9 0x7fe7d99ff6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
[task 2020-02-12T00:48:59.426Z] 00:48:59 INFO - GECKO(4044) | Indirect leak of 13 byte(s) in 1 object(s) allocated from:
[task 2020-02-12T00:48:59.426Z] 00:48:59 INFO - GECKO(4044) | #0 0x56145568cb9d in malloc /builds/worker/fetches/llvm-project/llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:145:3
[task 2020-02-12T00:48:59.427Z] 00:48:59 INFO - GECKO(4044) | #1 0x7fe7d5ccc211 in PR_SetCurrentThreadName /builds/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:1668:28
[task 2020-02-12T00:48:59.428Z] 00:48:59 INFO - GECKO(4044) | #2 0x7fe7b8f587ad in NS_SetCurrentThreadName(char const*) /builds/worker/workspace/build/src/xpcom/threads/nsThreadUtils.cpp:490:3
[task 2020-02-12T00:48:59.429Z] 00:48:59 INFO - GECKO(4044) | #3 0x7fe7c36dc1b3 in mozilla::scache::StartupCache::ThreadedPrefetch(void*) /builds/worker/workspace/build/src/startupcache/StartupCache.cpp:645:3
[task 2020-02-12T00:48:59.429Z] 00:48:59 INFO - GECKO(4044) | #4 0x7fe7d5cd8d0e in _pt_root /builds/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:201:5
[task 2020-02-12T00:48:59.430Z] 00:48:59 INFO - GECKO(4044) | #5 0x7fe7d99ff6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9)
[task 2020-02-12T00:48:59.430Z] 00:48:59 INFO - GECKO(4044) | -----------------------------------------------------
[task 2020-02-12T00:48:59.430Z] 00:48:59 INFO - GECKO(4044) | Suppressions used:
[task 2020-02-12T00:48:59.432Z] 00:48:59 INFO - GECKO(4044) | count bytes template
[task 2020-02-12T00:48:59.432Z] 00:48:59 INFO - GECKO(4044) | 27 816 nsComponentManagerImpl
[task 2020-02-12T00:48:59.432Z] 00:48:59 INFO - GECKO(4044) | 611 17509 libfontconfig.so
[task 2020-02-12T00:48:59.433Z] 00:48:59 INFO - GECKO(4044) | 1 29 libglib-2.0.so
[task 2020-02-12T00:48:59.433Z] 00:48:59 INFO - GECKO(4044) | 4 832 mozJSComponentLoader
[task 2020-02-12T00:48:59.434Z] 00:48:59 INFO - GECKO(4044) | 2 360 XPCWrappedNativeJSOps
[task 2020-02-12T00:48:59.434Z] 00:48:59 INFO - GECKO(4044) | -----------------------------------------------------
[task 2020-02-12T00:48:59.436Z] 00:48:59 INFO - GECKO(4044) | SUMMARY: AddressSanitizer: 1037 byte(s) leaked in 2 allocation(s).
[task 2020-02-12T00:48:59.520Z] 00:48:59 INFO - TEST-INFO | Main app process: exit 0
[task 2020-02-12T00:48:59.520Z] 00:48:59 INFO - TEST-INFO | LeakSanitizer | To show the addresses of leaked objects add report_objects=1 to LSAN_OPTIONS
[task 2020-02-12T00:48:59.521Z] 00:48:59 INFO - TEST-INFO | LeakSanitizer | This can be done in testing/mozbase/mozrunner/mozrunner/utils.py
[task 2020-02-12T00:48:59.522Z] 00:48:59 ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at PR_SetThreadPrivate, nsThread::InitCurrentThread, nsThreadManager::GetCurrentThread, NS_GetCurrentThread
[task 2020-02-12T00:48:59.523Z] 00:48:59 ERROR - TEST-UNEXPECTED-FAIL | LeakSanitizer | leak at PR_SetCurrentThreadName, NS_SetCurrentThreadName, mozilla::scache::StartupCache::ThreadedPrefetch, _pt_root

Flags: needinfo?(dothayer)
Pushed by dothayer@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7b7b147b6955
Start writing StartupCache earlier in shutdown r=froydnj
https://hg.mozilla.org/integration/autoland/rev/d79f1d283acd
Move StartupCache destructor later in shutdown r=froydnj
Backout by shindli@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e89e4f374d45
Backed out 2 changesets for causing xpcshell permafailures in netwerk/test/unit/test_alt-data_simple.js CLOSED TREE

(In reply to Stefan Hindli [:stefan_hindli] from comment #8)

Backed out 2 changesets (Bug 1607505) for causing xpcshell permafailures in netwerk/test/unit/test_alt-data_simple.js CLOSED TREE

https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception&classifiedState=unclassified&fromchange=7adc2fc208abb42d832813ab55c949a16c770547&selectedJob=288579099

https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=288579099&repo=autoland&lineNumber=2107

Can you help me understand how this is a permafailure? So far it looks quite intermittent - and the failure is an existing intermittent failure, no? Am I misunderstanding something?

Flags: needinfo?(dothayer) → needinfo?(shindli)
Pushed by ncsoregi@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a727a8fe08f5
Start writing StartupCache earlier in shutdown r=froydnj a=reland CLOSED TREE
https://hg.mozilla.org/integration/autoland/rev/95e06979aeba
Move StartupCache destructor later in shutdown r=froydnj a=reland CLOSED TREE
Status: ASSIGNED → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla75

:dthayer, you are correct, it wasn't a permafailure.

Flags: needinfo?(shindli)
Regressions: 1615045
Regressions: 1616186
You need to log in before you can comment on or make changes to this bug.