Closed
Bug 1163049
Opened 8 years ago
Closed 8 years ago
Autophone - 2015-05-08 Throbber stop regression on fx-team
Categories
(Firefox for Android Graveyard :: General, defect)
Tracking
(firefox40 affected, firefox41 fixed, fennec40+)
RESOLVED
FIXED
Firefox 41
People
(Reporter: bc, Assigned: benjamin)
References
(Blocks 1 open bug)
Details
(Keywords: regression)
Attachments
(1 file, 1 obsolete file)
2.95 KB,
patch
|
rnewman
:
review+
|
Details | Diff | Splinter Review |
http://phonedash.mozilla.org/#/org.mozilla.fennec/throbberstop/local-blank/norejected/2015-05-07/2015-05-08/notcached/noerrorbars/standarderror/notry Shows an increase in the throbber stop time for all tests and for first and second runs. It first appears in this range: https://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=3e85a97162d9&tochange=663616cbe447 bug 1121013 or bug 1140132 seem like candidates.
Reporter | ||
Comment 2•8 years ago
|
||
mfinkle has a good description in https://wiki.mozilla.org/Auto-tools/Projects/Autophone/Autophone_for_developers#S1S2_Test
Flags: needinfo?(bob)
Assignee | ||
Comment 3•8 years ago
|
||
From reading the description, I think that this is supposed to be a pageload test and unrelated to startup. Is that correct? The difference that bug 1140132 may have caused is that we load the addon manager earlier in the startup process than we did previously. But that is an affect that shouldn't affect pageload performance, so I'm wondering if/how the test waits for startup to complete.
Reporter | ||
Comment 4•8 years ago
|
||
The test page is loaded and once the load event is fired, the quitter extension is used to shut down fennec. The time from the Start proc message to the time the Throbber stop message is emitted in the logcat is displayed Throbber stop time. From mfinkle's comment: <quote> "Throbber Start and Throbber Stop should map to the points where Gecko nsIWebProgressListener fires START|NETWORK and STOP|NETWORK notifications, respectively. In the UI we use those to control the visibility of the "page progress" indicator. It used to be a throbber(spinner) but is now a simple progress line. "The time between Throbber Start and Throbber Stop is a combination of Gecko networking and page parsing & loading, and rendering. We also have some Java UI affects too. </quote> This regression was confirmed by its appearance on the other repos as it was merged.
Assignee | ||
Comment 5•8 years ago
|
||
I got the finkle quote. What are the details of "once the load event is fired"? mfinkle, does fennec use the toolkit addon manager code at all?
Flags: needinfo?(mark.finkle)
Reporter | ||
Comment 6•8 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #5) > I got the finkle quote. What are the details of "once the load event is > fired"? <script> function quit() { var evt = document.createEvent('Events'); evt.initEvent('please-quit', true, false); document.dispatchEvent(evt); } window.addEventListener("load", function () { setTimeout(quit, 1000); }, false); window.onerror = quit; </script>
Comment 7•8 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #5) > I got the finkle quote. What are the details of "once the load event is > fired"? > > mfinkle, does fennec use the toolkit addon manager code at all? Yes. Fennec uses the Toolkit Add-on Manager.
Flags: needinfo?(mark.finkle)
Comment 8•8 years ago
|
||
Throbber Stop includes startup and pageload. Throbber Start includes startup. The startup is almost 100% Java related code. Code that happens during Gecko's startup can appear after Throbber Start. The range is fairly small. It should be simple to backout any suspects and see how the graph is affected.
Comment 9•8 years ago
|
||
Sounds like bug 1140132 causes an early initialization of NSS, which we have tried to avoid on mobile, and could explain the regression.
Updated•8 years ago
|
tracking-fennec: ? → 40+
Comment 10•8 years ago
|
||
Benjamin, looks like your bug 1140132 caused this
Assignee: nobody → benjamin
Assignee | ||
Comment 11•8 years ago
|
||
Changing the order of NSS initialization is almost certainly what's going on here. This happens via the extension-manager calling into nsICryptoHash, probably here: http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/extensions/internal/PluginProvider.jsm#35 Do the test machines have Flash installed? This hash would only be calculated if Flash is installed. I can think of the following solutions, in rought order of preference: * Because there's only ever Flash on Android, don't use crypto hashing for that plugin * Make it so nsICryptoHash doesn't have to load the certificate DBs (doing main-thread IO) * Implement nsICryptoHash on android in terms of Java instead of NSS * Delay plugin data collection on Android
Status: NEW → ASSIGNED
Flags: needinfo?(blassey.bugs)
Reporter | ||
Comment 12•8 years ago
|
||
Flash is not installed on these devices.
Comment 13•8 years ago
|
||
I think Bob answered the question you NI'd me for
Flags: needinfo?(blassey.bugs)
Assignee | ||
Comment 14•8 years ago
|
||
If I just build Fennec myself and break at the first call to EnsureNSSInitialized, this is the JS stack that triggers it: $1 = 0x920cb000 "0 LoginManagerCrypto_SDR.prototype.init() [\"jar:jar:file:///data/app/org.mozilla.fennec_bsmedberg-1/base.apk!/assets/omni.ja!/components/crypto-SDR.js\":79]\n this = [object Object]\n1 LoginManagerCry"... (gdb) printf "%s", $s 0 LoginManagerCrypto_SDR.prototype.init() ["jar:jar:file:///data/app/org.mozilla.fennec_bsmedberg-1/base.apk!/assets/omni.ja!/components/crypto-SDR.js":79] 1 LoginManagerCrypto_SDR() ["jar:jar:file:///data/app/org.mozilla.fennec_bsmedberg-1/base.apk!/assets/omni.ja!/components/crypto-SDR.js":14] 2 XPCOMUtils__getFactory/factory.createInstance(outer = null, iid = {2030770e-542e-40cd-8061-cd9d4ad4227f}) ["resource://gre/modules/XPCOMUtils.jsm":302] 3 LoginManagerStorage_mozStorage.prototype._crypto() ["jar:jar:file:///data/app/org.mozilla.fennec_bsmedberg-1/base.apk!/assets/omni.ja!/components/storage-mozStorage.js":75] 4 LoginManagerStorage_mozStorage.prototype.initialize() ["jar:jar:file:///data/app/org.mozilla.fennec_bsmedberg-1/base.apk!/assets/omni.ja!/components/storage-mozStorage.js":202] 5 LoginManager.prototype._initStorage() ["jar:jar:file:///data/app/org.mozilla.fennec_bsmedberg-1/base.apk!/assets/omni.ja!/components/nsLoginManager.js":154] 6 LoginManager.prototype.init() ["jar:jar:file:///data/app/org.mozilla.fennec_bsmedberg-1/base.apk!/assets/omni.ja!/components/nsLoginManager.js":131] 7 LoginManager() ["jar:jar:file:///data/app/org.mozilla.fennec_bsmedberg-1/base.apk!/assets/omni.ja!/components/nsLoginManager.js":55] 8 XPCOMUtils__getFactory/factory.createInstance(outer = null, iid = {38c7f6af-7df9-49c7-b558-2776b24e6cc1}) ["resource://gre/modules/XPCOMUtils.jsm":302] 9 startup/BrowserApp_delayedStartup/<() ["chrome://browser/content/browser.js":381] So either my hypothesis is incorrect or the autophone test is using a different configuration than I would. Bob, if I provided a patch which dumped the JS stack to logcat when NSS is initialized, how hard would it be to get that running in the autophone environment and get the logcat output?
Flags: needinfo?(bob)
Reporter | ||
Comment 15•8 years ago
|
||
Not hard. You can do it your self via a try build. Use try: -b o -p android-api-9,android-api-11 -u autophone-s1s2 -t none as the check in comment. The logcat will be available in the Job Info panel on Treeherder and the measurements will be available on phonedash.mozilla.org (select try only).
Flags: needinfo?(bob)
Assignee | ||
Comment 16•8 years ago
|
||
bc is my hero. The JS stack initializing NSS as a result of this patch is: 05-20 10:34:24.443 I/Gecko ( 403): InitializeNSS 05-20 10:34:24.482 I/Gecko ( 403): 0 loadManifestFromRDF(aUri = [xpconnect wrapped nsIURI], aStream = [xpconnect wrapped (nsISupports, nsIInputStream, nsIBufferedInputStream, nsISeekableStream, nsIStreamBufferAccess)]) ["resource://gre/modules/addons/XPIProvider.jsm":1012] 05-20 10:34:24.482 I/Gecko ( 403): 1 loadManifestFromDir() ["resource://gre/modules/addons/XPIProvider.jsm":1063] 05-20 10:34:24.482 I/Gecko ( 403): 2 next(val = undefined) ["self-hosted":624] 05-20 10:34:24.482 I/Gecko ( 403): this = [object Generator] 05-20 10:34:24.482 I/Gecko ( 403): 3 TaskImpl_run(aSendResolved = true) ["resource://gre/modules/Task.jsm":314] 05-20 10:34:24.482 I/Gecko ( 403): this = [object Object] 05-20 10:34:24.482 I/Gecko ( 403): 4 TaskImpl(iterator = [object Generator]) ["resource://gre/modules/Task.jsm":275] 05-20 10:34:24.482 I/Gecko ( 403): this = [object Object] 05-20 10:34:24.482 I/Gecko ( 403): 5 createAsyncFunction/asyncFunction([xpconnect wrapped nsIFile]) ["resource://gre/modules/Task.jsm":249] 05-20 10:34:24.482 I/Gecko ( 403): 6 loadManifestFromFile(aFile = [xpconnect wrapped nsIFile]) ["resource://gre/modules/addons/XPIProvider.jsm":1158] 05-20 10:34:24.482 I/Gecko ( 403): 7 syncLoadManifestFromFile(aFile = [xpconnect wrapped nsIFile]) ["resource://gre/modules/addons/XPIProvider.jsm":1169] 05-20 10:34:24.482 I/Gecko ( 403): 8 XPI_processPendingFileChanges/<(aLocation = [object Object], 0, [object Objec Which is where we create a random GUID for sync here: http://hg.mozilla.org/mozilla-central/annotate/8d8df22fe72d/toolkit/mozapps/extensions/internal/XPIProvider.jsm#l1012 I'm not sure what this ID is for or whether it needs strong cryptography. Could we use Math.random for this instead of NSS? If not, could we stub nsIRandomGenerator in Fennec using Java instead of initializing NSS?
Assignee | ||
Comment 17•8 years ago
|
||
Attachment #8608821 -
Flags: review?(rnewman)
Attachment #8608821 -
Flags: review?(MattN+bmo)
Comment 18•8 years ago
|
||
Comment on attachment 8608821 [details] [diff] [review] 1163049-faster-random Review of attachment 8608821 [details] [diff] [review]: ----------------------------------------------------------------- My first response to this was: why not make AddonInternal.guid lazy, and avoid the need to do any work at all at this stage? If that's a new random GUID, we don't need it until we sync. (It might need to be present sooner to avoid changing a bunch of code, of course.) Did you consider doing that, Benjamin? ::: toolkit/mozapps/extensions/internal/XPIProvider.jsm @@ +1008,5 @@ > let storage = Services.storage; > > // Generate random GUID used for Sync. > + // This does NOT use cryptographically secure algorithms because it is the > + // startup path of Fennec and loading NSS causes a startup regression. Pointer to the bug for further discussion? @@ +1014,1 @@ > // This was lifted from util.js:makeGUID() from services-sync. This comment is no longer correct. @@ +1014,4 @@ > // This was lifted from util.js:makeGUID() from services-sync. > + function* range(i) { for (let j = 0; j < i; ++j) { yield j; } } > + let byte_string = [String.fromCharCode(Math.floor(Math.random() * 0x100)) > + for (i of range(9))].join(""); Out of curiosity, do we optimize JS generators enough that this beats let bytes = []; for (i = 0; i < 9; ++i) { bytes.push(Math.floor(Math.random() * 0x100)) } let byte_string = ... or some other phrasing?
Attachment #8608821 -
Flags: review?(rnewman) → review+
Comment 19•8 years ago
|
||
Comment on attachment 8608821 [details] [diff] [review] 1163049-faster-random Review of attachment 8608821 [details] [diff] [review]: ----------------------------------------------------------------- I don't really have anything new to add on top of what rnewman said but making syncGUID a lazy getter seems like a good solution assuming it's not accessed during startup.
Attachment #8608821 -
Flags: review?(MattN+bmo)
Assignee | ||
Comment 20•8 years ago
|
||
Lazy like this? Here's the tryserver run that includes this patch: https://treeherder.mozilla.org/#/jobs?repo=try&revision=267662c3866d logcat:http://mozilla-releng-blobs.s3.amazonaws.com/blobs/try/sha512/d94732cfc08cfa5f350367f81f62dc7563a30c8aee31a7e01a424b967748d30d9ecc1263ffa3ca34f4fe918d52ae27d8ef5a76f1307a25d36e3d25d533b06e67 And the NSS init comes from verifyDirSignedState which looks late enough to me.
Attachment #8609591 -
Flags: review?(rnewman)
Comment 21•8 years ago
|
||
Comment on attachment 8609591 [details] [diff] [review] 1163049-lazy-guid Review of attachment 8609591 [details] [diff] [review]: ----------------------------------------------------------------- Yeah, that looks good to me. If add-on sync continues to work with this change, then great!
Attachment #8609591 -
Flags: review?(rnewman) → review+
Assignee | ||
Updated•8 years ago
|
Attachment #8608821 -
Attachment is obsolete: true
Assignee | ||
Comment 22•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=267662c3866d
Keywords: checkin-needed
Comment 23•8 years ago
|
||
https://hg.mozilla.org/integration/fx-team/rev/622af20a746a
Keywords: checkin-needed
https://hg.mozilla.org/integration/fx-team/rev/622af20a746a
Whiteboard: [fixed-in-fx-team]
Comment 25•8 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/622af20a746a
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox41:
--- → fixed
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → Firefox 41
Assignee | ||
Comment 26•8 years ago
|
||
bc, please verify that this patch actually fixes the autophone regression
Flags: needinfo?(bob)
Reporter | ||
Comment 27•8 years ago
|
||
bmsedberg: Sorry for the delay. Had to deal with device issues and a test back log. Comparing http://phonedash.mozilla.org/#/org.mozilla.fennec/throbberstop/local-blank/norejected/2015-05-07/2015-05-10/notcached/noerrorbars/standarderror/notry to http://phonedash.mozilla.org/#/org.mozilla.fennec/throbberstop/local-blank/norejected/2015-05-26/2015-06-01/notcached/noerrorbars/standarderror/notry I don't see any improvement.
Flags: needinfo?(bob)
Updated•2 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•