TelemetryEnvironment takes up 160ms on startup on Android

NEW
Unassigned

Status

()

Toolkit
Telemetry
P4
normal
2 years ago
2 months ago

People

(Reporter: jchen, Unassigned)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [measurement:client])

On my Nexus 4, annotateEnvironment/TelemetryEnvironment.currentEnvironment takes 160ms to run during early startup. I'm not sure what it does exactly, but can annotateEnvironment be made to run later?
EnvironmentCache._startWatchingPrefs takes up another 80ms a bit later during startup.
Blocks: 1140132
This is for annotating crashes with data potentially relevant for crash analysis.
Maybe we could cut down a bit on the time there, but at least on desktop we want that data.
I assume that we want the same on Android? But i'm not sure who makes that call.

For _startWatchingPrefs, we could actually currently skip this on Android - based on the unified pref like [0], which is false for Fennec.

0: https://hg.mozilla.org/mozilla-central/annotate/7d4ab4a9febd/toolkit/components/telemetry/TelemetryController.jsm#l46
Blocks: 1122482
I think honestly we'd rather have the 160ms (240ms?) back rather than have this data. AFAIK, I've never relied on that when trying to solve a crash.
Assignee: nobody → nchen
Note that this is pretty new.
Benjamin, are there plans to use this outside desktop crash-analysis yet?
Flags: needinfo?(benjamin)

Comment 5

2 years ago
Well I'd sure like to be able to use this for android, and have a unified system! But it's perfectly fine to delay it if it really costs that much.

Do we know *why* it costs that much? Is it possible that something stupid about collecting the GFX info, profile creation time, or something like that is causing the jank here, and we could just delay *that*?

I'm sorry, I didn't see this bug until just now.
Flags: needinfo?(benjamin)

Comment 6

2 years ago
Breakdown of time within TelemetryStartup according to a trace view that mconley sent me:

71ms - TelemetryStartup.js toplevel importing things
258ms - TelemetryStartup.prototype.observe
* 51ms - TelemetryController.setupTelemetry
** Almost all of this is setting up and initializing SessionRecorder
* 204ms - TelemetryEnvironment.currentEnvironment
** 44ms - _getSystem
*** this is mostly _getCpuData (16%) and _getGfxData (70%)
** 108ms - updateSettings
*** 18ms importing UpdateChannel.jsm
*** 22ms - getPrefData
*** 64ms - _updateSearchEngine; mostly importing and calling into nsSearchService
** 40ms - EnvironmentAddonBuilder.prototype.init, most time spent in AddonManager

So in terms of time, I think we're hurting in the following areas:

* importing things earlier than we normally would: the search service, UpdateChannel.jsm (if we use that on Android at all!)
* getting data earlier: _getGfxData
* getting data we might not get at all: _getCpuData
* Doing something with the addon manager. Not sure wtf this is.

I'd suggest that delaying collection of _getGfxData and _getCpuData until we've got stuff on the screen is probably fine. And delaying the call into the EnvironmentAddonBuilder is also probably fine, since it's intended to be async anyway. Is there a good notification to use for this on Android? Does final-ui-startup do something useful in an Android context?

Also need to understand whether UpdateChannel.jsm actually returns meaningful results on Android at all.

Comment 7

2 years ago
Georg can you confirm that delaying those pieces of the environment collection seems reasonable?
Flags: needinfo?(gfritzsche)
* delaying gfx & cpu data: i'm not sure on that myself - how badly do we want that early in the environment for crash analysis or other reasons?
* we only call into the search engine after it initialized (hanging off its observer topic), is that firing very early on Android?
* the addon manager is weird - if we don't need the data immediately, can we just delay doing any addon work until N sec after startup?

In general, if things are not needed right away on Android and are expensive, lets definitely delay it there.
If they are not useful on Android, lets special-case it there.
Flags: needinfo?(gfritzsche)
I think we should just trigger collecting the "expensive" pieces from the delayed Telemetry init, i.e. around here:
https://hg.mozilla.org/mozilla-central/annotate/e7e69cc8c07b/toolkit/components/telemetry/TelemetryController.jsm#l670
Jim, is this still being worked on?
I don't know what the future importance of Telemetry on Android is, so maybe we can cut at least part of the expensive data out for now and revisit later.
Flags: needinfo?(nchen)
Blocks: 1201022
No longer blocks: 1122482
(In reply to Georg Fritzsche [:gfritzsche] from comment #10)
> Jim, is this still being worked on?
> I don't know what the future importance of Telemetry on Android is, so maybe
> we can cut at least part of the expensive data out for now and revisit later.

Sounds like a good plan. I'm not actively working on this though, and don't really know the code, so feel free to take it.
Assignee: nchen → nobody
Flags: needinfo?(nchen)
Points: --- → 2
Priority: -- → P3
Whiteboard: [measurement:client]
Margaret, do you know how important this is?
Is it something the mobile team could look into?
Flags: needinfo?(margaret.leibovic)
Priority: P3 → P4

Comment 13

2 years ago
(In reply to Georg Fritzsche [:gfritzsche] from comment #12)
> Margaret, do you know how important this is?
> Is it something the mobile team could look into?

This individual bug isn't a necessarily high priority, but improving startup time is an important ongoing effort.

According to comment 3, snorp says we'd rather have the startup time back than this data. Is anyone even looking at this data? I know my team isn't. snorp, are you using this for any crash investigation work? 

I don't have anyone on my team who can work on this right now. Can you help explain what changes need to happen? Maybe someone from snorp's team could pick this up.
Flags: needinfo?(margaret.leibovic) → needinfo?(snorp)
We aren't actively working on this now, but might have a chance in the medium term.
Flags: needinfo?(snorp)
You need to log in before you can comment on or make changes to this bug.