Closed Bug 1167710 Opened 7 years ago Closed 6 years ago

[Settings] Advanced telemetry setting

Categories

(Firefox OS Graveyard :: Gaia::Settings, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(feature-b2g:2.5+)

RESOLVED FIXED
feature-b2g 2.5+

People

(Reporter: rnicoletti, Assigned: rnicoletti)

References

Details

Attachments

(1 file, 2 obsolete files)

We need an "Advanced Telemetry" panel on the 'developer' settings page. This will enable the generation of advanced telemetry metrics from gecko. Because gecko is using the hud (hud.js) to generate the advanced telemetry metrics, the "Advanced Telemetry" setting needs to enable the gecko hud. The assumption is it will be enabled in the same fashion as it is enabled by the "Developer HUD" panel.
The "Advanced Telemetry" panel would have an option like 'enable advanced telemetry', which gecko's hud.js would observe to determine when it should collect advanced telemetry, and then settings similar to what is in the Developer HUD panel corresponding to which metrics are to be collected.
The Advanced Telemetry panel is providing two potential benefits:

* Don't tie enabling the Developer HUD to enabling advanced telemetry 
* Enable specific metrics to be collected by advanced telemetry separate from what is collected/generated by the Developer HUD.
Jan, can you explain the relationship between enabling "Developer Tools" and the resulting performance hit that you described in an earlier email [1]? I'm asking partly to know what needs to be done when enabling Advanced Telemetry in the settings app to cause the developer hud to be enabled. And it would help to know more about how and when the "devtools performance hit" happens.

[1] "Activating the devtools actors for the developer hud (necessary for most metrics like errors, reflows, and especially memory) does cause a noticeable performance hit. However, the actors have recently been heavily optimized (mainly by Alexandre Poirot). It used to be that activating the Developer HUD would add 10MB to every app's footprint (hence Vivien's worries), but I believe that bump is now less than 2MB per app."
Flags: needinfo?(janx)
Sure!

Note: I use "web view", "frame" and "docshell" interchangeably. Basically, a Firefox OS app is composed of one or more "frames".

- A "developer tools actor" is a tiny privileged JS script that can be loaded into a web view to report about something (e.g. memory, errors, event-loop lag...) or cause some changes (e.g. draw a dotted pink border around a currently inspected element). They live here: [1].

- In the Developer HUD menu, activating "Developer tools" installs a set of devtools actors into all existing Firefox OS web views, and any new web view created afterwards.

- It used to be that installing all these actors into a web view would bump the USS footprint of its process 10MB higher. This was terrible and always caused a big performance hit, but Alex (:ochameau) spent many months fighting the actors' footprint. Today, I believe it to be much lower (we need to measure it again), and specific actors are only being installed if they're actually needed, so if you only track one metric, only the one related actor will be installed into the web views. TL;DR - Today there is no "big performance hit" anymore, and tracking fewer metrics causes less stress to Firefox OS.

- Not all Developer HUD metrics need actors to work: "Frames per second" and "Time to load" will still work even if you don't enable "Developer tools".

- If you need to track any of the following metrics, you should enable "Developer tools" actors: Warnings, Errors, Security issues, Reflows, Jank, USS, App memory. If you start tracking all these metrics at once, you might (or might not) notice that the device becomes a little slower. This is mainly because you'll start installing more actors into each web view, and some of these actors will start polling data and computing metrics. The impact shouldn't be horrible, but in some cases you might notice it.

- I'd advise against using "App memory", because it only shows a small fraction of the memory footprint, and the backend it uses to get this information does a lot of heavy computation. Instead, using "USS" gives a very precise measure of the memory footprint by regularly probing a lightweight linux kernel feature, so you won't feel its impact.

- I'd prefer if your Telemetry control takes the form of a single checkbox in the HUD menu (like "Log changes in ADB") instead of a separate panel. I think an "Advanced Telemetry" panel would look very much like the Developer HUD panel, and it seems hard to maintain separate, overlapping lists of which metrics are being tracked by the HUD, and which by Telemetry.

[1] The devtools actors live here: https://dxr.mozilla.org/mozilla-central/source/toolkit/devtools/server/actors
Flags: needinfo?(janx)
Jan, Tamara and I agree with your suggestion of having a single "advanced telemetry" checkbox in the HUD menu instead of a separate panel. As such, I'm changing the title of this bug. Also, our intention is to disable the HUD display when the "advanced telemetry" setting is enabled. This is because those interested in recording advanced telemetry are most likely not interested in seeing the information on the screen and since we are sending a different event from the hud backend for advanced telemetry information we don't want to be sending 2x the events when "advanced telemetry" is enabled.
Summary: [Settings] Advanced telemetry settings panel → [Settings] Advanced telemetry setting
Hi Jan, following up on your feedback from comment 4, I'd like your opinion on two issues related to how recording telemetry information is different from viewing HUD information on the device:

1. My guess is the HUD is typically enabled for short periods whereas advanced telemetry would be enabled for longer periods and therefore having the HUD display enabled for long periods may not be desirable for users. Therefore, we'd like to be able to record telemetry information without having to display the corresponding metric in the HUD. Perhaps our "advanced telemetry" setting, when enabled, could have an additional "disable HUD display" setting? Should the "advanced telemetry" setting automatically disable the HUD display?

2. Regarding having only one settings panel, which I agree is desirable in terms of limiting the complexity of the advanced telemetry, it nonetheless may be desirable to collect advanced telemetry data for one set of metrics while displaying in the HUD a different set of metrics. I'm thinking it may be desirable to collect telemetry data for more metrics than is desirable to display on the HUD.

I'm trying to find a good balance between being simplicity and flexibility. Your thoughts?
Flags: needinfo?(janx)
Depends on: 1173346
Hi Russ,

1. The HUD display is also supposed to be used for longer periods of time: The widgets are meant to stay out of the user's way in order to detect problems arising in normal everyday use of an app (e.g. memory leaks, corner case bugs, rare security issues, etc). However, I agree that there is value in making the widgets' display optional, so I filed bug 1173346 for it. I also agree that when using advanced telemetry, you'll probably not want to see the widgets all the time, but a checkbox that automatically unchecks another would be bad/surprising UX, and an exclusive switch between telemetry and widgets is probably too strict. Let's agree on sensible defaults (e.g. in eng builds, should advanced telemetry be on / widgets off by default? Which metrics should be tracked?) and let the user further tweak the HUD to preference.

2. That is my worry too: Widgets are useful for rare / unusual problems (e.g. you might only ever want to see JS errors and security issues if they happen) while advanced telemetry is better when tracking many things. My gut feeling is that the complexity of defining separate sets of metrics for widgets / telemetry / logs outweighs the benefits, but if you have ideas on how to make such a thing simpler or more useful, I'd love to know. Maybe we can have slightly different sets, but hide the added complexity, by making telemetry (if enabled) always report about some metrics regardless if their widget is enabled?

For example, if USS polling is low-impact enough, we could have telemetry always report USS even if the widget is off. Also, for engineering builds, we could enable HUD + advanced telemetry by default, but with just a few widgets enabled (e.g. JS errors and security issues) so that developers only see a widget when there is a critical problem, while telemetry tracks more things.
Flags: needinfo?(janx)
I found the setting needed to exist in order to have a settings observer for the setting so I added the advanced telemetry setting to the Developer HUD settings panel. I updated my hud.js to use the new setting. This all works: when the setting is enabled, hud.js sends the advanced telemetry events; when the setting is not enabled, no advanced telemetry events are set.

Also, I added an ADVANCED_TELEMETRY build but when I use it is not setting the "advanced telemetry" setting, so it seems I'm missing something.
Attachment #8620783 - Flags: review?(thills)
(In reply to Russ Nicoletti [:russn] from comment #8)
> Created attachment 8620783 [details] [review]
> PR for creation of advanced telemetry setting
> Also, I added an ADVANCED_TELEMETRY build but when I use it is not setting
> the "advanced telemetry" setting, so it seems I'm missing something.

Apparently, creating new custom build flags is not recommended. Therefore, I'm using DOGFOOD.
Attachment #8620783 - Flags: review?(thills)
Assignee: nobody → rnicoletti
Status: NEW → ASSIGNED
No longer depends on: 1173346
Duplicate of this bug: 1166379
Depends on: 1178923
No longer depends on: 1178923
Attachment #8631832 - Attachment is obsolete: true
Comment on attachment 8631844 [details] [review]
[gaia] russnicoletti:bug-1167710 > mozilla-b2g:master

Hi Evelyn,

This patch is for adding an "Advanced Telemetry", aka "NGA Telemetry" [1], setting to the developer HUD settings panel. The setting is queried by the gecko HUD when determining if it should sent telemetry information to the system app (bug 1178512).

[1] https://wiki.mozilla.org/Gaia/Architecture_Transition_Validation#Telemetry
Attachment #8631844 - Flags: review?(ehung)
Comment on attachment 8631844 [details] [review]
[gaia] russnicoletti:bug-1167710 > mozilla-b2g:master

LGTM, thanks.
Attachment #8631844 - Flags: review?(ehung) → review+
Hi Evelyn, I'm wondering if there is a protocol for adding a setting, to let people know about the new setting, via email or something. Or is ok to simply land this code without any kind of notification?
Flags: needinfo?(ehung)
I guess the protocol is dev-gaia mailing list, you could send a mail to tell people and get feedback there.
Flags: needinfo?(ehung)
Blocks: 1180699
feature-b2g: --- → 2.5+
Attachment #8620783 - Attachment is obsolete: true
Master: https://github.com/mozilla-b2g/gaia/commit/1305afffca13c3cd1d1125e4d5e51ef306f8ea4f
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.