1584889 - [Thunderbird Telemetry] Collect plain text vs. html composition usage

Assignee

Description

•

6 years ago

What:
Add a telemetry probe to collect stats on whether user is using html or plain text as the default message composition format.

Why:
Presumably this would help guide our assumptions on which format was the best to set as the default (does anyone else have a more compelling argument to add?).

Expiration:
None - keep collecting this indefinitely.
(although obviously it'll become redundant when everyone realizes that html email is wrong and we drop support for it, thus making the world a better place ;- ).

Ben Campbell

Assignee

Comment 1

•

6 years ago

OK, so this is my first run through adding a real telemetry probe for Thunderbird, so lots of unknowns to work out.

The main thing I'm not sure about is how to handle multiple identities - users can have multiple accounts, and accounts can have multiple identities, and identities are where the composition setting is held (compose_html bool in prefs).

It feels like the "right" way is to use a Categorical Histogram (with "html" and "plain" categories, say). So I might have 3 accounts set up - a "plain" count of 2 and an "html" count of 1.
However, I'd want to update that histogram whenever the compose_html prefs were changed or new identities added/removed.
But there doesn't seem to be to a way to replace histogram counts, only to further accumulate. (there is a clear() method on histograms, but that's only for use in testing).
So I'm guessing a histogram isn't appropriate here.

The alternative is to use integer scalars - eg one for each of "html", "plain", to hold the number of identities using each setting.
But I'm not quite sure how useful that'd be on the data-analysis side... I guess you'd just be summing up all the "html" counts against the "plain" counts and that's your html/plain proportion... but I really don't know.

Chris: Got any words of wisdom for us?

Flags: needinfo?(chutten)

Jorg K (CEST = GMT+2)

Comment 2

•

6 years ago

•

Edited

Add a telemetry probe to collect stats on whether user is using html or plain text as the default message composition format.

I also what to know the following:

Is paragraph style used
Are custom colour used
How many dictionaries installed
Signature used (together with some details)
How do people reply, below or above the quote, etc.

I'm sure I could think of much more.

Chris H-C :chutten

Comment 3

•

6 years ago

Integer scalars would work except that scalars are sent exactly once and then are cleared. (This makes them more suited for counting the number of times the setting is changed instead of collecting its current value).

Where these preference-types of collections in Firefox usually end up (to my chagrin given its lack of expiry, documentation, testing utilities, and data collection ownership information) is the Environment in the userPrefs section. This is a non-scalar way of reporting any non-default setting with each and every ping (no clearing).

A further option would be a custom ping of your own devising. If it fits in a JS object, it can be sent as a payload via submitExternalPing, so you can manage the representation, lifetime, and reporting schedule manually. Using "multi-store" (the record_into_stores property available on Histograms and Scalars) you can still take advantage of Scalars' and Histograms' features (documentation, definitions files, IPC, test utilities, etc.) for scalar- and histogram-shaped data while also controlling when things are sent and how.

For this example specifically you could have those scalars

identity.pref:
  composes_html:
    kind: uint
    record_into_store: ["tb_prefs"]
    ...
  composes_plain_text:
    kind: uint
    record_into_store: ["tb_prefs"]
    ...

Then your identity prefs code can scalarSet those values appropriately. At shutdown you can build your ping payload and submit it

let payload = {};
let scalarSnapshot = Telemetry.getSnapshotForScalars("tb_prefs", true /* clear */, true /* filter test probes */);
payload.composes_html = scalarSnapshot.parent.composes_html / (scalarSnapshot.parent.composes_html + scalarSnapshot.parent.composes_plain_text); // if you wanted the ratio. You could also just report the raw values.
TelemetryController.submitExternalPing(TB_IDENTITY_PREFS_PING_NAME, payload, options);

We have some docs about custom pings.

From a Data Stewardship POV I would recommend getting in the habit of asking "What questions are these data required to answer?" before adding more measurements. A nice exercise is examining hypothetical actions you might take for probably outcomes. For example, using "How do people reply, below or above the quote", what actions will we take if:

98% of identities reply above the quote? 50%? 1%?
The number of identities replying above the quote sharply declines? Sharply increases? Gradually increases?

If we don't have concrete actions that will result from this information, maybe we don't need the data after all. The best quality of data is the data you don't collect.

Hope this helps!

Flags: needinfo?(chutten)

Ben Campbell

Assignee

Updated

•

6 years ago

Assignee: nobody → benc

Ben Campbell

Assignee

Updated

•

6 years ago

Depends on: 1597179

Ben Campbell

Assignee

Updated

•

6 years ago

Depends on: 1597180

Magnus Melin [:mkmelin]

Comment 4

•

6 years ago

For the html vs. plain text composition I'm not sure the prefs are that interesting, I'd just collect actual usage and don't care about the prefs. So maybe just a probe that would increment each time the compose window comes up? (Let's do one bug per probe.)

Magnus Melin [:mkmelin]

Updated

•

6 years ago

Summary: [Thunderbird Telemetry] Collect plain/html composition settings. → [Thunderbird Telemetry] Collect plain text vs. html composition usage

Jorg K (CEST = GMT+2)

Comment 5

•

6 years ago

Sure. That can then also transmit whether they're using paragraph mode or background colours, and format flowed, dictionaries, signature, where the reply is (see comment #2).

Ben Campbell

Assignee

Comment 6

•

6 years ago

Attached patch 1584889-add-html-text-compose-telemetry-1.patch (obsolete) — Details — Splinter Review

Ben Campbell

Assignee

Comment 7

•

6 years ago

(In reply to Magnus Melin [:mkmelin] from comment #4)

For the html vs. plain text composition I'm not sure the prefs are that interesting, I'd just collect actual usage and don't care about the prefs. So maybe just a probe that would increment each time the compose window comes up? (Let's do one bug per probe.)

That makes sense to me. Just uploaded a patch to do this.
(seems to work fine in the unit test, but not seeing any telemetry reporting in my local test setup, so will hold out on review for now)

Magnus Melin [:mkmelin]

Updated

•

6 years ago

Status: NEW → ASSIGNED

Ben Campbell

Assignee

Comment 8

•

6 years ago

Comment on attachment 9111125 [details] [diff] [review] 1584889-add-html-text-compose-telemetry-1.patch This one should be good to go, now that Bug 1597180 has landed. The patch still applies cleanly and it all seems to work fine ([try build](https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&selectedJob=286559173))

Attachment #9111125 - Flags: review?(mkmelin+mozilla)

Jorg K (CEST = GMT+2)

Comment 9

•

6 years ago

Comment on attachment 9111125 [details] [diff] [review] 1584889-add-html-text-compose-telemetry-1.patch What is the argument against collecting more data, see comment #2? Can you send telemetry from JS code? Then you should do it from MsgComposeCommands.js.

Attachment #9111125 - Flags: feedback-

Magnus Melin [:mkmelin]

Comment 10

•

6 years ago

Jörg, I think those would be separate probes. They should have own bugs, each with justifications of why we'd want them. Some of the data (at least dictionaries) may already be available.

Magnus Melin [:mkmelin]

Comment 11

•

6 years ago

Comment on attachment 9111125 [details] [diff] [review] 1584889-add-html-text-compose-telemetry-1.patch Review of attachment 9111125 [details] [diff] [review]: ----------------------------------------------------------------- Looks pretty ok with the below changed ::: mail/components/telemetry/Scalars.yaml @@ +54,5 @@ > +tb.compose: > + format_html: > + bug_numbers: > + - 1584889 > + description: How many times nsMsgCompose has been invoked for HTML composition. A bit too technical perhaps. Maybe "How many times messages were written in HTML composition mode." (And corresponding change below.) @@ +60,5 @@ > + products: > + - 'thunderbird' > + kind: uint > + notification_emails: > + - "telemetry-client-dev@thunderbird.net" Was this already set up? If not, contact sancus ::: mailnews/compose/test/unit/test_telemetry.js @@ +1,4 @@ > +/* Any copyright is dedicated to the Public Domain. > + * http://creativecommons.org/publicdomain/zero/1.0/ */ > + > +/* Test telemetry related to message composition */ /** Style documentation please The test file should be named more specifically. Maybe test_telemetry_compose.js @@ +2,5 @@ > + * http://creativecommons.org/publicdomain/zero/1.0/ */ > + > +/* Test telemetry related to message composition */ > + > +const Telemetry = Services.telemetry; short enough that I 'd just use Services.telemetry where needed @@ +19,5 @@ > +add_task(async function test_compose_format() { > + Telemetry.clearScalars(); > + > + // Bare-bones code to initiate composing a message in given format. > + let compose = function(fmt) { for findability, would be preferabe with a more distinct name, like createCompose @@ +24,5 @@ > + let msgCompose = Cc[ > + "@mozilla.org/messengercompose/compose;1" > + ].createInstance(Ci.nsIMsgCompose); > + > + var params = Cc[ let @@ +34,5 @@ > + }; > + > + // Start composing arbitrary numbers of messages in each format. > + const num_html = 7; > + const num_plain = 13; For constants, would prefer uppercased, as NUM_HTML

Attachment #9111125 - Flags: review?(mkmelin+mozilla)

Jorg K (CEST = GMT+2)

Comment 12

•

6 years ago

Can someone please answer:

Can you send telemetry from JS code? Then you should do it from MsgComposeCommands.js.

Chris H-C :chutten

Comment 13

•

6 years ago

Hi! You can definitely record Telemetry from JS code, so long as you can get to the Telemetry service (the same one the test's using). It has JS bindings (the one you want is probably Services.telemetry.scalarAdd("tb.compose.format_{html|plain_text}", 1);) for both recording and testing.

Ben Campbell

Assignee

Comment 14

•

6 years ago

Attached patch 1584889-add-html-text-compose-telemetry-2.patch — Details — Splinter Review

Here's a patch which incorporates the suggestions from Magnus (holding off on review until the email address is confirmed and the JS issue is decided).

I did the probe on the C++ side because I'm pretty hazy about possible entry points from JS.
I'm happy to move it to the JS side instead, if anyone can suggest a definitive place for it (although remember that adding probes causes C++ code generation, so doing it all in JS doesn't save us any compilation, unfortunately).

Skimming through MsgComposeCommands.js, ComposeStartup() looks like the obvious candidate... but there's a lot of stuff in there tied to the GUI, which might make it an arse to unit-test?

Attachment #9111125 - Attachment is obsolete: true

Jorg K (CEST = GMT+2)

Comment 15

•

6 years ago

Looking at comment #2, you can see that I asked for a whole lot more information to be captured. Yes, it's nice to know whether plaintext or HTML is used, but if you want to tweak other compose functionality, need have to have more data. So the idea for me was to grab all that information in a centralised place. For example, C++ code never sees and dictionary count, colour information, etc. But it should all be available in the file I pointed out.

I see that you put the probe in nsMsgCompose::Initialize() which is called via InitCompose() in ComposeStartup(). So for me, the natural place would be to put it there.

In this context, may I point you to attachment 8819691 [details]. It was never completed, but is useful to some extent.

Jorg K (CEST = GMT+2)

Comment 16

•

6 years ago

Another thing you should definitely capture is the compose type, so new, reply, "edit draft", etc., to get some idea what people are doing. It would also be good to know that "type" of message they are composing. Isn't plaintext the default for news messages?

Jorg K (CEST = GMT+2)

Comment 17

•

6 years ago

•

Edited

Since you got me started on this: To argue for or against bug 862292 (the one that lobbies to use UTF-8 for outgoing messages exclusively), you should capture the charset used. There is some editor initialization code which should be able to get it from the editor.

Magnus Melin [:mkmelin]

Comment 18

•

6 years ago

(In reply to Jorg K (GMT+1) from comment #17)
That's not the correct logic, since people don't understand charsets. It will be whatever the default is. And since everything can be sent in UTF-8, you can't detect it that way either.

I think some of the other things mentioned to collect would be useful. Still, they should each have separate bugs with justifications on why to have them.

Ben Campbell

Assignee

Comment 19

•

6 years ago

I've just been looking at moving the probe over to the JS side, but ComposeStartup() seems to be the last stop before we're into C++ land, and it's too tied in to the GUI to unit test. So I'm going to leave the patch as is.

Ben Campbell

Assignee

Updated

•

6 years ago

Attachment #9123439 - Flags: review?(mkmelin+mozilla)

Jorg K (CEST = GMT+2)

Comment 20

•

6 years ago

Well, surely there are tests for composing messages. Besides, if you want to take other probes, the information might only be available in the JS code.

Ben Campbell

Assignee

Comment 21

•

6 years ago

(In reply to Jorg K (GMT+1) from comment #20)

Well, surely there are tests for composing messages.

Yes... but they all bypass ComposeStartUp() and call the C++-implemented nsIMsgCompose directly (instantiated via MailServices.compose.initCompose()).

Besides, if you want to take other probes, the information might only be available in the JS code.

That's fine. Both JS and C++ can invoke the telemetry-recording functions. In general, I'd lean toward putting them on JS side where easier, but doing in them in C++ is perfectly legit.
(for a rough count comparison, M-C looks like it's got about 50 or so scalarAdd() calls in JS and roughly the same in C++).

Magnus Melin [:mkmelin]

Comment 22

•

6 years ago

Comment on attachment 9123439 [details] [diff] [review] 1584889-add-html-text-compose-telemetry-2.patch Review of attachment 9123439 [details] [diff] [review]: ----------------------------------------------------------------- Looks good. For testing I have gzipServer, and run ` ./gzipServer.py -p 8080` Then after a while `grep -r tb.compose.format_html .` in the working directory will find a report with the number of compositions in html

Attachment #9123439 - Flags: review?(mkmelin+mozilla) → review+

Pulsebot

Comment 23

•

6 years ago

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/27929debc910
Add telemetry to count use of HTML vs plaintext message composition. r=mkmelin

Status: ASSIGNED → RESOLVED

Closed: 6 years ago

Resolution: --- → FIXED

Magnus Melin [:mkmelin]

Updated

•

6 years ago

Target Milestone: --- → Thunderbird 74.0

Jorg K (CEST = GMT+2)

Comment 24

•

6 years ago

(In reply to Magnus Melin [:mkmelin] from comment #18)

I think some of the other things mentioned to collect would be useful. Still, they should each have separate bugs with justifications on why to have them.

Can you file those bugs then?

Frank-Rainer Grahl (:frg)

Comment 25

•

6 years ago

Attached patch 1584889-suite-bustage.patch — Details — Splinter Review

Tis bug broke compiling SeaMonkey. If you add mail only telemetry please either surround with not NOZ_SUITE or put the declares into the shared mailnews code.

The patch from Bill fixes it. Fixing tests is not a priority. Suite tests are unfortunately broken for a long time.

Please check in or let me know when I should do it.

Flags: needinfo?(benc)

Attachment #9124709 - Flags: review+

mac198442

Comment 26

•

6 years ago

Since it would seem the intention here was to capture what Tunderbird users were using and not any other mail client that might be now, or later on developed using mailnews. This should probably really be within an #ifdef MOZ_THUNDERBIRD rather than an #ifndef MOZ_SUITE

mac198442

Comment 27

•

6 years ago

So that you only capture relevant data.

mac198442

Comment 28

•

6 years ago

TO make it clear, if the intent is to find what Thunderbird users do, then collecting data from other mail clients is NOT what you what to be doing.

Magnus Melin [:mkmelin]

Comment 29

•

6 years ago

Any Thunderbird derivative would also have MOZ_THUNDERBIRD set, so that's not an issue. I'll land the patch shortly.

Pulsebot

Comment 30

•

6 years ago

Pushed by mkmelin@iki.fi: https://hg.mozilla.org/comm-central/rev/4a30d3d764d2 Make Telemetry ifndef MOZ_SUITE to fix suite build bustage. r=frg

Ben Campbell

Assignee

Comment 31

•

6 years ago

(In reply to Frank-Rainer Grahl (:frg) from comment #25)

Tis bug broke compiling SeaMonkey. If you add mail only telemetry please either surround with not NOZ_SUITE or put the declares into the shared mailnews code.

Sorry about that, will keep it in mind.
I guess that currently applies to all the c-c telemetry. All the probe #defines are generated at build time, so none of them will be available in SeaMonkey without some build fiddling...

Flags: needinfo?(benc)

Frank-Rainer Grahl (:frg)

Comment 32

•

6 years ago

Sorry about that, will keep it in mind.

Thanks

I guess that currently applies to all the c-c telemetry. All the probe #defines are generated at build time, so none of them will be available in SeaMonkey without some build fiddling...

I doubt we will or want to enable telemetry any time soon. If it just needs build changes to avoid ifdefs around each location in the shared components let us know. Currently the problem is that everything / the defines for mailnews is in mail only.

Magnus Melin [:mkmelin]

Comment 33

•

5 years ago

(In reply to Jorg K (GMT+1) from comment #24)

Can you file those bugs then?

If filed bug 1615996 (and a bunch of others unrelated). For the other suggestions, thinking about them I don't know what they could collect: it would only be the more or less the default values.

1584889-add-html-text-compose-telemetry-1.patch 6 years ago Ben Campbell 5.29 KB, patch	jorgk-bmo : feedback-	Details \| Diff \| Splinter Review
1584889-add-html-text-compose-telemetry-2.patch 6 years ago Ben Campbell 5.31 KB, patch	mkmelin : review+	Details \| Diff \| Splinter Review
1584889-suite-bustage.patch 6 years ago Frank-Rainer Grahl (:frg) 1.31 KB, patch	frg : review+	Details \| Diff \| Splinter Review