Shutdown time is read in the main thread

RESOLVED FIXED in mozilla20

Status

()

Toolkit
Telemetry
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: espindola, Assigned: espindola)

Tracking

unspecified
mozilla20
x86
Mac OS X
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 6 obsolete attachments)

Comment hidden (empty)
Status: NEW → ASSIGNED
QA Contact: respindola
Assignee: nobody → respindola
QA Contact: respindola
Depends on: 817012
Created attachment 688947 [details] [diff] [review]
patch

Most of the patch is just moving code to Telemetry.cpp. Let me know if you want me to split that into an independent patch.

The reason I am moving the code to Telemetry.cpp is that mCachedTelemetryData will also be used to guard the reading of late write stacks.
Attachment #688947 - Flags: review?(vdjeric)
https://tbpl.mozilla.org/?tree=Try&rev=11838ef9cd3c
Comment on attachment 688947 [details] [diff] [review]
patch

Benjamin should probably comment on the nsAppStartup bits.
Attachment #688947 - Flags: feedback?(benjamin)
Created attachment 689212 [details] [diff] [review]
test fixes

This should fix the test failures of the previous run. I will try to add the logic for reading the data during startup next.

https://tbpl.mozilla.org/?tree=Try&rev=9ffc149d0116
Attachment #688947 - Attachment is obsolete: true
Attachment #688947 - Flags: review?(vdjeric)
Attachment #688947 - Flags: feedback?(benjamin)
Attachment #689212 - Flags: review?(vdjeric)
Attachment #689212 - Flags: feedback?(benjamin)
Created attachment 689286 [details] [diff] [review]
Now with timer

This one includes the timer Vladan, Ehsan and I discussed. We already had an observer for xul-window-visible, so I reused that.
Attachment #689212 - Attachment is obsolete: true
Attachment #689212 - Flags: review?(vdjeric)
Attachment #689212 - Flags: feedback?(benjamin)
Attachment #689286 - Flags: review?(vdjeric)
Attachment #689286 - Flags: feedback?(benjamin)
Created attachment 690468 [details] [diff] [review]
Just move code to Telemetry.cpp

This patch just moves the code to Telemetry (and the nsITelemetry interface). This is hopefully easy to review and will make the "real" patch less noisy.

The reason moving this is that we also want to read stacks for the late writes and we want to read both asynchronously. Having all reads in telemetry makes it easier to move them to an async callback.

https://tbpl.mozilla.org/?tree=Try&rev=1960880d6d62
Attachment #690468 - Flags: review?(benjamin)
Attachment #690468 - Attachment is patch: true
Created attachment 690905 [details] [diff] [review]
Move code to Telemetry.cpp and s/NULL/nullptr/.
Attachment #690468 - Attachment is obsolete: true
Attachment #690468 - Flags: review?(benjamin)
Attachment #690905 - Flags: review?(vdjeric)
Created attachment 690967 [details] [diff] [review]
Patch for async read, rebased on top of the move patch
Attachment #689286 - Attachment is obsolete: true
Attachment #689286 - Flags: review?(vdjeric)
Attachment #689286 - Flags: feedback?(benjamin)
Attachment #690967 - Flags: review?(vdjeric)
https://tbpl.mozilla.org/?tree=Try&rev=d4836b7ed9b8
Comment on attachment 690967 [details] [diff] [review]
Patch for async read, rebased on top of the move patch

>diff --git a/toolkit/components/telemetry/Telemetry.cpp b/toolkit/components/telemetry/Telemetry.cpp
>   bool mCachedShutdownTime;
>   uint32_t mLastShutdownTime;
>+  std::vector<nsCOMPtr<nsIReadShutdownTimeCallback> > mCallbacks;
>+  friend class nsReadShutdownTime;

I'd prefer to have setter functions for these member variables instead of this class making friends. 
Why is there a vector of callbacks? We should only have one outstanding shutdown read at a time.. maybe make mCachedShutdownTime into a tri-state bool called mShutdownTimeState or something, and give it states READ|UNREAD|PENDING.

>+static uint32_t
>+ReadLastShutdownDuration(const char *filename) {
>+  FILE *f = fopen(filename, "r");
>+  if (!f) {
>+    return 0;
>+  }

Let's return -1 here and change return type to signed int. We don't want to have a shutdownDuration field in TelemetryPing anyway if we can't read the time from the file. Reporting 0 in case of error makes our shutdown times look better than they are in reality.

Also, why is ReadLastShutdownDuration a separate function from nsReadShutdownTime::Run? I'm not saying it's worse on its own, I'm wondering if you are intending to call it from somewhere else?

>+  int r = fscanf(f, "%d\n", &shutdownTime);

Nit: I know you didn't write this code, but I'm not a fan of single-letter variable names outside loop variables. So r -> ret, f -> shutdownTimeFile.
Apply throughout file, e.g. "nsCOMPtr<nsIRunnable> e"

>+class nsReadShutdownTime : public nsRunnable
>+{
>+  const char *mFilename;
>+  TelemetryImpl *mTelemetry;

Move nsReadShutdownTime and ReadLastShutdownDuration into an anonymous namespace. 
We always have an explicit access specifier for member variables and we put the public section with the class's constructor first.

>+  NS_IMETHOD Run() {
>+    uint32_t val = ReadLastShutdownDuration(mFilename);

Rename "val" to "duration".

>+    mTelemetry->mLastShutdownTime = val;

Make this into a setter method in TelemetryImpl. Same for mCachedShutdownTime

>+    nsCOMPtr<nsIRunnable> e =
>+      NS_NewRunnableMethod(this, &nsReadShutdownTime::MainThread);
>+    NS_DispatchToMainThread(e, NS_DISPATCH_NORMAL);
>+    return NS_OK;

NS_ENSURE_STATE(e) after the call to NS_NewRunnableMethod.
Is it possible for nsReadShutdownTime::MainThread to run after mTelemetry has been freed? If so, mTelemetry should be an nsCOMPtr<TelemetryImpl>

> NS_IMETHODIMP
> TelemetryImpl::GetLastShutdownDuration(uint32_t *aResult)
> {
>+  // The user must call ReadShutdownTime first. We return zero instead of
>+  // reporting a failure so that the rest of telemetry can uniformly handle
>+  // the read not being available yet.
>+  if (!mCachedShutdownTime) {
>     *aResult = 0;
>     return NS_OK;
>   }

As mentioned earlier, return -1 here

>+NS_IMETHODIMP
>+TelemetryImpl::ReadShutdownTime(nsIReadShutdownTimeCallback *aCallback)
>+{
>+  // We have finished reading the data already, just call the callback.
>+  if (mCachedShutdownTime) {
>+    aCallback->Complete();
>+    return NS_OK;
>+  }

I don't like having these callbacks just for tests.. Can't we just do a "sleep" call in tests?

>+  nsCOMPtr<nsIEventTarget> target =
>+    do_GetService(NS_STREAMTRANSPORTSERVICE_CONTRACTID);

Rename "target" to "targetThread" and add a comment explaining why we're getting the stream transport service here

>diff --git a/toolkit/components/telemetry/TelemetryPing.js b/toolkit/components/telemetry/TelemetryPing.js
>-    case "xul-window-visible":
>+    case "xul-window-visible": {

We don't use braces around case conditions in this file. It's ugly imho

>+      let timer = Cc["@mozilla.org/timer;1"]
>+                    .createInstance(Ci.nsITimer);

Nit: this can be on one line

>+      timer.initWithCallback(function() {
>+        Telemetry.readShutdownTime(function () {
>+        });
>+      }, 5413, Ci.nsITimer.TYPE_ONE_SHOT);

Put the magic number in a const variable at the top of the file.

> diff --git a/toolkit/components/telemetry/nsITelemetry.idl b/toolkit/components/telemetry/nsITelemetry.idl
>+[scriptable,function, uuid(5c25e4b0-398f-11e2-81c1-0800200c9a66)]
>+interface nsIReadShutdownTimeCallback : nsISupports
>+{
>+    void complete();
>+};

I don't think we really need this callback if it's only used in tests

> [scriptable, uuid(de54f594-4c20-4968-a27a-83b38ff952b9)]
> interface nsITelemetry : nsISupports
> {
...
>+
>+  void readShutdownTime(in nsIReadShutdownTimeCallback aCallback);
> };

Nit: rename to asyncReadShutdownTime

>diff --git a/toolkit/content/aboutTelemetry.js b/toolkit/content/aboutTelemetry.js
>   // Get the Telemetry Ping payload
>-  let ping = TelemetryPing.getPayload();
>+  Telemetry.readShutdownTime(function() {
>+    let ping = TelemetryPing.getPayload();

I don't think we need this.. xul-window-visible will have fired before we get here, and even if it hasn't, we can just display the default value (0 in current patch, -1 in my proposal). Either way, it accurately reflects the Telemetry data currently gathered
Attachment #690967 - Flags: review?(vdjeric) → review-
I'll review the "move" patch after we address the questions above
> >+  friend class nsReadShutdownTime;
> 
> I'd prefer to have setter functions for these member variables instead of
> this class making friends. 

Why? It is an important invariant that these member variables are not arbitrarily modified from the outside. If there is a style ban on friend I can make nsReadShutdownTime an inner class if you want.

> Why is there a vector of callbacks? We should only have one outstanding
> shutdown read at a time.. maybe make mCachedShutdownTime into a tri-state
> bool called mShutdownTimeState or something, and give it states
> READ|UNREAD|PENDING.

There can be only one pending read, but an arbitrary number of pending callbacks. Just imagine opening 100 about:telemetry  tabs before the read completes. When it does, the 100 tabs have to be notified. We could return a busy state, but that would just push the complexity to the caller.

> >+static uint32_t
> >+ReadLastShutdownDuration(const char *filename) {
> >+  FILE *f = fopen(filename, "r");
> >+  if (!f) {
> >+    return 0;
> >+  }
> 
> Let's return -1 here and change return type to signed int. We don't want to
> have a shutdownDuration field in TelemetryPing anyway if we can't read the
> time from the file. Reporting 0 in case of error makes our shutdown times
> look better than they are in reality.

This is fairly unrelated to making the read async, isn't it? Why does it have to done in this patch? Note that this change would not be "user" facing anyway, TelemetryPing.js has

  let shutdownDuration = Services.startup.lastShutdownDuration;
  if (shutdownDuration)
    ret.shutdownDuration = shutdownDuration;

So 0 is not reported.


> Also, why is ReadLastShutdownDuration a separate function from
> nsReadShutdownTime::Run? I'm not saying it's worse on its own, I'm wondering
> if you are intending to call it from somewhere else?

No, it was just easier for the refactoring.

> >+  int r = fscanf(f, "%d\n", &shutdownTime);
> 
> Nit: I know you didn't write this code, but I'm not a fan of single-letter
> variable names outside loop variables. So r -> ret, f -> shutdownTimeFile.
> Apply throughout file, e.g. "nsCOMPtr<nsIRunnable> e"

Again, this is unrelated to making the read async. Lets please focus on that in this change, the process is already way too slow as is.

> >+class nsReadShutdownTime : public nsRunnable
> >+{
> >+  const char *mFilename;
> >+  TelemetryImpl *mTelemetry;
> 
> Move nsReadShutdownTime and ReadLastShutdownDuration into an anonymous
> namespace. 

They are in one. It starts in line 38 and ends in line 1819. This is line 693 to 734.

> We always have an explicit access specifier for member variables and we put
> the public section with the class's constructor first.

Will do.

> >+  NS_IMETHOD Run() {
> >+    uint32_t val = ReadLastShutdownDuration(mFilename);
> 
> Rename "val" to "duration".

Was only used once, removed.

> >+    mTelemetry->mLastShutdownTime = val;
> 
> Make this into a setter method in TelemetryImpl. Same for mCachedShutdownTime

See above.

> >+    nsCOMPtr<nsIRunnable> e =
> >+      NS_NewRunnableMethod(this, &nsReadShutdownTime::MainThread);
> >+    NS_DispatchToMainThread(e, NS_DISPATCH_NORMAL);
> >+    return NS_OK;
> 
> NS_ENSURE_STATE(e) after the call to NS_NewRunnableMethod.

done

> Is it possible for nsReadShutdownTime::MainThread to run after mTelemetry
> has been freed? If so, mTelemetry should be an nsCOMPtr<TelemetryImpl>

done.
 
> > NS_IMETHODIMP
> > TelemetryImpl::GetLastShutdownDuration(uint32_t *aResult)
> > {
> >+  // The user must call ReadShutdownTime first. We return zero instead of
> >+  // reporting a failure so that the rest of telemetry can uniformly handle
> >+  // the read not being available yet.
> >+  if (!mCachedShutdownTime) {
> >     *aResult = 0;
> >     return NS_OK;
> >   }
> 
> As mentioned earlier, return -1 here

This patch is not changing this.

> >+NS_IMETHODIMP
> >+TelemetryImpl::ReadShutdownTime(nsIReadShutdownTimeCallback *aCallback)
> >+{
> >+  // We have finished reading the data already, just call the callback.
> >+  if (mCachedShutdownTime) {
> >+    aCallback->Complete();
> >+    return NS_OK;
> >+  }
> 
> I don't like having these callbacks just for tests.. Can't we just do a
> "sleep" call in tests?

Sorry, which callbacks are just for tests? The one you quoted is not, if you open about:telemetry, wait for the read and open it again that is the callback that is called.

> >+  nsCOMPtr<nsIEventTarget> target =
> >+    do_GetService(NS_STREAMTRANSPORTSERVICE_CONTRACTID);
> 
> Rename "target" to "targetThread" and add a comment explaining why we're
> getting the stream transport service here

done.

> >diff --git a/toolkit/components/telemetry/TelemetryPing.js b/toolkit/components/telemetry/TelemetryPing.js
> >-    case "xul-window-visible":
> >+    case "xul-window-visible": {
> 
> We don't use braces around case conditions in this file. It's ugly imho

What should I do for creating the "let timer"? Without the brace its scope will be the function no? Should I create a helper function and do

case "xul-window-visibile":
   this.xulWindowVisible();

?


> >+      let timer = Cc["@mozilla.org/timer;1"]
> >+                    .createInstance(Ci.nsITimer);
> 
> Nit: this can be on one line

Done.
 
> >+      timer.initWithCallback(function() {
> >+        Telemetry.readShutdownTime(function () {
> >+        });
> >+      }, 5413, Ci.nsITimer.TYPE_ONE_SHOT);
> 
> Put the magic number in a const variable at the top of the file.

Done.

> > diff --git a/toolkit/components/telemetry/nsITelemetry.idl b/toolkit/components/telemetry/nsITelemetry.idl
> >+[scriptable,function, uuid(5c25e4b0-398f-11e2-81c1-0800200c9a66)]
> >+interface nsIReadShutdownTimeCallback : nsISupports
> >+{
> >+    void complete();
> >+};
> 
> I don't think we really need this callback if it's only used in tests

See above.

> > [scriptable, uuid(de54f594-4c20-4968-a27a-83b38ff952b9)]
> > interface nsITelemetry : nsISupports
> > {
> ...
> >+
> >+  void readShutdownTime(in nsIReadShutdownTimeCallback aCallback);
> > };
> 
> Nit: rename to asyncReadShutdownTime

Done.
 
> >diff --git a/toolkit/content/aboutTelemetry.js b/toolkit/content/aboutTelemetry.js
> >   // Get the Telemetry Ping payload
> >-  let ping = TelemetryPing.getPayload();
> >+  Telemetry.readShutdownTime(function() {
> >+    let ping = TelemetryPing.getPayload();
> 
> I don't think we need this.. xul-window-visible will have fired before we
> get here, and even if it hasn't, we can just display the default value (0 in
> current patch, -1 in my proposal). Either way, it accurately reflects the
> Telemetry data currently gathered

Lets please not do this. Really. There is no point in presenting inaccurate data to our users. Remember that stacks for the late writes will be read in here to. It would be really surprising to open about:telemetry, se that there were no late writes, refresh and see that now there are some.
(In reply to Vladan Djeric (:vladan) from comment #11)
> I'll review the "move" patch after we address the questions above

Please no. The reason for splitting the patches is to speed up our tremendously slow code review process, not slow it down!
Created attachment 691827 [details] [diff] [review]
new patch on top of the move patch

https://tbpl.mozilla.org/?tree=Try&rev=f28f7441497a
Attachment #690967 - Attachment is obsolete: true
Attachment #691827 - Flags: review?(vdjeric)
Attachment #690905 - Flags: review?(vdjeric) → review?(ehsan)
Comment on attachment 690905 [details] [diff] [review]
Move code to Telemetry.cpp and s/NULL/nullptr/.

Review of attachment 690905 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/startup/public/nsIAppStartup.idl
@@ -45,5 @@
>      /**
> -     * The amount of time, in milliseconds, that the last session took
> -     * to shutdown.  Reads as 0 to indicate failure.
> -     */
> -    readonly attribute uint32_t lastShutdownDuration;

You need to rev the uuid.

::: toolkit/components/telemetry/nsITelemetry.idl
@@ +39,5 @@
> +  /**
> +   * The amount of time, in milliseconds, that the last session took
> +   * to shutdown.  Reads as 0 to indicate failure.
> +   */
> +  readonly attribute uint32_t lastShutdownDuration;

Please rev the uuid here as well.
Attachment #690905 - Flags: review?(ehsan) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/2e34b7c8a339
Created attachment 691887 [details] [diff] [review]
fix it. Rebased on top of inbound.

https://tbpl.mozilla.org/?tree=Try&rev=eb3ebcf79643
Attachment #691827 - Attachment is obsolete: true
Attachment #691827 - Flags: review?(vdjeric)
Attachment #691887 - Flags: review?(vdjeric)
Comment on attachment 691887 [details] [diff] [review]
fix it. Rebased on top of inbound.

After our convo, I've come to recognize the benefit of having callbacks for async operations in test code, so that invalidates a lot of my previous comments.

(In reply to Rafael Ávila de Espíndola (:espindola) from comment #12)
> Why? It is an important invariant that these member variables are not
> arbitrarily modified from the outside. If there is a style ban on friend I
> can make nsReadShutdownTime an inner class if you want.

No, there is no style ban. I would prefer an inner class over friend, but I guess that is really a matter of personal taste, so let's leave it.

> > >+  int r = fscanf(f, "%d\n", &shutdownTime);
> > 
> > Nit: I know you didn't write this code, but I'm not a fan of single-letter
> > variable names outside loop variables. So r -> ret, f -> shutdownTimeFile.
> > Apply throughout file, e.g. "nsCOMPtr<nsIRunnable> e"
> 
> Again, this is unrelated to making the read async. Lets please focus on that
> in this change, the process is already way too slow as is.

We can fix the variable names in a follow up bug or in a no-bug commit, but we might as well do it now since we're changing the code. I'll leave the decision up to you.

> What should I do for creating the "let timer"? Without the brace its scope
> will be the function no? Should I create a helper function and do

As we discussed, it has essentially the same lifetime with & without the braces. For consistency with rest of the code in this function, I would vote for just taking out the curly braces.

> > Is it possible for nsReadShutdownTime::MainThread to run after mTelemetry
> > has been freed? If so, mTelemetry should be an nsCOMPtr<TelemetryImpl>
> 
> done.

Since we're keeping these classes as friends, you could also just directly set mTelemetry = TelemetryImpl::sTelemetry. The nsCOMPtr is still important for keeping TelemetryImpl alive for accesses off the main thread.

> const SHUTDOWN_TIME_READ_DELAY = 5413;

Nit: SHUTDOWN_TIME_READ_DELAY_MS

> void asyncReadShutdownTime(in nsIReadShutdownTimeCallback aCallback);

Add a description of the function like the others in the IDL file

>   Telemetry.asyncReadShutdownTime(function() {

Let's move the code for displaying "addon histogram data" before the asyncReadShutdownTime call and the rest of the callback code that relies on the ping payload into its own function.
Attachment #691887 - Flags: review?(vdjeric)
Attachment #691887 - Flags: review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/babbf864b10c
https://hg.mozilla.org/mozilla-central/rev/2e34b7c8a339
https://hg.mozilla.org/mozilla-central/rev/babbf864b10c
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
You need to log in before you can comment on or make changes to this bug.