Last Comment Bug 815709 - Shutdown time is read in the main thread
: Shutdown time is read in the main thread
Status: RESOLVED FIXED
:
Product: Toolkit
Classification: Components
Component: Telemetry (show other bugs)
: unspecified
: x86 Mac OS X
: -- normal (vote)
: mozilla20
Assigned To: Rafael Ávila de Espíndola (:espindola) (not reading bugmail)
:
Mentors:
Depends on: 817012
Blocks: 814765
  Show dependency treegraph
 
Reported: 2012-11-27 10:29 PST by Rafael Ávila de Espíndola (:espindola) (not reading bugmail)
Modified: 2012-12-14 07:34 PST (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch (17.11 KB, patch)
2012-12-05 13:48 PST, Rafael Ávila de Espíndola (:espindola) (not reading bugmail)
no flags Details | Diff | Review
test fixes (24.46 KB, patch)
2012-12-06 07:51 PST, Rafael Ávila de Espíndola (:espindola) (not reading bugmail)
no flags Details | Diff | Review
Now with timer (19.09 KB, patch)
2012-12-06 10:36 PST, Rafael Ávila de Espíndola (:espindola) (not reading bugmail)
no flags Details | Diff | Review
Just move code to Telemetry.cpp (15.51 KB, patch)
2012-12-10 10:51 PST, Rafael Ávila de Espíndola (:espindola) (not reading bugmail)
no flags Details | Diff | Review
Move code to Telemetry.cpp and s/NULL/nullptr/. (15.53 KB, patch)
2012-12-11 08:44 PST, Rafael Ávila de Espíndola (:espindola) (not reading bugmail)
ehsan: review+
Details | Diff | Review
Patch for async read, rebased on top of the move patch (13.82 KB, patch)
2012-12-11 11:02 PST, Rafael Ávila de Espíndola (:espindola) (not reading bugmail)
vladan.bugzilla: review-
Details | Diff | Review
new patch on top of the move patch (14.60 KB, patch)
2012-12-13 07:49 PST, Rafael Ávila de Espíndola (:espindola) (not reading bugmail)
no flags Details | Diff | Review
fix it. Rebased on top of inbound. (14.42 KB, patch)
2012-12-13 10:33 PST, Rafael Ávila de Espíndola (:espindola) (not reading bugmail)
vladan.bugzilla: review+
Details | Diff | Review

Description Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-11-27 10:29:38 PST

    
Comment 1 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-05 13:48:15 PST
Created attachment 688947 [details] [diff] [review]
patch

Most of the patch is just moving code to Telemetry.cpp. Let me know if you want me to split that into an independent patch.

The reason I am moving the code to Telemetry.cpp is that mCachedTelemetryData will also be used to guard the reading of late write stacks.
Comment 2 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-05 13:52:13 PST
https://tbpl.mozilla.org/?tree=Try&rev=11838ef9cd3c
Comment 3 Nathan Froyd [:froydnj] 2012-12-05 13:54:09 PST
Comment on attachment 688947 [details] [diff] [review]
patch

Benjamin should probably comment on the nsAppStartup bits.
Comment 4 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-06 07:51:23 PST
Created attachment 689212 [details] [diff] [review]
test fixes

This should fix the test failures of the previous run. I will try to add the logic for reading the data during startup next.

https://tbpl.mozilla.org/?tree=Try&rev=9ffc149d0116
Comment 5 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-06 10:36:58 PST
Created attachment 689286 [details] [diff] [review]
Now with timer

This one includes the timer Vladan, Ehsan and I discussed. We already had an observer for xul-window-visible, so I reused that.
Comment 6 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-10 10:51:44 PST
Created attachment 690468 [details] [diff] [review]
Just move code to Telemetry.cpp

This patch just moves the code to Telemetry (and the nsITelemetry interface). This is hopefully easy to review and will make the "real" patch less noisy.

The reason moving this is that we also want to read stacks for the late writes and we want to read both asynchronously. Having all reads in telemetry makes it easier to move them to an async callback.

https://tbpl.mozilla.org/?tree=Try&rev=1960880d6d62
Comment 7 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-11 08:44:04 PST
Created attachment 690905 [details] [diff] [review]
Move code to Telemetry.cpp and s/NULL/nullptr/.
Comment 8 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-11 11:02:02 PST
Created attachment 690967 [details] [diff] [review]
Patch for async read, rebased on top of the move patch
Comment 9 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-11 11:03:30 PST
https://tbpl.mozilla.org/?tree=Try&rev=d4836b7ed9b8
Comment 10 Vladan Djeric (:vladan) 2012-12-12 16:32:19 PST
Comment on attachment 690967 [details] [diff] [review]
Patch for async read, rebased on top of the move patch

>diff --git a/toolkit/components/telemetry/Telemetry.cpp b/toolkit/components/telemetry/Telemetry.cpp
>   bool mCachedShutdownTime;
>   uint32_t mLastShutdownTime;
>+  std::vector<nsCOMPtr<nsIReadShutdownTimeCallback> > mCallbacks;
>+  friend class nsReadShutdownTime;

I'd prefer to have setter functions for these member variables instead of this class making friends. 
Why is there a vector of callbacks? We should only have one outstanding shutdown read at a time.. maybe make mCachedShutdownTime into a tri-state bool called mShutdownTimeState or something, and give it states READ|UNREAD|PENDING.

>+static uint32_t
>+ReadLastShutdownDuration(const char *filename) {
>+  FILE *f = fopen(filename, "r");
>+  if (!f) {
>+    return 0;
>+  }

Let's return -1 here and change return type to signed int. We don't want to have a shutdownDuration field in TelemetryPing anyway if we can't read the time from the file. Reporting 0 in case of error makes our shutdown times look better than they are in reality.

Also, why is ReadLastShutdownDuration a separate function from nsReadShutdownTime::Run? I'm not saying it's worse on its own, I'm wondering if you are intending to call it from somewhere else?

>+  int r = fscanf(f, "%d\n", &shutdownTime);

Nit: I know you didn't write this code, but I'm not a fan of single-letter variable names outside loop variables. So r -> ret, f -> shutdownTimeFile.
Apply throughout file, e.g. "nsCOMPtr<nsIRunnable> e"

>+class nsReadShutdownTime : public nsRunnable
>+{
>+  const char *mFilename;
>+  TelemetryImpl *mTelemetry;

Move nsReadShutdownTime and ReadLastShutdownDuration into an anonymous namespace. 
We always have an explicit access specifier for member variables and we put the public section with the class's constructor first.

>+  NS_IMETHOD Run() {
>+    uint32_t val = ReadLastShutdownDuration(mFilename);

Rename "val" to "duration".

>+    mTelemetry->mLastShutdownTime = val;

Make this into a setter method in TelemetryImpl. Same for mCachedShutdownTime

>+    nsCOMPtr<nsIRunnable> e =
>+      NS_NewRunnableMethod(this, &nsReadShutdownTime::MainThread);
>+    NS_DispatchToMainThread(e, NS_DISPATCH_NORMAL);
>+    return NS_OK;

NS_ENSURE_STATE(e) after the call to NS_NewRunnableMethod.
Is it possible for nsReadShutdownTime::MainThread to run after mTelemetry has been freed? If so, mTelemetry should be an nsCOMPtr<TelemetryImpl>

> NS_IMETHODIMP
> TelemetryImpl::GetLastShutdownDuration(uint32_t *aResult)
> {
>+  // The user must call ReadShutdownTime first. We return zero instead of
>+  // reporting a failure so that the rest of telemetry can uniformly handle
>+  // the read not being available yet.
>+  if (!mCachedShutdownTime) {
>     *aResult = 0;
>     return NS_OK;
>   }

As mentioned earlier, return -1 here

>+NS_IMETHODIMP
>+TelemetryImpl::ReadShutdownTime(nsIReadShutdownTimeCallback *aCallback)
>+{
>+  // We have finished reading the data already, just call the callback.
>+  if (mCachedShutdownTime) {
>+    aCallback->Complete();
>+    return NS_OK;
>+  }

I don't like having these callbacks just for tests.. Can't we just do a "sleep" call in tests?

>+  nsCOMPtr<nsIEventTarget> target =
>+    do_GetService(NS_STREAMTRANSPORTSERVICE_CONTRACTID);

Rename "target" to "targetThread" and add a comment explaining why we're getting the stream transport service here

>diff --git a/toolkit/components/telemetry/TelemetryPing.js b/toolkit/components/telemetry/TelemetryPing.js
>-    case "xul-window-visible":
>+    case "xul-window-visible": {

We don't use braces around case conditions in this file. It's ugly imho

>+      let timer = Cc["@mozilla.org/timer;1"]
>+                    .createInstance(Ci.nsITimer);

Nit: this can be on one line

>+      timer.initWithCallback(function() {
>+        Telemetry.readShutdownTime(function () {
>+        });
>+      }, 5413, Ci.nsITimer.TYPE_ONE_SHOT);

Put the magic number in a const variable at the top of the file.

> diff --git a/toolkit/components/telemetry/nsITelemetry.idl b/toolkit/components/telemetry/nsITelemetry.idl
>+[scriptable,function, uuid(5c25e4b0-398f-11e2-81c1-0800200c9a66)]
>+interface nsIReadShutdownTimeCallback : nsISupports
>+{
>+    void complete();
>+};

I don't think we really need this callback if it's only used in tests

> [scriptable, uuid(de54f594-4c20-4968-a27a-83b38ff952b9)]
> interface nsITelemetry : nsISupports
> {
...
>+
>+  void readShutdownTime(in nsIReadShutdownTimeCallback aCallback);
> };

Nit: rename to asyncReadShutdownTime

>diff --git a/toolkit/content/aboutTelemetry.js b/toolkit/content/aboutTelemetry.js
>   // Get the Telemetry Ping payload
>-  let ping = TelemetryPing.getPayload();
>+  Telemetry.readShutdownTime(function() {
>+    let ping = TelemetryPing.getPayload();

I don't think we need this.. xul-window-visible will have fired before we get here, and even if it hasn't, we can just display the default value (0 in current patch, -1 in my proposal). Either way, it accurately reflects the Telemetry data currently gathered
Comment 11 Vladan Djeric (:vladan) 2012-12-12 16:34:59 PST
I'll review the "move" patch after we address the questions above
Comment 12 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-13 07:22:31 PST
> >+  friend class nsReadShutdownTime;
> 
> I'd prefer to have setter functions for these member variables instead of
> this class making friends. 

Why? It is an important invariant that these member variables are not arbitrarily modified from the outside. If there is a style ban on friend I can make nsReadShutdownTime an inner class if you want.

> Why is there a vector of callbacks? We should only have one outstanding
> shutdown read at a time.. maybe make mCachedShutdownTime into a tri-state
> bool called mShutdownTimeState or something, and give it states
> READ|UNREAD|PENDING.

There can be only one pending read, but an arbitrary number of pending callbacks. Just imagine opening 100 about:telemetry  tabs before the read completes. When it does, the 100 tabs have to be notified. We could return a busy state, but that would just push the complexity to the caller.

> >+static uint32_t
> >+ReadLastShutdownDuration(const char *filename) {
> >+  FILE *f = fopen(filename, "r");
> >+  if (!f) {
> >+    return 0;
> >+  }
> 
> Let's return -1 here and change return type to signed int. We don't want to
> have a shutdownDuration field in TelemetryPing anyway if we can't read the
> time from the file. Reporting 0 in case of error makes our shutdown times
> look better than they are in reality.

This is fairly unrelated to making the read async, isn't it? Why does it have to done in this patch? Note that this change would not be "user" facing anyway, TelemetryPing.js has

  let shutdownDuration = Services.startup.lastShutdownDuration;
  if (shutdownDuration)
    ret.shutdownDuration = shutdownDuration;

So 0 is not reported.


> Also, why is ReadLastShutdownDuration a separate function from
> nsReadShutdownTime::Run? I'm not saying it's worse on its own, I'm wondering
> if you are intending to call it from somewhere else?

No, it was just easier for the refactoring.

> >+  int r = fscanf(f, "%d\n", &shutdownTime);
> 
> Nit: I know you didn't write this code, but I'm not a fan of single-letter
> variable names outside loop variables. So r -> ret, f -> shutdownTimeFile.
> Apply throughout file, e.g. "nsCOMPtr<nsIRunnable> e"

Again, this is unrelated to making the read async. Lets please focus on that in this change, the process is already way too slow as is.

> >+class nsReadShutdownTime : public nsRunnable
> >+{
> >+  const char *mFilename;
> >+  TelemetryImpl *mTelemetry;
> 
> Move nsReadShutdownTime and ReadLastShutdownDuration into an anonymous
> namespace. 

They are in one. It starts in line 38 and ends in line 1819. This is line 693 to 734.

> We always have an explicit access specifier for member variables and we put
> the public section with the class's constructor first.

Will do.

> >+  NS_IMETHOD Run() {
> >+    uint32_t val = ReadLastShutdownDuration(mFilename);
> 
> Rename "val" to "duration".

Was only used once, removed.

> >+    mTelemetry->mLastShutdownTime = val;
> 
> Make this into a setter method in TelemetryImpl. Same for mCachedShutdownTime

See above.

> >+    nsCOMPtr<nsIRunnable> e =
> >+      NS_NewRunnableMethod(this, &nsReadShutdownTime::MainThread);
> >+    NS_DispatchToMainThread(e, NS_DISPATCH_NORMAL);
> >+    return NS_OK;
> 
> NS_ENSURE_STATE(e) after the call to NS_NewRunnableMethod.

done

> Is it possible for nsReadShutdownTime::MainThread to run after mTelemetry
> has been freed? If so, mTelemetry should be an nsCOMPtr<TelemetryImpl>

done.
 
> > NS_IMETHODIMP
> > TelemetryImpl::GetLastShutdownDuration(uint32_t *aResult)
> > {
> >+  // The user must call ReadShutdownTime first. We return zero instead of
> >+  // reporting a failure so that the rest of telemetry can uniformly handle
> >+  // the read not being available yet.
> >+  if (!mCachedShutdownTime) {
> >     *aResult = 0;
> >     return NS_OK;
> >   }
> 
> As mentioned earlier, return -1 here

This patch is not changing this.

> >+NS_IMETHODIMP
> >+TelemetryImpl::ReadShutdownTime(nsIReadShutdownTimeCallback *aCallback)
> >+{
> >+  // We have finished reading the data already, just call the callback.
> >+  if (mCachedShutdownTime) {
> >+    aCallback->Complete();
> >+    return NS_OK;
> >+  }
> 
> I don't like having these callbacks just for tests.. Can't we just do a
> "sleep" call in tests?

Sorry, which callbacks are just for tests? The one you quoted is not, if you open about:telemetry, wait for the read and open it again that is the callback that is called.

> >+  nsCOMPtr<nsIEventTarget> target =
> >+    do_GetService(NS_STREAMTRANSPORTSERVICE_CONTRACTID);
> 
> Rename "target" to "targetThread" and add a comment explaining why we're
> getting the stream transport service here

done.

> >diff --git a/toolkit/components/telemetry/TelemetryPing.js b/toolkit/components/telemetry/TelemetryPing.js
> >-    case "xul-window-visible":
> >+    case "xul-window-visible": {
> 
> We don't use braces around case conditions in this file. It's ugly imho

What should I do for creating the "let timer"? Without the brace its scope will be the function no? Should I create a helper function and do

case "xul-window-visibile":
   this.xulWindowVisible();

?


> >+      let timer = Cc["@mozilla.org/timer;1"]
> >+                    .createInstance(Ci.nsITimer);
> 
> Nit: this can be on one line

Done.
 
> >+      timer.initWithCallback(function() {
> >+        Telemetry.readShutdownTime(function () {
> >+        });
> >+      }, 5413, Ci.nsITimer.TYPE_ONE_SHOT);
> 
> Put the magic number in a const variable at the top of the file.

Done.

> > diff --git a/toolkit/components/telemetry/nsITelemetry.idl b/toolkit/components/telemetry/nsITelemetry.idl
> >+[scriptable,function, uuid(5c25e4b0-398f-11e2-81c1-0800200c9a66)]
> >+interface nsIReadShutdownTimeCallback : nsISupports
> >+{
> >+    void complete();
> >+};
> 
> I don't think we really need this callback if it's only used in tests

See above.

> > [scriptable, uuid(de54f594-4c20-4968-a27a-83b38ff952b9)]
> > interface nsITelemetry : nsISupports
> > {
> ...
> >+
> >+  void readShutdownTime(in nsIReadShutdownTimeCallback aCallback);
> > };
> 
> Nit: rename to asyncReadShutdownTime

Done.
 
> >diff --git a/toolkit/content/aboutTelemetry.js b/toolkit/content/aboutTelemetry.js
> >   // Get the Telemetry Ping payload
> >-  let ping = TelemetryPing.getPayload();
> >+  Telemetry.readShutdownTime(function() {
> >+    let ping = TelemetryPing.getPayload();
> 
> I don't think we need this.. xul-window-visible will have fired before we
> get here, and even if it hasn't, we can just display the default value (0 in
> current patch, -1 in my proposal). Either way, it accurately reflects the
> Telemetry data currently gathered

Lets please not do this. Really. There is no point in presenting inaccurate data to our users. Remember that stacks for the late writes will be read in here to. It would be really surprising to open about:telemetry, se that there were no late writes, refresh and see that now there are some.
Comment 13 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-13 07:27:38 PST
(In reply to Vladan Djeric (:vladan) from comment #11)
> I'll review the "move" patch after we address the questions above

Please no. The reason for splitting the patches is to speed up our tremendously slow code review process, not slow it down!
Comment 14 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-13 07:49:26 PST
Created attachment 691827 [details] [diff] [review]
new patch on top of the move patch

https://tbpl.mozilla.org/?tree=Try&rev=f28f7441497a
Comment 15 :Ehsan Akhgari (busy, don't ask for review please) 2012-12-13 08:56:23 PST
Comment on attachment 690905 [details] [diff] [review]
Move code to Telemetry.cpp and s/NULL/nullptr/.

Review of attachment 690905 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/startup/public/nsIAppStartup.idl
@@ -45,5 @@
>      /**
> -     * The amount of time, in milliseconds, that the last session took
> -     * to shutdown.  Reads as 0 to indicate failure.
> -     */
> -    readonly attribute uint32_t lastShutdownDuration;

You need to rev the uuid.

::: toolkit/components/telemetry/nsITelemetry.idl
@@ +39,5 @@
> +  /**
> +   * The amount of time, in milliseconds, that the last session took
> +   * to shutdown.  Reads as 0 to indicate failure.
> +   */
> +  readonly attribute uint32_t lastShutdownDuration;

Please rev the uuid here as well.
Comment 16 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-13 09:08:06 PST
https://hg.mozilla.org/integration/mozilla-inbound/rev/2e34b7c8a339
Comment 17 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-13 10:33:17 PST
Created attachment 691887 [details] [diff] [review]
fix it. Rebased on top of inbound.

https://tbpl.mozilla.org/?tree=Try&rev=eb3ebcf79643
Comment 18 Vladan Djeric (:vladan) 2012-12-13 17:46:04 PST
Comment on attachment 691887 [details] [diff] [review]
fix it. Rebased on top of inbound.

After our convo, I've come to recognize the benefit of having callbacks for async operations in test code, so that invalidates a lot of my previous comments.

(In reply to Rafael Ávila de Espíndola (:espindola) from comment #12)
> Why? It is an important invariant that these member variables are not
> arbitrarily modified from the outside. If there is a style ban on friend I
> can make nsReadShutdownTime an inner class if you want.

No, there is no style ban. I would prefer an inner class over friend, but I guess that is really a matter of personal taste, so let's leave it.

> > >+  int r = fscanf(f, "%d\n", &shutdownTime);
> > 
> > Nit: I know you didn't write this code, but I'm not a fan of single-letter
> > variable names outside loop variables. So r -> ret, f -> shutdownTimeFile.
> > Apply throughout file, e.g. "nsCOMPtr<nsIRunnable> e"
> 
> Again, this is unrelated to making the read async. Lets please focus on that
> in this change, the process is already way too slow as is.

We can fix the variable names in a follow up bug or in a no-bug commit, but we might as well do it now since we're changing the code. I'll leave the decision up to you.

> What should I do for creating the "let timer"? Without the brace its scope
> will be the function no? Should I create a helper function and do

As we discussed, it has essentially the same lifetime with & without the braces. For consistency with rest of the code in this function, I would vote for just taking out the curly braces.

> > Is it possible for nsReadShutdownTime::MainThread to run after mTelemetry
> > has been freed? If so, mTelemetry should be an nsCOMPtr<TelemetryImpl>
> 
> done.

Since we're keeping these classes as friends, you could also just directly set mTelemetry = TelemetryImpl::sTelemetry. The nsCOMPtr is still important for keeping TelemetryImpl alive for accesses off the main thread.

> const SHUTDOWN_TIME_READ_DELAY = 5413;

Nit: SHUTDOWN_TIME_READ_DELAY_MS

> void asyncReadShutdownTime(in nsIReadShutdownTimeCallback aCallback);

Add a description of the function like the others in the IDL file

>   Telemetry.asyncReadShutdownTime(function() {

Let's move the code for displaying "addon histogram data" before the asyncReadShutdownTime call and the rest of the callback code that relies on the ping payload into its own function.
Comment 19 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-12-13 20:16:42 PST
https://hg.mozilla.org/integration/mozilla-inbound/rev/babbf864b10c

Note You need to log in before you can comment on or make changes to this bug.