Closed Bug 814765 Opened 10 years ago Closed 10 years ago

Include late writes in the Telemetry ping

Categories

(Toolkit :: Telemetry, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla21

People

(Reporter: espindola, Assigned: espindola)

References

Details

Attachments

(1 file, 2 obsolete files)

Attached patch patch (obsolete) — Splinter Review
This patch goes on top of the one in bug 814704. It doesn't include code for displaying the info in about:telemetry. Let me know if it should or if it is ok for that to be a followup bug.
Attachment #684764 - Flags: review?(vdjeric)
Comment on attachment 684764 [details] [diff] [review]
patch

The about:telemetry changes can be a separate bug, but we should land both of them together.

--- a/toolkit/components/telemetry/Telemetry.cpp	
+++ a/toolkit/components/telemetry/Telemetry.cpp	

> #include <algorithm>
>+#include <fstream>
>+#include <dirent.h>

Usually I see our code use the XPCOM versions of file/directory operations. Are these calls portable and can they throw exceptions?

>+// Read a stack from the given file name. In case of any error silently
>+// return an empty stack.

I would prefer we return nsresult and pass a ProcessedStack reference or pointer as an argument

>+static Telemetry::ProcessedStack
>+ReadStack(const char *fileName)
>+{

Prefix parameter names with the letter "a", i.e. aFileName. Change throughout

+    Telemetry::ProcessedStack::Module module = {
+      moduleName,
+      0,  // mPdbAge
+      "", // mPdbSignature
+      ""  // mPdbName
+    };
+    stack.AddModule(module);

So late-write reporting is a Mac-only feature for now?

>+  for (unsigned i = 0; i < numModules; ++i) {

We mostly use size_t in this file for this. Change throughout

>+  unsigned numFrames;
>+  file >> numFrames;
>+  file.ignore(std::numeric_limits<std::streamsize>::max(), '\n');

I don't think we should blindly fast-forward through the file. If there's unexpected data between value and newline, we should indicate an error

>+  struct dirent *ent;
>+  CombinedStacks stacks;
>+  while ((ent = readdir(dir))) {
>+    const char *prefix = "Telemetry.LateWriteFinal-";
>+    unsigned int prefixLen = strlen(prefix);
>+    if (strncmp(prefix, ent->d_name, prefixLen) != 0) {
>+      continue;
>+    }

Move prefix =.. and length calculation outside the loop

>+
>+    nsCOMPtr<nsIFile> stackFile;
>+    rv = mozFile->Clone(getter_AddRefs(stackFile));
>+    if (!NS_SUCCEEDED(rv)) {
>+      continue;
>+    }

if (NS_FAILED(rv)) instead, change throughout.
I'd also prefer to "break;" out of the loop on error, instead of returning partial results and potentially masking bugs

>+
>+    stackFile->AppendNative(nsDependentCString(ent->d_name));
>+    nsAutoCString stackNativePath;
>+    rv = stackFile->GetNativePath(stackNativePath);
>+    if (!NS_SUCCEEDED(rv)) {
>+      continue;
>+    }
>+
>+    Telemetry::ProcessedStack Stack = ReadStack(stackNativePath.get());
>+    stacks.AddStack(Stack);
>+    PR_Delete(stackNativePath.get());

nsAutoCString will take care of this free for you


>diff --git a/toolkit/components/telemetry/TelemetryPing.js b/toolkit/components/telemetry/TelemetryPing.js
>index 794eba3..883c206 100644
>--- a/toolkit/components/telemetry/TelemetryPing.js
>+++ b/toolkit/components/telemetry/TelemetryPing.js
>@@ -499,6 +499,7 @@ TelemetryPing.prototype = {
>       histograms: this.getHistograms(Telemetry.histogramSnapshots),
>       slowSQL: Telemetry.slowSQL,
>       chromeHangs: Telemetry.chromeHangs,
>+      lateWrites: Telemetry.lateWrites,
>       addonHistograms: this.getAddonHistograms()
>     };
> 
>diff --git a/toolkit/components/telemetry/nsITelemetry.idl b/toolkit/components/telemetry/nsITelemetry.idl
>index 7b4c07b..c0696d8 100644
>--- a/toolkit/components/telemetry/nsITelemetry.idl
>+++ b/toolkit/components/telemetry/nsITelemetry.idl
>@@ -75,6 +75,18 @@ interface nsITelemetry : nsISupports
>   [implicit_jscontext]
>   readonly attribute jsval chromeHangs;
> 
>+  /*
>+   * An object with two fields: memoryMap and stacks.
>+   * * memoryMap is a list of loaded libraries.
>+   * * stacks is a list of stacks. Each stack is a list of pairs of the form
>+   *   [moduleIndex, offset]. The moduleIndex is an index into the memoryMap and
>+   *   offset is an offset in the library at memoryMap[moduleIndex].
>+   * This format is used to make it easier to send the stacks to the
>+   * symbolication server.
>+   */
>+  [implicit_jscontext]
>+  readonly attribute jsval lateWrites;
>+
>   /**
>    * An object whose properties are the names of histograms defined in
>    * TelemetryHistograms.h and whose corresponding values are the textual
Attachment #684764 - Flags: review?(vdjeric) → review-
I don't think we're doing enough to root mLateWritesReport to protect it from GC.

Take a look at the example below or talk to jorendorff about this.. and please share any documentation/insights with me :)

http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsDOMFileReader.cpp#441
Depends on: 814704
> The about:telemetry changes can be a separate bug, but we should land both
> of them together.

I will just put them in one patch if you are OK with it.
 
> Usually I see our code use the XPCOM versions of file/directory operations.
> Are these calls portable and can they throw exceptions?

They are only used on OS X so far, but they are part of the standard. They can be configured to throw, but they don't by default.
 
> >+// Read a stack from the given file name. In case of any error silently
> >+// return an empty stack.
> 
> I would prefer we return nsresult and pass a ProcessedStack reference or
> pointer as an argument

I will change it to take a reference, but return void as we discussed.

> +    Telemetry::ProcessedStack::Module module = {
> +      moduleName,
> +      0,  // mPdbAge
> +      "", // mPdbSignature
> +      ""  // mPdbName
> +    };
> +    stack.AddModule(module);
> 
> So late-write reporting is a Mac-only feature for now?

Yes, see bug 820852.
 
> I'd also prefer to "break;" out of the loop on error, instead of returning
> partial results and potentially masking bugs

That would also produce a partial result. I can read into a temporary CombinedStacks and only update mLateWritesStacks if all reads succeed if you want.
 

> >+    PR_Delete(stackNativePath.get());
> 
> nsAutoCString will take care of this free for you

This deletes the file, no the memory.
 
I will add an updated patch in a second.
Attached patch patch (obsolete) — Splinter Review
Jason, asking for feedback for the big FIXME in TelemetryImpl::GetLateWrites.
Attachment #684764 - Attachment is obsolete: true
Attachment #695692 - Flags: review?(vdjeric)
Attachment #695692 - Flags: feedback?(jorendorff)
Comment on attachment 695692 [details] [diff] [review]
patch

>+// Read a stack from the given file name. In case of any error, aStack is
>+// unchanged.

We shouldn't report empty stacks to Telemetry when there are errors

>+static void
>+ReadStack(const char *aFileName, Telemetry::ProcessedStack &aStack)
>+{
..
>+  char newline = file.get();
>+  if (file.fail() || newline != '\n') {
>+    return;
>+  }

When you add late writes on Windows, make sure you don't output a carriage return + a linefeed, otherwise this will be wrong. I'm guessing the SHA1Stream probably does the right thing and only outputs the linefeed

>+void
>+TelemetryImpl::ReadLateWritesStacks()
>+{
>+  nsCOMPtr<nsIFile> mozFile;

Let's call this profileDir or something similar

>+  NS_GetSpecialDirectory(NS_APP_USER_PROFILE_50_DIR, getter_AddRefs(mozFile));
>+  if (!mozFile) {
>+    return;
>+  }

Check rv from NS_GetSpecialDirectory instead

>+  struct dirent *ent;
>+  const char *prefix = "Telemetry.LateWriteFinal-";
>+  unsigned int prefixLen = strlen(prefix);
>+  while ((ent = readdir(dir))) {
>+    if (strncmp(prefix, ent->d_name, prefixLen) != 0) {
>+      continue;
>+    }
>+
>+    nsCOMPtr<nsIFile> stackFile;
>+    rv = mozFile->Clone(getter_AddRefs(stackFile));
>+    if (NS_FAILED(rv)) {
>+      continue;
>+    }

You don't actually need to use nsIFile here, you're only after the string paths. You could just do string concatenation in an nsAutoCString and use XPCOM_FILE_PATH_SEPARATOR for the platform-independent separator

>+    PR_Delete(stackNativePath.get());

If you switch to the strings-only approach I suggested, add a comment about what this is doing. Otherwise, nsIFile has a remove() method

>+  if (report == nullptr)
>+    return NS_ERROR_FAILURE;

New style rules require curly braces around all if statements

>+function renderLateWrites(lateWrites) {

I was assuming you would make this data part of an existing section, but it's better this way. So, make this into a singleton to keep with the singleton-per-page-section idea
Attachment #695692 - Flags: review?(vdjeric) → review-
> We shouldn't report empty stacks to Telemetry when there are errors

Changed the code to check the size of the stack.
 
> When you add late writes on Windows, make sure you don't output a carriage
> return + a linefeed, otherwise this will be wrong. I'm guessing the
> SHA1Stream probably does the right thing and only outputs the linefeed

It just passes the bytes to AppendPrintf. If that adds a \r when given a \n I guess the best would be to adjust this code to accept a \r too.
 
> >+void
> >+TelemetryImpl::ReadLateWritesStacks()
> >+{
> >+  nsCOMPtr<nsIFile> mozFile;
> 
> Let's call this profileDir or something similar

Done.

> >+  NS_GetSpecialDirectory(NS_APP_USER_PROFILE_50_DIR, getter_AddRefs(mozFile));
> >+  if (!mozFile) {
> >+    return;
> >+  }
> 
> Check rv from NS_GetSpecialDirectory instead

Done.

> >+  struct dirent *ent;
> >+  const char *prefix = "Telemetry.LateWriteFinal-";
> >+  unsigned int prefixLen = strlen(prefix);
> >+  while ((ent = readdir(dir))) {
> >+    if (strncmp(prefix, ent->d_name, prefixLen) != 0) {
> >+      continue;
> >+    }
> >+
> >+    nsCOMPtr<nsIFile> stackFile;
> >+    rv = mozFile->Clone(getter_AddRefs(stackFile));
> >+    if (NS_FAILED(rv)) {
> >+      continue;
> >+    }
> 
> You don't actually need to use nsIFile here, you're only after the string
> paths. You could just do string concatenation in an nsAutoCString and use
> XPCOM_FILE_PATH_SEPARATOR for the platform-independent separator

Much better, thanks!

> >+    PR_Delete(stackNativePath.get());
> 
> If you switch to the strings-only approach I suggested, add a comment about
> what this is doing. Otherwise, nsIFile has a remove() method

Done.

> >+  if (report == nullptr)
> >+    return NS_ERROR_FAILURE;
> 
> New style rules require curly braces around all if statements

Done.

> >+function renderLateWrites(lateWrites) {
> 
> I was assuming you would make this data part of an existing section, but
> it's better this way. So, make this into a singleton to keep with the
> singleton-per-page-section idea

This is insane, but whatever.

A new patch will be up in 30m or so.
Attached patch New patchSplinter Review
https://tbpl.mozilla.org/?tree=Try&rev=5432133afb51
Attachment #695692 - Attachment is obsolete: true
Attachment #695692 - Flags: feedback?(jorendorff)
Attachment #696786 - Flags: review?(vdjeric)
Attachment #696786 - Flags: feedback?(jorendorff)
Comment on attachment 696786 [details] [diff] [review]
New patch

>+#ifdef XP_MACOSX
>+  void ReadLateWritesStacks();
>+#else
>+  void ReadLateWritesStacks() { }
>+#endif

I would not look for platform-specific implementations in a header. Maybe do something like:

void ReadLateWritesStacks()
{
#ifndef XP_MACOX
  return;
#endif
  ...
}

Would that have problems building? If it works, you could get rid of the big "#ifdef XP_MACOSX" around the implementation in the cpp file

>+    if (stack.GetStackSize() != 0)
>+      mLateWritesStacks.AddStack(stack);

Curly braces
Attachment #696786 - Flags: review?(vdjeric) → review+
(In reply to Rafael Ávila de Espíndola (:espindola) from comment #7)
> > When you add late writes on Windows, make sure you don't output a carriage
> > return + a linefeed, otherwise this will be wrong. I'm guessing the
> > SHA1Stream probably does the right thing and only outputs the linefeed
> 
> It just passes the bytes to AppendPrintf. If that adds a \r when given a \n
> I guess the best would be to adjust this code to accept a \r too.

Can you test this on Windows?
Comment on attachment 696786 [details] [diff] [review]
New patch

Review of attachment 696786 [details] [diff] [review]:
-----------------------------------------------------------------

JS data still doesn't need to be rooted as long as there is a live pointer on the stack.

This will change someday, and we'll all do explicit rooting again, but don't worry--we won't change it without first fixing up a ton of code throughout the tree, adding C++ helper classes for rooting, static analysis to catch GC hazards, the whole nine yards. This patch is doing the Right Thing for now: nothing.
Attachment #696786 - Flags: feedback?(jorendorff) → feedback+
> JS data still doesn't need to be rooted as long as there is a live pointer
> on the stack.
> 
> This will change someday, and we'll all do explicit rooting again, but don't
> worry--we won't change it without first fixing up a ton of code throughout
> the tree, adding C++ helper classes for rooting, static analysis to catch GC
> hazards, the whole nine yards. This patch is doing the Right Thing for now:
> nothing.

Note that the patch that we thought had a problem was one that was caching the result of GetLateWrites. The value was cached on the heap, not on the stack. I couldn't figure out how to root it, so that is why the current patch just recomputes the value every time (and has a FIXME about it).
https://hg.mozilla.org/mozilla-central/rev/6ff7c9712ff2
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
You need to log in before you can comment on or make changes to this bug.