Closed Bug 815418 Opened 7 years ago Closed 7 years ago

telemetry on what proportion of attempted firefox startups result in 'firefox is running and not responding'

Categories

(Toolkit :: Startup and Profile System, defect)

x86_64
Windows 7
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla21

People

(Reporter: taras.mozilla, Assigned: aklotz)

Details

Attachments

(1 file, 5 obsolete files)

This should help figure out priority of bug 286355 vs other startup fixes.
Try run for a13ce166ad80 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=a13ce166ad80
Results (out of 19 total builds):
    exception: 9
    failure: 10
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/aklotz@mozilla.com-a13ce166ad80
Try build for this patch is currently running at https://tbpl.mozilla.org/?tree=Try&rev=ef822b2ecb73
Attachment #693565 - Flags: review?(vdjeric)
Try run for 9f228d49d7fb is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=9f228d49d7fb
Results (out of 19 total builds):
    exception: 6
    failure: 13
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/aklotz@mozilla.com-9f228d49d7fb
Try run for ef822b2ecb73 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=ef822b2ecb73
Results (out of 317 total builds):
    exception: 4
    success: 284
    warnings: 25
    failure: 4
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/aklotz@mozilla.com-ef822b2ecb73
Try run for 86c0068534cf is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=86c0068534cf
Results (out of 84 total builds):
    exception: 2
    success: 62
    warnings: 4
    failure: 16
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/aklotz@mozilla.com-86c0068534cf
Try run for 9f228d49d7fb is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=9f228d49d7fb
Results (out of 20 total builds):
    exception: 7
    failure: 13
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/aklotz@mozilla.com-9f228d49d7fb
Try run for a13ce166ad80 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=a13ce166ad80
Results (out of 20 total builds):
    exception: 10
    failure: 10
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/aklotz@mozilla.com-a13ce166ad80
Comment on attachment 693565 [details] [diff] [review]
Telemetry for failed profile lock attempts + unit test

>+  "STARTUP_PROFILE_LOCK_FAILURES": {
>+    "kind": "linear",
>+    "high": "10",
>+    "n_buckets": "9",
>+    "description": "The user's profile was unable to be locked, preventing startup"
>   }

Shound't n_buckets be 11?

>+static bool
>+GetFailedLockCount(nsIInputStream* inStream, unsigned int &result)
>+{

Doesn't need to be static, it's in an anonymous namespace

>+  char buf[kMaxFailedProfileLockFileSize + 1] = {0};
>+  void *buffer = &buf;
>+  NS_ReadInputStreamToBuffer(inStream, &buffer, kMaxFailedProfileLockFileSize);
>+  result = strtoul(buf, NULL, 10);
>+  return result != ULONG_MAX;
>+}

Technically, you should return false if strtoul returns 0. If the file exists, it should contain a positive number + strtoul returns 0 if it can't perform the conversion.

You can also do the same thing with an nsAutoCString's buffer by using BeginWriting and ToInteger methods

>+class nsReadFailedProfileLockCount : public nsRunnable
>+{
>+public:

This should be integrated into nsFetchTelemetryData

>+    if (doesFileExist) {
>+      {
>+        nsCOMPtr<nsIInputStream> inStream;
>+        rv = NS_NewLocalFileInputStream(getter_AddRefs(inStream), mFile,
>+                                        PR_RDONLY);
>+        NS_ENSURE_SUCCESS(rv, rv);
>+
>+        NS_ENSURE_STATE(GetFailedLockCount(inStream, mFailedLockCount));
>+      }

why do you need this nested block?

>+      mFile->Remove(false);
>+      if (mFailedLockCount) {
>+        nsCOMPtr<nsIRunnable> workItem = NS_NewRunnableMethod(this,
>+                                     &nsReadFailedProfileLockCount::MainThread);
>+        NS_ENSURE_STATE(workItem);
>+        NS_DispatchToMainThread(workItem, NS_DISPATCH_NORMAL);
>+      }

You can call accumulate directly from the helper thread

>+NS_IMETHODIMP
>+TelemetryImpl::AsyncReadFailedProfileLockCount()
>+{
>+  if (!CanRecord()) {
>+    return NS_OK;
>+  }

We don't need this check since we're not recording data here

>+void
>+WriteFailedProfileLock(nsIFile* aProfileDir)
>+{
>+  if (!CanRecord()) {
>+    return;
>+  }

Is this check redundant? CanRecord returns true if Telemetry service is not initialized yet

>+  int64_t fileSize = 0;
>+  file->GetFileSize(&fileSize);

Check rv 

>+  nsCOMPtr<nsIFileStream> fileStream;
>+  rv = NS_NewLocalFileStream(getter_AddRefs(fileStream), file,
>+                             PR_RDWR | PR_CREATE_FILE, 0640);

You could get rid of seekable stream and QIs if you use NS_NewLocalFileInputStream here and then NS_NewLocalFileOutputStream later with truncate flag. Just a suggestion, make your own call

>+  char buf[kMaxFailedProfileLockFileSize + 1] = {0};
>+#ifdef _MSC_VER
>+  int chars = _snprintf(buf, ArrayLength(buf) - 1, "%u", failedLockCount);
>+#else
>+  int chars = snprintf(buf, ArrayLength(buf) - 1, "%u", failedLockCount);
>+#endif
>+  if (chars < 0) {
>+    return;
>+  }

You can simplify this by using an nsAutoCString and its AppendInt method

>+  uint32_t written;
>+  rv = outStream->Write(buf, chars, &written);
>+  if (NS_FAILED(rv)) {
>+    return;
>+  }

I think you will need to check that written bytes == input size and retry until it's fully written out

>+
>+  /**
>+   * Read how many failed attempts to lock the user's profile have occurred
>+   * prior to this run.
>+   */
>+  void asyncReadFailedProfileLockCount();

We should make this part of asyncFetchTelemetryData

>diff --git a/toolkit/components/telemetry/tests/unit/test_TelemetryPing.js b/toolkit/components/telemetry/tests/unit/test_TelemetryPing.js
>--- a/toolkit/components/telemetry/tests/unit/test_TelemetryPing.js
>+++ b/toolkit/components/telemetry/tests/unit/test_TelemetryPing.js

Thank you for writing this test, it's very thorough :)
Attachment #693565 - Flags: review?(vdjeric) → review-
(In reply to Vladan Djeric (:vladan) from comment #8)
> Comment on attachment 693565 [details] [diff] [review]
> Telemetry for failed profile lock attempts + unit test
> 
> >+  "STARTUP_PROFILE_LOCK_FAILURES": {
> >+    "kind": "linear",
> >+    "high": "10",
> >+    "n_buckets": "9",
> >+    "description": "The user's profile was unable to be locked, preventing startup"
> >   }
> 
> Shound't n_buckets be 11?

high must be > n_buckets or else we fail with a static assert. Would it be better if I bumped n_buckets to 10 and high to 11?

> 
> >+static bool
> >+GetFailedLockCount(nsIInputStream* inStream, unsigned int &result)
> >+{
> 
> Doesn't need to be static, it's in an anonymous namespace

Done

> 
> >+  char buf[kMaxFailedProfileLockFileSize + 1] = {0};
> >+  void *buffer = &buf;
> >+  NS_ReadInputStreamToBuffer(inStream, &buffer, kMaxFailedProfileLockFileSize);
> >+  result = strtoul(buf, NULL, 10);
> >+  return result != ULONG_MAX;
> >+}
> 
> Technically, you should return false if strtoul returns 0. If the file
> exists, it should contain a positive number + strtoul returns 0 if it can't
> perform the conversion.
> 
> You can also do the same thing with an nsAutoCString's buffer by using
> BeginWriting and ToInteger methods
> 

Rewritten to use nsAutoCString instead.

> >+class nsReadFailedProfileLockCount : public nsRunnable
> >+{
> >+public:
> 
> This should be integrated into nsFetchTelemetryData

Done.

> 
> >+    if (doesFileExist) {
> >+      {
> >+        nsCOMPtr<nsIInputStream> inStream;
> >+        rv = NS_NewLocalFileInputStream(getter_AddRefs(inStream), mFile,
> >+                                        PR_RDONLY);
> >+        NS_ENSURE_SUCCESS(rv, rv);
> >+
> >+        NS_ENSURE_STATE(GetFailedLockCount(inStream, mFailedLockCount));
> >+      }
> 
> why do you need this nested block?

I wanted to release the file input stream before deleting the file that it references in the next line following the nested block.

> 
> >+      mFile->Remove(false);
> >+      if (mFailedLockCount) {
> >+        nsCOMPtr<nsIRunnable> workItem = NS_NewRunnableMethod(this,
> >+                                     &nsReadFailedProfileLockCount::MainThread);
> >+        NS_ENSURE_STATE(workItem);
> >+        NS_DispatchToMainThread(workItem, NS_DISPATCH_NORMAL);
> >+      }
> 
> You can call accumulate directly from the helper thread

Done.

> >+NS_IMETHODIMP
> >+TelemetryImpl::AsyncReadFailedProfileLockCount()
> >+{
> >+  if (!CanRecord()) {
> >+    return NS_OK;
> >+  }
> 
> We don't need this check since we're not recording data here
> 

Removed when I migrated everything into nsFetchTelemetryData.

> >+void
> >+WriteFailedProfileLock(nsIFile* aProfileDir)
> >+{
> >+  if (!CanRecord()) {
> >+    return;
> >+  }
> 
> Is this check redundant? CanRecord returns true if Telemetry service is not
> initialized yet

Yeah, looks like it. Removed.

> 
> >+  int64_t fileSize = 0;
> >+  file->GetFileSize(&fileSize);
> 
> Check rv 

Done.

> 
> >+  nsCOMPtr<nsIFileStream> fileStream;
> >+  rv = NS_NewLocalFileStream(getter_AddRefs(fileStream), file,
> >+                             PR_RDWR | PR_CREATE_FILE, 0640);
> 
> You could get rid of seekable stream and QIs if you use
> NS_NewLocalFileInputStream here and then NS_NewLocalFileOutputStream later
> with truncate flag. Just a suggestion, make your own call

I left this as-is

> 
> >+  char buf[kMaxFailedProfileLockFileSize + 1] = {0};
> >+#ifdef _MSC_VER
> >+  int chars = _snprintf(buf, ArrayLength(buf) - 1, "%u", failedLockCount);
> >+#else
> >+  int chars = snprintf(buf, ArrayLength(buf) - 1, "%u", failedLockCount);
> >+#endif
> >+  if (chars < 0) {
> >+    return;
> >+  }
> 
> You can simplify this by using an nsAutoCString and its AppendInt method
> 

Done

> >+  uint32_t written;
> >+  rv = outStream->Write(buf, chars, &written);
> >+  if (NS_FAILED(rv)) {
> >+    return;
> >+  }
> 
> I think you will need to check that written bytes == input size and retry
> until it's fully written out

Done

> 
> >+
> >+  /**
> >+   * Read how many failed attempts to lock the user's profile have occurred
> >+   * prior to this run.
> >+   */
> >+  void asyncReadFailedProfileLockCount();
> 
> We should make this part of asyncFetchTelemetryData

Done

> 
> >diff --git a/toolkit/components/telemetry/tests/unit/test_TelemetryPing.js b/toolkit/components/telemetry/tests/unit/test_TelemetryPing.js
> >--- a/toolkit/components/telemetry/tests/unit/test_TelemetryPing.js
> >+++ b/toolkit/components/telemetry/tests/unit/test_TelemetryPing.js
> 
> Thank you for writing this test, it's very thorough :)

You're welcome!
Attachment #693565 - Attachment is obsolete: true
Attachment #697607 - Flags: review?(vdjeric)
Comment on attachment 697607 [details] [diff] [review]
Telemetry for failed profile lock attempts + unit test rev. 2

Looks good, but this change touches code also modified by Rafael in bug 814765 so you will have to rebase. 

Rafael, could I also ask you take a quick look over this patch?

(In reply to Aaron Klotz [:aklotz] from comment #9)
> high must be > n_buckets or else we fail with a static assert. Would it be
> better if I bumped n_buckets to 10 and high to 11?

Nathan explained to me that he added this assert to prevent a common off-by-one mistake where a developer would declare a linear histogram with both "high" and "n_buckets" set to N. This would lead a to a step interval of (N + 1)/N and unexpected bucket configurations. So instead, we should be using an enumerated histogram for this purpose. I updated the wiki page to reflect this info.

> I wanted to release the file input stream before deleting the file that it
> references in the next line following the nested block.

Can't you just call Close() on the input stream?

>+bool
>+GetFailedLockCount(nsIInputStream* inStream, unsigned int &result)
>+{
>+  nsAutoCString bufStr;
>+  NS_ReadInputStreamToString(inStream, bufStr, kMaxFailedProfileLockFileSize);

Check rv

>   NS_IMETHOD Run() {
>-    mTelemetry->mLastShutdownTime = ReadLastShutdownDuration(mFilename);
>+    bool doesFailedLockCountFileExist = false;
>+    ...

Move the file reading/removing into a separate function

>+void
>+WriteFailedProfileLock(nsIFile* aProfileDir)
>+{
>+  nsCOMPtr<nsIFile> file;
>+  nsresult rv = GetFailedProfileLockFile(getter_AddRefs(file), aProfileDir);
>+  if (NS_FAILED(rv)) {
>+    return;
>+  }

You can use NS_ENSURE_SUCCESS_VOID(rv) to shorten the code a bit if you like. The downside is that this macro will cause warnings to be printed to the console on failure.
A bit of background on the two views on this question: https://groups.google.com/forum/#!msg/mozilla.dev.platform/ArFWYZUyOTE/zLY1QShxFwMJ

>+  unsigned int failedLockCount = 0;
>+  if (fileSize > 0) {
>+    nsCOMPtr<nsIInputStream> inStream = do_QueryInterface(fileStream);
>+    if (!GetFailedLockCount(inStream, failedLockCount)) {
>+      failedLockCount = 0;
>+    }
>+  }

You need to check if do_QueryInterface returned null here and for other QI calls. You could use NS_ENSURE_TRUE_VOID

>+  do {
>+    uint32_t written = 0;
>+    rv = outStream->Write(bytes, bytesLeft, &written);
>+    if (NS_FAILED(rv)) {
>+      break;
>+    }
>+    bytes += written;
>+    bytesLeft -= written;
>+  } while (bytesLeft);

bytesLeft > 0 just in case
Attachment #697607 - Flags: review?(vdjeric) → feedback?(respindola)
Comment on attachment 697607 [details] [diff] [review]
Telemetry for failed profile lock attempts + unit test rev. 2

Review of attachment 697607 [details] [diff] [review]:
-----------------------------------------------------------------

Don't worry about conflicts. If you push this first it should be easy for me to rebase.

::: toolkit/components/telemetry/Telemetry.cpp
@@ +730,5 @@
> +  return NS_SUCCEEDED(rv) && result > 0;
> +}
> +
> +nsresult
> +GetFailedProfileLockFile(nsIFile* *aFile, nsIFile* aProfileDir = nullptr)

Can you have this return an nsIFile*?

@@ +748,5 @@
>  class nsFetchTelemetryData : public nsRunnable
>  {
>  public:
> +  nsFetchTelemetryData(const char* aShutdownTimeFilename,
> +                       nsIFile* aFailedProfileLockFile)

It is a bit inconsistent to have a name for one file and a nsIFile for another, but that is OK with me if vladan is OK with it too.

@@ +775,5 @@
> +    nsresult rv = mFailedProfileLockFile->Exists(&doesFailedLockCountFileExist);
> +    NS_ENSURE_SUCCESS(rv, rv);
> +
> +    if (doesFailedLockCountFileExist) {
> +      unsigned int failedLockCount = 0;

Why do you need to do?

if (exist)
  open(..)

It is more efficient to try to open and check the error, no?

@@ +787,5 @@
> +        NS_ENSURE_STATE(GetFailedLockCount(inStream, failedLockCount));
> +      }
> +      mFailedProfileLockFile->Remove(false);
> +      if (failedLockCount) {
> +        Telemetry::Accumulate(Telemetry::STARTUP_PROFILE_LOCK_FAILURES,

This is safe to call from non-main threads, right?

@@ +2184,5 @@
> +    return;
> +  }
> +  bool fileExists = false;
> +  rv = file->Exists(&fileExists);
> +  if (NS_FAILED(rv)) {

Same here, why do you need to check for existence first?

@@ +2214,5 @@
> +  ++failedLockCount;
> +  nsAutoCString bufStr;
> +  bufStr.AppendInt(static_cast<int>(failedLockCount));
> +  nsCOMPtr<nsISeekableStream> seekStream = do_QueryInterface(fileStream);
> +  rv = seekStream->Seek(nsISeekableStream::NS_SEEK_SET, 0);

You just created the file, why do you need to seek?
Attachment #697607 - Flags: feedback?(respindola) → feedback+
(In reply to Vladan Djeric (:vladan) from comment #10)
> Comment on attachment 697607 [details] [diff] [review]
> Telemetry for failed profile lock attempts + unit test rev. 2
> 
> Looks good, but this change touches code also modified by Rafael in bug
> 814765 so you will have to rebase. 
> 
> Rafael, could I also ask you take a quick look over this patch?
> 
> (In reply to Aaron Klotz [:aklotz] from comment #9)
> > high must be > n_buckets or else we fail with a static assert. Would it be
> > better if I bumped n_buckets to 10 and high to 11?
> 
> Nathan explained to me that he added this assert to prevent a common
> off-by-one mistake where a developer would declare a linear histogram with
> both "high" and "n_buckets" set to N. This would lead a to a step interval
> of (N + 1)/N and unexpected bucket configurations. So instead, we should be
> using an enumerated histogram for this purpose. I updated the wiki page to
> reflect this info.

OK, done.

> 
> > I wanted to release the file input stream before deleting the file that it
> > references in the next line following the nested block.
> 
> Can't you just call Close() on the input stream?

Sure, done.

> 
> >+bool
> >+GetFailedLockCount(nsIInputStream* inStream, unsigned int &result)
> >+{
> >+  nsAutoCString bufStr;
> >+  NS_ReadInputStreamToString(inStream, bufStr, kMaxFailedProfileLockFileSize);
> 
> Check rv

Done.

> 
> >   NS_IMETHOD Run() {
> >-    mTelemetry->mLastShutdownTime = ReadLastShutdownDuration(mFilename);
> >+    bool doesFailedLockCountFileExist = false;
> >+    ...
> 
> Move the file reading/removing into a separate function
> 

Done.

> >+void
> >+WriteFailedProfileLock(nsIFile* aProfileDir)
> >+{
> >+  nsCOMPtr<nsIFile> file;
> >+  nsresult rv = GetFailedProfileLockFile(getter_AddRefs(file), aProfileDir);
> >+  if (NS_FAILED(rv)) {
> >+    return;
> >+  }
> 
> You can use NS_ENSURE_SUCCESS_VOID(rv) to shorten the code a bit if you
> like. The downside is that this macro will cause warnings to be printed to
> the console on failure.
> A bit of background on the two views on this question:
> https://groups.google.com/forum/#!msg/mozilla.dev.platform/ArFWYZUyOTE/
> zLY1QShxFwMJ

Okay. (Whoa, that's quite the thread!) Based on that, I'll use the macros to shorten the code when it's reasonable to log the failure.

> 
> >+  unsigned int failedLockCount = 0;
> >+  if (fileSize > 0) {
> >+    nsCOMPtr<nsIInputStream> inStream = do_QueryInterface(fileStream);
> >+    if (!GetFailedLockCount(inStream, failedLockCount)) {
> >+      failedLockCount = 0;
> >+    }
> >+  }
> 
> You need to check if do_QueryInterface returned null here and for other QI
> calls. You could use NS_ENSURE_TRUE_VOID

Done.

> 
> >+  do {
> >+    uint32_t written = 0;
> >+    rv = outStream->Write(bytes, bytesLeft, &written);
> >+    if (NS_FAILED(rv)) {
> >+      break;
> >+    }
> >+    bytes += written;
> >+    bytesLeft -= written;
> >+  } while (bytesLeft);
> 
> bytesLeft > 0 just in case

Done.

(In reply to Rafael Ávila de Espíndola (:espindola) from comment #11)
> Comment on attachment 697607 [details] [diff] [review]
> Telemetry for failed profile lock attempts + unit test rev. 2
> 
> Review of attachment 697607 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Don't worry about conflicts. If you push this first it should be easy for me
> to rebase.
> 
> ::: toolkit/components/telemetry/Telemetry.cpp
> @@ +730,5 @@
> > +  return NS_SUCCEEDED(rv) && result > 0;
> > +}
> > +
> > +nsresult
> > +GetFailedProfileLockFile(nsIFile* *aFile, nsIFile* aProfileDir = nullptr)
> 
> Can you have this return an nsIFile*?

Not sure I want to do that. That would appear to conflict with our style guide.

> 
> @@ +748,5 @@
> >  class nsFetchTelemetryData : public nsRunnable
> >  {
> >  public:
> > +  nsFetchTelemetryData(const char* aShutdownTimeFilename,
> > +                       nsIFile* aFailedProfileLockFile)
> 
> It is a bit inconsistent to have a name for one file and a nsIFile for
> another, but that is OK with me if vladan is OK with it too.
> 

I agree about the inconsistency; I need a nsIFile object again later so that's why it's being kept around.

> @@ +775,5 @@
> > +    nsresult rv = mFailedProfileLockFile->Exists(&doesFailedLockCountFileExist);
> > +    NS_ENSURE_SUCCESS(rv, rv);
> > +
> > +    if (doesFailedLockCountFileExist) {
> > +      unsigned int failedLockCount = 0;
> 
> Why do you need to do?
> 
> if (exist)
>   open(..)
> 
> It is more efficient to try to open and check the error, no?
> 

Fair enough.

> @@ +787,5 @@
> > +        NS_ENSURE_STATE(GetFailedLockCount(inStream, failedLockCount));
> > +      }
> > +      mFailedProfileLockFile->Remove(false);
> > +      if (failedLockCount) {
> > +        Telemetry::Accumulate(Telemetry::STARTUP_PROFILE_LOCK_FAILURES,
> 
> This is safe to call from non-main threads, right?

Vladan said so in his first review of the patch.

> 
> @@ +2184,5 @@
> > +    return;
> > +  }
> > +  bool fileExists = false;
> > +  rv = file->Exists(&fileExists);
> > +  if (NS_FAILED(rv)) {
> 
> Same here, why do you need to check for existence first?

That was an attempt at clarifying the code path around the subsequent call to GetFileSize but I'll take a better approach.

> 
> @@ +2214,5 @@
> > +  ++failedLockCount;
> > +  nsAutoCString bufStr;
> > +  bufStr.AppendInt(static_cast<int>(failedLockCount));
> > +  nsCOMPtr<nsISeekableStream> seekStream = do_QueryInterface(fileStream);
> > +  rv = seekStream->Seek(nsISeekableStream::NS_SEEK_SET, 0);
> 
> You just created the file, why do you need to seek?

There are two possible cases here:

(1) WriteFailedProfileLock() is called when there are no previous failures. The file doesn't exist yet. In that case, the Seek is redundant. I can add a check here to avoid Seeking in this case.

(2) WriteFailedProfileLock() is called when there already have been previous failures. We're going to be doing a read-modify-write to the profile lock file. We've just done the read so the file pointer is > 0. We need to reset that back to the beginning of the file.
Attachment #697607 - Attachment is obsolete: true
Attachment #699336 - Flags: review?(vdjeric)
Comment on attachment 699336 [details] [diff] [review]
Telemetry for failed profile lock attempts + unit test rev. 3

Please rebase and I'll check it in
Attachment #699336 - Flags: review?(vdjeric)
Attachment #699336 - Flags: review+
Attachment #699336 - Flags: checkin+
Attachment #699336 - Flags: checkin+
Attached patch Rebased patch (obsolete) — Splinter Review
Attachment #699336 - Attachment is obsolete: true
Comment on attachment 699929 [details] [diff] [review]
Rebased patch

Carrying forward r+
Attachment #699929 - Flags: review+
Attachment #699929 - Flags: checkin?(vdjeric)
https://hg.mozilla.org/integration/mozilla-inbound/rev/23bb4a03ef34

(Please use the checkin-needed keyword instead of setting the checkin? flag.)
Attachment #699929 - Flags: checkin?(vdjeric)
Backed out because of possible xpcshell bustage: https://hg.mozilla.org/integration/mozilla-inbound/rev/4f18799588dc
Those histogram values in that failed test are bizarre and make no sense. I just rebased the patch again ran it again on my laptop and it was fine. :-( Investigating...
Attached patch Patch with bug fix (obsolete) — Splinter Review
This makes some changes to nsFetchTelemetryData::LoadFailedLockCount that should take care of that test failure.
Attachment #699929 - Attachment is obsolete: true
Attachment #702501 - Flags: review?(vdjeric)
Comment on attachment 702501 [details] [diff] [review]
Patch with bug fix

>+  "STARTUP_PROFILE_LOCK_FAILURES": {
>+    "kind": "enumerated",
>+    "n_values": 11,
>+    "description": "The user's profile was unable to be locked, preventing startup"
>   }

Since this is a startup measurement of sorts + we are only ever going to store a single value per session, I think we should make this a simpleMeasurement instead of a histogram.
You would only need to change the measurement recording in Telemetry to make it more similar to Rafael's approach for shutdown duration.

>+  nsresult
>+  LoadFailedLockCount(uint32_t& failedLockCount)
>+  {
>+    failedLockCount = 0;
>+    int64_t fileSize = 0;
>+    nsresult rv = mFailedProfileLockFile->GetFileSize(&fileSize);
>+    NS_ENSURE_SUCCESS(rv, rv);
>+    NS_ENSURE_STATE(fileSize <= kMaxFailedProfileLockFileSize);
>+    nsCOMPtr<nsIInputStream> inStream;
>+    rv = NS_NewLocalFileInputStream(getter_AddRefs(inStream),
>+                                    mFailedProfileLockFile,
>+                                    PR_RDONLY);
>+    NS_ENSURE_SUCCESS(rv, rv);
>+    NS_ENSURE_STATE(GetFailedLockCount(inStream, fileSize, failedLockCount));

NS_ENSURE_TRUE seems better suited for this than NS_ENSURE_STATE since we're not checking any state for null-ness. Same with the fileSize check above
Attachment #702501 - Flags: review?(vdjeric)
Attachment #702501 - Attachment is obsolete: true
Attachment #707799 - Flags: review?(vdjeric)
Comment on attachment 707799 [details] [diff] [review]
Patch that provides results in simple measures

LGTM. 


>+  nsresult
>+  LoadFailedLockCount(uint32_t& failedLockCount)
>+  {
>+    failedLockCount = 0;
      [...]
>+    NS_ENSURE_SUCCESS(rv, rv);
>+    NS_ENSURE_TRUE(GetFailedLockCount(inStream, fileSize, failedLockCount),
>+                   NS_ERROR_UNEXPECTED);

Why return nsresult here if we're never going to check it? Is it to output warnings with the NS_ENSURE macros?
Attachment #707799 - Flags: review?(vdjeric) → review+
(In reply to Vladan Djeric (:vladan) from comment #24)
> Comment on attachment 707799 [details] [diff] [review]
> Patch that provides results in simple measures

> Why return nsresult here if we're never going to check it? Is it to output
> warnings with the NS_ENSURE macros?

That, and I wanted to minimize the footprint of the changes since I already knew that this code ran well as-is.
Keywords: checkin-needed
Backed-out and relanded because I misspelled Aaron's name in the original commit (sorry, aklotz):

Backout:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4df72bf4cd1c

Re-Inbound:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4d956c4b8279
https://hg.mozilla.org/mozilla-central/rev/4d956c4b8279
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.