Note: There are a few cases of duplicates in user autocompletion which are being worked on.

Stop client-side Windows "throttling"

RESOLVED FIXED in mozilla1.9.3a1

Status

()

Toolkit
Crash Reporting
P2
normal
RESOLVED FIXED
8 years ago
8 years ago

People

(Reporter: Samuel Sidler (old account; do not CC), Assigned: ddahl)

Tracking

Trunk
mozilla1.9.3a1
Points:
---
Bug Flags:
blocking1.9.2 +

Firefox Tracking Flags

(status1.9.2 beta4-fixed)

Details

Attachments

(1 attachment, 5 obsolete attachments)

Right now, we client-side "throttle" by only checking the submission box 10% of the time on Windows. We then do server-side throttling and only process 15% of crash reports. Instead of this, we should do no client-side throttling at all and rely on server-side throttling.

We should also *reset* this preference to be checked starting with Firefox 3.6.

This is specifically for Firefox 3.6 and we won't land it for 3.5 or 3.0.

Lars, I'm CCing you to this bug because when it happens (and, more specifically, when Firefox 3.6 goes live) you'll need to throttle Windows crashes server-side more than 15% (probably more like 1%, but we'll play with it when we release).
Flags: blocking1.9.2?
(Reporter)

Updated

8 years ago
OS: Mac OS X → All
Hardware: x86 → All
3.6b1 will not set the client-side throttle in the mozconfig (bug 518496). If you'd like to stop throttling 3.5.x then we'll need to update the mozconfig's there too.
I'd like to not do it for 3.6 final and 3.6.x. We'll leave things the same for 3.5.x and 3.0.x, as I said.
Sounds like it should be duped over there if the work happened there.
Component: Breakpad Integration → Release Engineering
Product: Toolkit → mozilla.org
QA Contact: breakpad.integration → release
Version: unspecified → other
(In reply to comment #0)
> We should also *reset* this preference to be checked starting with Firefox 3.6.

This would be an in-app change, right ? Whether this belongs back in Toolkit:Breakpad Integration or elsewhere IDK.
(In reply to comment #0)
> We should also *reset* this preference to be checked starting with Firefox 3.6.

The preference retains whatever value was last there. I'd rather not reset it for users that intentionally opted out.
(In reply to comment #5)
> (In reply to comment #0)
> > We should also *reset* this preference to be checked starting with Firefox 3.6.
> 
> The preference retains whatever value was last there. I'd rather not reset it
> for users that intentionally opted out.

But we have no way of knowing if they intentionally opted out or if we just didn't check the box for them (which happened for 90% of users, the first time around). I'd rather reset it once, turn off all throttling, and then users can truly opt out without us opting out for them.
Do we have any pending patches (or desires) to change the crashreporter UI? That would be an ideal time to reset it, from the POV of users being more likely to read it when it looks different.
Any change we might want to make would be more text-related than organization-related and string freeze has already past for 3.6.

Comment 9

8 years ago
Is there anything left to do here?
Has this bug been fixed by bug 518496 ?
This bug is really about resetting the pref (and whether we should). Moving it out of RelEng.
Component: Release Engineering → Preferences
Flags: blocking1.9.2?
Product: mozilla.org → Firefox
QA Contact: release → preferences
Version: other → 3.6 Branch

Updated

8 years ago
Flags: blocking-firefox3.6?
We should reset the pref, users can opt out again if they need to.
We should have zero client side throttling on Windows when Firefox 3.6 launches, and for the forseeable future.

Who can take this blocker?
Flags: blocking-firefox3.6? → blocking-firefox3.6+
Priority: -- → P2
(Assignee)

Comment 12

8 years ago
What is the name of the preference? I can take this.
It's not a pref, it's a registry entry, checked here:
http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/client/crashreporter_win.cpp#901

Looks like it's "Software\\Mozilla\\Crash Reporter\\Enabled", piecing together the strings from the top of that file.
Component: Preferences → Breakpad Integration
Flags: blocking-firefox3.6+
Product: Firefox → Toolkit
QA Contact: preferences → breakpad.integration
Version: 3.6 Branch → Trunk
Flags: blocking1.9.2+
Why should we do this? Comment #0 is not clear about what problem we're trying to solve: as far as I can tell we already have more than enough crash reports on Windows to be statistically significant in terms of top crash and regression analysis: is there other data we're trying to glean that would make this (and the potential user annoyance of resetting a pref) worthwhile?
Benjamin: As I told you before, we're going to start processing 100% of reports. If you have questions about that, feel free to ask shaver.

Regardless, implementing this bug will at least allow us to have a better guess at what percentage of users is submitting reports. If we don't want to process 100%, we can just server-side throttle more. There's no reason not to fix this bug.
You've said that before, I questioned why and got no good answers. From what I see, we're swimming in Windows data now and I don't think trying to extract more data from Windows release users is an especially good use of everyone's time. Anyone who could fix this RFE could also help fix multi-process crash reporting for 1.9.2.
Fixing this bug is as easy as renaming the registry entry we use. (Build already knows to set the build flag to 100%.) I think that's quite a bit different than working on multi-process crash reporting.
(Why do we need multi-process crash reporting for 1.9.2?)

I don't know why you think we're swimming in Windows data, or why you think that having a larger set of crashes on which to run our correlation and other tools will hurt us, tbh.

Why do you believe that the combination of client- and server- side throttling (which amount to 1.5% of total crashes, if the math in comment 0 is correct) is statistically significant, and for what analyses?  I mean, it's clear that the sample size is enough that the coarse distribution isn't likely to have occurred by chance, but in order to have statistically significant (read: distinguishable from chance) distribution information for _specific_ crashes, I don't think it's as clear that what we have right now is sufficient.
(Assignee)

Updated

8 years ago
Assignee: nobody → ddahl
(Assignee)

Comment 19

8 years ago
Created attachment 409199 [details] [diff] [review]
v 1.0 Changed Key Name to SubmitCrashReport

I tested this by enabling and disabling the crash reporter and rebuilding with the key name changed to "SubmitCrashReport" instead of "SubmitReport".
Attachment #409199 - Flags: review?
(Assignee)

Updated

8 years ago
Attachment #409199 - Flags: review? → review?(ted.mielczarek)
(Assignee)

Comment 20

8 years ago
Comment on attachment 409199 [details] [diff] [review]
v 1.0 Changed Key Name to SubmitCrashReport

WIP - going to add old key removal
Attachment #409199 - Flags: review?(ted.mielczarek)
(Assignee)

Comment 21

8 years ago
Created attachment 409224 [details] [diff] [review]
v 1.1 adding reg key removal - WIP

Just curious if I am on the right track - this didn't seem to remove the old key at all. how can I attach a debugger to the crashreporter? It tells me it shold not be run directly
Attachment #409199 - Attachment is obsolete: true
Attachment #409224 - Flags: review?(ted.mielczarek)
Whiteboard: [has patch][needs review ted]
(Assignee)

Comment 22

8 years ago
Created attachment 410099 [details] [diff] [review]
v 1.2 Working Patch

Tested by crashing with ted's crash extension and ms debugger/ regedit - what test suite might there be?
Attachment #409224 - Attachment is obsolete: true
Attachment #410099 - Flags: review?(ted.mielczarek)
Attachment #409224 - Flags: review?(ted.mielczarek)
There aren't any tests for the native crash reporter client, sadly. (It's hard to write tests for platform-native UI currently.)
(Assignee)

Updated

8 years ago
Attachment #410099 - Flags: review?(ted.mielczarek) → review?(robert.bugzilla)
(Assignee)

Updated

8 years ago
Attachment #410099 - Flags: review?(robert.bugzilla) → review?(jmathies)

Comment 24

8 years ago
Is there a chance for the key to be set in both local machine and current user? (Maybe we should remove from both.) Also, I would use RegOpenKeyExW so you can set access permission properly, RegOpenKey is depreciated. (I know we use it all over the place here, but in new code we should do it right.)
(Assignee)

Comment 25

8 years ago
(In reply to comment #24)
> Is there a chance for the key to be set in both local machine and current user?
> (Maybe we should remove from both.) 

According to the SetBoolKey method, we do not set the key in both places, only current user, however, I thought there might be some legacy reason to look in both places for this value.

> Also, I would use RegOpenKeyExW so you can
> set access permission properly, RegOpenKey is depreciated. (I know we use it
> all over the place here, but in new code we should do it right.)

ok, sounds good.
(Assignee)

Comment 26

8 years ago
Changed the code to:
...
  HKEY hRegKey;
  LONG removeSuccess;

  if (RegOpenKeyEx(HKEY_LOCAL_MACHINE, key, 0, KEY_READ, &hRegKey) 
      == ERROR_SUCCESS) {
    removeSuccess = RegDeleteValue(hRegKey, valueName);
  }
...

when I go to test this via ted's crashme extension, the crashreporter does not run, instead, I get the dialog to debug. 

reverting to the previous patch, the same thing happens. Very strange. I guess a clobber build is in order?
(Assignee)

Comment 27

8 years ago
whoops, looks like the application.ini setting for CrashReporter = Enabled was overwritten.
(Assignee)

Comment 28

8 years ago
Created attachment 411322 [details] [diff] [review]
v 1.3 Non-working patch

WIP: Doesn't remove the old vlaue, need to debug.
(Assignee)

Comment 29

8 years ago
Created attachment 411488 [details] [diff] [review]
v 1.4 working patch - comments addressed

Was opening the registry entry in read-only mode. the old value is removed now.
Attachment #410099 - Attachment is obsolete: true
Attachment #411322 - Attachment is obsolete: true
Attachment #411488 - Flags: review?
Attachment #410099 - Flags: review?(jmathies)
(Assignee)

Updated

8 years ago
Attachment #411488 - Flags: review? → review?(jmathies)

Comment 30

8 years ago
What's the point of removeSuccess? Also, (I think) the point here is to remove the key if it exists in either hive, so I would just do:

+  if (RegOpenKeyEx(HKEY_LOCAL_MACHINE, key, 0, KEY_SET_VALUE, &hRegKey) 
+      == ERROR_SUCCESS) {
+    RegDeleteValue(hRegKey, valueName);
+    RegCloseKey(hRegKey);
+  }
+
+  if (RegOpenKeyEx(HKEY_CURRENT_USER, key, 0, KEY_SET_VALUE, &hRegKey) 
+      == ERROR_SUCCESS) {
+    RegDeleteValue(hRegKey, valueName);
+    RegCloseKey(hRegKey);
+  }

..and touch up that function:

// Removes a value from HKEY_LOCAL_MACHINE and HKEY_CURRENT_USER, if it exists.
static void RemoveUnusedValues(const wchar_t* key, LPCTSTR valueName)
Whiteboard: [has patch][needs review ted]
(Assignee)

Comment 31

8 years ago
Created attachment 411517 [details] [diff] [review]
v 1.5 Comments addressed

The removeSuccess is now removed. That was only for debugging purposes. CloseKey is now called, added comment and changed function name to 'removeUnusedValues'.
Attachment #411488 - Attachment is obsolete: true
Attachment #411517 - Flags: review?(jmathies)
Attachment #411488 - Flags: review?(jmathies)
I am pleased to see the successful removal of removeSuccess.

Updated

8 years ago
Attachment #411517 - Flags: review?(jmathies) → review+
(Assignee)

Updated

8 years ago
Keywords: checkin-needed
http://hg.mozilla.org/mozilla-central/rev/5b9389a84c2a
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Whiteboard: baking
Target Milestone: --- → mozilla1.9.3a1
(Assignee)

Comment 34

8 years ago
Can we land this for branch now? Not sure how we would catch a regression or issue other than trunk nightly users
I don't think we'd catch anything until actually shipping a release. This is blocking, so it's ready to land whenever.
Keywords: checkin-needed
Whiteboard: baking → [can land branch]
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/784785a78a5a
status1.9.2: --- → final-fixed
Keywords: checkin-needed
Whiteboard: [can land branch]
You need to log in before you can comment on or make changes to this bug.