Closed Bug 1134016 Opened 9 years ago Closed 7 years ago

crash in pref_ValueChanged

Categories

(Core :: Preferences: Backend, defect)

defect
Not set
critical

Tracking

()

RESOLVED INCOMPLETE
Tracking Status
firefox48 --- wontfix
firefox49 --- wontfix
firefox-esr45 --- wontfix
firefox50 --- wontfix
firefox51 --- wontfix
firefox52 --- wontfix
firefox-esr52 --- wontfix
firefox56 --- wontfix
firefox57 --- unaffected
firefox58 --- unaffected

People

(Reporter: MatsPalmgren_bugz, Unassigned)

References

Details

(Keywords: crash, topcrash)

Crash Data

Attachments

(1 obsolete file)

This bug was filed from the Socorro interface and is 
report bp-8ab26458-526c-4196-b5ca-436a12150128.
=============================================================

This signature isn't new but the volume is quite high in v35.
It's currently at #50 in "Top Crashers for Firefox 35.0.1" with
1377 reported crashes for the past 7 days.

pref_ValueChanged
pref_savePref(PLDHashTable*, PLDHashEntryHdr*, unsigned int, void*)
xul.dll@0x18e034b
_SEH_epilog
@0x1
xpsp2res.dll@0x1d7364
xpsp2res.dll@0x1d2c66
xpsp2res.dll@0x1d2e72
xpsp2res.dll@0x1d2a1f
xpsp2res.dll@0x1d2f29
mozilla::Preferences::SavePrefFileInternal(nsIFile*)
mozilla::Preferences::SavePrefFile(nsIFile*)
NS_InvokeByIndex
XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*)
js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct)
Interpret
js::RunScript(JSContext*, js::RunState&)
js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct)
js_fun_apply(JSContext*, unsigned int, JS::Value*)
js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct)
js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value const*, JS::MutableHandle<JS::Value>)
js::DirectProxyHandler::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&)
js::CrossCompartmentWrapper::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&)
js::Proxy::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&)
js::proxy_Call(JSContext*, unsigned int, JS::Value*)
js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct)
Interpret


It still occurs in 38.0a1: bp-d07c6547-ddff-4e8b-bd9c-93db42150216

98% of crashes are on Windows, 2% on Linux.  None on OSX (in the
past 28 days at least).
We seem to get these also during the startup sanity check (e.g., https://crash-stats.mozilla.com/report/index/6a7b01a4-6853-4301-a89f-22b812160211), called here http://hg.mozilla.org/releases/mozilla-release/annotate/e51f09575766/gfx/src/DriverCrashGuard.cpp#l364 - David, is this done on the main thread, and should we guard against Preferences::GetService failing before we make that call?
Flags: needinfo?(dvander)
I'm not sure what further guards we can do - we already check that the pref service exists (i.e. is non-null). Do we know what percentage of these crashes occur within the crash guard?
Flags: needinfo?(dvander)
Around a quarter in the most recent count (244 out of 1037)
Flags: needinfo?(milan)
Crash volume for signature 'pref_ValueChanged':
 - nightly (version 51): 1 crash from 2016-08-01.
 - aurora  (version 50): 0 crashes from 2016-08-01.
 - beta    (version 49): 76 crashes from 2016-08-02.
 - release (version 48): 532 crashes from 2016-07-25.
 - esr     (version 45): 38 crashes from 2016-05-02.

Crash volume on the last weeks (Week N is from 08-22 to 08-28):
            W. N-1  W. N-2  W. N-3
 - nightly       1       0       0
 - aurora        0       0       0
 - beta         33      24       5
 - release     178     159     100
 - esr           0       0       0

Affected platform: Windows

Crash rank on the last 7 days:
           Browser     Content   Plugin
 - nightly
 - aurora
 - beta    #554
 - release #149
 - esr
Crash volume for signature 'pref_ValueChanged':
 - nightly (version 52): 0 crashes from 2016-09-19.
 - aurora  (version 51): 0 crashes from 2016-09-19.
 - beta    (version 50): 7 crashes from 2016-09-20.
 - release (version 49): 388 crashes from 2016-09-05.
 - esr     (version 45): 32 crashes from 2016-06-01.

Crash volume on the last weeks (Week N is from 10-03 to 10-09):
            W. N-1  W. N-2
 - nightly       0       0
 - aurora        0       0
 - beta          7       0
 - release     281     107
 - esr           0       0

Affected platform: Windows

Crash rank on the last 7 days:
           Browser   Content     Plugin
 - nightly
 - aurora
 - beta    #1598
 - release #177      #11491
 - esr
Crash volume for signature 'pref_ValueChanged':
 - nightly (version 52): 6 crashes from 2016-09-19.
 - aurora  (version 51): 4 crashes from 2016-09-19.
 - beta    (version 50): 41 crashes from 2016-09-20.
 - release (version 49): 1138 crashes from 2016-09-05.
 - esr     (version 45): 0 crashes from 2016-07-25.

Crash volume on the last weeks (Week N is from 10-17 to 10-23):
            W. N-1  W. N-2  W. N-3  W. N-4
 - nightly       6       0       0       0
 - aurora        4       0       0       0
 - beta         14      12       7       0
 - release     312     339     281     107
 - esr           0       0       0       0

Affected platform: Windows

Crash rank on the last 7 days:
           Browser     Content   Plugin
 - nightly #158
 - aurora
 - beta    #834
 - release #173
 - esr
The majority of the crashes coming from release and beta seems to be because of the E10S differences between them, looks like majority of these are with E10S off.
Flags: needinfo?(milan)
(In reply to Milan Sreckovic [:milan] from comment #8)
> Created attachment 8804368 [details] [diff] [review]
> Bug 1134016: Initialize the global hash table pointer.
> 
> Review commit: https://reviewboard.mozilla.org/r/88368/diff/#index_header
> See other reviews: https://reviewboard.mozilla.org/r/88368/

I don't really think this will take care of the problem, but it's an easy thing to do and not worry about.
(In reply to Milan Sreckovic [:milan] from comment #10)
> Comment on attachment 8804368 [details] [diff] [review]
> Bug 1134016: Initialize the global has table pointer, check for the default
> preference before testing its value.
> 
> Review request updated; see interdiff:
> https://reviewboard.mozilla.org/r/88368/diff/1-2/

This could actually take care of the problem.
> PLDHashTable*       gHashTable = nullptr;

This initialization is redundant.  Globals are always zero by default in C/C++.

Can you explain why swapping the order of HasDefault/pref_ValueChanged fixes the problem?
Flags: needinfo?(milan)
If there is no default today, we will still call pref_ValueChanged, which accesses the default value.  When we swap the two lines, if there is no default, we will never actually make a pref_ValueChanged call.
Flags: needinfo?(milan)
If you think "pref" points to allocated memory but that "pref->defaultPref"
(or userPref) contains a bogus stringVal, then the suggested fix seems like
wallpaper to me.  It will only "work" in the limited cases where the corrupted
memory happens to not have the PREF_FLAG_HAS_DEFAULT bit set in the prefFlags
field.  And there's no guarantee the program won't do something worse after
we avoided this illegal memory read.
I suppose.

Initializing a variable, is at least going to stop people from arguing whether all platforms zero out all global variables.  It's easier to read.

For the other part, if the default value is not initialized, there is no point in checking its value, even if there is no crash in doing it.  It's a better thing to do.  Probably should change the one other place where we check against the default value before we check if there is a default value.

The bug hasn't moved for 20 months, we're getting ~100 crashes per day, but if this going to make things worse, I certainly don't want it landed.  Thus the review request.
Comment on attachment 8804368 [details] [diff] [review]
Bug 1134016: Initialize the global has table pointer, check for the default preference before testing its value.

https://reviewboard.mozilla.org/r/88368/#review88446

::: modules/libpref/prefapi.cpp:70
(Diff revision 2)
>  
>      const char *otherKey = reinterpret_cast<const char*>(key);
>      return (strcmp(prefEntry->key, otherKey) == 0);
>  }
>  
> -PLDHashTable*       gHashTable;
> +PLDHashTable*       gHashTable = nullptr;

Please don't make this change. Init to 0 is part of the language; Brendan preferred that we didn't add the explicit initializer and I don't see any reason to change that.

::: modules/libpref/prefapi.cpp:357
(Diff revision 2)
>  
>          // where we're getting our pref from
>          PrefValue* sourcePref;
>  
>          if (pref->prefFlags.HasUserValue() &&
> -            (pref_ValueChanged(pref->defaultPref,
> +            (!(pref->prefFlags.HasDefault()) ||

I don't expect this to make any change. So you're welcome to land it, but we shouldn't resolve the bug unless you see that it actually caused things to improve.
Attachment #8804368 - Flags: review?(benjamin) → review+
Attachment #8804368 - Attachment is patch: true
Attachment #8804368 - Attachment mime type: text/x-review-board-request → text/plain
It'd be nice if PREF_ClearUserPref and pref_SavePrefs were clashing, but everything should be on the main thread, so there is not a lot of chance of that.
This is highly correlated to some addons, e.g. (100.0% in signature vs 00.24% overall) Addon "{cd617375-6743-4ee8-bac4-fbf10f35729e}" = true.

{cd617375-6743-4ee8-bac4-fbf10f35729e} is RightToClick. Apparently it has been removed from AMO (https://addons.mozilla.org/en-Us/firefox/addon/righttoclick/).
Too late for firefox 52, mass-wontfix.
(In reply to Marco Castelluccio [:marco] from comment #18)
> This is highly correlated to some addons, e.g. (100.0% in signature vs
> 00.24% overall) Addon "{cd617375-6743-4ee8-bac4-fbf10f35729e}" = true.
> 
> {cd617375-6743-4ee8-bac4-fbf10f35729e} is RightToClick. Apparently it has
> been removed from AMO
> (https://addons.mozilla.org/en-Us/firefox/addon/righttoclick/).

There are no instances of this crash signature in 57 or 58, which supports the theory that this crash is triggered by some legacy add-ons (which are not supported in 57+).

If we aren't going to fix this in a 56.0.x dot release, then we can probably resolve this as WORKSFORME in 57+.
Recent refactorings of libpref have changed this code sufficiently that it's going to show up with a different signature if it's still present. If that happens, we might as well just open a new bug report.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: