Closed Bug 1036454 Opened 5 years ago Closed 5 years ago

Failing assertion when calling resetUserPrefs

Categories

(Core :: Preferences: Backend, defect)

x86_64
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox39 --- fixed

People

(Reporter: marco, Assigned: matthew.spraggs, Mentored)

Details

(Whiteboard: [lang=c++])

Attachments

(9 files)

I'm using nsIPrefService::resetUserPrefs in a mochitest, there's a failing assertion:

LoadPlugin() /tmp/tmp3OvKyX.mozrunner/plugins/libnptestjava.so returned 7f13c364aeb0
LoadPlugin() /tmp/tmp3OvKyX.mozrunner/plugins/libnpsecondtest.so returned 7f13c36acac0
LoadPlugin() /tmp/tmp3OvKyX.mozrunner/plugins/libnptest.so returned 7f13c36bf730
Assertion failure: aOp == PL_DHASH_LOOKUP || aTable->recursionLevel == 0, at xpcom/glue/pldhash.cpp:529
pref_HashPref (modules/libpref/src/prefapi.cpp:757)
PREF_SetIntPref (modules/libpref/src/prefapi.cpp:265)
nsPrefBranch::SetIntPref(char const*, int) (modules/libpref/src/nsPrefBranch.cpp:211)
NS_InvokeByIndex (xpcom/reflect/xptcall/md/unix/xptcinvoke_x86_64_unix.cpp:166)
XPCCallContext::operator JSContext*() const (js/xpconnect/src/xpcprivate.h:801)
XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) (js/xpconnect/src/XPCWrappedNative.cpp:1703)
XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) (js/xpconnect/src/XPCWrappedNativeJSOps.cpp:1272)
js::CallJSNative(JSContext*, bool (*)(JSContext*, unsigned int, JS::Value*), JS::CallArgs const&) (js/src/jscntxtinlines.h:231)
js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) (js/src/vm/Interpreter.cpp:455)
Interpret (js/src/vm/Interpreter.cpp:2551)
js::RunScript(JSContext*, js::RunState&) (js/src/vm/Interpreter.cpp:402 (discriminator 1))
js::Invoke(JSContext*, JS::CallArgs, js::MaybeConstruct) (js/src/vm/Interpreter.cpp:474)
js::Invoke(JSContext*, JS::Value const&, JS::Value const&, unsigned int, JS::Value const*, JS::MutableHandle<JS::Value>) (js/src/vm/Interpreter.cpp:511)
~AutoLastFrameCheck (js/src/jsapi.cpp:4248)
nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS*, unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) (js/xpconnect/src/XPCWrappedJSClass.cpp:1259)
PrepareAndDispatch (xpcom/reflect/xptcall/md/unix/xptcstubs_x86_64_linux.cpp:124)
SharedStub (xptcstubs_x86_64_linux.cpp:?)
Can you please include the full stacktrace all the way back to main()?
Flags: needinfo?(mar.castelluccio)
Attached file log.txt
Flags: needinfo?(mar.castelluccio)
Bug here is that we're dispatching observers while enumerating the table:

18  libxul.so!nsPrefBranch::NotifyObserver [nsPrefBranch.cpp:84a5992a5d4d : 697 + 0x1d]
19  libxul.so!pref_DoCallback [prefapi.cpp:84a5992a5d4d : 975 + 0x9]
20  libxul.so!pref_ClearUserPref [prefapi.cpp:84a5992a5d4d : 642 + 0x4]
21  libxul.so!PL_DHashTableEnumerate [pldhash.cpp:84a5992a5d4d : 684 + 0x13]
22  libxul.so!PREF_ClearAllUserPrefs [prefapi.cpp:84a5992a5d4d : 657 + 0x14]

We likely need to batch up all the notifications and perform them after the enumeration is complete.
Mentor: benjamin
Whiteboard: [lang=c++]
I will like to work on this. Please suggest where to look.
Is anyone working on this? Akshendra are you still planning on working on it? If not I'd be interested in fixing this.
@Matt No. I haven't started yet. So have a go.
Great, I'll take this bug then.
Marco would you be able to give me the steps that reproduce this bug? Which mochitest did you run?
Flags: needinfo?(mar.castelluccio)
I was writing a new mochitest, unfortunately I don't recall precisely what I was doing.
You can probably reproduce the problem by writing a new mochitest that calls the resetUserPref function.
Flags: needinfo?(mar.castelluccio)
Ok thanks. I'm going to write an xpcshell test for this, after reading the mochitest guidance, which recommends against using mochitest for many cases.
Attached file Chrome mochitest
This chrome mochitest reproduces the specified bug.
Attached file test_bug1036454.js
This xpcshell test does not reproduce the specified bug, even though the resetUserPrefs function is called.
Ok I've had a look at the implementations of pref_ClearUserPref and PLDHashTable::Enumerate, and superficially it should be easy to add in some container to pref_ClearUserPref to track all the preference names, then run through the container at the end of Enumerate and call pref_DoCallback on each of them. However, it seems adding in an extra argument to pref_ClearUserPref is going to mess up the whole PLDHashEnumerator interface. The second option is putting in a global variable, which isn't a good idea for obvious reasons.

Benjamin, do you have some advice on how to proceed? Will I be lynched if I modify the PLDHashEnumerator interface to accept an additional container argument?
You mustn't add another arg, but you don't need to. In the case of pref_ClearUserPref, why can't you use the existing `arg` void*? It appears to be unused, and that's exactly what it's for.
Update:

Thanks Benjamin for the advice. I was over-complicating the problem I think. Following this is the draft patch of my changes. The assertion now passes, but the test I've written now fails at a different point in the execution. It's a little strange because the backtrace differs between runs (I've attached the test output for two example cases). Perhaps this is because of different threads failing? I'm not sure if this is a direct result of the fix I've implemented, or whether it's a separate bug that needs to be logged. I'll keep exploring the source of the bug to see if I can ascertain what the issue is.
Attached file Draft proposed changes
Draft changes
Attached file fail.log
Example new exception
Attached file fail2.log
Comment on attachment 8517720 [details]
fail2.log

Example new exception
I've been poking around in the preferences API, trying to find the reason for the aforementioned crashes. Because the assertion errors come after the ClearAllUserPrefs function has finished executing, I don't really understand why it's happening, mainly due to my patchy knowledge of the preferences API. I have been able to prevent the errors by commenting out the lines "pref->flags &= ~PREF_USERSET;" and "nextOp = PL_DHASH_REMOVE;" in function pref_ClearUserPref, file prefapi.cpp. I expect this will break other tests and cause problems elsewhere, and it still doesn't give me much insight into what's happening. Benjamin are you able to offer some insight into what might be happening? My next approach will be to try and locate where these failures occur in the javascript processing sequence. Hopefully this will give more insight.
Attachment #8517720 - Attachment mime type: text/x-log → text/plain
The new assertion is not related to the pref API itself. The assertion is failing because you cleared a pref which is required for the test automation to work correctly:

http://hg.mozilla.org/mozilla-central/annotate/21b745197618/js/xpconnect/wrappers/WrapperFactory.cpp#l423
calls http://hg.mozilla.org/mozilla-central/annotate/21b745197618/js/xpconnect/src/xpcprivate.h#l3749

That pref is typically set by release automation here:
http://dxr.mozilla.org/mozilla-central/source/testing/profiles/prefs_general.js#116 along with many other "user" prefs.

So basically the assertion is caused by you calling .resetUserPrefs from within an automated test.
Ok sure, thanks. I had a feeling it wasn't part of the preferences API due to the location of the stack trace, but I wasn't sure exactly how it fitted into what I was doing in the test. I've saved and restored the preferences before and after calling resetUserPrefs (see new chrome test attachment).

I still get an assertion error relating to the access of non-local network addresses (see second attachment). I've investigated this with a breakpoint here: http://hg.mozilla.org/mozilla-central/annotate/21b745197618/netwerk/base/src/nsSocketTransport2.cpp#l1198. If I comment out the call to resetUserPrefs in the test, then everything runs well, the breakpoint is hit, and the IP that's connected to is local (127.0.0.1 in all cases). However, if resetUserPrefs is called, then the first time this breakpoint is hit after the the user prefs are reset, the IP is not local (54.148.85.183, or tiles.services.mozilla.com) and the assertion error occurs.

On the javascript side, I put an alert in right after I called resetUserPrefs in the js to see what happens. The dialogue appears then immediately disappears. It looks like this is happening on another thread, and is happening before any other js really gets a chance to run. So it looks like I don't even get a chance to put all the test preferences back after the code executes, unless I add some code into the C++ API to put them back in debug mode. I can't see any other way to fix this issue.

Finally, I can set a breakpoint here: http://hg.mozilla.org/mozilla-central/annotate/21b745197618/netwerk/base/src/nsSocketTransport2.cpp#l1194 and manually set crashOnNonLocalConnections to 0 in gdb. The test then proceeds ok and doesn't raise anymore assertion errors. Obviously this doesn't fix this case, but it shows that restoring the user preferences in the js after resetting them solves the previous assertion error (this network error sometimes comes before the preferences assertion error).
Attached file Ammended test
Ok, I'm totally stuck. Immediately running pref_DoCallback on each saved preference name, I set the user preferences back to the values in dxr.mozilla.org/mozilla-central/source/testing/profiles/prefs_general.js. When I say immediately after, I mean on the subsequent lines in the same function, using PREF_SetXXXPref. Perhaps I'm doing it wrong, but I still get the exception relating to a non-local address. Totally stumped!
Hi Benjamin,

Would you be able to advise of where in the code base I might be able to find the code that handles URL routing? As far as I understand, something (a user setting I suppose) is being changed so that urls are no longer routed through the test server correctly, resulting in the request hitting the network and the test failing. I'm hoping that if I can find where this url-handling takes place, I might be able to determine what has changed that results in the aforementioned failure.

Best,

Matt
I already explained what's causing the error. You are clearing user prefs set by the test framework which are required for the test framework to work properly.  I think the answer here is that it is not possible to use .resetUserPrefs from a mochitest.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
Yes, but I also tried immediately restoring the preferences in user_prefs.js within the C++ code immediately after they're reset, and the error still occurs. Do you want me to create a patch for the original bug (observers being dispatched while the table is being enumerated)?
Oh yes. Sorry, we just can't really write a good automated test for it.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Here's the proposed fix for this bug.
Attachment #8579201 - Flags: review?(benjamin)
Comment on attachment 8579201 [details] [diff] [review]
Proposed fix for bug 1036454

It's unusual to be using std::vector and std::string instead of nsTArray/nsString, although in this case it's probably fine, since neither the producer nor the consumer is using XPCOM strings, so there's no benefit in the XPCOM stringbuffer sharing.
Attachment #8579201 - Flags: review?(benjamin) → review+
Sure, I'll bear that in mind for the future. What do I need to do now? The instructions for check-in on MDN just tell me how to format the patch properly, which I *think* I've already done.
Flags: needinfo?(benjamin)
aha! do you have tryserver access? If not I can push this to tryserver for you, and assuming this passes try it will be ready to set checkin-needed and sheriffs will push it to the main tree for you.
Flags: needinfo?(benjamin)
Assignee: nobody → matthew.spraggs
Flags: needinfo?(matthew.spraggs)
I don't have tryserver access unfortunately. Could you push for me? I tried pushing to tryserver with "try: -b do -p linux64 -u xpcshell,mochitests -t none", but I'm not sure if those are the right tests to run.
Flags: needinfo?(matthew.spraggs)
Flags: needinfo?(benjamin)
https://hg.mozilla.org/integration/mozilla-inbound/rev/5bbd37d1d21a there is only one current test that uses .resetUserPrefs, so I ran that test locally and just pushed this. Thank you for the work!
Flags: needinfo?(benjamin)
Target Milestone: --- → mozilla39
No problem. Thanks for your help.
https://hg.mozilla.org/mozilla-central/rev/5bbd37d1d21a
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.