Intermittent browser/components/migration/tests/unit/test_automigration.js | xpcshell return code: -11

RESOLVED FIXED in Firefox 60

Status

()

defect
P5
normal
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: intermittent-bug-filer, Assigned: dthayer)

Tracking

({crash, intermittent-failure})

unspecified
mozilla61
Points:
---

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox59 unaffected, firefox60 fixed, firefox61 fixed)

Details

(Whiteboard: [fxperf:p1])

Attachments

(1 attachment)

Reporter

Description

a year ago
treeherder
Filed by: ncsoregi [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=166769929&repo=mozilla-beta

https://queue.taskcluster.net/v1/task/AM6EzIvfS3qWwfRnoScfAQ/runs/0/artifacts/public/logs/live_backing.log

task 2018-03-08T16:02:08.968Z] 16:02:08     INFO -  TEST-START | browser/components/migration/tests/unit/test_automigration.js
[task 2018-03-08T16:02:09.738Z] 16:02:09  WARNING -  TEST-UNEXPECTED-FAIL | browser/components/migration/tests/unit/test_automigration.js | xpcshell return code: -11
[task 2018-03-08T16:02:09.738Z] 16:02:09     INFO -  TEST-INFO took 759ms
[task 2018-03-08T16:02:09.739Z] 16:02:09     INFO -  >>>>>>>
[task 2018-03-08T16:02:09.740Z] 16:02:09     INFO -  PID 7852 | [7852, Main Thread] WARNING: Couldn't get the user appdata directory. Crash events may not be produced.: file /builds/worker/workspace/build/src/toolkit/crashreporter/nsExceptionHandler.cpp, line 2678
[task 2018-03-08T16:02:09.740Z] 16:02:09     INFO -  (xpcshell/head.js) | test MAIN run_test pending (1)
[task 2018-03-08T16:02:09.740Z] 16:02:09     INFO -  (xpcshell/head.js) | test run_next_test 0 pending (2)
[task 2018-03-08T16:02:09.740Z] 16:02:09     INFO -  (xpcshell/head.js) | test MAIN run_test finished (2)
[task 2018-03-08T16:02:09.741Z] 16:02:09     INFO -  running event loop
[task 2018-03-08T16:02:09.741Z] 16:02:09     INFO -  browser/components/migration/tests/unit/test_automigration.js | Starting checkMigratorPicking
[task 2018-03-08T16:02:09.741Z] 16:02:09     INFO -  (xpcshell/head.js) | test checkMigratorPicking pending (2)
[task 2018-03-08T16:02:09.741Z] 16:02:09     INFO -  (xpcshell/head.js) | test run_next_test 0 finished (2)
[task 2018-03-08T16:02:09.742Z] 16:02:09     INFO -  TEST-PASS | browser/components/migration/tests/unit/test_automigration.js | checkMigratorPicking - [checkMigratorPicking : 53] true == true
[task 2018-03-08T16:02:09.742Z] 16:02:09     INFO -  PID 7852 | JavaScript strict warning: resource:///modules/MigrationUtils.jsm, line 718: ReferenceError: reference to undefined property "@mozilla.org/profile/migrator;1?app=browser&type=gobbledygook"
[task 2018-03-08T16:02:09.743Z] 16:02:09     INFO -  TEST-PASS | browser/components/migration/tests/unit/test_automigration.js | checkMigratorPicking - [checkMigratorPicking : 53] true == true
[task 2018-03-08T16:02:09.744Z] 16:02:09     INFO -  TEST-PASS | browser/components/migration/tests/unit/test_automigration.js | checkMigratorPicking - [checkMigratorPicking : 53] true == true
[task 2018-03-08T16:02:09.745Z] 16:02:09     INFO -  TEST-PASS | browser/components/migration/tests/unit/test_automigration.js | checkMigratorPicking - [checkMigratorPicking : 53] true == true
[task 2018-03-08T16:02:09.746Z] 16:02:09     INFO -  TEST-PASS | browser/components/migration/tests/unit/test_automigration.js | checkMigratorPicking - [checkMigratorPicking : 53] true == true
[task 2018-03-08T16:02:09.747Z] 16:02:09     INFO -  (xpcshell/head.js) | test run_next_test 1 pending (2)
[task 2018-03-08T16:02:09.748Z] 16:02:09     INFO -  (xpcshell/head.js) | test checkMigratorPicking finished (2)
[task 2018-03-08T16:02:09.750Z] 16:02:09     INFO -  "CONSOLE_MESSAGE: (warn) [JavaScript Warning: "ReferenceError: reference to undefined property "@mozilla.org/profile/migrator;1?app=browser&type=gobbledygook"" {file: "resource:///modules/MigrationUtils.jsm" line: 718}]"
[task 2018-03-08T16:02:09.751Z] 16:02:09     INFO -  "CONSOLE_MESSAGE: (error) [JavaScript Error: "Cc[("@mozilla.org/profile/migrator;1?app=browser&type=" + aKey)] is undefined" {file: "resource:///modules/MigrationUtils.jsm" line: 718}]
[task 2018-03-08T16:02:09.752Z] 16:02:09     INFO -  MU_getMigrator@resource:///modules/MigrationUtils.jsm:718:1
[task 2018-03-08T16:02:09.753Z] 16:02:09     INFO -  pickMigrator@resource:///modules/AutoMigrate.jsm:153:26
[task 2018-03-08T16:02:09.754Z] 16:02:09     INFO -  checkMigratorPicking/<@/builds/worker/workspace/build/tests/xpcshell/tests/browser/components/migration/tests/unit/test_automigration.js:64:29
[task 2018-03-08T16:02:09.755Z] 16:02:09     INFO -  promiseThrows@/builds/worker/workspace/build/tests/xpcshell/tests/browser/components/migration/tests/unit/test_automigration.js:49:11
[task 2018-03-08T16:02:09.756Z] 16:02:09     INFO -  checkMigratorPicking@/builds/worker/workspace/build/tests/xpcshell/tests/browser/components/migration/tests/unit/test_automigration.js:64:9
[task 2018-03-08T16:02:09.757Z] 16:02:09     INFO -  _do_main@/builds/worker/workspace/build/tests/xpcshell/head.js:221:3
[task 2018-03-08T16:02:09.758Z] 16:02:09     INFO -  _execute_test@/builds/worker/workspace/build/tests/xpcshell/head.js:533:5
[task 2018-03-08T16:02:09.758Z] 16:02:09     INFO -  @-e:1:1
[task 2018-03-08T16:02:09.759Z] 16:02:09     INFO -  "
[task 2018-03-08T16:02:09.760Z] 16:02:09     INFO -  browser/components/migration/tests/unit/test_automigration.js | Starting checkProfilePicking
[task 2018-03-08T16:02:09.761Z] 16:02:09     INFO -  (xpcshell/head.js) | test checkProfilePicking pending (2)
[task 2018-03-08T16:02:09.762Z] 16:02:09     INFO -  (xpcshell/head.js) | test run_next_test 1 finished (2)
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)

Comment 3

a year ago
This crash has

[task 2018-03-08T16:02:17.454Z] 16:02:17     INFO -   3  libxul.so!mozilla::detail::RunnableFunction<BackgroundSdrEncryptStrings(const nsTArray<nsTString<char> >&, RefPtr<mozilla::dom::Promise>&)::<lambda()> >::~RunnableFunction [nsThreadUtils.h:9b028aef17cb86ef1ca4a291ed33c92634f7c190 : 524 + 0x5]

in the stack.

Doug, I don't suppose your experience with OMT login stuff could help in determining what is calling into this stuff and/or why there's a thread mismatch of some sort?
Flags: needinfo?(dothayer)
Assignee

Comment 4

a year ago
Hmm, from the stack it looks like we're releasing this[1] nsIRunnable, which holds a reference to the moved promise. This shouldn't normally be a problem, as the background thread should win the race to release its reference to the runnable, meaning it doesn't actually run the destructor. But if it loses the race, it runs the destructor and tries to release the reference to the promise, which fails because the current thread needs to own the promise to do ref counting on it.

So I think line 60 just needs to read:
  NS_DispatchToMainThread(runnable.forget());

Does that seem reasonable, Gijs? If so I can throw up a patch.

[1] https://searchfox.org/mozilla-central/rev/877c99c523a054419ec964d4dfb3f0cadac9d497/security/manager/ssl/SecretDecoderRing.cpp#51
Flags: needinfo?(dothayer)

Comment 5

a year ago
(In reply to Doug Thayer [:dthayer] from comment #4)
> Hmm, from the stack it looks like we're releasing this[1] nsIRunnable, which
> holds a reference to the moved promise. This shouldn't normally be a
> problem, as the background thread should win the race to release its
> reference to the runnable, meaning it doesn't actually run the destructor.
> But if it loses the race, it runs the destructor and tries to release the
> reference to the promise, which fails because the current thread needs to
> own the promise to do ref counting on it.
> 
> So I think line 60 just needs to read:
>   NS_DispatchToMainThread(runnable.forget());
> 
> Does that seem reasonable, Gijs? If so I can throw up a patch.

It does not seem *un*reasonable, but I am not an expert in async multithreaded C++ code and lifetimes. I looked at https://dxr.mozilla.org/mozilla-central/search?q=ns_dispatchtomainthread&=mozilla-central and there are a more than a couple other places that use this pattern - but even more that don't seem to do that. I don't know if we should update more places, or change some signature or whatever, that would make this type of mistake (if that really is what it is) harder to make?


:keeler or :froydnj might be better people to give you accurate feedback and/or a patch review here.

I'm on the fence about moving this bug - it's not really a migration bug per se, even if the test is what triggers it. Tentatively moving somewhere to security-land, feel free to move this bug.
Component: Migration → Security: PSM
Flags: needinfo?(dothayer)
Product: Firefox → Core
Comment hidden (mozreview-request)
Assignee

Updated

a year ago
Flags: needinfo?(dothayer)
Assignee

Updated

a year ago
Assignee: nobody → dothayer
Status: NEW → ASSIGNED
Whiteboard: [fxperf:p1]

Comment 7

a year ago
mozreview-review
Comment on attachment 8961454 [details]
Bug 1444290 - Forget bg SDR runnable reference when dispatching

https://reviewboard.mozilla.org/r/230210/#review236080

The usual way this was solved in the past (especially when we didn't have an `already_AddRefed` overload for dispatching events to threads) was to have the `Run()` method explicitly drop the reference to the held refcounted thing (promise, in this case), using a local `RefPtr` that was `.forget()`'d into, or `Move()`'d into nowadays.  That way you're guaranteed to be releasing the reference on the right thread, regardless of whether the dispatching thread or the target thread wins the race to actually destroy the runnable.

I agree that the change proposed here ensures that only one thread has access to the runnable, and therefore the runnable (and its contents) are guaranteed to be destroyed on the proper thread.  It feels a little like spooky action-at-a-distance, but I think that's just my personal hangup.  r=me, and thanks for looking into this!
Attachment #8961454 - Flags: review?(nfroyd) → review+

Comment 8

a year ago
Pushed by dothayer@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7566bce984f9
Forget bg SDR runnable reference when dispatching r=froydnj

Comment 9

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/7566bce984f9
Status: ASSIGNED → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Please request Beta approval on this.
Flags: needinfo?(dothayer)
Keywords: crash
Assignee

Comment 11

a year ago
Comment on attachment 8961454 [details]
Bug 1444290 - Forget bg SDR runnable reference when dispatching

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1426721
[User impact if declined]: Infrequent crashes when importing logins from Chrome.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: No.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: This change simply removes a redundant AddRef / Release pair on an object which in the wrong circumstances could lead to an object being released on the wrong thread.
[String changes made/needed]: None.
Flags: needinfo?(dothayer)
Attachment #8961454 - Flags: approval-mozilla-beta?
Comment on attachment 8961454 [details]
Bug 1444290 - Forget bg SDR runnable reference when dispatching

crash fix, should be in 60.0b8
Attachment #8961454 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
You need to log in before you can comment on or make changes to this bug.