Closed Bug 1076597 Opened 10 years ago Closed 10 years ago

Race condition in settings api during app shutdown

Categories

(Core :: DOM: Device Interfaces, defect)

25 Branch
x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
2.1 S6 (10oct)
blocking-b2g 2.1+
Tracking Status
firefox33 --- wontfix
firefox34 --- wontfix
firefox35 --- fixed
b2g-v2.0 --- unaffected
b2g-v2.0M --- unaffected
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed

People

(Reporter: qdot, Assigned: gerard-majax)

References

Details

(Whiteboard: [systemsfe])

Attachments

(2 files, 15 obsolete files)

9.78 KB, patch
gerard-majax
: review+
Details | Diff | Splinter Review
46.75 KB, text/plain
Details
When an app shuts down, the SettingsManager will try to close all of its locks during dom-window-destroyed, by sending the Finalize message for each lock to the SettingsRequestManager. However, at the time of child message manager death, the SettingsRequestManager receives a child-process-shutdown event through the observer, and calls removeMessageManager, which removes all tracked locks and tasks for that message manager. This means that if we have things that shut down quickly while also running settings calls (like, say, the callscreen in bug 1070066), settings calls can be made that will never actually commit to the database, finish, and be broadcast to observers.

The fix here is an interesting question. I /think/ the right solution is that instead of just killing locks/tasks in removeMessageManager, we somehow make sure that finalize is going to be called on the remaining locks for the message manager. The app itself won't get anything back, but we'll still let the rest of the system update.
Here's my cleaned up WIP. Honestly, this could probably land this way.

This changes what happens when we remove a message manager due to child process death. It used to be that we would just clear out all of the tasks, and transactions would die unfinished. Now, we make sure that all locks have a finalize task. If they don't, we queue one for them. This means that all locks for the dead message manager will still be in the queue, but should finish their transaction, which means observers will be called.

Note that I have NOT run mochis on this yet, just ran it on the phone, and it seems ok there. Doing a desktop build now, will post mochi results when that's done.
Attachment #8498652 - Flags: feedback?(lissyx+mozillians)
Patch clears mochis fine.
Comment on attachment 8498652 [details] [diff] [review]
Patch 1 (v1) - WIP: Always finalize open locks on child process death

Review of attachment 8498652 [details] [diff] [review]:
-----------------------------------------------------------------

I have been looking at this since yesterday, and it all makes a lot of sense. I have been testing this on my Xperia ZR and now on my Flame. As far as I can say, for the ZR at least, I could not reproduce anymore most of the settings-related bugs once applied.

I would like, however, to be able to write some tests to make this nicer.
Attachment #8498652 - Flags: feedback?(lissyx+mozillians) → feedback+
[Blocking Requested - why for this release]: because triggered by STRs of bug 1070066 which is 2.1+
Assignee: nobody → lissyx+mozillians
blocking-b2g: --- → 2.1?
Whiteboard: [systemsfe]
Blocks: 1074379
Blocks: 1070066
Attachment #8500406 - Attachment is obsolete: true
Attachment #8500408 - Attachment is obsolete: true
There is few changes on the fix itself, mostly cosmetic. Tests are getting into shape, and now all bits should be in place to implement the core logic exposing the race condition there.
Regression
blocking-b2g: 2.1? → 2.1+
No longer blocks: 1070066
See Also: → 1070066
(In reply to Alexandre LISSY :gerard-majax from comment #11)
> Try with tests and fix:
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=a9bf91bc0468
> Try with only tests:
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1a879a93cd4f

And those are invalid because one need to add the xpcshell.ini to http://dxr.mozilla.org/mozilla-central/source/testing/xpcshell/xpcshell_b2g.ini :(
Attachment #8498652 - Attachment is obsolete: true
Attachment #8500480 - Attachment is obsolete: true
Attachment #8500481 - Attachment is obsolete: true
With those two patches, we expect to have:
 - try green with the fix
 - try red (timeout) without
When child message manager dies, it sends a child-process-shutdown
message to the SettingsRequestManager. This would just close the locks
and tasks of this message manager. The race happens when some
applications can shutdown quickly: settings requests will never be
committed to the database. One example is the callscreen. The fix,
provided by qDot, simply put those tasks in a finalize state to make
sure they are properly executed and committed.
Comment on attachment 8501072 [details] [diff] [review]
Fix Settings API shutdown race condition r=bent

Ben, according to git log you have been reviewing a lot of patches on dom/settings. Since qDot is away, I'm asking you for review on this. This patch is the same as the two others one, it's just a squashed as one commit and with debug disabled.

If someone else is more suited to review this, feel free to transfer :)
Attachment #8501072 - Flags: review?(bent.mozilla)
Target Milestone: --- → 2.1 S6 (10oct)
When child message manager dies, it sends a child-process-shutdown
message to the SettingsRequestManager. This would just close the locks
and tasks of this message manager. The race happens when some
applications can shutdown quickly: settings requests will never be
committed to the database. One example is the callscreen. The fix,
provided by qDot, simply put those tasks in a finalize state to make
sure they are properly executed and committed.
Attachment #8501072 - Attachment is obsolete: true
Attachment #8501072 - Flags: review?(bent.mozilla)
Comment on attachment 8501192 [details] [diff] [review]
Fix Settings API shutdown race condition r=bent

Changed the name of the setting used in the test since the previous one exists and made the first test to fail (since it expects it does not exists)
Attachment #8501192 - Flags: review?(bent.mozilla)
Comment on attachment 8501192 [details] [diff] [review]
Fix Settings API shutdown race condition r=bent

Review of attachment 8501192 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/settings/SettingsRequestManager.jsm
@@ +751,5 @@
>      let closedLockIDs = [];
>      let lockIDs = Object.keys(this.lockInfo);
>      for (let i in lockIDs) {
> +      let lock = this.lockInfo[lockIDs[i]];
> +      if (lock._mm == aMsgMgr) {

Can you also verify that the principal is the same?

@@ +753,5 @@
>      for (let i in lockIDs) {
> +      let lock = this.lockInfo[lockIDs[i]];
> +      if (lock._mm == aMsgMgr) {
> +        let is_finalizing = false;
> +        for (let task_index in lock.tasks) {

Can you iterate backwards? Finalize should be at the end.
Attachment #8501192 - Flags: review?(bent.mozilla) → review+
(In reply to Alexandre LISSY :gerard-majax from comment #21)
> Re-triggered try because of failure.
> Try with tests:
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=2577906394be
> Try with tests and fix:
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1d6aa022af4a

Tests worked as expected, but we have an issue on https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1d6aa022af4a (which should be green): it's broken on emulator debug, with an IndexedDB error :(
When child message manager dies, it sends a child-process-shutdown
message to the SettingsRequestManager. This would just close the locks
and tasks of this message manager. The race happens when some
applications can shutdown quickly: settings requests will never be
committed to the database. One example is the callscreen. The fix,
provided by qDot, simply put those tasks in a finalize state to make
sure they are properly executed and committed.
Comment on attachment 8501537 [details] [diff] [review]
Fix Settings API shutdown race condition r=bent

Carrying r+ from attachment 8501192 [details] [diff] [review]
Adding principal check and backward iteration on tasks.
Attachment #8501537 - Flags: review+
Attachment #8501192 - Attachment is obsolete: true
The error seems to be 100% reproductible on b2g emulator debug. Looking at full log, I see:
> 00:41:00     INFO -  10-08 06:06:50.791 I/Gecko   ( 1567): -*- SettingsRequestManager: Making READWRITE transaction for 435d2192-4f21-48d4-90b7-285f147a56be
> 00:41:00     INFO -  10-08 06:06:50.841 I/Gecko   ( 1567): -*- SettingsRequestManager: Making get request for xpcshell.language.current
> 00:41:00     INFO -  10-08 06:06:50.981 I/Gecko   ( 1567): [1567] WARNING: 'NS_FAILED(rv)', file ../../../gecko/dom/indexedDB/ActorsParent.cpp, line 6582
> 00:41:00     INFO -  10-08 06:06:51.001 I/Gecko   ( 1567): [1567] WARNING: 'NS_FAILED(rv)', file ../../../gecko/dom/indexedDB/ActorsParent.cpp, line 12877
> 00:41:00     INFO -  10-08 06:06:51.011 I/Gecko   ( 1567): [1567] WARNING: Converting non-IndexedDB error code (0x80630001) to NS_ERROR_DOM_INDEXEDDB_UNKNOWN_ERR: file ../../../gecko/dom/indexedDB/ActorsParent.cpp, line 408
> 00:41:00  WARNING -  10-08 06:06:51.021 W/GeckoConsole( 1567): [JavaScript Error: "IndexedDB UnknownErr: ActorsParent.cpp:414"]
> 00:41:00     INFO -  10-08 06:06:51.101 I/Gecko   ( 1567): -*- SettingsRequestManager: Transaction for lock 435d2192-4f21-48d4-90b7-285f147a56be aborted
> 00:41:00     INFO -  10-08 06:06:51.141 I/Gecko   ( 1567): -*- SettingsRequestManager: getRequest FAILED xpcshell.language.current

Doing a local run against b2g desktop in debug mode, I don't reproduce :(
When child message manager dies, it sends a child-process-shutdown
message to the SettingsRequestManager. This would just close the locks
and tasks of this message manager. The race happens when some
applications can shutdown quickly: settings requests will never be
committed to the database. One example is the callscreen. The fix,
provided by qDot, simply put those tasks in a finalize state to make
sure they are properly executed and committed.
Comment on attachment 8501613 [details] [diff] [review]
Fix Settings API shutdown race condition r=bent

Carrying r+ from attachment 8501192 [details] [diff] [review]
Fixing some js warning/errors triggered in debug build. Also changed the Settings:Get, hoping this will make emulator debug happy.
Try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=98ad1b0fb7e5
Attachment #8501613 - Flags: review+
(In reply to Alexandre LISSY :gerard-majax from comment #24)
> New set of try with the addressed review comments:
>  https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=0afae351c3de
>  https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=49b4f89be8b5

Apart from the emulator-debug failure still present, those try are ok: red without the fix, green with.
When child message manager dies, it sends a child-process-shutdown
message to the SettingsRequestManager. This would just close the locks
and tasks of this message manager. The race happens when some
applications can shutdown quickly: settings requests will never be
committed to the database. One example is the callscreen. The fix,
provided by qDot, simply put those tasks in a finalize state to make
sure they are properly executed and committed.
(In reply to Alexandre LISSY :gerard-majax from comment #31)
> Created attachment 8501718 [details] [diff] [review]
> Fix Settings API shutdown race condition r=bent
> 
> When child message manager dies, it sends a child-process-shutdown
> message to the SettingsRequestManager. This would just close the locks
> and tasks of this message manager. The race happens when some
> applications can shutdown quickly: settings requests will never be
> committed to the database. One example is the callscreen. The fix,
> provided by qDot, simply put those tasks in a finalize state to make
> sure they are properly executed and committed.

Changed the lock id to use "{uuid}" instead of "uuid".

New try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=58f9248a4c99

Locally, this is passing in my emulator debug :(
Comment on attachment 8501718 [details] [diff] [review]
Fix Settings API shutdown race condition r=bent

Carrying r+
Attachment #8501718 - Flags: review+
Attachment #8501537 - Attachment is obsolete: true
Attachment #8501613 - Attachment is obsolete: true
Attachment #8501718 - Attachment is obsolete: true
Try at: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=98ad1b0fb7e5 is green for opt, with some known unrelated intermittents.
Try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=58f9248a4c99
This one just fixes the lock id, adding curly bracsed around.
Try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=c12c7577df9c
This one just removes debug, in case it could explain the failures
Try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=839f49403404
This one blacklist the added test for gonk debug, to avoid red on b2g emulator debug builds
(In reply to Alexandre LISSY :gerard-majax from comment #27)
> The error seems to be 100% reproductible on b2g emulator debug. Looking at
> full log, I see:
> > 00:41:00     INFO -  10-08 06:06:50.791 I/Gecko   ( 1567): -*- SettingsRequestManager: Making READWRITE transaction for 435d2192-4f21-48d4-90b7-285f147a56be
> > 00:41:00     INFO -  10-08 06:06:50.841 I/Gecko   ( 1567): -*- SettingsRequestManager: Making get request for xpcshell.language.current
> > 00:41:00     INFO -  10-08 06:06:50.981 I/Gecko   ( 1567): [1567] WARNING: 'NS_FAILED(rv)', file ../../../gecko/dom/indexedDB/ActorsParent.cpp, line 6582
> > 00:41:00     INFO -  10-08 06:06:51.001 I/Gecko   ( 1567): [1567] WARNING: 'NS_FAILED(rv)', file ../../../gecko/dom/indexedDB/ActorsParent.cpp, line 12877
> > 00:41:00     INFO -  10-08 06:06:51.011 I/Gecko   ( 1567): [1567] WARNING: Converting non-IndexedDB error code (0x80630001) to NS_ERROR_DOM_INDEXEDDB_UNKNOWN_ERR: file ../../../gecko/dom/indexedDB/ActorsParent.cpp, line 408
> > 00:41:00  WARNING -  10-08 06:06:51.021 W/GeckoConsole( 1567): [JavaScript Error: "IndexedDB UnknownErr: ActorsParent.cpp:414"]
> > 00:41:00     INFO -  10-08 06:06:51.101 I/Gecko   ( 1567): -*- SettingsRequestManager: Transaction for lock 435d2192-4f21-48d4-90b7-285f147a56be aborted
> > 00:41:00     INFO -  10-08 06:06:51.141 I/Gecko   ( 1567): -*- SettingsRequestManager: getRequest FAILED xpcshell.language.current
> 
> Doing a local run against b2g desktop in debug mode, I don't reproduce :(

Ben, I could not even reproduce locally on a debug b2g in emulator. See above, I've sent a couple of try with slight changes. My plan is to land the fix with the test running on b2g emulator opt, and file a followup for this issue.
Flags: needinfo?(bent.mozilla)
The error code here says that the database is locked by another process... I don't know how that could happen since we should only touch the file from the parent process.
Flags: needinfo?(bent.mozilla)
(In reply to Alexandre LISSY :gerard-majax from comment #37)
> Try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=839f49403404
> This one blacklist the added test for gonk debug, to avoid red on b2g
> emulator debug builds
>
> Try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=c12c7577df9c
> This one just removes debug, in case it could explain the failures

Both are green, so it would have been fixed by removing debug \o/

I retriggered 10 jobs for each xpcshell tests in each build, just to make sure. But if we can land this without the disabling for b2g emulator debug, it's a good news :)
Try with skip: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=a92a8ee50547
Try without skip: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=5c80e298220a

From the second link, it seems that removing the SettingsRequestManager debug improved the situation but it is still failing often enough for this not to be reliable for sherrifs. So I think the safe way is to disable this test for b2g emulator debug builds.
Attachment #8501037 - Attachment is obsolete: true
Attachment #8501038 - Attachment is obsolete: true
When child message manager dies, it sends a child-process-shutdown
message to the SettingsRequestManager. This would just close the locks
and tasks of this message manager. The race happens when some
applications can shutdown quickly: settings requests will never be
committed to the database. One example is the callscreen. The fix,
provided by qDot, simply put those tasks in a finalize state to make
sure they are properly executed and committed.
Comment on attachment 8502305 [details] [diff] [review]
Fix Settings API shutdown race condition r=bent

Carrying r+ from :bent
Attachment #8502305 - Flags: review+
Try is green for b2g emulator opt, after retriggers: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=a92a8ee50547

The only failure are known intermittents.
Keywords: checkin-needed
When child message manager dies, it sends a child-process-shutdown
message to the SettingsRequestManager. This would just close the locks
and tasks of this message manager. The race happens when some
applications can shutdown quickly: settings requests will never be
committed to the database. One example is the callscreen. The fix,
provided by qDot, simply put those tasks in a finalize state to make
sure they are properly executed and committed.
Comment on attachment 8502305 [details] [diff] [review]
Fix Settings API shutdown race condition r=bent

obsolete by backout
Attachment #8502305 - Attachment is obsolete: true
Attachment #8502421 - Attachment is obsolete: true
When child message manager dies, it sends a child-process-shutdown
message to the SettingsRequestManager. This would just close the locks
and tasks of this message manager. The race happens when some
applications can shutdown quickly: settings requests will never be
committed to the database. One example is the callscreen. The fix,
provided by qDot, simply put those tasks in a finalize state to make
sure they are properly executed and committed.
Comment on attachment 8502423 [details] [diff] [review]
Fix Settings API shutdown race condition r=bent

Carrying r+ from :bent

Disabling the test on non b2g builds and on emulator/debug.
Attachment #8502423 - Flags: review+
Finger crossed, but this one should be green ...
Keywords: checkin-needed
When child message manager dies, it sends a child-process-shutdown
message to the SettingsRequestManager. This would just close the locks
and tasks of this message manager. The race happens when some
applications can shutdown quickly: settings requests will never be
committed to the database. One example is the callscreen. The fix,
provided by qDot, simply put those tasks in a finalize state to make
sure they are properly executed and committed.
Attachment #8502423 - Attachment is obsolete: true
Comment on attachment 8502447 [details] [diff] [review]
Fix Settings API shutdown race condition r=bent

Carrying r+, fixing typo in skip-if
Attachment #8502447 - Flags: review+
Keywords: checkin-needed
Please request b2g34 approval on this patch when you get a chance.
Flags: needinfo?(lissyx+mozillians)
Sure.
Flags: needinfo?(lissyx+mozillians)
Comment on attachment 8502447 [details] [diff] [review]
Fix Settings API shutdown race condition r=bent

NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Settings OOP
User impact if declined: improper api settings behavior on quickly shutting down applications
Testing completed: adding xpcshell tests to ensure the race is covered, merged in master since a couple of days, multiple QA (:jlorenzo) checking of the patch to assert it was fixing the depends
Risk to taking this patch (and alternatives if risky): low
String or UUID changes made by this patch: none
Attachment #8502447 - Flags: approval-mozilla-b2g34?
Attachment #8502447 - Flags: approval-mozilla-b2g34? → approval-mozilla-b2g34+
OK, so this land on b-i and around the same time, Mnw goes basically permafail (with errors that I swear I remember qDot hitting at one point too). Now it gets uplifted to b2g34 and also permafail. Coincidence? :)
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=3660&repo=mozilla-b2g34_v2_1

I'm backing this out from b2g34 for now to see if Mnw indeed greens up. If so, I'm going to recommend doing the same on trunk as well.
https://hg.mozilla.org/releases/mozilla-b2g34_v2_1/rev/76b4a13e4fc7
Yep, backing it out got us back from permafail to our more usual just fails most of the time state. I'd recommend backing it out from m-c as well, but seeing as how Mnw isn't a sheriffed test suite at this point, I'll defer to the bug owners here to decide.
Well, it sort of fixed the perma-fail. I did indeed make the cascade of failures from comment 61 go away. However, test_dom_BluetoothManager_enabled.js is still perma-failing afterwards.
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #63)
> Well, it sort of fixed the perma-fail. I did indeed make the cascade of
> failures from comment 61 go away. However,
> test_dom_BluetoothManager_enabled.js is still perma-failing afterwards.

So test_dom_BluetoothManager_enabled.js is perma-fail without my patch, how can we be sure that it's my patch that is causing this mess then?
Mach does not work. This will be hard to work on:

alex@portable-alex:~/codaz/Mozilla/b2g/devices/Emulator/B2G$ ./mach help marionette-webapi
usage: mach [global arguments] marionette-webapi [command arguments]

Run a Marionette webapi test (test WebAPIs using marionette).

Global Arguments:
  -v, --verbose         Print verbose output.
  -l FILENAME, --log-file FILENAME
                        Filename to write log data to.
  --log-interval        Prefix log line with interval from last message rather
                        than relative time. Note that this is NOT execution
                        time if there are parallel operations.
  --log-no-times        Do not prefix log lines with times. By default, mach
                        will prefix each output line with the time since
                        command start.
  -h, --help            Show this help message.

Command Arguments:
  --type TESTTYPE  Test type, usually one of: browser, b2g, b2g-qemu.
  TESTS            Path to test(s) to run.
alex@portable-alex:~/codaz/Mozilla/b2g/devices/Emulator/B2G$ ./mach --log-file bluetooth.log --log-interval marionette-webapi dom/bluetooth/tests/marionette/test_dom_BluetoothManager_enabled.js
Usage: mach [options]

mach: error: no such option: --log-file
alex@portable-alex:~/codaz/Mozilla/b2g/devices/Emulator/B2G$ git pull 
Already up-to-date.
alex@portable-alex:~/codaz/Mozilla/b2g/devices/Emulator/B2G$ ./mach -l bluetooth.log --log-interval marionette-webapi dom/bluetooth/tests/marionette/test_dom_BluetoothManager_enabled.js
Usage: mach [options]

mach: error: no such option: -l
alex@portable-alex:~/codaz/Mozilla/b2g/devices/Emulator/B2G$ ./mach -l bluetooth.log marionette-webapi dom/bluetooth/tests/marionette/test_dom_BluetoothManager_enabled.js
Usage: mach [options]

mach: error: no such option: -l
Depends on: 1083685
My patch landed in:
 - https://tbpl.mozilla.org/?rev=c0a98d9b9a33 with green Mnw
 - https://tbpl.mozilla.org/?rev=95d1486223f7 with green Mnw
 - https://tbpl.mozilla.org/?rev=b91b22431613 is the first failure.
 - https://tbpl.mozilla.org/?rev=50b689feab5f is then back to green Mnw
 - https://tbpl.mozilla.org/?rev=097821fd89ed green Mnw
 - https://tbpl.mozilla.org/?rev=708b45d9b1b1 orange Mnw but not similar failure
Starting with https://tbpl.mozilla.org/?rev=999ad2f1e108 I do see a perma red Mnw matching pattern.
I ran the marionette-webapi test locally a dozen of times against a B2G Emulator Debug build, could not reproduce any problem ...
(In reply to Alexandre LISSY :gerard-majax from comment #67)
> I ran the marionette-webapi test locally a dozen of times against a B2G
> Emulator Debug build, could not reproduce any problem ...

No luck with opt build locally when running: |./mach marionette-webapi gecko/dom/bluetooth/tests/marionette/|
Retriggered a set of try jobs for master (where it has not been reverted): https://tbpl.mozilla.org/?tree=Try&rev=66345e60b0e2
(In reply to Alexandre LISSY :gerard-majax from comment #69)
> Retriggered a set of try jobs for master (where it has not been reverted):
> https://tbpl.mozilla.org/?tree=Try&rev=66345e60b0e2

And there are errors on master, but intermittent.
Flags: needinfo?(ryanvm)
Yes, Mnw is known to be incredibly flaky. Always has been. Not sure what specifically you're expecting from me about it.
Flags: needinfo?(ryanvm)
Failed run: https://tbpl.mozilla.org/php/getParsedLog.php?id=50360599&tree=Try&full=1
Successful run: https://tbpl.mozilla.org/php/getParsedLog.php?id=50378525&tree=Try&full=1

In the later case, we can see that we have the logcat of the emulator, and all the debug that has been switched. We don't have any logcat at all in the first case, so I cannot even know if emulator properly started or not ...
We're seeing increasing failures on Mnw (Bug 1081529), and that busted build problem is becoming more and more frequent. The problem is we can't get the logcat or any other build artifact b/c they don't get uploaded if buildbot kills the mozharness script before it can upload the artifacts, so we have no visibility into the core problem. I'm requesting this feature Bug 1069641, but in the mean time, I'll make time next week to try to isolate the cause of the busted builds.
Alexandre just pointed out to me that the patch that landed on m-c was the wrong one. I'm not sure how that happened (qimportbz should have Just Worked here AFAICT), but I've gone ahead and backed out and re-landed at least.
https://hg.mozilla.org/integration/b2g-inbound/rev/0197ff8217ec
https://hg.mozilla.org/integration/b2g-inbound/rev/cf4d42ad0275
Attached file logcat
By running the test with some additional parameters in mozharness, I have the logcat, but I don't see anything suspicious, especially not with marionette (we would have seen marionette error logs or something suspicious with the xpcom interfaces it uses).

You can see line 280 corresponds with https://mxr.mozilla.org/mozilla-central/source/dom/bluetooth/tests/marionette/head.js#551, so it got that far in execution, but https://mxr.mozilla.org/mozilla-central/source/dom/bluetooth/tests/marionette/test_dom_BluetoothManager_enabled.js#30 is never logged, so I suspect that a script you've been using is now timing out in your test's head.js or something before it can get there.
interestingly, the subsequent tests are failing because a setup script in marionette's test harness is timing out... I wonder if the emulator is just incredibly slow.
Also:
> 17:13:46     INFO -  10-29 00:12:42.145 W/(Preallocated app)(  271): [JavaScript Error: "TypeError: infos is null" {file: "chrome://global/content/BrowserElementChild.js" line: 57}]
> 17:13:46     INFO -  10-29 00:12:45.085 W/GeckoConsole(   44): [JavaScript Error: "TypeError: actor.target is undefined" {file: "resource://gre/modules/InterAppCommService.jsm" line: 810}]
Do you think you could capture a logcat after enabling all debug in dom/settings ?

Just switching the |DEBUG| to true in all js/jsm files in this directory, maybe it could help :)
Flags: needinfo?(mdas)
Unfortunately, I can't yet. That logcat output was a test run of a mozharness script. You can't modify any other code when testing out mozharness changes, so I can't change the test/gecko/anything. In the meantime, I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1090978 to get logcat output interleaved in the stdout. When that lands, you'll be able to see logcat output in MnW, so you can try anything out and see what happens.
Flags: needinfo?(mdas)
Ok. Well the log already show us that bluetooth test starts and that after this, we properly get the permissions. The very next step here is to read bluetooth.enabled setting, so we cannot go further until we can trigger debugging of this code :)
Landing of bug 1092941 fixed Mnw.
Depends on: 1092941
You need to log in before you can comment on or make changes to this bug.