Closed
Bug 1076597
Opened 10 years ago
Closed 10 years ago
Race condition in settings api during app shutdown
Categories
(Core :: DOM: Device Interfaces, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox33 | --- | wontfix |
firefox34 | --- | wontfix |
firefox35 | --- | fixed |
b2g-v2.0 | --- | unaffected |
b2g-v2.0M | --- | unaffected |
b2g-v2.1 | --- | fixed |
b2g-v2.2 | --- | fixed |
People
(Reporter: qdot, Assigned: gerard-majax)
References
Details
(Whiteboard: [systemsfe])
Attachments
(2 files, 15 obsolete files)
9.78 KB,
patch
|
gerard-majax
:
review+
bajaj
:
approval-mozilla-b2g34+
|
Details | Diff | Splinter Review |
46.75 KB,
text/plain
|
Details |
When an app shuts down, the SettingsManager will try to close all of its locks during dom-window-destroyed, by sending the Finalize message for each lock to the SettingsRequestManager. However, at the time of child message manager death, the SettingsRequestManager receives a child-process-shutdown event through the observer, and calls removeMessageManager, which removes all tracked locks and tasks for that message manager. This means that if we have things that shut down quickly while also running settings calls (like, say, the callscreen in bug 1070066), settings calls can be made that will never actually commit to the database, finish, and be broadcast to observers. The fix here is an interesting question. I /think/ the right solution is that instead of just killing locks/tasks in removeMessageManager, we somehow make sure that finalize is going to be called on the remaining locks for the message manager. The app itself won't get anything back, but we'll still let the rest of the system update.
Reporter | ||
Comment 1•10 years ago
|
||
Here's my cleaned up WIP. Honestly, this could probably land this way. This changes what happens when we remove a message manager due to child process death. It used to be that we would just clear out all of the tasks, and transactions would die unfinished. Now, we make sure that all locks have a finalize task. If they don't, we queue one for them. This means that all locks for the dead message manager will still be in the queue, but should finish their transaction, which means observers will be called. Note that I have NOT run mochis on this yet, just ran it on the phone, and it seems ok there. Doing a desktop build now, will post mochi results when that's done.
Attachment #8498652 -
Flags: feedback?(lissyx+mozillians)
Reporter | ||
Comment 2•10 years ago
|
||
Patch clears mochis fine.
Assignee | ||
Comment 3•10 years ago
|
||
Comment on attachment 8498652 [details] [diff] [review] Patch 1 (v1) - WIP: Always finalize open locks on child process death Review of attachment 8498652 [details] [diff] [review]: ----------------------------------------------------------------- I have been looking at this since yesterday, and it all makes a lot of sense. I have been testing this on my Xperia ZR and now on my Flame. As far as I can say, for the ZR at least, I could not reproduce anymore most of the settings-related bugs once applied. I would like, however, to be able to write some tests to make this nicer.
Attachment #8498652 -
Flags: feedback?(lissyx+mozillians) → feedback+
Assignee | ||
Comment 4•10 years ago
|
||
[Blocking Requested - why for this release]: because triggered by STRs of bug 1070066 which is 2.1+
Assignee: nobody → lissyx+mozillians
blocking-b2g: --- → 2.1?
Updated•10 years ago
|
Whiteboard: [systemsfe]
Assignee | ||
Comment 5•10 years ago
|
||
Assignee | ||
Comment 6•10 years ago
|
||
Assignee | ||
Comment 7•10 years ago
|
||
Assignee | ||
Comment 8•10 years ago
|
||
Assignee | ||
Updated•10 years ago
|
Attachment #8500406 -
Attachment is obsolete: true
Assignee | ||
Updated•10 years ago
|
Attachment #8500408 -
Attachment is obsolete: true
Assignee | ||
Comment 9•10 years ago
|
||
There is few changes on the fix itself, mostly cosmetic. Tests are getting into shape, and now all bits should be in place to implement the core logic exposing the race condition there.
Updated•10 years ago
|
Assignee | ||
Comment 11•10 years ago
|
||
Try with tests and fix: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=a9bf91bc0468 Try with only tests: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1a879a93cd4f
Assignee | ||
Comment 12•10 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #11) > Try with tests and fix: > https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=a9bf91bc0468 > Try with only tests: > https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1a879a93cd4f And those are invalid because one need to add the xpcshell.ini to http://dxr.mozilla.org/mozilla-central/source/testing/xpcshell/xpcshell_b2g.ini :(
Assignee | ||
Comment 13•10 years ago
|
||
New set of try: - fix and tests: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=2552a4baec12 - tests only: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=e569aefef30e
Assignee | ||
Updated•10 years ago
|
Attachment #8498652 -
Attachment is obsolete: true
Assignee | ||
Updated•10 years ago
|
Attachment #8500480 -
Attachment is obsolete: true
Assignee | ||
Updated•10 years ago
|
Attachment #8500481 -
Attachment is obsolete: true
Assignee | ||
Comment 14•10 years ago
|
||
Assignee | ||
Comment 15•10 years ago
|
||
Assignee | ||
Comment 16•10 years ago
|
||
With those two patches, we expect to have: - try green with the fix - try red (timeout) without
Assignee | ||
Comment 17•10 years ago
|
||
When child message manager dies, it sends a child-process-shutdown message to the SettingsRequestManager. This would just close the locks and tasks of this message manager. The race happens when some applications can shutdown quickly: settings requests will never be committed to the database. One example is the callscreen. The fix, provided by qDot, simply put those tasks in a finalize state to make sure they are properly executed and committed.
Assignee | ||
Comment 18•10 years ago
|
||
Comment on attachment 8501072 [details] [diff] [review] Fix Settings API shutdown race condition r=bent Ben, according to git log you have been reviewing a lot of patches on dom/settings. Since qDot is away, I'm asking you for review on this. This patch is the same as the two others one, it's just a squashed as one commit and with debug disabled. If someone else is more suited to review this, feel free to transfer :)
Attachment #8501072 -
Flags: review?(bent.mozilla)
Updated•10 years ago
|
Target Milestone: --- → 2.1 S6 (10oct)
Assignee | ||
Comment 19•10 years ago
|
||
When child message manager dies, it sends a child-process-shutdown message to the SettingsRequestManager. This would just close the locks and tasks of this message manager. The race happens when some applications can shutdown quickly: settings requests will never be committed to the database. One example is the callscreen. The fix, provided by qDot, simply put those tasks in a finalize state to make sure they are properly executed and committed.
Assignee | ||
Updated•10 years ago
|
Attachment #8501072 -
Attachment is obsolete: true
Attachment #8501072 -
Flags: review?(bent.mozilla)
Assignee | ||
Comment 20•10 years ago
|
||
Comment on attachment 8501192 [details] [diff] [review] Fix Settings API shutdown race condition r=bent Changed the name of the setting used in the test since the previous one exists and made the first test to fail (since it expects it does not exists)
Attachment #8501192 -
Flags: review?(bent.mozilla)
Assignee | ||
Comment 21•10 years ago
|
||
Re-triggered try because of failure. Try with tests: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=2577906394be Try with tests and fix: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1d6aa022af4a
Comment on attachment 8501192 [details] [diff] [review] Fix Settings API shutdown race condition r=bent Review of attachment 8501192 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/settings/SettingsRequestManager.jsm @@ +751,5 @@ > let closedLockIDs = []; > let lockIDs = Object.keys(this.lockInfo); > for (let i in lockIDs) { > + let lock = this.lockInfo[lockIDs[i]]; > + if (lock._mm == aMsgMgr) { Can you also verify that the principal is the same? @@ +753,5 @@ > for (let i in lockIDs) { > + let lock = this.lockInfo[lockIDs[i]]; > + if (lock._mm == aMsgMgr) { > + let is_finalizing = false; > + for (let task_index in lock.tasks) { Can you iterate backwards? Finalize should be at the end.
Attachment #8501192 -
Flags: review?(bent.mozilla) → review+
Assignee | ||
Comment 23•10 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #21) > Re-triggered try because of failure. > Try with tests: > https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=2577906394be > Try with tests and fix: > https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1d6aa022af4a Tests worked as expected, but we have an issue on https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=1d6aa022af4a (which should be green): it's broken on emulator debug, with an IndexedDB error :(
Assignee | ||
Comment 24•10 years ago
|
||
New set of try with the addressed review comments: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=0afae351c3de https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=49b4f89be8b5
Assignee | ||
Comment 25•10 years ago
|
||
When child message manager dies, it sends a child-process-shutdown message to the SettingsRequestManager. This would just close the locks and tasks of this message manager. The race happens when some applications can shutdown quickly: settings requests will never be committed to the database. One example is the callscreen. The fix, provided by qDot, simply put those tasks in a finalize state to make sure they are properly executed and committed.
Assignee | ||
Comment 26•10 years ago
|
||
Comment on attachment 8501537 [details] [diff] [review] Fix Settings API shutdown race condition r=bent Carrying r+ from attachment 8501192 [details] [diff] [review] Adding principal check and backward iteration on tasks.
Attachment #8501537 -
Flags: review+
Assignee | ||
Updated•10 years ago
|
Attachment #8501192 -
Attachment is obsolete: true
Assignee | ||
Comment 27•10 years ago
|
||
The error seems to be 100% reproductible on b2g emulator debug. Looking at full log, I see:
> 00:41:00 INFO - 10-08 06:06:50.791 I/Gecko ( 1567): -*- SettingsRequestManager: Making READWRITE transaction for 435d2192-4f21-48d4-90b7-285f147a56be
> 00:41:00 INFO - 10-08 06:06:50.841 I/Gecko ( 1567): -*- SettingsRequestManager: Making get request for xpcshell.language.current
> 00:41:00 INFO - 10-08 06:06:50.981 I/Gecko ( 1567): [1567] WARNING: 'NS_FAILED(rv)', file ../../../gecko/dom/indexedDB/ActorsParent.cpp, line 6582
> 00:41:00 INFO - 10-08 06:06:51.001 I/Gecko ( 1567): [1567] WARNING: 'NS_FAILED(rv)', file ../../../gecko/dom/indexedDB/ActorsParent.cpp, line 12877
> 00:41:00 INFO - 10-08 06:06:51.011 I/Gecko ( 1567): [1567] WARNING: Converting non-IndexedDB error code (0x80630001) to NS_ERROR_DOM_INDEXEDDB_UNKNOWN_ERR: file ../../../gecko/dom/indexedDB/ActorsParent.cpp, line 408
> 00:41:00 WARNING - 10-08 06:06:51.021 W/GeckoConsole( 1567): [JavaScript Error: "IndexedDB UnknownErr: ActorsParent.cpp:414"]
> 00:41:00 INFO - 10-08 06:06:51.101 I/Gecko ( 1567): -*- SettingsRequestManager: Transaction for lock 435d2192-4f21-48d4-90b7-285f147a56be aborted
> 00:41:00 INFO - 10-08 06:06:51.141 I/Gecko ( 1567): -*- SettingsRequestManager: getRequest FAILED xpcshell.language.current
Doing a local run against b2g desktop in debug mode, I don't reproduce :(
Assignee | ||
Comment 28•10 years ago
|
||
When child message manager dies, it sends a child-process-shutdown message to the SettingsRequestManager. This would just close the locks and tasks of this message manager. The race happens when some applications can shutdown quickly: settings requests will never be committed to the database. One example is the callscreen. The fix, provided by qDot, simply put those tasks in a finalize state to make sure they are properly executed and committed.
Assignee | ||
Comment 29•10 years ago
|
||
Comment on attachment 8501613 [details] [diff] [review] Fix Settings API shutdown race condition r=bent Carrying r+ from attachment 8501192 [details] [diff] [review] Fixing some js warning/errors triggered in debug build. Also changed the Settings:Get, hoping this will make emulator debug happy. Try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=98ad1b0fb7e5
Attachment #8501613 -
Flags: review+
Assignee | ||
Comment 30•10 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #24) > New set of try with the addressed review comments: > https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=0afae351c3de > https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=49b4f89be8b5 Apart from the emulator-debug failure still present, those try are ok: red without the fix, green with.
Assignee | ||
Comment 31•10 years ago
|
||
When child message manager dies, it sends a child-process-shutdown message to the SettingsRequestManager. This would just close the locks and tasks of this message manager. The race happens when some applications can shutdown quickly: settings requests will never be committed to the database. One example is the callscreen. The fix, provided by qDot, simply put those tasks in a finalize state to make sure they are properly executed and committed.
Assignee | ||
Comment 32•10 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #31) > Created attachment 8501718 [details] [diff] [review] > Fix Settings API shutdown race condition r=bent > > When child message manager dies, it sends a child-process-shutdown > message to the SettingsRequestManager. This would just close the locks > and tasks of this message manager. The race happens when some > applications can shutdown quickly: settings requests will never be > committed to the database. One example is the callscreen. The fix, > provided by qDot, simply put those tasks in a finalize state to make > sure they are properly executed and committed. Changed the lock id to use "{uuid}" instead of "uuid". New try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=58f9248a4c99 Locally, this is passing in my emulator debug :(
Assignee | ||
Comment 33•10 years ago
|
||
Comment on attachment 8501718 [details] [diff] [review] Fix Settings API shutdown race condition r=bent Carrying r+
Attachment #8501718 -
Flags: review+
Assignee | ||
Updated•10 years ago
|
Attachment #8501537 -
Attachment is obsolete: true
Assignee | ||
Updated•10 years ago
|
Attachment #8501613 -
Attachment is obsolete: true
Assignee | ||
Updated•10 years ago
|
Attachment #8501718 -
Attachment is obsolete: true
Assignee | ||
Comment 34•10 years ago
|
||
Try at: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=98ad1b0fb7e5 is green for opt, with some known unrelated intermittents.
Assignee | ||
Comment 35•10 years ago
|
||
Try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=58f9248a4c99 This one just fixes the lock id, adding curly bracsed around.
Assignee | ||
Comment 36•10 years ago
|
||
Try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=c12c7577df9c This one just removes debug, in case it could explain the failures
Assignee | ||
Comment 37•10 years ago
|
||
Try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=839f49403404 This one blacklist the added test for gonk debug, to avoid red on b2g emulator debug builds
Assignee | ||
Comment 38•10 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #27) > The error seems to be 100% reproductible on b2g emulator debug. Looking at > full log, I see: > > 00:41:00 INFO - 10-08 06:06:50.791 I/Gecko ( 1567): -*- SettingsRequestManager: Making READWRITE transaction for 435d2192-4f21-48d4-90b7-285f147a56be > > 00:41:00 INFO - 10-08 06:06:50.841 I/Gecko ( 1567): -*- SettingsRequestManager: Making get request for xpcshell.language.current > > 00:41:00 INFO - 10-08 06:06:50.981 I/Gecko ( 1567): [1567] WARNING: 'NS_FAILED(rv)', file ../../../gecko/dom/indexedDB/ActorsParent.cpp, line 6582 > > 00:41:00 INFO - 10-08 06:06:51.001 I/Gecko ( 1567): [1567] WARNING: 'NS_FAILED(rv)', file ../../../gecko/dom/indexedDB/ActorsParent.cpp, line 12877 > > 00:41:00 INFO - 10-08 06:06:51.011 I/Gecko ( 1567): [1567] WARNING: Converting non-IndexedDB error code (0x80630001) to NS_ERROR_DOM_INDEXEDDB_UNKNOWN_ERR: file ../../../gecko/dom/indexedDB/ActorsParent.cpp, line 408 > > 00:41:00 WARNING - 10-08 06:06:51.021 W/GeckoConsole( 1567): [JavaScript Error: "IndexedDB UnknownErr: ActorsParent.cpp:414"] > > 00:41:00 INFO - 10-08 06:06:51.101 I/Gecko ( 1567): -*- SettingsRequestManager: Transaction for lock 435d2192-4f21-48d4-90b7-285f147a56be aborted > > 00:41:00 INFO - 10-08 06:06:51.141 I/Gecko ( 1567): -*- SettingsRequestManager: getRequest FAILED xpcshell.language.current > > Doing a local run against b2g desktop in debug mode, I don't reproduce :( Ben, I could not even reproduce locally on a debug b2g in emulator. See above, I've sent a couple of try with slight changes. My plan is to land the fix with the test running on b2g emulator opt, and file a followup for this issue.
Flags: needinfo?(bent.mozilla)
The error code here says that the database is locked by another process... I don't know how that could happen since we should only touch the file from the parent process.
Flags: needinfo?(bent.mozilla)
Assignee | ||
Comment 40•10 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #37) > Try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=839f49403404 > This one blacklist the added test for gonk debug, to avoid red on b2g > emulator debug builds > > Try: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=c12c7577df9c > This one just removes debug, in case it could explain the failures Both are green, so it would have been fixed by removing debug \o/ I retriggered 10 jobs for each xpcshell tests in each build, just to make sure. But if we can land this without the disabling for b2g emulator debug, it's a good news :)
Assignee | ||
Comment 41•10 years ago
|
||
Try with skip: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=a92a8ee50547 Try without skip: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=5c80e298220a From the second link, it seems that removing the SettingsRequestManager debug improved the situation but it is still failing often enough for this not to be reliable for sherrifs. So I think the safe way is to disable this test for b2g emulator debug builds.
Assignee | ||
Updated•10 years ago
|
Attachment #8501037 -
Attachment is obsolete: true
Assignee | ||
Updated•10 years ago
|
Attachment #8501038 -
Attachment is obsolete: true
Assignee | ||
Comment 42•10 years ago
|
||
When child message manager dies, it sends a child-process-shutdown message to the SettingsRequestManager. This would just close the locks and tasks of this message manager. The race happens when some applications can shutdown quickly: settings requests will never be committed to the database. One example is the callscreen. The fix, provided by qDot, simply put those tasks in a finalize state to make sure they are properly executed and committed.
Assignee | ||
Comment 43•10 years ago
|
||
Comment on attachment 8502305 [details] [diff] [review] Fix Settings API shutdown race condition r=bent Carrying r+ from :bent
Attachment #8502305 -
Flags: review+
Assignee | ||
Comment 44•10 years ago
|
||
Try is green for b2g emulator opt, after retriggers: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=a92a8ee50547 The only failure are known intermittents.
Assignee | ||
Updated•10 years ago
|
Keywords: checkin-needed
Comment 45•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/318da8a0ab90
Keywords: checkin-needed
Assignee | ||
Comment 46•10 years ago
|
||
When child message manager dies, it sends a child-process-shutdown message to the SettingsRequestManager. This would just close the locks and tasks of this message manager. The race happens when some applications can shutdown quickly: settings requests will never be committed to the database. One example is the callscreen. The fix, provided by qDot, simply put those tasks in a finalize state to make sure they are properly executed and committed.
Assignee | ||
Comment 47•10 years ago
|
||
Comment on attachment 8502305 [details] [diff] [review] Fix Settings API shutdown race condition r=bent obsolete by backout
Attachment #8502305 -
Attachment is obsolete: true
Assignee | ||
Updated•10 years ago
|
Attachment #8502421 -
Attachment is obsolete: true
Assignee | ||
Comment 48•10 years ago
|
||
When child message manager dies, it sends a child-process-shutdown message to the SettingsRequestManager. This would just close the locks and tasks of this message manager. The race happens when some applications can shutdown quickly: settings requests will never be committed to the database. One example is the callscreen. The fix, provided by qDot, simply put those tasks in a finalize state to make sure they are properly executed and committed.
Assignee | ||
Comment 49•10 years ago
|
||
Comment on attachment 8502423 [details] [diff] [review] Fix Settings API shutdown race condition r=bent Carrying r+ from :bent Disabling the test on non b2g builds and on emulator/debug.
Attachment #8502423 -
Flags: review+
Assignee | ||
Comment 50•10 years ago
|
||
Finger crossed, but this one should be green ...
Keywords: checkin-needed
Comment 51•10 years ago
|
||
had to backout again for https://treeherder.mozilla.org/ui/logviewer.html#?job_id=2886050&repo=mozilla-inbound maybe we need to run try again ?
Keywords: checkin-needed
Assignee | ||
Comment 52•10 years ago
|
||
When child message manager dies, it sends a child-process-shutdown message to the SettingsRequestManager. This would just close the locks and tasks of this message manager. The race happens when some applications can shutdown quickly: settings requests will never be committed to the database. One example is the callscreen. The fix, provided by qDot, simply put those tasks in a finalize state to make sure they are properly executed and committed.
Assignee | ||
Updated•10 years ago
|
Attachment #8502423 -
Attachment is obsolete: true
Assignee | ||
Comment 53•10 years ago
|
||
Comment on attachment 8502447 [details] [diff] [review] Fix Settings API shutdown race condition r=bent Carrying r+, fixing typo in skip-if
Attachment #8502447 -
Flags: review+
Assignee | ||
Updated•10 years ago
|
Keywords: checkin-needed
Comment 54•10 years ago
|
||
https://hg.mozilla.org/integration/b2g-inbound/rev/69866036d793
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/69866036d793
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Comment 57•10 years ago
|
||
Please request b2g34 approval on this patch when you get a chance.
status-firefox33:
--- → wontfix
status-firefox34:
--- → wontfix
status-firefox35:
--- → fixed
Flags: needinfo?(lissyx+mozillians)
Assignee | ||
Comment 59•10 years ago
|
||
Comment on attachment 8502447 [details] [diff] [review] Fix Settings API shutdown race condition r=bent NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings. [Approval Request Comment] Bug caused by (feature/regressing bug #): Settings OOP User impact if declined: improper api settings behavior on quickly shutting down applications Testing completed: adding xpcshell tests to ensure the race is covered, merged in master since a couple of days, multiple QA (:jlorenzo) checking of the patch to assert it was fixing the depends Risk to taking this patch (and alternatives if risky): low String or UUID changes made by this patch: none
Attachment #8502447 -
Flags: approval-mozilla-b2g34?
Updated•10 years ago
|
Attachment #8502447 -
Flags: approval-mozilla-b2g34? → approval-mozilla-b2g34+
Comment 61•10 years ago
|
||
OK, so this land on b-i and around the same time, Mnw goes basically permafail (with errors that I swear I remember qDot hitting at one point too). Now it gets uplifted to b2g34 and also permafail. Coincidence? :) https://treeherder.mozilla.org/ui/logviewer.html#?job_id=3660&repo=mozilla-b2g34_v2_1 I'm backing this out from b2g34 for now to see if Mnw indeed greens up. If so, I'm going to recommend doing the same on trunk as well. https://hg.mozilla.org/releases/mozilla-b2g34_v2_1/rev/76b4a13e4fc7
Comment 62•10 years ago
|
||
Yep, backing it out got us back from permafail to our more usual just fails most of the time state. I'd recommend backing it out from m-c as well, but seeing as how Mnw isn't a sheriffed test suite at this point, I'll defer to the bug owners here to decide.
Comment 63•10 years ago
|
||
Well, it sort of fixed the perma-fail. I did indeed make the cascade of failures from comment 61 go away. However, test_dom_BluetoothManager_enabled.js is still perma-failing afterwards.
Assignee | ||
Comment 64•10 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #63) > Well, it sort of fixed the perma-fail. I did indeed make the cascade of > failures from comment 61 go away. However, > test_dom_BluetoothManager_enabled.js is still perma-failing afterwards. So test_dom_BluetoothManager_enabled.js is perma-fail without my patch, how can we be sure that it's my patch that is causing this mess then?
Assignee | ||
Comment 65•10 years ago
|
||
Mach does not work. This will be hard to work on: alex@portable-alex:~/codaz/Mozilla/b2g/devices/Emulator/B2G$ ./mach help marionette-webapi usage: mach [global arguments] marionette-webapi [command arguments] Run a Marionette webapi test (test WebAPIs using marionette). Global Arguments: -v, --verbose Print verbose output. -l FILENAME, --log-file FILENAME Filename to write log data to. --log-interval Prefix log line with interval from last message rather than relative time. Note that this is NOT execution time if there are parallel operations. --log-no-times Do not prefix log lines with times. By default, mach will prefix each output line with the time since command start. -h, --help Show this help message. Command Arguments: --type TESTTYPE Test type, usually one of: browser, b2g, b2g-qemu. TESTS Path to test(s) to run. alex@portable-alex:~/codaz/Mozilla/b2g/devices/Emulator/B2G$ ./mach --log-file bluetooth.log --log-interval marionette-webapi dom/bluetooth/tests/marionette/test_dom_BluetoothManager_enabled.js Usage: mach [options] mach: error: no such option: --log-file alex@portable-alex:~/codaz/Mozilla/b2g/devices/Emulator/B2G$ git pull Already up-to-date. alex@portable-alex:~/codaz/Mozilla/b2g/devices/Emulator/B2G$ ./mach -l bluetooth.log --log-interval marionette-webapi dom/bluetooth/tests/marionette/test_dom_BluetoothManager_enabled.js Usage: mach [options] mach: error: no such option: -l alex@portable-alex:~/codaz/Mozilla/b2g/devices/Emulator/B2G$ ./mach -l bluetooth.log marionette-webapi dom/bluetooth/tests/marionette/test_dom_BluetoothManager_enabled.js Usage: mach [options] mach: error: no such option: -l
Assignee | ||
Comment 66•10 years ago
|
||
My patch landed in: - https://tbpl.mozilla.org/?rev=c0a98d9b9a33 with green Mnw - https://tbpl.mozilla.org/?rev=95d1486223f7 with green Mnw - https://tbpl.mozilla.org/?rev=b91b22431613 is the first failure. - https://tbpl.mozilla.org/?rev=50b689feab5f is then back to green Mnw - https://tbpl.mozilla.org/?rev=097821fd89ed green Mnw - https://tbpl.mozilla.org/?rev=708b45d9b1b1 orange Mnw but not similar failure Starting with https://tbpl.mozilla.org/?rev=999ad2f1e108 I do see a perma red Mnw matching pattern.
Assignee | ||
Comment 67•10 years ago
|
||
I ran the marionette-webapi test locally a dozen of times against a B2G Emulator Debug build, could not reproduce any problem ...
Assignee | ||
Comment 68•10 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #67) > I ran the marionette-webapi test locally a dozen of times against a B2G > Emulator Debug build, could not reproduce any problem ... No luck with opt build locally when running: |./mach marionette-webapi gecko/dom/bluetooth/tests/marionette/|
Assignee | ||
Comment 69•10 years ago
|
||
Retriggered a set of try jobs for master (where it has not been reverted): https://tbpl.mozilla.org/?tree=Try&rev=66345e60b0e2
Assignee | ||
Comment 70•10 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #69) > Retriggered a set of try jobs for master (where it has not been reverted): > https://tbpl.mozilla.org/?tree=Try&rev=66345e60b0e2 And there are errors on master, but intermittent.
Flags: needinfo?(ryanvm)
Comment 71•10 years ago
|
||
Yes, Mnw is known to be incredibly flaky. Always has been. Not sure what specifically you're expecting from me about it.
Flags: needinfo?(ryanvm)
Assignee | ||
Comment 72•10 years ago
|
||
Failed run: https://tbpl.mozilla.org/php/getParsedLog.php?id=50360599&tree=Try&full=1 Successful run: https://tbpl.mozilla.org/php/getParsedLog.php?id=50378525&tree=Try&full=1 In the later case, we can see that we have the logcat of the emulator, and all the debug that has been switched. We don't have any logcat at all in the first case, so I cannot even know if emulator properly started or not ...
Comment 73•10 years ago
|
||
We're seeing increasing failures on Mnw (Bug 1081529), and that busted build problem is becoming more and more frequent. The problem is we can't get the logcat or any other build artifact b/c they don't get uploaded if buildbot kills the mozharness script before it can upload the artifacts, so we have no visibility into the core problem. I'm requesting this feature Bug 1069641, but in the mean time, I'll make time next week to try to isolate the cause of the busted builds.
Comment 74•10 years ago
|
||
Bug for busted builds: https://bugzilla.mozilla.org/show_bug.cgi?id=1084415
Comment 75•10 years ago
|
||
Alexandre just pointed out to me that the patch that landed on m-c was the wrong one. I'm not sure how that happened (qimportbz should have Just Worked here AFAICT), but I've gone ahead and backed out and re-landed at least. https://hg.mozilla.org/integration/b2g-inbound/rev/0197ff8217ec https://hg.mozilla.org/integration/b2g-inbound/rev/cf4d42ad0275
Comment 76•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/cf4d42ad0275
Flags: in-testsuite+
Comment 78•10 years ago
|
||
By running the test with some additional parameters in mozharness, I have the logcat, but I don't see anything suspicious, especially not with marionette (we would have seen marionette error logs or something suspicious with the xpcom interfaces it uses). You can see line 280 corresponds with https://mxr.mozilla.org/mozilla-central/source/dom/bluetooth/tests/marionette/head.js#551, so it got that far in execution, but https://mxr.mozilla.org/mozilla-central/source/dom/bluetooth/tests/marionette/test_dom_BluetoothManager_enabled.js#30 is never logged, so I suspect that a script you've been using is now timing out in your test's head.js or something before it can get there.
Comment 79•10 years ago
|
||
fyi, if you want to see the full test run: https://tbpl.mozilla.org/php/getParsedLog.php?id=51269174&tree=Ash&full=1#error1
Comment 80•10 years ago
|
||
interestingly, the subsequent tests are failing because a setup script in marionette's test harness is timing out... I wonder if the emulator is just incredibly slow.
Assignee | ||
Comment 81•10 years ago
|
||
Thanks! I see something suspicious: https://tbpl.mozilla.org/php/getParsedLog.php?id=51269174&tree=Ash&full=1#error0
Assignee | ||
Comment 82•10 years ago
|
||
Also:
> 17:13:46 INFO - 10-29 00:12:42.145 W/(Preallocated app)( 271): [JavaScript Error: "TypeError: infos is null" {file: "chrome://global/content/BrowserElementChild.js" line: 57}]
> 17:13:46 INFO - 10-29 00:12:45.085 W/GeckoConsole( 44): [JavaScript Error: "TypeError: actor.target is undefined" {file: "resource://gre/modules/InterAppCommService.jsm" line: 810}]
Assignee | ||
Comment 83•10 years ago
|
||
Do you think you could capture a logcat after enabling all debug in dom/settings ? Just switching the |DEBUG| to true in all js/jsm files in this directory, maybe it could help :)
Flags: needinfo?(mdas)
Comment 84•10 years ago
|
||
Unfortunately, I can't yet. That logcat output was a test run of a mozharness script. You can't modify any other code when testing out mozharness changes, so I can't change the test/gecko/anything. In the meantime, I filed https://bugzilla.mozilla.org/show_bug.cgi?id=1090978 to get logcat output interleaved in the stdout. When that lands, you'll be able to see logcat output in MnW, so you can try anything out and see what happens.
Flags: needinfo?(mdas)
Assignee | ||
Comment 85•10 years ago
|
||
Ok. Well the log already show us that bluetooth test starts and that after this, we properly get the permissions. The very next step here is to read bluetooth.enabled setting, so we cannot go further until we can trigger debugging of this code :)
Assignee | ||
Comment 86•10 years ago
|
||
Landing of bug 1092941 fixed Mnw.
Comment 87•10 years ago
|
||
*fingers crossed* https://hg.mozilla.org/releases/mozilla-b2g34_v2_1/rev/90f687bae552
You need to log in
before you can comment on or make changes to this bug.
Description
•