Closed
Bug 1082001
Opened 11 years ago
Closed 11 years ago
Mulet doesn't start correctly
Categories
(Firefox OS Graveyard :: Runtime, defect)
Tracking
(firefox34 wontfix, firefox35 wontfix, firefox36 fixed, b2g-v2.1 fixed, b2g-v2.2 fixed)
RESOLVED
FIXED
2.1 S8 (7Nov)
People
(Reporter: ochameau, Assigned: gerard-majax)
References
Details
(Whiteboard: [systemsfe])
Attachments
(4 files, 8 obsolete files)
3.15 KB,
text/plain
|
Details | |
931.90 KB,
text/x-log
|
Details | |
217.92 KB,
text/x-log
|
Details | |
16.39 KB,
patch
|
gerard-majax
:
review+
fabrice
:
approval-mozilla-b2g34+
|
Details | Diff | Splinter Review |
When opening it, shell.html is loaded, but gaia doesn't start and the page stays black.
I narrowed it down to an issue in Settings API, where a dom-window-destroyed event is dispatched during shell.html loading...
We end up destroying the mozSettings API over here:
http://mxr.mozilla.org/mozilla-central/source/dom/settings/SettingsManager.js#386
That explains why gaia doesn't start.
But I can't get why we receive such dom-window-destroyed. It is being dispatched from these two callsites:
http://mxr.mozilla.org/mozilla-central/source/dom/base/nsGlobalWindow.cpp#1562
http://mxr.mozilla.org/mozilla-central/source/dom/base/nsGlobalWindow.cpp#2868
Both seems to be very late during window destruction.
This is really weird as the window works correctly, whereas the window object should be mostly non-working by the time we receive this event.
Assignee | ||
Comment 1•11 years ago
|
||
Looks like you have been able to find more info than me on this issue ...
Reporter | ||
Comment 2•11 years ago
|
||
After some more debugging, it is being called from
http://mxr.mozilla.org/mozilla-central/source/dom/base/nsGlobalWindow.cpp#2177
WindowStateHolder::~WindowStateHolder()
nsGlobalWindow::FreeInnerObjects()
--> notify dom-window-destroyed
--> break mozSettings (and many other APIs!)
Given that most of this codepath is very old and that it used to work... it must be something at a higher level.
I would say it can be something in session restore or in the way we load the homepage.
Reporter | ||
Comment 3•11 years ago
|
||
So the way we load the homepage is just about calling loadURI on <browser> during startup,
I haven't seen anything really special.
Reporter | ||
Comment 4•11 years ago
|
||
Ok so it regressed since bug 1065128 attachment 8488141 [details] [diff] [review] landed.
I'm not sure listening to dom-window-destroyed was the right fix.
Listening to inner-window-destroyed from an nsIDOMGlobalPropertyInitializer looks like a common pattern...
Depends on: 1065128
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → lissyx+mozillians
Assignee | ||
Comment 5•11 years ago
|
||
There's a lot of settings requests that fails.
Assignee | ||
Comment 6•11 years ago
|
||
Even if I revert bug 1065128, it goes further but I'm still stuck on the boot logo :(
Flags: needinfo?(poirot.alex)
Assignee | ||
Comment 7•11 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #6)
> Even if I revert bug 1065128, it goes further but I'm still stuck on the
> boot logo :(
Nevermind, I failed something, it's okay.
Flags: needinfo?(poirot.alex)
Comment 8•11 years ago
|
||
Listening to inner-window-destroyed is fine. However, bug 1065128 moved certain calls to dom-window-destroyed because both settings and geolocation were calling sendAsyncMessage from inner-window-destroyed, which doesn't work because at that point there's no window.
However, when I fixed this for settings in bug 1065128, I made SettingsManager::cleanup() be called in dom-window-destroyed, which also involves things like unregistering listeners and settings the cached window object to null. What we could do is only make the requires sendAsyncMessage calls in dom-window-destroyed, then do everything else in inner-window-destroyed? Don't know if that'd fix this or not.
Comment 9•11 years ago
|
||
Er, meant comment 8 to be on bug 1065128, so reposted there, and I fixed settings and IDENTITY, not geolocation.
Ok, back to vacation.
Assignee | ||
Comment 11•11 years ago
|
||
log when mulet does not boot with the patch applied
Attachment #8506963 -
Attachment is obsolete: true
Assignee | ||
Comment 12•11 years ago
|
||
> $ grep -n 'SettingsManager init' settings-mulet-*
> settings-mulet-NOK.log:29:-*- SettingsManager: SettingsManager init
> settings-mulet-OK.log:29:-*- SettingsManager: SettingsManager init
> settings-mulet-OK.log:1448:-*- SettingsManager: SettingsManager init
> settings-mulet-OK.log:3676:-*- SettingsManager: SettingsManager init
> settings-mulet-OK.log:11180:-*- SettingsManager: SettingsManager init
> settings-mulet-OK.log:11502:-*- SettingsManager: SettingsManager init
Flags: needinfo?(kyle)
Assignee | ||
Comment 13•11 years ago
|
||
Attachment #8507893 -
Attachment is obsolete: true
Assignee | ||
Comment 14•11 years ago
|
||
Assignee | ||
Comment 15•11 years ago
|
||
After a lot of trying, investigations and discussions, it seems we should just not use dom-window-destroyed as it seems not to be a good idea.
See also but 1016952 comment 11.
Assignee | ||
Comment 16•11 years ago
|
||
With some changes, I have been able to get errors popping when this is failing!
Assignee | ||
Comment 17•11 years ago
|
||
Attachment #8507892 -
Attachment is obsolete: true
Assignee | ||
Comment 18•11 years ago
|
||
Attachment #8507951 -
Attachment is obsolete: true
Assignee | ||
Comment 19•11 years ago
|
||
Attachment #8507952 -
Attachment is obsolete: true
Updated•11 years ago
|
Whiteboard: [systemsfe]
Assignee | ||
Comment 20•11 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #15)
> After a lot of trying, investigations and discussions, it seems we should
> just not use dom-window-destroyed as it seems not to be a good idea.
>
> See also bug 1016952 comment 11.
Ok, things are getting clearer now.
So we load the system app, and it makes a request to navigator.mozSettings. This triggers a SettingManager.init() call, which gets the window matching the system app.
There's a trick using "about:blank" for resources cleanup, so when we unload this page this is triggering a "dom-window-destroyed" event. Since we just have a reference to the window, and not the inner window, we cannot distinguish whether we have an event for the about:blank inner window or for a legit app getting killed.
So when we cleanup resources upon this dom-window-destroyed event, we indeed put SettingsManager in an unexpected state where it has no more reference to the window (it was destroyed) but it's still partly active.
Assignee | ||
Comment 21•11 years ago
|
||
Kyle, in light of comment 20, I gave a try to reverting the dom-window-destroyed changes only, keeping the fix for the principals. As far as I could test, I could not reproduce using STRs provided in bug 1065128 comment 0.
Flags: needinfo?(kyle)
Comment 22•11 years ago
|
||
Ok, comment 20 makes sense, but I'm not sure what you mean by "keeping the fix for the principals". Can you post a patch of what you've got now?
Also, I was thinking about it yesterday... There's a chance the way I'm doing shutdown in SettingsManager is wrong. The reason we need the cleanup in dom-window-destroyed is so that parent process SettingsManagers would clean themselves up, since SettingsRequestManager won't get a child-process-shutdown message from them. However, since SettingsManager and SettingsRequestManager are in the same process already, SettingsRequestManager should be able to track the window object for SettingsManagers in the parent process, and do the same thing on inner-window-destroyed for these SettingsManager objects as it would on receiving child-process-shutdown for apps in B2G. This would mean we would no longer have to rely on SettingsManager running sendAsyncMessage in shutdown at all, since that seems to be causing a lot of problems.
I don't think it'd be a ton of work to fix this, but I also don't know what, if anything, it would fix. It just seems like our last ditch effort to kick off transaction finalizing from the child process or SettingsManager objects seems like it's causing more pain than it should.
Flags: needinfo?(lissyx+mozillians)
Flags: needinfo?(kyle)
Flags: needinfo?(anygregor)
Assignee | ||
Comment 23•11 years ago
|
||
As I was saying yesterday on IRC, once reverted the dom-window-destroyed changes, I would still run into issues with PRODUCTION=1 Gaia builds.
As far as I can say, it looks like checking out a revision just prior landing of bug 1049439 fixes the issue.
Depends on: 1049439
Flags: needinfo?(lissyx+mozillians)
Assignee | ||
Comment 24•11 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #23)
> As I was saying yesterday on IRC, once reverted the dom-window-destroyed
> changes, I would still run into issues with PRODUCTION=1 Gaia builds.
>
> As far as I can say, it looks like checking out a revision just prior
> landing of bug 1049439 fixes the issue.
Ok, maybe it's not related to bug 1049439 in fact.
No longer depends on: 1049439
Assignee | ||
Comment 25•11 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #24)
> (In reply to Alexandre LISSY :gerard-majax from comment #23)
> > As I was saying yesterday on IRC, once reverted the dom-window-destroyed
> > changes, I would still run into issues with PRODUCTION=1 Gaia builds.
> >
> > As far as I can say, it looks like checking out a revision just prior
> > landing of bug 1049439 fixes the issue.
>
> Ok, maybe it's not related to bug 1049439 in fact.
That's indeed because of bug 941969.
Depends on: 941969
Assignee | ||
Comment 26•11 years ago
|
||
(In reply to Kyle Machulis (OUT OCT 1-NOV 30) [:kmachulis] [:qdot] (USE NEEDINFO?) from comment #22)
> Ok, comment 20 makes sense, but I'm not sure what you mean by "keeping the
> fix for the principals". Can you post a patch of what you've got now?
>
> Also, I was thinking about it yesterday... There's a chance the way I'm
> doing shutdown in SettingsManager is wrong. The reason we need the cleanup
> in dom-window-destroyed is so that parent process SettingsManagers would
> clean themselves up, since SettingsRequestManager won't get a
> child-process-shutdown message from them. However, since SettingsManager and
> SettingsRequestManager are in the same process already,
> SettingsRequestManager should be able to track the window object for
> SettingsManagers in the parent process, and do the same thing on
> inner-window-destroyed for these SettingsManager objects as it would on
> receiving child-process-shutdown for apps in B2G. This would mean we would
> no longer have to rely on SettingsManager running sendAsyncMessage in
> shutdown at all, since that seems to be causing a lot of problems.
>
> I don't think it'd be a ton of work to fix this, but I also don't know what,
> if anything, it would fix. It just seems like our last ditch effort to kick
> off transaction finalizing from the child process or SettingsManager objects
> seems like it's causing more pain than it should.
This seems to be doing the trick on my mulet builds, whether with OOP enabled or not.
Here is a first tentative try: https://tbpl.mozilla.org/?tree=Try&rev=d4b2410b0212
I'm checking the behavior on a real device, too, but I have been struggling with bug 1087967 and bug 969215.
Assignee | ||
Comment 27•11 years ago
|
||
So far my testing on Mulet shows that I cannot reproduce the bug documented in bug 1065128 comment 0.
Assignee | ||
Comment 28•11 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #27)
> So far my testing on Mulet shows that I cannot reproduce the bug documented
> in bug 1065128 comment 0.
Nor on my device.
Assignee | ||
Comment 29•11 years ago
|
||
From bug 1065128 SettingsManager has been changed to listen the
dom-window-destroyed event for its cleanup. However, when running Gaia
in Mulet, a race condition is exposed. For B2G, when loading a page,
about:blank is first used. This means that window destroyed events will
be triggered. However, from the dom-window-destroyed event we cannot
distringuish whether this is about:blank or a legit application being
closed. SettingsManager gets initialized (i.e., init() called) when the
application makes use of navigator.mozSettings. So the chain of event is
that we have a SettingsManager living because System app did some
request. At this time, about:blank is being unloaded and triggers a
dom-window-destroyed event. This makes SettingsManager doing its
cleanup, especially freeing the window reference. Then in the meantime,
we have the navigator.mozSettings use that is progressing. At some
point, SettingsManager has no more window to send messages to, and Gaia
is not able to even start.
SettingsRequestManager lives on the parent process and SettingsManager
lives on the child side. Part of the cleanup performed by
SettingsManager was to ensure pending locks on the parent process would
but forced to finalize to make sure those are being properly committed.
We move this cleanup to SettingsRequestManager and we augment the lock
informations with the proper inner window id. This way we can track
which lock is attached to which inner window when the lock gets created.
And thus we can listen on inner-window-destroyed from
SettingsRequestManager to be able to force finalize on any pending lock.
Assignee | ||
Updated•11 years ago
|
Attachment #8508198 -
Attachment is obsolete: true
Assignee | ||
Comment 30•11 years ago
|
||
Comment on attachment 8510417 [details] [diff] [review]
Cleanup settings lock from parent itself r=bent
Ben since you reviewed the first patch, please have a look at this one.
There's a first green try: https://tbpl.mozilla.org/?tree=Try&rev=d4b2410b0212
And a second one is en-route to check on all other platforms, for the sake of making sure, at https://tbpl.mozilla.org/?tree=Try&rev=35e5d0443b83.
I have been testing this since this morning on my mulet build, not being able to expose any issue. This is also on one of my devices that I'm using and it behaves properly for now.
I'm planning for some more stress testing also.
Attachment #8510417 -
Flags: review?(bent.mozilla)
Assignee | ||
Comment 31•11 years ago
|
||
I have been trying to document and ensure any perf issue, but ran into bug 1088683. Now that I have a workaround for this one, it's Cleopatra that is broken and that fails with:
> Error in worker: Exception: out of memory (undefined:undefined)
Comment on attachment 8510417 [details] [diff] [review]
Cleanup settings lock from parent itself r=bent
Hrm, what guarantees you that the innerWindowID you get in the child (and then send to the parent) matches the innerWindowID you get in the parent? I don't think we synchronize that between processes, do we?
Attachment #8510417 -
Flags: review?(bent.mozilla)
Assignee | ||
Comment 33•11 years ago
|
||
From bug 1065128 SettingsManager has been changed to listen the
dom-window-destroyed event for its cleanup. However, when running Gaia
in Mulet, a race condition is exposed. For B2G, when loading a page,
about:blank is first used. This means that window destroyed events will
be triggered. However, from the dom-window-destroyed event we cannot
distinguish whether this is about:blank or a legit application being
closed. SettingsManager gets initialized (i.e., init() called) when the
application makes use of navigator.mozSettings. So the chain of event is
that we have a SettingsManager living because System app did some
request. At this time, about:blank is being unloaded and triggers a
dom-window-destroyed event. This makes SettingsManager doing its
cleanup, especially freeing the window reference. Then in the meantime,
we have the navigator.mozSettings use that is progressing. At some
point, SettingsManager has no more window to send messages to, and Gaia
is not able to even start.
SettingsRequestManager lives on the parent process and SettingsManager
lives on the child side. Part of the cleanup performed by
SettingsManager was to ensure pending locks on the parent process would
be forced to finalize to make sure those are being properly committed.
We move this cleanup to SettingsRequestManager and we augment the lock
informations with the proper inner window id. This way we can track
which lock is attached to which inner window when the lock gets created.
And thus we can listen on inner-window-destroyed from
SettingsRequestManager to be able to force finalize on any pending lock.
Impacted code path are those were we are not running out of process.
When we are running out of process, SettingsRequestManager already
listens on the child-process-shutdown event to perform the lock
finalization.
Assignee | ||
Comment 34•11 years ago
|
||
(In reply to Alexandre LISSY :gerard-majax from comment #33)
> Created attachment 8512602 [details] [diff] [review]
> Cleanup settings lock from parent itself r=bent
>
> From bug 1065128 SettingsManager has been changed to listen the
> dom-window-destroyed event for its cleanup. However, when running Gaia
> in Mulet, a race condition is exposed. For B2G, when loading a page,
> about:blank is first used. This means that window destroyed events will
> be triggered. However, from the dom-window-destroyed event we cannot
> distinguish whether this is about:blank or a legit application being
> closed. SettingsManager gets initialized (i.e., init() called) when the
> application makes use of navigator.mozSettings. So the chain of event is
> that we have a SettingsManager living because System app did some
> request. At this time, about:blank is being unloaded and triggers a
> dom-window-destroyed event. This makes SettingsManager doing its
> cleanup, especially freeing the window reference. Then in the meantime,
> we have the navigator.mozSettings use that is progressing. At some
> point, SettingsManager has no more window to send messages to, and Gaia
> is not able to even start.
>
> SettingsRequestManager lives on the parent process and SettingsManager
> lives on the child side. Part of the cleanup performed by
> SettingsManager was to ensure pending locks on the parent process would
> be forced to finalize to make sure those are being properly committed.
> We move this cleanup to SettingsRequestManager and we augment the lock
> informations with the proper inner window id. This way we can track
> which lock is attached to which inner window when the lock gets created.
> And thus we can listen on inner-window-destroyed from
> SettingsRequestManager to be able to force finalize on any pending lock.
> Impacted code path are those were we are not running out of process.
> When we are running out of process, SettingsRequestManager already
> listens on the child-process-shutdown event to perform the lock
> finalization.
Refactored a little bit the OOP and non OOP code paths, and renaming method to make it more obvious.
New try: https://tbpl.mozilla.org/?tree=Try&rev=ef3fdb68102e
Assignee | ||
Updated•11 years ago
|
Attachment #8510417 -
Attachment is obsolete: true
Assignee | ||
Comment 35•11 years ago
|
||
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #32)
> Comment on attachment 8510417 [details] [diff] [review]
> Cleanup settings lock from parent itself r=bent
>
> Hrm, what guarantees you that the innerWindowID you get in the child (and
> then send to the parent) matches the innerWindowID you get in the parent? I
> don't think we synchronize that between processes, do we?
Yes but that's not a problem: the issue we fix here is related to running in process. The case for oop is already handled by the child-process-shutdown code path.
Given your feedback, I think it's worthwhile to make it more obvious in the patch, so I:
- factorized between non OOP and OOP code paths
- renamed methods to make it more clear.
Tested this on Mulet and device (Flame/KK), no issue spotted so far.
Assignee | ||
Updated•11 years ago
|
Attachment #8512602 -
Flags: review?(bent.mozilla)
Comment on attachment 8512602 [details] [diff] [review]
Cleanup settings lock from parent itself r=bent
Review of attachment 8512602 [details] [diff] [review]:
-----------------------------------------------------------------
Ok this makes much more sense. Thanks!
::: dom/settings/SettingsManager.js
@@ +389,5 @@
> observe: function(aSubject, aTopic, aData) {
> + if (DEBUG) debug("Topic: " + aTopic);
> + if (aTopic == "inner-window-destroyed") {
> + let wId = aSubject.QueryInterface(Ci.nsISupportsPRUint64).data;
> + if (wId == this.innerWindowID) {
Nit: ===, then we won't have to worry about null equaling 0.
::: dom/settings/SettingsRequestManager.jsm
@@ +755,5 @@
> }
> },
>
> + hasLockFinalizeTask: function(lock) {
> + let task_index;
Nit: This can go in the for() loop initializer right?
Attachment #8512602 -
Flags: review?(bent.mozilla) → review+
Assignee | ||
Comment 37•11 years ago
|
||
From bug 1065128 SettingsManager has been changed to listen the
dom-window-destroyed event for its cleanup. However, when running Gaia
in Mulet, a race condition is exposed. For B2G, when loading a page,
about:blank is first used. This means that window destroyed events will
be triggered. However, from the dom-window-destroyed event we cannot
distinguish whether this is about:blank or a legit application being
closed. SettingsManager gets initialized (i.e., init() called) when the
application makes use of navigator.mozSettings. So the chain of event is
that we have a SettingsManager living because System app did some
request. At this time, about:blank is being unloaded and triggers a
dom-window-destroyed event. This makes SettingsManager doing its
cleanup, especially freeing the window reference. Then in the meantime,
we have the navigator.mozSettings use that is progressing. At some
point, SettingsManager has no more window to send messages to, and Gaia
is not able to even start.
SettingsRequestManager lives on the parent process and SettingsManager
lives on the child side. Part of the cleanup performed by
SettingsManager was to ensure pending locks on the parent process would
be forced to finalize to make sure those are being properly committed.
We move this cleanup to SettingsRequestManager and we augment the lock
informations with the proper inner window id. This way we can track
which lock is attached to which inner window when the lock gets created.
And thus we can listen on inner-window-destroyed from
SettingsRequestManager to be able to force finalize on any pending lock.
Impacted code path are those were we are not running out of process.
When we are running out of process, SettingsRequestManager already
listens on the child-process-shutdown event to perform the lock
finalization.
Assignee | ||
Updated•11 years ago
|
Attachment #8512602 -
Attachment is obsolete: true
Assignee | ||
Comment 38•11 years ago
|
||
Comment on attachment 8513326 [details] [diff] [review]
Cleanup settings lock from parent itself r=bent
Carrying r+ from :bent.
Addressed nits.
Attachment #8513326 -
Flags: review+
Assignee | ||
Comment 39•11 years ago
|
||
Keywords: checkin-needed
Assignee | ||
Updated•11 years ago
|
Flags: needinfo?(anygregor)
Comment 40•11 years ago
|
||
Keywords: checkin-needed
Comment 41•11 years ago
|
||
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → 2.1 S8 (7Nov)
Assignee | ||
Comment 42•11 years ago
|
||
Comment on attachment 8513326 [details] [diff] [review]
Cleanup settings lock from parent itself r=bent
NOTE: Please see https://wiki.mozilla.org/Release_Management/B2G_Landing to better understand the B2G approval process and landings.
[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 1065128
User impact if declined: gaia not booting in some cases: mulet, might affect also some devices
Testing completed: living on master since october 29th
Risk to taking this patch (and alternatives if risky): low
String or UUID changes made by this patch: none
Bug 1092941 needs to be uplifted/landed just after this one.
Attachment #8513326 -
Flags: approval-mozilla-b2g34?
Updated•11 years ago
|
Attachment #8513326 -
Flags: approval-mozilla-b2g34? → approval-mozilla-b2g34+
Comment 43•11 years ago
|
||
status-b2g-v2.1:
--- → fixed
status-b2g-v2.2:
--- → fixed
status-firefox34:
--- → wontfix
status-firefox35:
--- → wontfix
status-firefox36:
--- → fixed
You need to log in
before you can comment on or make changes to this bug.
Description
•