950266 - Re-enable the Nuwa process on B2G by default

Reporter

Description

•

12 years ago

Nuwa was enabled in bug 930282, but then disabled in bug 948829 due to problems. This bug is about re-enabling it. Please add any blockers I have missed.

Nicholas Nethercote [inactive]

Reporter

Comment 1

•

12 years ago

Should bug 948021 block this rather than depend on this?

Nicholas Nethercote [inactive]

Reporter

Updated

•

12 years ago

Summary: Re-nable the Nuwa process on B2G by default → Re-enable the Nuwa process on B2G by default

Michael Vines [:m1] [:evilmachines]

Updated

•

12 years ago

blocking-b2g: --- → 1.3?

Preeti Raghunath(:Preeti)

Comment 2

•

12 years ago

Andrew, Please reassign appropriately.

Flags: needinfo?(overholt)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 3

•

12 years ago

I'll assign to to Patrick, but it's really just waiting on the blocking bug (which I own).

Assignee: nobody → pwang

Flags: needinfo?(overholt)

Ben Kelly [:bkelly, not reviewing]

Updated

•

12 years ago

Depends on: 948977

Ben Kelly [:bkelly, not reviewing]

Comment 4

•

12 years ago

After bisecting a datazilla regression, it appears NUWA slows down the cold launch time for some of our apps. In particular, clock seem effected. This is all in bug 948977. It would be nice to understand what is going on there before re-enabling NUWA.

Joe Cheng [:jcheng] (please needinfo)

Comment 5

•

12 years ago

triage: nuwa is intended for 1.3. 1.3+ to benefit tarako

blocking-b2g: 1.3? → 1.3+

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Updated

•

12 years ago

Blocks: 952692

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Updated

•

12 years ago

Blocks: 952693

Joe Cheng [:jcheng] (please needinfo)

Comment 6

•

12 years ago

hi Patrick, do you mind giving an update on this bug after you come back from the holidays? Thanks

Flags: needinfo?(pwang)

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 7

•

12 years ago

Hi Joe, the regression bugs that are filed after last time we enable Nuwa are almost fixed. I think we can reenable nuwa after bug 948774 has been fixed.

Flags: needinfo?(pwang)

Preeti Raghunath(:Preeti)

Comment 8

•

12 years ago

Hi Patrick, What is the ETA for fixing bug 948774?

Flags: needinfo?(pwang)

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 9

•

12 years ago

I am not pretty sure about the ETA of the bug. But you might ask Kyle, who works on the bug.

Flags: needinfo?(pwang)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 11

•

12 years ago

See the bug.

Flags: needinfo?(khuey)

[:fabrice] Fabrice Desré

Assignee

Comment 12

•

12 years ago

https://hg.mozilla.org/integration/b2g-inbound/rev/eb5f24ea495b

Michael Vines [:m1] [:evilmachines]

Updated

•

12 years ago

status-b2g-v1.3: --- → affected

Michael Vines [:m1] [:evilmachines]

Updated

•

12 years ago

Whiteboard: [MemShrink:P1][tarako] → [MemShrink:P1][tarako][CR 599688]

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Updated

•

12 years ago

Assignee: pwang → fabrice

Phil Ringnalda (:philor)

Comment 13

•

12 years ago

Not bad, other than marionette-webapi bustage like https://tbpl.mozilla.org/php/getParsedLog.php?id=33007882&tree=B2g-Inbound and a mochitest-6 assertion failure like https://tbpl.mozilla.org/php/getParsedLog.php?id=33001635&tree=B2g-Inbound Backed out in https://hg.mozilla.org/integration/b2g-inbound/rev/a5f0a97f9097

[:fabrice] Fabrice Desré

Assignee

Comment 14

•

12 years ago

arg... Patrik, do you have time to take a look?

Flags: needinfo?(pwang)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 15

•

12 years ago

The assertion is just bogus. I'll fix that.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 16

•

12 years ago

I pushed https://hg.mozilla.org/integration/b2g-inbound/rev/d1d1f50c63f9 to fix the assertion. Patrick, can you investigate the webapi test failure?

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 17

•

12 years ago

sure.

Flags: needinfo?(pwang)

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 18

•

12 years ago

The error occurs when app_window.js try to use browser API on FTU's iframe. If Nuwa preallocate process is on, loading of remote frame will end at [1], and browser parent script won't be loaded, then calling to browser API will result in a javascript exception. Since loading a remote frame is now an asynchronous task, browser API won't be available right after attaching an iframe, embedder should test if browser API is usable before using it. Another option is that we provide a callback to let embedder know that we are ready. [1] http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsFrameLoader.cpp?rev=4f7f249f6d18#463

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 19

•

12 years ago

Attached patch WIP: Notify embedder when BEP is ready. (obsolete) — Details — Splinter Review

Gecko change to fire an event when browser element parent is ready.

Attachment #8360552 - Flags: feedback?(fabrice)

[:fabrice] Fabrice Desré

Assignee

Comment 20

•

12 years ago

Comment on attachment 8360552 [details] [diff] [review] WIP: Notify embedder when BEP is ready. Review of attachment 8360552 [details] [diff] [review]: ----------------------------------------------------------------- Patrick, I had a simpler patch in my queue that just does: let evt = this._createEvent('ready', null, /* cancelable = */ false); this._frameElement.dispatchEvent(evt); at the end of the BrowserElementParent() constructor and it worked just fine with the right gaia patch. Do you feel we really need your pending frame mechanism? ::: dom/browser-element/BrowserElementParent.js @@ +81,5 @@ > _observeInProcessBrowserFrameShown: function(frameLoader) { > debug("In-process browser frame shown " + frameLoader); > + this._createBrowserElementParent(frameLoader, > + /* hasRemoteFrame = */ false, > + /* pending frame */ false); nit: align comments ::: dom/browser-element/BrowserElementParent.jsm @@ +707,5 @@ > observe: function(subject, topic, data) { > switch(topic) { > + case 'remote-browser-frame-shown': > + if (!this._initialized && (this._frameLoader == subject)) { > + Services.obs.removeObserver(this, 'remote-Browser-frame-shown'); that should be remote-browser-frame-shown

Attachment #8360552 - Flags: feedback?(fabrice) → feedback+

[:fabrice] Fabrice Desré

Assignee

Comment 21

•

12 years ago

Attached patch mozbrowser-ready.patch (obsolete) — Details — Splinter Review

Gaia patch that waits for the new mozbrowserready event if the paint listener methods are not accessible.

Attachment #8360641 - Flags: review?(alive)

(inactive after 6/18) Alive Kuo [:alive]

Comment 22

•

12 years ago

According to comment 18, I wonder this doesn't only affect addNextPaintListener but also all browser API: getScreenshot/goBack/goForward/canGoBack/canGoForward/setVisible/getVisible/purgeHistory. Is that true?

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 23

•

12 years ago

(In reply to Alive Kuo [:alive][NEEDINFO!] from comment #22) > According to comment 18, I wonder this doesn't only affect > addNextPaintListener but also all browser API: > getScreenshot/goBack/goForward/canGoBack/canGoForward/setVisible/getVisible/ > purgeHistory. Is that true? Yes, all browser APIs are not available before the new process launch.

(inactive after 6/18) Alive Kuo [:alive]

Comment 24

•

12 years ago

(In reply to Patrick Wang (ChihKai) (:kk1fff) from comment #23) > (In reply to Alive Kuo [:alive][NEEDINFO!] from comment #22) > > According to comment 18, I wonder this doesn't only affect > > addNextPaintListener but also all browser API: > > getScreenshot/goBack/goForward/canGoBack/canGoForward/setVisible/getVisible/ > > purgeHistory. Is that true? > > Yes, all browser APIs are not available before the new process launch. How about events? I wonder we may lose mozbrowserloadstart event before mozbrowserready event?

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 25

•

12 years ago

(In reply to Fabrice Desré [:fabrice] from comment #20) > Comment on attachment 8360552 [details] [diff] [review] > WIP: Notify embedder when BEP is ready. > > Review of attachment 8360552 [details] [diff] [review]: > ----------------------------------------------------------------- > > Patrick, I had a simpler patch in my queue that just does: > > let evt = this._createEvent('ready', null, /* cancelable = */ false); > this._frameElement.dispatchEvent(evt); > > at the end of the BrowserElementParent() constructor and it worked just fine > with the right gaia patch. Do you feel we really need your pending frame > mechanism? I don't think we really need pending, simpler mechanism would be better. :)

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 26

•

12 years ago

(In reply to Alive Kuo [:alive][NEEDINFO!] from comment #24) > (In reply to Patrick Wang (ChihKai) (:kk1fff) from comment #23) > > Yes, all browser APIs are not available before the new process launch. > > How about events? I wonder we may lose mozbrowserloadstart event before > mozbrowserready event? No, there's no event before mozbrowserready, since browser element is not working then.

(inactive after 6/18) Alive Kuo [:alive]

Comment 27

•

12 years ago

Comment on attachment 8360641 [details] [diff] [review] mozbrowser-ready.patch Review of attachment 8360641 [details] [diff] [review]: ----------------------------------------------------------------- According to comment 18, comment 23, we need a new mechanism for all browser API but not only addNextPaintListener. Fabrice, if you don't mind I would take this.

Attachment #8360641 - Flags: review?(alive)

(inactive after 6/18) Alive Kuo [:alive]

Comment 28

•

12 years ago

Attached file https://github.com/alivedise/gaia/compare/bugzilla;950266;nuwa?expand=1 (obsolete) — Details

Nuwa Gaia ready adapter

Attachment #8361019 - Flags: review?(timdream)

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 29

•

12 years ago

Attached patch WIP: Notify embedder when BEP is ready. v2 (obsolete) — Details — Splinter Review

I kept pending state since this rather understandable to me. I ran webapi test locally with Alive's patch on Gaia, I still got 3 failures, but it's better than before, when webapi test can't even start.

Attachment #8360552 - Attachment is obsolete: true

[:fabrice] Fabrice Desré

Assignee

Updated

•

12 years ago

Depends on: 960641

[:fabrice] Fabrice Desré

Assignee

Comment 30

•

12 years ago

Let's move the 'mozbrowserready' patches to bug 960641 and keep this one just to turn on nuwa.

[:fabrice] Fabrice Desré

Assignee

Comment 31

•

12 years ago

Attached patch enable-nuwa.patch — Details — Splinter Review

slight update to get things building.

Attachment #8360641 - Attachment is obsolete: true

Attachment #8361019 - Attachment is obsolete: true

Attachment #8361026 - Attachment is obsolete: true

Attachment #8361019 - Flags: review?(timdream)

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 32

•

12 years ago

I am a little bit worried about the regression that could happen if we change the behavior of mozbrowser. Seeking if we can solve this without touching the API..

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 35

•

12 years ago

(In reply to Preeti Raghunath(:Preeti) from comment #33) > Patrick, > > What is the solution moving forward? The patch for bug 960641 was landed.

Flags: needinfo?(pwang)

[:fabrice] Fabrice Desré

Assignee

Comment 36

•

12 years ago

We need to fix bug 961959 first to not break all profiling.

Depends on: 961959

Flags: needinfo?(fabrice)

Faramarz Rashed

Comment 37

•

12 years ago

Fabrice, if that's the case can you add the 1.3 CS blocker status for the dependent bug so it gets tagged please?

Jason Smith [:jsmith]

Updated

•

12 years ago

Blocks: 961591

Preeti Raghunath(:Preeti)

Comment 38

•

12 years ago

Blocks: 942267

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 40

•

12 years ago

Patrick and Cervantes are working through the current tryserver failures.

Flags: needinfo?(fabrice)

Preeti Raghunath(:Preeti)

Comment 41

•

12 years ago

Thanks Kyle

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 42

•

12 years ago

Can we really enable Nuwa at this point for the v1.3 release? It seems hugely risky. Nuwa is a new feature and feature freeze was a long time ago. The fact that the feature hasn't been stabilized yet means that it's not ready for v1.3.

Mandyam Vikram

Comment 43

•

12 years ago

We really need this to be enabled for 1.3 as we are targeting devices with 256 MB configurations where the memory savings provided by Nuwa could play a signficant role.

Cervantes Yu [:cyu] [:cervantes]

Comment 44

•

12 years ago

We bisect locally and found that bug 961047 breaks the test case when Nuwa is enabled.

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 45

•

12 years ago

Try result: https://tbpl.mozilla.org/?tree=Try&rev=4f902b351695 https://tbpl.mozilla.org/?tree=Try&rev=50c61ddd206d It looks the failure occurs after default enabling APZC. Vivien, Shih-Chiang, would you help take a look?

Mike Lee [:mlee]

Updated

•

12 years ago

Status: NEW → ASSIGNED

Keywords: perf

Whiteboard: [MemShrink:P1][tarako][CR 599688] → [c=memory p= s= u=tarako] [MemShrink:P1][tarako][CR 599688]

[:fabrice] Fabrice Desré

Assignee

Comment 46

•

12 years ago

I could not reproduce the failures on a local emulator build on current b2g-inbound. Try run to confirm: https://tbpl.mozilla.org/?tree=Try&rev=8919c94aaa4d

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 47

•

12 years ago

(In reply to Fabrice Desré [:fabrice] from comment #46) > I could not reproduce the failures on a local emulator build on current > b2g-inbound. Try run to confirm: > > https://tbpl.mozilla.org/?tree=Try&rev=8919c94aaa4d The failure case on my local build is not the same as on try server. There's probably a timing issue.

[:fabrice] Fabrice Desré

Assignee

Comment 48

•

12 years ago

Yep, my local (green) build was a debug build - we fail only on opt builds. Sounds like a race :(

Cervantes Yu [:cyu] [:cervantes]

Comment 49

•

12 years ago

We think that turning on Nuwa in the emulator tests might have revealed some potential timing issues with APZ: - (group 3) ERROR TEST-UNEXPECTED-FAIL | /tests/docshell/test/navigation/test_bug430723.html | Test timed out. We found that the onscroll callback in the test case can only be fired when ScrollFrameHelper::AsyncScroll::WillRefresh() is called, that is, it is triggered by the refresh driver. If in ScrollFrameHelper::ScrollToWithOrigin(), mAsyncScroll haven't received callback from nsRefreshDriver, then the onscroll callback will never be fired. The test case will never finish until marionette times out. We have a workaround of this problem that we call ScrollFrameHelper::AsyncScroll::WillRefresh() when it still waits for the refresh driver. Then this failure is gone and we will encounter the unexpected scroll offset like - ERROR TEST-UNEXPECTED-FAIL | /tests/dom/events/test/test_bug946632.html | <input> horizontal line scroll container.scrollTop - got 20, expected 0 There are other unexpected failures like: - ERROR TEST-UNEXPECTED-FAIL | /tests/dom/events/test/test_bug946632.html | <input> horizontal line scroll targets <input> - got [object HTMLDivElement], expected [object HTMLInputElement] Fabrice, we need help from someone familiar with APZ to dive deeper into the failures. Would you help coordinate this?

Cervantes Yu [:cyu] [:cervantes]

Comment 50

•

12 years ago

Attached patch [test-only] Workaround a potential APZ timing issue. — Details — Splinter Review

Cervantes Yu [:cyu] [:cervantes]

Comment 51

•

12 years ago

And if we debug b2g running in the emulator, which makes it even slower, the unexpected scroll offset failures are gone. This indicates that the failure is likely another racing condition.

Cervantes Yu [:cyu] [:cervantes]

Comment 52

•

12 years ago

Myth busted: APZC is not enabled on try runs even with the gaia part in bug 961047!! We try to log AsyncPanZoomController ctor but didn't see it created. We print the preference "dom.browser_frames.useAsyncPanZoom" and it is false when Nuwa is not enabled and tests passed. I think Nuwa makes the preference in gaia take effect in time and the bugs with APZ are revealed on the try server.

[:fabrice] Fabrice Desré

Assignee

Updated

•

12 years ago

Depends on: 965351

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 53

•

12 years ago

(In reply to Cervantes Yu from comment #52) > We try to log AsyncPanZoomController ctor but didn't see it created. We > print the preference "dom.browser_frames.useAsyncPanZoom" and it is false > when Nuwa is not enabled and tests passed. I think Nuwa makes the preference > in gaia take effect in time and the bugs with APZ are revealed on the try > server. Why would Nuwa change the behaviour of the preference? Note that we explicitly leave dom.browser_frames.useAsyncPanZoom as false in the b2g.js prefs file, and then there is a separate gaia pref that sets it to true when gaia starts up. Since the try server doesn't run gaia, that code doesn't run and dom.browser_frames.useAsyncPanZoom remains false for try testing. From my limited understanding of Nuwa I don't see why this would change if that were enabled.

Mason Chang [Inactive] [:mchang]

Updated

•

12 years ago

Priority: -- → P1

[:fabrice] Fabrice Desré

Assignee

Comment 54

•

12 years ago

Nuwa doesn't change the behavior of the preference, but the timing between the frame loader and turning on the pref to true in b2g/chrome/content/settings.js : With nuwa enabled: I/GeckoDump( 289): XXX Settings: dom.browser_frames.useAsyncPanZoom set to true E/GeckoConsole( 327): Could not read chrome manifest 'file:///system/b2g/chrome.manifest'. I/Gecko ( 289): XXX frame loader async=true With nuwa Disabled: I/Gecko ( 294): XXX frame loader async=false I/GeckoDump( 294): XXX Settings: dom.browser_frames.useAsyncPanZoom set to true So with nuwa enabled, we run the tests with apzc enabled, and fail because of the race.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 55

•

12 years ago

Huh. So you're saying that even without the Nuwa patches, tryserver run is turning on the pref at some point? That was not what I expected based on https://bugzilla.mozilla.org/show_bug.cgi?id=961047#c36 (specifically the middle paragraph that says the test runner doesn't use the Gaia settings).

[:fabrice] Fabrice Desré

Assignee

Comment 56

•

12 years ago

Yes, but it turns it on after the frame creation, so this has no effect on the tests. That's really bad I think.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 57

•

12 years ago

If I can successfully repro the test failures locally I can dig into it. I would prefer to fix the test failures properly than back out bug 961047 or land more hacks. Right now I have a b2g-emulator build that appears to work but am running into errors when trying to run the mochitests on it; I'll continue on that tomorrow.

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Comment 58

•

12 years ago

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #57) > If I can successfully repro the test failures locally I can dig into it. I > would prefer to fix the test failures properly than back out bug 961047 or > land more hacks. Right now I have a b2g-emulator build that appears to work > but am running into errors when trying to run the mochitests on it; I'll > continue on that tomorrow. FYI, the gecko tree I use with the emulator has a workaround for bug 933336 (attachment 826741 [details] [diff] [review]) and the not-yet-landed patch for bug 936312.

[:fabrice] Fabrice Desré

Assignee

Comment 59

•

12 years ago

(In reply to Jed Davis [:jld] from comment #58) > FYI, the gecko tree I use with the emulator has a workaround for bug 933336 > (attachment 826741 [details] [diff] [review]) and the not-yet-landed patch > for bug 936312. I don't think these bugs/patches are relevant for these test failures.

Andreas Gal :gal

Comment 60

•

12 years ago

Are we confident that nuwa actually works and merely our tests fail? In that case disable test, land the patch, and lets keep digging. There is no need to hold up CS as long we think this is actually working.

[:fabrice] Fabrice Desré

Assignee

Comment 61

•

12 years ago

Hopefully last try build with the 3 tests disabled: https://tbpl.mozilla.org/?tree=Try&rev=01dc40bbda45

Thinker Li [:sinker]

Comment 62

•

12 years ago

Cervantes had ignored the preference value and always enable APZ with a hard code. The result is really terrible. The testcases even never start, they are blocked at somewhere before marionette starting the testcases. What I had been learnt last night is that APZ would not work properly with AsyncScroll in concurrent. The scrollframes post scrolling events only for the changes of the scroll position. AsyncScroll would change scroll position periodically and post scrolling events. But, existing AsyncScroll would be stopped by another ScrollBy()/ScrollTo() with INSTANT mode. APZ would cause a ScrollBy(), and it is related to mDestination; the destination of the AsyncScroll. So, if a layer transaction for AsyncScroll's scrolling is sent after APZC's RequestContentRepaint() that cause a ScrollBy() at the content process in turn, and ScrollBy() is dispatched at the content process after the transaction, then it could scroll to current position; no position change. In another word, existing AsyncScroll would be stopped without any scrolling event in advance. Consider following scenario at a content process, 1. Current is at A position 2. call ScrollTo(B position); mDestination == B 3. post a scrolling event @ C position (between A, B); but AsyncScroll is still running. 4. call ScrollTo(D position) and expect to receive one or more scrolling events; AsyncScroll is reused and keep running. 5. ScrollBy(delta, INSTANT) from APZC at the parent process. C == mDestination + delta. Current AsyncScroll is destroyed. 6. The expectation of scrolling is failed now.

[:fabrice] Fabrice Desré

Assignee

Comment 63

•

12 years ago

Thinker, none of this is nuwa specific, right?

Milan Sreckovic [:milan] (needinfo for best results)

Comment 64

•

12 years ago

(In reply to Fabrice Desré [:fabrice] from comment #63) > Thinker, none of this is nuwa specific, right? If we're talking about our scrolling tests needing to be redone for APZC, then I believe you're right - that was the case before nuwa. We're tracking similar issue in bug 958036, I guess we need something for these tests as well.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 65

•

12 years ago

(In reply to Thinker Li [:sinker] from comment #62) > What I had been learnt last night is that APZ would not work properly with > AsyncScroll in concurrent. The scrollframes post scrolling events only for > the changes of the scroll position. AsyncScroll would change scroll > position periodically and post scrolling events. But, existing AsyncScroll > would be stopped by another ScrollBy()/ScrollTo() with INSTANT mode. APZ > would cause a ScrollBy(), and it is related to mDestination; the destination APZ does a ScrollTo, but otherwise you're exactly right. Also this should have been fixed very recently in bug 961280. Do you know if the problem you describe persists after that fix is applied?

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 66

•

12 years ago

(In reply to Milan Sreckovic [:milan] from comment #64) > We're tracking > similar issue in bug 958036, I guess we need something for these tests as > well. The "figuring out what's wrong in these tests" is mostly covered by bug 963278.

[:fabrice] Fabrice Desré

Assignee

Comment 67

•

12 years ago

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #65) > (In reply to Thinker Li [:sinker] from comment #62) > > What I had been learnt last night is that APZ would not work properly with > > AsyncScroll in concurrent. The scrollframes post scrolling events only for > > the changes of the scroll position. AsyncScroll would change scroll > > position periodically and post scrolling events. But, existing AsyncScroll > > would be stopped by another ScrollBy()/ScrollTo() with INSTANT mode. APZ > > would cause a ScrollBy(), and it is related to mDestination; the destination > > APZ does a ScrollTo, but otherwise you're exactly right. Also this should > have been fixed very recently in bug 961280. Do you know if the problem you > describe persists after that fix is applied? bug 961280 was merged to b2g-inbound and didn't fix the test failures for me.

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 68

•

12 years ago

(In reply to Fabrice Desré [:fabrice] from comment #63) > Thinker, none of this is nuwa specific, right? Cervantes' test is with Nuwa turned off and APZ turned on. Also we did a test that blocks all RecvUpdateFrame messages in TabChild with Nuwa on, in this case, and we can pass the test case.

Thinker Li [:sinker]

Comment 69

•

12 years ago

(In reply to Fabrice Desré [:fabrice] from comment #63) > Thinker, none of this is nuwa specific, right? Right, we don't find any nuwa specific issue for APZ yet.

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 70

•

12 years ago

(In reply to Fabrice Desré [:fabrice] from comment #67) > bug 961280 was merged to b2g-inbound and didn't fix the test failures for me. Ok, but can you tell me why it didn't? i.e. why didn't it hit the codepath I added and skip the ScrollTo call? Is my patch wrong, or is it failing for some other reason now?

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 71

•

12 years ago

I'm able to reproduce the failures locally and will debug them. You can follow along in bug 963278 if interested.

Andreas Gal :gal

Comment 72

•

12 years ago

Is this landed yet? We are past our CS deadline.

[:fabrice] Fabrice Desré

Assignee

Comment 73

•

12 years ago

remote: https://hg.mozilla.org/integration/b2g-inbound/rev/2963f15842fc remote: https://hg.mozilla.org/integration/b2g-inbound/rev/e3e75f44864b

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Updated

•

12 years ago

Depends on: 963278

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 74

•

12 years ago

Backed out in https://hg.mozilla.org/integration/b2g-inbound/rev/666164e66db2 for making test_webapps_actor.html permaorange and mochitest-9's log permaoverflowing

Thinker Li [:sinker]

Comment 75

•

12 years ago

(In reply to Wes Kocher (:KWierso) from comment #74) > Backed out in > https://hg.mozilla.org/integration/b2g-inbound/rev/666164e66db2 for making > test_webapps_actor.html permaorange and mochitest-9's log permaoverflowing Is test_webapps_actor.html right? If I am right, mm.sendAsyncMessage("install", ...) is before all installing of message listeners, and app installing is executed async in another process. It don't promise the order of installing apps and registrations of messages.

[:fabrice] Fabrice Desré

Assignee

Comment 76

•

12 years ago

I pushed to try: - disabling the test: https://tbpl.mozilla.org/?tree=Try&rev=c14db8951f73 - fixing the test: https://tbpl.mozilla.org/?tree=Try&rev=3aeb7177a3f1

Thinker Li [:sinker]

Comment 77

•

12 years ago

Hi Fabrice, I suggest to move mm.sendAsyncMessage("install", ...) to here https://hg.mozilla.org/try/rev/96e3b4c77097#l5.31 since navigator.mozApps.mgmt is created lazily, and it registers for a few messages.

[:fabrice] Fabrice Desré

Assignee

Comment 78

•

12 years ago

daily push: https://hg.mozilla.org/integration/b2g-inbound/rev/ca2ab9f4ac9c

Inder

Comment 79

•

12 years ago

Fabrice, can you please provide v1.3 patch also.

Flags: needinfo?(fabrice)

[:fabrice] Fabrice Desré

Assignee

Comment 80

•

12 years ago

Inder, I'll land on 1.3 once we're on m-c

Flags: needinfo?(fabrice)

Michael Vines [:m1] [:evilmachines]

Comment 81

•

12 years ago

Hey Fabrice, we'd like to enable internally on v1.3 in parallel. Waiting for it to land on m-c and then for the uplift will take quite a while.

[:fabrice] Fabrice Desré

Assignee

Comment 82

•

12 years ago

Attached patch nuwa-1_3.patch — Details — Splinter Review

Inder, here's a rebased version for 1.3/aurora.

Ryan VanderMeulen [:RyanVM]

Comment 83

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/ca2ab9f4ac9c

Status: ASSIGNED → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla29

Mike Lee [:mlee]

Updated

•

12 years ago

Whiteboard: [c=memory p= s= u=tarako] [MemShrink:P1][tarako][CR 599688] → [c=memory p= s=2014.01.31 u=tarako] [MemShrink:P1][tarako][CR 599688]

Ryan VanderMeulen [:RyanVM]

Comment 84

•

12 years ago

https://hg.mozilla.org/releases/mozilla-aurora/rev/bb8075885195

status-b2g-v1.3: affected → fixed

status-b2g-v1.4: --- → fixed

status-firefox27: --- → wontfix

status-firefox28: --- → fixed

status-firefox29: --- → fixed

Andrew McCreight [:mccr8]

Updated

•

12 years ago

Depends on: 966582

Ryan VanderMeulen [:RyanVM]

Comment 85

•

12 years ago

Backed out from both branches for frequent B2G mochitest-4 crashes as seen in bug 966582. On Aurora, it was basically perma-crash. On trunk, it wasn't that bad, but still frequent. https://hg.mozilla.org/releases/mozilla-aurora/rev/3a4d6b21c0ea https://hg.mozilla.org/mozilla-central/rev/c5b7d3d1d5ee

Status: RESOLVED → REOPENED

status-b2g-v1.3: fixed → affected

status-b2g-v1.4: fixed → affected

status-firefox28: fixed → affected

status-firefox29: fixed → affected

Resolution: FIXED → ---

Target Milestone: mozilla29 → ---

Patrick Wang (Chih-Kai Wang) (:kk1fff)

Comment 86

•

12 years ago

I can reproduce this crash locally, and I got backtrace on content process: Program received signal SIGSEGV, Segmentation fault. _pr_poll_with_poll (pds=0x402c6020, npds=1, timeout=<value optimized out>) at /home/patrick/w/hgpool/emu1/mcgit/nsprpub/pr/src/pthreads/ptio.c:3814 3814 in_flags_read = (pds[index].fd->methods->poll)( #0 _pr_poll_with_poll (pds=0x402c6020, npds=1, timeout=<value optimized out>) at /home/patrick/w/hgpool/emu1/mcgit/nsprpub/pr/src/pthreads/ptio.c:3814 #1 0x41f4d62e in PR_Poll (pds=0x1f, npds=1076706396, timeout=65535) at /home/patrick/w/hgpool/emu1/mcgit/nsprpub/pr/src/pthreads/ptio.c:4326 #2 0x405f829e in nsSocketTransportService::Poll (this=<value optimized out>, wait=<value optimized out>, interval=0x402d3cb4) at /home/patrick/w/hgpool/emu1/mcgit/netwerk/base/src/nsSocketTransportService2.cpp:393 #3 0x405f85da in nsSocketTransportService::DoPollIteration (this=0x42de1a00, wait=true) at /home/patrick/w/hgpool/emu1/mcgit/netwerk/base/src/nsSocketTransportService2.cpp:820 #4 0x405f87d6 in nsSocketTransportService::Run (this=0x42de1a00) at /home/patrick/w/hgpool/emu1/mcgit/netwerk/base/src/nsSocketTransportService2.cpp:684 #5 0x405d34fe in nsThread::ProcessNextEvent (this=0x40202a20, mayWait=<value optimized out>, result=0x402d3d47) at /home/patrick/w/hgpool/emu1/mcgit/xpcom/threads/nsThread.cpp:637 #6 0x405a3be8 in NS_ProcessNextEvent (thread=0x40202a20, mayWait=false) at /home/patrick/w/hgpool/emu1/mcgit/xpcom/glue/nsThreadUtils.cpp:263 #7 0x40709f84 in mozilla::ipc::MessagePumpForNonMainThreads::Run (this=0x42e20f10, aDelegate=0x402eb1a0) at /home/patrick/w/hgpool/emu1/mcgit/ipc/glue/MessagePump.cpp:303 #8 0x406ffd3c in MessageLoop::RunInternal (this=0x1000000) at /home/patrick/w/hgpool/emu1/mcgit/ipc/chromium/src/base/message_loop.cc:226 #9 0x406ffdba in MessageLoop::RunHandler (this=0x402eb1a0) at /home/patrick/w/hgpool/emu1/mcgit/ipc/chromium/src/base/message_loop.cc:219 #10 MessageLoop::Run (this=0x402eb1a0) at /home/patrick/w/hgpool/emu1/mcgit/ipc/chromium/src/base/message_loop.cc:193 #11 0x405d3a9c in nsThread::ThreadFunc (arg=<value optimized out>) at /home/patrick/w/hgpool/emu1/mcgit/xpcom/threads/nsThread.cpp:258 #12 0x41f50348 in _pt_root (arg=<value optimized out>) at /home/patrick/w/hgpool/emu1/mcgit/nsprpub/pr/src/pthreads/ptthread.c:205 #13 0x40017358 in _thread_create_startup (arg=0x402abc00) at /home/patrick/w/hgpool/emu1/mcgit/mozglue/build/Nuwa.cpp:543 #14 thread_create_startup (arg=0x402abc00) at /home/patrick/w/hgpool/emu1/mcgit/mozglue/build/Nuwa.cpp:574 #15 0x40060e4c in __thread_entry (func=0x402d3f00, arg=0x402abc00, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217 #16 0x4006099c in pthread_create (thread_out=<value optimized out>, attr=0x402abc14, start_routine=0x40017319 <thread_create_startup>, arg=0x402abc00) at bionic/libc/bionic/pthread.c:357 #17 0x00000000 in ?? () The pds[index].fd has type (PRFileDesc *) and its value is 0x1f, which is not accessible.

[:fabrice] Fabrice Desré

Assignee

Updated

•

12 years ago

Depends on: 966802

Mike Lee [:mlee]

Updated

•

12 years ago

Whiteboard: [c=memory p= s=2014.01.31 u=tarako] [MemShrink:P1][tarako][CR 599688] → [c=memory p= s= u=1.3+tarako] [MemShrink:P1][tarako][CR 599688]

Preeti Raghunath(:Preeti)

Comment 87

•

12 years ago

Fabrice, Is this on your radar?

Flags: needinfo?(fabrice)

[:fabrice] Fabrice Desré

Assignee

Comment 88

•

12 years ago

(In reply to Preeti Raghunath(:Preeti) from comment #87) > Fabrice, > > Is this on your radar? I feel it's my whole life these last couple of weeks. I think I'm gonna make a fridge magnet saying "Re-enable Nuwa!"

Flags: needinfo?(fabrice)

Ryan VanderMeulen [:RyanVM]

Comment 89

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/322238bbb29b Fabrice and I agree that we should way a day or two before uplifting this so we can better see any issues that pop up from the landing.

Status: REOPENED → RESOLVED

Closed: 12 years ago → 12 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla30

Jason Smith [:jsmith]

Updated

•

12 years ago

Depends on: 968250

Jason Smith [:jsmith]

Comment 90

•

12 years ago

I'm adding NO_UPLIFT here to indicate to not uplift this until we figure out what happened in bug 968250. If nuwa caused bug 968250, then we'll need to back out.

Whiteboard: [c=memory p= s= u=1.3+tarako] [MemShrink:P1][tarako][CR 599688] → [c=memory p= s= u=1.3+tarako] [MemShrink:P1][tarako][CR 599688][NO_UPLIFT]

Gabriele Svelto [:gsvelto]

Comment 91

•

12 years ago

I've just run into another issue that seems to have been caused by this patch and which is a *significant* regression: the process priorities are all over the place because the process priority manager seems to be convinced that there's always a high-priority process running (like a process holding the CPU wakelock). I have not yet managed to establish what is the root cause but it seems that there's a leftover entry in the |mHighPriorityChildIDs| child array ( http://hg.mozilla.org/mozilla-central/file/579cf46bc21e/dom/ipc/ProcessPriorityManager.cpp#l176 ) that confuses the manager and causes it to miscompute the priorities of the other processes. Disabling Nuwa by unsetting MOZ_NUWA_PROCESS during the build process makes the issue go away.

[:fabrice] Fabrice Desré

Assignee

Comment 92

•

12 years ago

Gabriele, file a follow up please.

Gabriele Svelto [:gsvelto]

Updated

•

12 years ago

Blocks: 968297

Jason Smith [:jsmith]

Comment 93

•

12 years ago

Removing NO_UPLIFT as the root cause of the problem has been found for why the keon was failing to boot. See bug 967967. We'll want to uplift bug 967967 with this patch if we are to ensure that the keon device remains functional on 1.3.

Whiteboard: [c=memory p= s= u=1.3+tarako] [MemShrink:P1][tarako][CR 599688][NO_UPLIFT] → [c=memory p= s= u=1.3+tarako] [MemShrink:P1][tarako][CR 599688]

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 94

•

12 years ago

We should still sit on this for a couple days.

Jason Smith [:jsmith]

Comment 95

•

12 years ago

(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #94) > We should still sit on this for a couple days. Agreed - the bake time is definitely important here to ensure that we're not hitting basic bustage with the pref on. Can we plan for the uplift on Monday? Or do we want to earlier? I want to make sure that we do get this in our next full functional test run, so that we can find nuwa crashes during the testing.

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 96

•

12 years ago

That sounds reasonable. I can uplift it Monday morning Taipei time.

Faramarz Rashed

Comment 97

•

12 years ago

We made a commitment to partners that we would uplift this by Friday. We have to do it before Monday. Thanks.

Jason Smith [:jsmith]

Comment 98

•

12 years ago

(In reply to Faramarz (:faramarz) from comment #97) > We made a commitment to partners that we would uplift this by Friday. We > have to do it before Monday. Thanks. Alright, I guess we're doing this Friday then. Also - there's a smoketest run that has been executed now with nuwa enabled. Only new issue found with nuwa enabled was bug 968501. I'm currently looking someone to test to see if that happens with nuwa disabled.

Jason Smith [:jsmith]

Updated

•

12 years ago

Depends on: 968501

Preeti Raghunath(:Preeti)

Comment 99

•

12 years ago

Yes, 2/7 was the committed date.

Jason Smith [:jsmith]

Comment 100

•

12 years ago

The UI automation report came out clean this morning with no regressions caused by nuwa, so were confirmed to be safe here. We're also planning on kicking off a full functional run next Monday. Fabrice - Can you land this on b2g28?

Flags: needinfo?(fabrice)

Ryan VanderMeulen [:RyanVM]

Comment 101

•

12 years ago

https://hg.mozilla.org/releases/mozilla-b2g28_v1_3/rev/164c55b989eb

status-b2g-v1.3: affected → fixed

status-b2g-v1.4: affected → fixed

status-firefox28: affected → wontfix

status-firefox29: affected → wontfix

status-firefox30: --- → fixed

Flags: needinfo?(fabrice)

Ryan VanderMeulen [:RyanVM]

Comment 102

•

12 years ago

Follow-up to disable the flaky WebRTC tests on v1.3 as well since the patch here wasn't ever updated. https://hg.mozilla.org/releases/mozilla-b2g28_v1_3/rev/c0ea6c9350d6 And re-enable the WebRTC tests on !B2G on trunk because I'm somehow doubting that was the intended goal. https://hg.mozilla.org/integration/mozilla-inbound/rev/9add9f9ed1fd

[:fabrice] Fabrice Desré

Assignee

Comment 103

•

12 years ago

Thanks Ryan, we owe you one!

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 104

•

12 years ago

(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #102) > And re-enable the WebRTC tests on !B2G on trunk because I'm somehow doubting > that was the intended goal. > https://hg.mozilla.org/integration/mozilla-inbound/rev/9add9f9ed1fd Well, I had to back that out in http://hg.mozilla.org/integration/mozilla-inbound/rev/d91a9bc9e84c for various test failures like https://tbpl.mozilla.org/php/getParsedLog.php?id=34317544&tree=Mozilla-Inbound and https://tbpl.mozilla.org/php/getParsedLog.php?id=34316799&tree=Mozilla-Inbound

Jason Smith [:jsmith]

Comment 105

•

12 years ago

(In reply to Wes Kocher (:KWierso) from comment #104) > (In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #102) > > And re-enable the WebRTC tests on !B2G on trunk because I'm somehow doubting > > that was the intended goal. > > https://hg.mozilla.org/integration/mozilla-inbound/rev/9add9f9ed1fd > > Well, I had to back that out in > http://hg.mozilla.org/integration/mozilla-inbound/rev/d91a9bc9e84c for > various test failures like > https://tbpl.mozilla.org/php/getParsedLog.php?id=34317544&tree=Mozilla- > Inbound and > https://tbpl.mozilla.org/php/getParsedLog.php?id=34316799&tree=Mozilla- > Inbound ekr - is this the failure you were talking about over email that occurs with the nuwa webrtc crash fixed?

Flags: needinfo?(ekr)

Ryan VanderMeulen [:RyanVM]

Comment 106

•

12 years ago

Impressive that we managed to break them that fast. Spun off to bug 969718.

Ryan VanderMeulen [:RyanVM]

Comment 107

•

12 years ago

Merged to v1.3t. https://hg.mozilla.org/releases/mozilla-b2g28_v1_3t/pushloghtml?changeset=86e56630cb7e

status-b2g-v1.3T: --- → fixed

Eric Rescorla (:ekr)

Comment 108

•

12 years ago

Yes, this looks like a similar problem.

Flags: needinfo?(ekr)

Joe Cheng [:jcheng] (please needinfo)

Comment 109

•

12 years ago

status-b2g-v1.3T: fixed in tarako branch, remove [tarako] in whiteboard

Whiteboard: [c=memory p= s= u=1.3+tarako] [MemShrink:P1][tarako][CR 599688] → [c=memory p= s= u=1.3+tarako] [MemShrink:P1][CR 599688]

Jason Smith [:jsmith]

Updated

•

12 years ago

Depends on: 965912

Jason Smith [:jsmith]

Updated

•

12 years ago

Depends on: 974807

Randell Jesup [:jesup] (needinfo me)

Comment 110

•

12 years ago

re-enabled tests after a green try (and b2g 1.3 has them enabled) https://hg.mozilla.org/integration/mozilla-inbound/rev/752a28fd6b73

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 112

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/752a28fd6b73

Jason Smith [:jsmith]

Updated

•

12 years ago

No longer depends on: 974807

Jason Smith [:jsmith]

Updated

•

11 years ago

Depends on: 1024928

WIP: Notify embedder when BEP is ready. 12 years ago Patrick Wang (Chih-Kai Wang) (:kk1fff) 18.27 KB, patch	fabrice : feedback+	Details \| Diff \| Splinter Review
mozbrowser-ready.patch 12 years ago [:fabrice] Fabrice Desré 1.56 KB, patch		Details \| Diff \| Splinter Review
https://github.com/alivedise/gaia/compare/bugzilla;950266;nuwa?expand=1 12 years ago (inactive after 6/18) Alive Kuo [:alive] 71 bytes, text/plain		Details
WIP: Notify embedder when BEP is ready. v2 12 years ago Patrick Wang (Chih-Kai Wang) (:kk1fff) 20.07 KB, patch		Details \| Diff \| Splinter Review
enable-nuwa.patch 12 years ago [:fabrice] Fabrice Desré 2.38 KB, patch		Details \| Diff \| Splinter Review
[test-only] Workaround a potential APZ timing issue. 12 years ago Cervantes Yu [:cyu] [:cervantes] 1.17 KB, patch		Details \| Diff \| Splinter Review
nuwa-1_3.patch 12 years ago [:fabrice] Fabrice Desré 8.05 KB, patch		Details \| Diff \| Splinter Review