Closed Bug 918978 Opened 11 years ago Closed 11 years ago

Poppit abnormally terminates quickly after trying to play the game for a few seconds

Categories

(Firefox OS Graveyard :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:koi+, firefox27 verified, b2g-v1.2 fixed)

RESOLVED FIXED
blocking-b2g koi+
Tracking Status
firefox27 --- verified
b2g-v1.2 --- fixed

People

(Reporter: jsmith, Assigned: snorp)

References

Details

(Keywords: regression, Whiteboard: [MemShrink:P2])

Attachments

(4 files)

Attached file Logcat
Build: 1.2 9/20/2013 Device: Buri STR 1. Install Poppit from marketplace 2. Launch it 3. Try playing the game for a few seconds Expected The game stays up and running. Actual After playing the game for a few seconds, the app abnormally gets killed.
blocking-b2g: --- → koi?
Attached file dmesg log
I can also re-generate the crash on master hamachi. From dmesg log, poppit process is killed by low memory killer.
This bug seems very similar to Bug 899907.
Whiteboard: [MemShrink]
moving to koi+ as it is a regression issue.
blocking-b2g: koi? → koi+
QA Contact: dwatson
The window for the bug appears to be between 09/06 and 09/09: Occurs on - Buri Build ID: 20130909114657 Gecko: http://hg.mozilla.org/mozilla-central/rev/218d4334d29e Gaia: aa4180e9286d385fa6b62d236f30fb24cd8b93e9 Platform Version: 26.0a1 Does not occur - Build ID: 20130906040204 Gecko: http://hg.mozilla.org/mozilla-central/rev/ab5f29823236 Gaia: 94e5f269874b02ac0ea796b64ab995fce9efa4b3 Platform Version: 26.0a1 In build 09/06 the Poppit application does not experience the crash while playing, but in 09/09 the application crashes while playing.
I revert the following, by this change I can play poppit by using ROM built from same source in Comment 1. - Bug 907745 - Enable the new textures on B2G https://hg.mozilla.org/integration/b2g-inbound/rev/5174b38b4153
(In reply to Sotaro Ikeda [:sotaro] from comment #5) > I revert the following, by this change I can play poppit by using ROM built > from same source in Comment 1. > > - Bug 907745 - Enable the new textures on B2G > https://hg.mozilla.org/integration/b2g-inbound/rev/5174b38b4153 This revert seems to take the lifetime of poppit longer. But the app is killed during play at last.
(In reply to Sotaro Ikeda [:sotaro] from comment #6) > (In reply to Sotaro Ikeda [:sotaro] from comment #5) > This revert seems to take the lifetime of poppit longer. But the app is > killed during play at last. When reverted, I can play poppit. But ui touch and home button often became not work at all.
Blocks: GFXB2G1.2
Make sure we test on 1.2 going forward, not on master which has different graphics API in use (should not be related here though, if the problem was introduced on 9/6.)
Assignee: nobody → nical.bugzilla
Nothing obvious while running this on B2G emulator: - about:memory says for the Poppit process 140 M resident, 121 M resident-unique, which is enough to explain OOM on device, but offers very little clue into how that memory is used. 101 M explicit, of which 70M unclassified. Nothing special gfx-wise, with 4 M gralloc, 0.6 M canvas-2d-pixels, 7.57 images... - logcat keep spewing errors like this: E/OMXCodec( 192): [OMX.google.mp3.decoder] Timed out waiting for output buffers: 4/0 - in refgraph, nothing special graphics-wise either. Few layers, few Gralloc buffers... The only thing that seemed to be growing was the number of MediaDecoderStateMachine objects on the heap. It was 2 at the beginning of the game, and continuously grew to 6. These objects are part of a cycle of strong references that is not declared to the CC, reported as follows ([1]) by refgraph. However, these cycles remained referenced by outside, they weren't leaked. - the next step should be to run this in DMD to figure what the heap usage is. [1] : Cycle 45 has 2 vertices. Is NOT traversed by the CC! Types involved: mozilla::MediaDecoder / mozilla::MediaDecoderStateMachine Vertex 0: type name: mozilla::MediaDecoder address: 0x4435b160 size: 296 bytes belongs to Cycle 45 Strong references: strong reference 0 address: 0x402695e0 reference type name: nsCOMPtr<mozilla::MediaDecoderStateMachine> NON-traversed by the CC strong reference 1 address: 0x435db020 reference type name: nsRefPtr<mozilla::MediaResource> NON-traversed by the CC Weak references: weak reference 0 <unknown> at 0x4022f040 weak reference 1 <unknown> at 0x445c7ca0 weak reference 2 <unknown> at 0x445c8520 weak reference 3 FragmentOrElement (xhtml) audio app://{588c9b49-9c87-4985-b938-a5ff51cd78a9}/index.html at 0x44cbac20 Back strong references: back strong reference 0 mozilla::MediaDecoderStateMachine at 0x402695e0 back strong reference 1 FragmentOrElement (xhtml) audio app://{588c9b49-9c87-4985-b938-a5ff51cd78a9}/index.html at 0x44cbac20 back strong reference 2 nsTArrayHeader at 0x4592c020 Back weak references: back weak reference 0 mozilla::MediaDecoderStateMachine at 0x402695e0 back weak reference 1 <unknown> at 0x43392e20 back weak reference 2 <unknown> at 0x433f8a20 back weak reference 3 mozilla::MediaResource at 0x435db020 back weak reference 4 nsTArrayHeader at 0x44b99020 back weak reference 5 <unknown> at 0x4546f8e0 Vertex 1: type name: mozilla::MediaDecoderStateMachine address: 0x402695e0 size: 376 bytes belongs to Cycle 45 Strong references: strong reference 0 address: 0x4435b160 reference type name: nsRefPtr<mozilla::MediaDecoder> NON-traversed by the CC strong reference 1 address: 0x459a7810 reference type name: nsCOMPtr<nsITimer> NON-traversed by the CC Weak references: weak reference 0 <unknown> at 0x4022f040 weak reference 1 mozilla::MediaDecoderStateMachine at 0x402695e0 weak reference 2 <unknown> at 0x43392e20 weak reference 3 mozilla::MediaDecoder at 0x4435b160 weak reference 4 <unknown> at 0x44391cf0 weak reference 5 <unknown> at 0x44bddf20 Back strong references: back strong reference 0 mozilla::MediaDecoder at 0x4435b160 Back weak references: back weak reference 0 mozilla::MediaDecoderStateMachine at 0x402695e0 back weak reference 1 nsTimerImpl at 0x459a7810 back weak reference 2 <unknown> at 0x55466020 back weak reference 3 <unknown> at 0x554d3020 back weak reference 4 <unknown> at 0x5a007020 back weak reference 5 <unknown> at 0x5a070020
Sotaro / Chris D / Paul might know about whether the MediaDecoder situation here is normal.
(In reply to Benoit Jacob [:bjacob] from comment #10) > Sotaro / Chris D / Paul might know about whether the MediaDecoder situation > here is normal. Sounds like they use a lot of audio objects, each of which would result in a number of threads being created. Do we have the source to Poppit?
Attached file poppit-1.0.5.zip
(In reply to Chris Double (:doublec) from comment #11) > (In reply to Benoit Jacob [:bjacob] from comment #10) > > Sotaro / Chris D / Paul might know about whether the MediaDecoder situation > > here is normal. > > Sounds like they use a lot of audio objects, each of which would result in a > number of threads being created. Do we have the source to Poppit? I just attached the zip file from marketplace with the source.
(In reply to Sotaro Ikeda [:sotaro] from comment #5) > I revert the following, by this change I can play poppit by using ROM built > from same source in Comment 1. > > - Bug 907745 - Enable the new textures on B2G > https://hg.mozilla.org/integration/b2g-inbound/rev/5174b38b4153 I am going to re-check if the above is correct.
Attached file hg change log
"hg log" got from the following command. Tags are based on comment 4. - hg log -r ab5f29823236:218d4334d29e
From attachment 810747 [details], I picked up the suspicious changes. ---------------------------------------------------------- changeset: 145870:949ea52cb8ed user: Nicolas Silva <nical.bugzilla@gmail.com> date: Fri Sep 06 11:03:12 2013 +0200 summary: Bug 911941 - Prevent TextureClients to be shared by several ImageLayers. r=Bas changeset: 145871:af7a69606aa8 user: Nicolas Silva <nical.bugzilla@gmail.com> date: Fri Sep 06 11:04:50 2013 +0200 summary: Bug 912040 - Make sure deallocating shared texture data doesn't race with TextureClient access. r=sotaro changeset: 145872:5174b38b4153 user: Nicolas Silva <nical.bugzilla@gmail.com> date: Fri Sep 06 11:08:55 2013 +0200 summary: Bug 907745 - Enable the new textures on B2G ------------------------------------------------------- changeset: 145961:194d051724a8 user: Stephen Pohl <spohl.mozilla.bugs@gmail.com> date: Fri Sep 06 12:30:17 2013 -0400 summary: Bug 817700 - Make <canvas>.toBlob run asynchronously. r=seth,roc changeset: 145962:db43e949245c user: Stephen Pohl <spohl.mozilla.bugs@gmail.com> date: Fri Sep 06 12:30:21 2013 -0400 summary: Bug 817700 - Make <canvas>.toBlob run asynchronously - canvas changes. r=roc changeset: 145961:194d051724a8 user: Stephen Pohl <spohl.mozilla.bugs@gmail.com> date: Fri Sep 06 12:30:17 2013 -0400 summary: Bug 817700 - Make <canvas>.toBlob run asynchronously. r=seth,roc ------------------------------------------------------- changeset: 145971:2f0686edebb5 user: Benoit Jacob <bjacob@mozilla.com> date: Fri Sep 06 08:02:04 2013 -0400 summary: Bug 905227 - Introduce 128 px threshold for Skia/GL, overridable by gfx.canvas.min-size-for-skia-gl preference - r=nical changeset: 145972:d9e41df7445c user: Benoit Jacob <bjacob@mozilla.com> date: Fri Sep 06 08:02:12 2013 -0400 summary: Bug 905227 - Turn on Skia/GL on B2G - r=nical changeset: 145973:8e91e51a5c19 user: Benoit Jacob <bjacob@mozilla.com> date: Fri Sep 06 08:03:08 2013 -0400 summary: Bug 905227 - On Android Emulator, assume that OES_rgb8_rgba8 is available even though it's not in the GL_EXTENSIONS string - r=nical changeset: 145974:79c8b5e214a9 user: Benoit Jacob <bjacob@mozilla.com> date: Fri Sep 06 08:03:53 2013 -0400 summary: Bug 905227 - Adjust WebGL reftest.list for UNEXPECTED-PASS - r=nical -------------------------------------------------------------- changeset: 146103:a20a9a879347 user: Chris Pearce <cpearce@mozilla.com> date: Mon Sep 09 09:47:29 2013 +1200 summary: Bug 910996 - Move MP3FrameParser out of B2G specific code, so it can be used on other platforms. r=padenot ------------------------------------------------------------ changeset: 146106:353aa004fed8 user: Matt Woodrow <mwoodrow@mozilla.com> date: Mon Sep 09 14:35:20 2013 +1200 summary: Bug 912173 - Always AddRef and Release the GLContext when sharing a SurfaceStream to avoid mismatches. r=nrc
I feel the enabling the SkiaGL change seems most suspicious. Disabled it by commenting out the following. Since then the poppit did not get killed by low memory killer. // pref("gfx.canvas.azure.backends", "skia"); // pref("gfx.canvas.azure.accelerated", true); ------------------------------------------ changeset: 145972:d9e41df7445c user: Benoit Jacob <bjacob@mozilla.com> date: Fri Sep 06 08:02:12 2013 -0400 summary: Bug 905227 - Turn on Skia/GL on B2G - r=nical
That makes sense - when we use SkiaGL, we use their cache, and the memory usage would go up. I don't think the fix is to disable SkiaGL, and there are some thoughts for getting the cache picture more optimal, though I doubt that is a 1.2 timeframe.
Flags: needinfo?(bjacob)
During debugging, I saw another problem. When SkiaGL was enabled, poppit UI rendering sometimes drawn upside down. When I disabled the SkiaGL, I did not see this.
(In reply to Milan Sreckovic [:milan] from comment #18) > That makes sense - when we use SkiaGL, we use their cache, and the memory > usage would go up. I don't think the fix is to disable SkiaGL, and there > are some thoughts for getting the cache picture more optimal, though I doubt > that is a 1.2 timeframe. The Skia/GL cache is really an aspect of Skia/GL that I have never worked on, so you'll get a better answer from :snorp or :gw280. Forwarding the needinfo.
Flags: needinfo?(snorp)
Flags: needinfo?(gwright)
Flags: needinfo?(bjacob)
Note: we really need to look into this carefully with DMD to make sure that we understand exactly what, around Skia, is using much memory. Or are we already certain that it is the cache?
(In reply to Benoit Jacob [:bjacob] from comment #21) > Note: we really need to look into this carefully with DMD to make sure that > we understand exactly what, around Skia, is using much memory. Or are we > already certain that it is the cache? Given that things don't go bad when we use software Skia, I'd say the cache seems like the culprit. The current Skia texture cache behavior is not great for mobile. Bug 920239 will help. In the mean time we could try reducing the Skia max cache size for B2G. This value can be changed here: http://mxr.mozilla.org/mozilla-central/source/gfx/2d/DrawTargetSkia.cpp#91 I guess I would suggest 32 or even 16 here, given how resource-constrained some B2G devices can be. Obviously we need to get to a better solution, though.
Flags: needinfo?(snorp)
Another thing we need to do is purge the Skia texture cache entirely when there is a memory-pressure event. I had a patch for this a while back but I have no idea what I did with it.
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #22) > I guess I would suggest 32 or even 16 here, given how resource-constrained > some B2G devices can be. Obviously we need to get to a better solution, > though. I locally changed to 16 on v1.2 hamachi. By this change, the poppit did not killed by low memory killer until now, when it is shown in forground. When the app is set to background, the app is kill by low memory killer. 16MiB is still huge memory for hamachi device. When SkiaGL is disabled, the app was not get killed by low memory killer. But the change to 16 made big improvement.
(In reply to Sotaro Ikeda [:sotaro] from comment #24) > (In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #22) > > I guess I would suggest 32 or even 16 here, given how resource-constrained > > some B2G devices can be. Obviously we need to get to a better solution, > > though. > > I locally changed to 16 on v1.2 hamachi. By this change, the poppit did not > killed by low memory killer until now, when it is shown in forground. When > the app is set to background, the app is kill by low memory killer. 16MiB is > still huge memory for hamachi device. > > When SkiaGL is disabled, the app was not get killed by low memory killer. > But the change to 16 made big improvement. Alright, thanks for giving that a try. How much memory does the Hamachi have total? 256? We should probably just disable SkiaGL there.
Hamachi have 256 in total. But actual system memory is 180MB in total.
The following is memory info after stating the poppit with SkiaGL enabled. ---------------------------------------------- NAME PID NICE USS PSS RSS VSIZE OOM_ADJ USER b2g 144 0 48.7 51.8 55.3 152.2 0 root Poppit!" 521 1 87.2 90.5 94.0 494.4 2 app_521 System memory info: Total 180.4 MB Used - cache 156.9 MB B2G procs (PSS) 142.3 MB Non-B2G procs 14.6 MB Free + cache 23.5 MB Free 7.6 MB Cache 16.0 MB
It is memory info after starting the poppit with SkiaGL disabled. ---------------------------------------------------------- | megabytes | NAME PID NICE USS PSS RSS VSIZE OOM_ADJ USER b2g 146 0 52.2 56.1 60.3 153.8 0 root Poppit!" 554 1 70.6 74.5 78.7 460.8 2 app_554 System memory info: Total 180.4 MB Used - cache 138.4 MB B2G procs (PSS) 130.6 MB Non-B2G procs 7.8 MB Free + cache 42.0 MB Free 21.5 MB Cache 20.5 MB
Looks like the 16MB texture cache is the exact difference. The cache is what gives SkiaGL most of its speed (with drawImage at least), so if we can't have a decently sized one we should just fallback to software.
Assignee: nical.bugzilla → snorp
We can just build the check for memory and stuff into the patch for bug 920160, I think.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #30) > We can just build the check for memory and stuff into the patch for bug > 920160, I think. > > *** This bug has been marked as a duplicate of bug 920160 *** This bug has already been triaged, so the dupe needs to go the other direction.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Depends on: 920160
Or make a dependency and nominate the dependency.
Let me clarify my wording in the above comments: This bug is deemed as a blocker, but I don't know about the other. Until the other bug becomes a blocker itself, this bug needs to be fixed independently of bug 920160.
I can run Poppit with a 4MB cache on hamachi, but it's still super slow. Profiling indicates that it's doing software blits for some reason. The same setup (4MB cache) works fine on Android, so I'm looking into that now.
Whiteboard: [MemShrink] → [MemShrink:P2]
Have seen this issue today on the latest Buri aurora build. Repros exactly as in comment 0, Poppit works fine after rebooting. BuildID: 20131016004005 Gaia: 5ef3535021286ccab7af639897feaaf5955720a0 Gecko: 75b0b968f3ed Version: 26.0a2 *have the logcat saved if needed
snorp, this is a fairly hard blocker for release. What's the latest? What do you need to expedite your investigation and fix?
Flags: needinfo?(snorp)
(In reply to Alex Keybl [:akeybl] from comment #36) > snorp, this is a fairly hard blocker for release. What's the latest? What do > you need to expedite your investigation and fix? The fix for bug 920160 fixes this, just waiting on an additional review to land that.
Flags: needinfo?(snorp)
Closing as fixed by bug 920160.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Keywords: verifyme
Resolution: --- → FIXED
This issue still reproduces but far less often on Buri 1.2 Build ID: 20131107004003. The game closes intermittently when starting a new level, changing the skill level, or just by tapping on balloons during a stage. Gaia 590eb598aacf1e2136b2b6aca5c3124557a365ca SourceStamp 26f1e160e696 BuildID 20131107004003 Version 26.0 This issue does not reproduce on Buri 1.3 Build ID: 20131107040200 Gaia 42bbe26a72e030faf07a6fc297f61a3a8ccda25b SourceStamp 70de5e24d79b BuildID 20131107040200 Version 28.0a1 Game no longer terminates after trying to play Poppit on 1.3.
Sarah - Can you file a followup bug for the issue you found on 1.2?
A new bug has been created. See Bug 936285 (In reply to Jason Smith [:jsmith] from comment #40) > Sarah - Can you file a followup bug for the issue you found on 1.2?
Keywords: verifyme
Flags: needinfo?(gwright)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: