Closed
Bug 795985
Opened 12 years ago
Closed 12 years ago
Optimize gecko to run unmodified peacekeeper benchmark in 256MB RAM
Categories
(Core :: General, defect)
Tracking
()
RESOLVED
WORKSFORME
blocking-basecamp | - |
People
(Reporter: jsmith, Unassigned)
References
Details
(Whiteboard: [MemShrink:P3])
Attachments
(2 files)
Steps: 1. Go to the browser 2. Go to http://peacekeeper.futuremark.com/run.action Expected: The peacekeeper benchmark should finish while running on wifi with no errors. Actual: When you get to the first WebGL test (it shows a bubble with rocks going back and forth), it fails to load saying that the page is not responding.
Reporter | ||
Comment 1•12 years ago
|
||
Noming to block mainly because this is a web-based phone and we're likely to be measured on how well we do on these common tests - it doesn't look good if we can't finish a particular test.
blocking-basecamp: --- → ?
Anything in any logs?
Trying to run this on an Otoro causes the browser to crash at some point during the tests. I can't really identify where, because the tests don't change the URL or otherwise display anything that's visible on the display (other than the gfx results) while they're running.
Comment 4•12 years ago
|
||
APITrace? Else we'll probably need to rip it apart manually.
Reporter | ||
Comment 5•12 years ago
|
||
Attached a logcat with this reproducing. Nothing really stands out as obvious in the logcat though.
Reporter | ||
Comment 6•12 years ago
|
||
(In reply to Jeff Gilbert [:jgilbert] from comment #4) > APITrace? Else we'll probably need to rip it apart manually. Can you explain how I could get an APITrace?
Comment 7•12 years ago
|
||
(In reply to Jason Smith [:jsmith] from comment #6) > (In reply to Jeff Gilbert [:jgilbert] from comment #4) > > APITrace? Else we'll probably need to rip it apart manually. > > Can you explain how I could get an APITrace? I am not that familiar with running APTTrace on mobile devices. Maybe BenWa or Vlad knows more?
Comment 8•12 years ago
|
||
Let's block on figuring out why this fails and if it indicates a larger problem.
blocking-basecamp: ? → +
Summary: Cannot complete the peacekeeper benchmark test - fails to load the webgl test on FF OS → Determine why peacekeeper benchmark fails
I'll grab this.
Assignee: nobody → vladimir
So, WebGL is failing in general in many cases and causing the browser to crash. The cube on get.webgl.com works; WebGL Aquarium causes a crash. My current guess is that an OOM condition is caused by WebGL usage, which ends up taking down the browser. Demos such as: https://www.khronos.org/registry/webgl/sdk/demos/google/nvidia-vertex-buffer-object/index.html also cause a crash. (ignore the "bound vertex attribute buffers do not have sufficient size for given indices from the bound element array" errors in that, that's a separate issue)
Comment 11•12 years ago
|
||
Chris, is this benchmark a priority from a product standpoint?
blocking-basecamp: + → ?
Flags: needinfo?(clee)
Reporter | ||
Comment 12•12 years ago
|
||
(In reply to Andrew Overholt [:overholt] from comment #11) > Chris, is this benchmark a priority from a product standpoint? I think I'm less concerned right now about the benchmark issue, but more concerned about comment 10. I'd be inclined to finish off the analysis in comment 10, as Vlad's comments imply that there's quite a problem in the webgl world right now that we should break off into bugs upon completing the analysis here. If we finish off the analysis in comment 10, then I think that's sufficient to close this bug. But I would caution not doing the analysis based on what I'm hearing comment 10.
Yep, I'm still on the case here, but got detoured by some other work. It's going to be a little tricky to figure out if this is indeed the case though, without doing some extensive memory logging. Do we have anyone who knows how to do fairly low level debugging on these devices (otoro or unagi). Something is causing a reboot -- I'd like to be able to set a breakpoint in the kernel at that point and just figure out how we got there.
Comment 14•12 years ago
|
||
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #13) > Yep, I'm still on the case here, but got detoured by some other work. It's > going to be a little tricky to figure out if this is indeed the case though, > without doing some extensive memory logging. > > Do we have anyone who knows how to do fairly low level debugging on these > devices (otoro or unagi). Something is causing a reboot -- I'd like to be > able to set a breakpoint in the kernel at that point and just figure out how > we got there. cjones, dhylands, likely mwu, probably jlebar
vlad, reboot sounds like a new symptom. Can you STR it?
I think reboot may have been something else. Peacekeeper runs fine on Unagi, so I think I'm going to close this as WORKSFORME. I suspect Otoro is just running out of memory.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
How much USS does "Browser" end up taking on unagi? Btw, this doesn't fail on unagi because we purposely shipped a misconfigured kernel. When the first FOTA update goes out, we'll ship the properly-configured kernel.
To be clearer, the otoro actually has 256MB of physical RAM. The unagi has 512MB of physical RAM. However, the configuration we care about is 256MB. There's a kernel update we didn't ship for the dogfooders which configures the unagi to pretend to have 256MB. We'll be installing that as soon as we can roll out FOTA updates. I'd like to know how much memory the "Browser" uses, because we have a substantial amount of memory win about to land. We may be able to load this benchmark after those. If it uses more than 110MB USS though, we have pretty much no hope of loading it.
So, watching Browser on Unagi while it's running, I see jumps like this (only showing lines where either vsize or rss changes drastically): VSIZE RSS 156540 72644 773760 83240 153084 66860 211424 79732 242160 111664 230064 98524 138684 50180 259884 64624 332848 69976 296760 73328 197732 75956 I think either that 700mb vsize or the 110mb rss is around when webgl was running; unfortunately the test prints no progress so it's hard to tell. However, Browser sites at around 80-90MB RSS for most of the run, so it's very close to the limit you mention.
Spikes up to 110MB RSS are going to be a challenge, but this makes for a good test case :).
Blocks: slim-fast
Status: RESOLVED → REOPENED
Flags: needinfo?(clee)
Resolution: WORKSFORME → ---
Summary: Determine why peacekeeper benchmark fails → Get peacekeeper benchmark running in 256MB RAM
Ha, nice try, buddy! ;)
Assignee: vladimir → nobody
Component: Canvas: WebGL → General
Reporter | ||
Updated•12 years ago
|
Whiteboard: [MemShrink]
Comment 22•12 years ago
|
||
Vlad, do you think we should block the release on this?
Flags: needinfo?(vladimir)
Comment 23•12 years ago
|
||
Unless we have some reason to believe that it's even /possible/ to run this benchmark within however much ram we have available, I don't think we can sanely block on this.
Tough to say if we can or can't until we figure out what's actually using that memory. I don't know that we have good tools to do that quickly; about:memory might help, if we could get a dump of it every few seconds to see what's going on. One useful thing to do would be to figure out why it's exiting, that is which allocation is failing with 256mb.
Flags: needinfo?(vladimir)
Comment 25•12 years ago
|
||
> about:memory might help, if we could get a dump of it every few seconds to see what's
> going on.
Sure, run $B2G_ROOT/tools/get_about_memory.py in a loop.
Just realized the bug title is ambiguous, fixed. (I don't care about peacekeeper itself at all, it's just "fat code".)
Summary: Get peacekeeper benchmark running in 256MB RAM → Optimize gecko to run unmodified peacekeeper benchmark in 256MB RAM
Comment 27•12 years ago
|
||
If you don't care about Peacekeeper (I certainly don't), can we pick something we do care about as an example of "fat code" to optimize for? Like Cut the Rope, or a WebGL game, or something? I'd rather optimize for something we care about and be pleasantly surprised if we change something so Peacekeeper fits in 256mb of RAM than optimize for this benchmark and hope that it will translate to a page/app we care about.
All I want to do is see if there's something dumb that peacekeeper triggers that we can optimize. Totally agreed it's way down on the list.
It doesn't sound like this is a terribly high priority for anyone, so I'm moving this to blocking-. Justin offered to help anyone who is interested in doing measurement with the various tools that we have. But beyond that his plan was to run tests on actual apps instead which I agree is a higher priority.
blocking-basecamp: ? → -
Updated•12 years ago
|
Whiteboard: [MemShrink] → [MemShrink:P3]
With the fix from bug 798491, we make it into the WebGL tests. I see us dying from a null pointer deref in F/libc ( 1174): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1) E/OMXCodec( 1174): Attempting to allocate OMX node 'OMX.google.avc.decoder' E/GeckoConsole( 1174): [JavaScript Warning: "Media resource http://peacekeeper.futuremark.com/resources/videos/riverfly01/riverfly01.mp4 could not be decoded." {file: "http://peacekeeper.futuremark.com/runTest.action" line: 0}] E/GeckoConsole( 106): [JavaScript Warning: "Media resource http://peacekeeper.futuremark.com/resources/videos/riverfly01/riverfly01.mp4 could not be decoded." {file: "http://peacekeeper.futuremark.com/runTest.action" line: 0}] D/memalloc( 106): /dev/pmem: Allocated buffer base:0x4a500000 size:348160 offset:2461696 fd:92 D/memalloc( 1174): /dev/pmem: Mapped buffer base:0x46000000 size:2809856 offset:2461696 fd:36 I/DEBUG ( 109): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** I/DEBUG ( 109): Build fingerprint: 'toro/full_otoro/otoro:4.0.4.0.4.0.4/OPENMASTER/eng.cjones.20121026.170606:user/test-keys' I/DEBUG ( 109): pid: 1174, tid: 5258 >>> /system/b2g/plugin-container <<< I/DEBUG ( 109): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 00000000 I/DEBUG ( 109): r0 fffffffc r1 425e4c29 r2 425e5040 r3 00000000 I/DEBUG ( 109): r4 43ae98a0 r5 43c2bb84 r6 43c578e0 r7 00000000 I/DEBUG ( 109): r8 00000000 r9 4382fd30 10 4106ff71 fp 4106ff6a I/DEBUG ( 109): ip 400317b4 sp 4382fc98 lr 425e4c11 pc 00000000 cpsr 00000010 It's certainly possible this is a failed allocation, but the last two 4Hz samples of memory usage at the crash were Browser app_0 1174 106 120860 68688 ffffffff 400e6594 R /system/b2g/plugin-container Browser app_0 1174 106 121008 64468 ffffffff 400e8330 S /system/b2g/plugin-container so we're certainly not in memory pressure. This looks more like one of the decoder bugs.
With bug 810719 flipped on on beta, I finish the test and have these processes still running afterwards $ adb shell b2g-ps APPLICATION USER PID PPID VSIZE RSS WCHAN PC NAME b2g root 105 1 176672 57740 ffffffff 400c9330 S /system/b2g/b2g FM Radio app_0 484 105 59732 11744 ffffffff 400db330 S /system/b2g/plugin-container Clock app_0 557 105 62812 13136 ffffffff 4004f330 S /system/b2g/plugin-container Calculator app_0 582 105 59672 10792 ffffffff 400ec330 S /system/b2g/plugin-container Feedback app_0 599 105 58712 11168 ffffffff 400ad330 S /system/b2g/plugin-container Cost Control app_0 625 105 60760 14584 ffffffff 40062330 S /system/b2g/plugin-container (Preallocated a app_0 652 105 55508 8180 ffffffff 40097330 S /system/b2g/plugin-container Browser app_0 667 105 92816 38416 ffffffff 4002f330 S /system/b2g/plugin-container I don't know whether 128 is a good score on this HW. (Although, the screen turned off several times during the test.) But now we can go find out :).
Updated•12 years ago
|
Status: REOPENED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•