Steps: 1. Go to the browser 2. Go to http://peacekeeper.futuremark.com/run.action Expected: The peacekeeper benchmark should finish while running on wifi with no errors. Actual: When you get to the first WebGL test (it shows a bubble with rocks going back and forth), it fails to load saying that the page is not responding.
Noming to block mainly because this is a web-based phone and we're likely to be measured on how well we do on these common tests - it doesn't look good if we can't finish a particular test.
blocking-basecamp: --- → ?
Anything in any logs?
Trying to run this on an Otoro causes the browser to crash at some point during the tests. I can't really identify where, because the tests don't change the URL or otherwise display anything that's visible on the display (other than the gfx results) while they're running.
APITrace? Else we'll probably need to rip it apart manually.
Created attachment 667328 [details] Logcat Attached a logcat with this reproducing. Nothing really stands out as obvious in the logcat though.
(In reply to Jeff Gilbert [:jgilbert] from comment #4) > APITrace? Else we'll probably need to rip it apart manually. Can you explain how I could get an APITrace?
(In reply to Jason Smith [:jsmith] from comment #6) > (In reply to Jeff Gilbert [:jgilbert] from comment #4) > > APITrace? Else we'll probably need to rip it apart manually. > > Can you explain how I could get an APITrace? I am not that familiar with running APTTrace on mobile devices. Maybe BenWa or Vlad knows more?
Let's block on figuring out why this fails and if it indicates a larger problem.
blocking-basecamp: ? → +
Summary: Cannot complete the peacekeeper benchmark test - fails to load the webgl test on FF OS → Determine why peacekeeper benchmark fails
I'll grab this.
Assignee: nobody → vladimir
So, WebGL is failing in general in many cases and causing the browser to crash. The cube on get.webgl.com works; WebGL Aquarium causes a crash. My current guess is that an OOM condition is caused by WebGL usage, which ends up taking down the browser. Demos such as: https://www.khronos.org/registry/webgl/sdk/demos/google/nvidia-vertex-buffer-object/index.html also cause a crash. (ignore the "bound vertex attribute buffers do not have sufficient size for given indices from the bound element array" errors in that, that's a separate issue)
Chris, is this benchmark a priority from a product standpoint?
blocking-basecamp: + → ?
(In reply to Andrew Overholt [:overholt] from comment #11) > Chris, is this benchmark a priority from a product standpoint? I think I'm less concerned right now about the benchmark issue, but more concerned about comment 10. I'd be inclined to finish off the analysis in comment 10, as Vlad's comments imply that there's quite a problem in the webgl world right now that we should break off into bugs upon completing the analysis here. If we finish off the analysis in comment 10, then I think that's sufficient to close this bug. But I would caution not doing the analysis based on what I'm hearing comment 10.
Yep, I'm still on the case here, but got detoured by some other work. It's going to be a little tricky to figure out if this is indeed the case though, without doing some extensive memory logging. Do we have anyone who knows how to do fairly low level debugging on these devices (otoro or unagi). Something is causing a reboot -- I'd like to be able to set a breakpoint in the kernel at that point and just figure out how we got there.
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #13) > Yep, I'm still on the case here, but got detoured by some other work. It's > going to be a little tricky to figure out if this is indeed the case though, > without doing some extensive memory logging. > > Do we have anyone who knows how to do fairly low level debugging on these > devices (otoro or unagi). Something is causing a reboot -- I'd like to be > able to set a breakpoint in the kernel at that point and just figure out how > we got there. cjones, dhylands, likely mwu, probably jlebar
vlad, reboot sounds like a new symptom. Can you STR it?
I think reboot may have been something else. Peacekeeper runs fine on Unagi, so I think I'm going to close this as WORKSFORME. I suspect Otoro is just running out of memory.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → WORKSFORME
How much USS does "Browser" end up taking on unagi? Btw, this doesn't fail on unagi because we purposely shipped a misconfigured kernel. When the first FOTA update goes out, we'll ship the properly-configured kernel.
To be clearer, the otoro actually has 256MB of physical RAM. The unagi has 512MB of physical RAM. However, the configuration we care about is 256MB. There's a kernel update we didn't ship for the dogfooders which configures the unagi to pretend to have 256MB. We'll be installing that as soon as we can roll out FOTA updates. I'd like to know how much memory the "Browser" uses, because we have a substantial amount of memory win about to land. We may be able to load this benchmark after those. If it uses more than 110MB USS though, we have pretty much no hope of loading it.
So, watching Browser on Unagi while it's running, I see jumps like this (only showing lines where either vsize or rss changes drastically): VSIZE RSS 156540 72644 773760 83240 153084 66860 211424 79732 242160 111664 230064 98524 138684 50180 259884 64624 332848 69976 296760 73328 197732 75956 I think either that 700mb vsize or the 110mb rss is around when webgl was running; unfortunately the test prints no progress so it's hard to tell. However, Browser sites at around 80-90MB RSS for most of the run, so it's very close to the limit you mention.
Spikes up to 110MB RSS are going to be a challenge, but this makes for a good test case :).
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Summary: Determine why peacekeeper benchmark fails → Get peacekeeper benchmark running in 256MB RAM
Ha, nice try, buddy! ;)
Assignee: vladimir → nobody
Component: Canvas: WebGL → General
Vlad, do you think we should block the release on this?
Unless we have some reason to believe that it's even /possible/ to run this benchmark within however much ram we have available, I don't think we can sanely block on this.
Tough to say if we can or can't until we figure out what's actually using that memory. I don't know that we have good tools to do that quickly; about:memory might help, if we could get a dump of it every few seconds to see what's going on. One useful thing to do would be to figure out why it's exiting, that is which allocation is failing with 256mb.
> about:memory might help, if we could get a dump of it every few seconds to see what's > going on. Sure, run $B2G_ROOT/tools/get_about_memory.py in a loop.
Just realized the bug title is ambiguous, fixed. (I don't care about peacekeeper itself at all, it's just "fat code".)
Summary: Get peacekeeper benchmark running in 256MB RAM → Optimize gecko to run unmodified peacekeeper benchmark in 256MB RAM
If you don't care about Peacekeeper (I certainly don't), can we pick something we do care about as an example of "fat code" to optimize for? Like Cut the Rope, or a WebGL game, or something? I'd rather optimize for something we care about and be pleasantly surprised if we change something so Peacekeeper fits in 256mb of RAM than optimize for this benchmark and hope that it will translate to a page/app we care about.
All I want to do is see if there's something dumb that peacekeeper triggers that we can optimize. Totally agreed it's way down on the list.
It doesn't sound like this is a terribly high priority for anyone, so I'm moving this to blocking-. Justin offered to help anyone who is interested in doing measurement with the various tools that we have. But beyond that his plan was to run tests on actual apps instead which I agree is a higher priority.
blocking-basecamp: ? → -
Created attachment 686383 [details] Victoire With bug 810719 flipped on on beta, I finish the test and have these processes still running afterwards $ adb shell b2g-ps APPLICATION USER PID PPID VSIZE RSS WCHAN PC NAME b2g root 105 1 176672 57740 ffffffff 400c9330 S /system/b2g/b2g FM Radio app_0 484 105 59732 11744 ffffffff 400db330 S /system/b2g/plugin-container Clock app_0 557 105 62812 13136 ffffffff 4004f330 S /system/b2g/plugin-container Calculator app_0 582 105 59672 10792 ffffffff 400ec330 S /system/b2g/plugin-container Feedback app_0 599 105 58712 11168 ffffffff 400ad330 S /system/b2g/plugin-container Cost Control app_0 625 105 60760 14584 ffffffff 40062330 S /system/b2g/plugin-container (Preallocated a app_0 652 105 55508 8180 ffffffff 40097330 S /system/b2g/plugin-container Browser app_0 667 105 92816 38416 ffffffff 4002f330 S /system/b2g/plugin-container I don't know whether 128 is a good score on this HW. (Although, the screen turned off several times during the test.) But now we can go find out :).
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago → 6 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.