795985 - Optimize gecko to run unmodified peacekeeper benchmark in 256MB RAM

Reporter

Description

•

12 years ago

Steps:

1. Go to the browser
2. Go to http://peacekeeper.futuremark.com/run.action

Expected:

The peacekeeper benchmark should finish while running on wifi with no errors.

Actual:

When you get to the first WebGL test (it shows a bubble with rocks going back and forth), it fails to load saying that the page is not responding.

Jason Smith [:jsmith]

Reporter

Comment 1

•

12 years ago

Noming to block mainly because this is a web-based phone and we're likely to be measured on how well we do on these common tests - it doesn't look good if we can't finish a particular test.

blocking-basecamp: --- → ?

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Comment 2

•

12 years ago

Anything in any logs?

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Comment 3

•

12 years ago

Trying to run this on an Otoro causes the browser to crash at some point during the tests.  I can't really identify where, because the tests don't change the URL or otherwise display anything that's visible on the display (other than the gfx results) while they're running.

Kelsey Gilbert [:jgilbert]

Comment 4

•

12 years ago

APITrace? Else we'll probably need to rip it apart manually.

Jason Smith [:jsmith]

Reporter

Comment 5

•

12 years ago

Attached file Logcat — Details

Attached a logcat with this reproducing. Nothing really stands out as obvious in the logcat though.

Jason Smith [:jsmith]

Reporter

Comment 6

•

12 years ago

(In reply to Jeff Gilbert [:jgilbert] from comment #4)
> APITrace? Else we'll probably need to rip it apart manually.

Can you explain how I could get an APITrace?

Kelsey Gilbert [:jgilbert]

Comment 7

•

12 years ago

(In reply to Jason Smith [:jsmith] from comment #6)
> (In reply to Jeff Gilbert [:jgilbert] from comment #4)
> > APITrace? Else we'll probably need to rip it apart manually.
> 
> Can you explain how I could get an APITrace?

I am not that familiar with running APTTrace on mobile devices. Maybe BenWa or Vlad knows more?

Andrew Overholt [:overholt]

Comment 8

•

12 years ago

Let's block on figuring out why this fails and if it indicates a larger problem.

blocking-basecamp: ? → +

Summary: Cannot complete the peacekeeper benchmark test - fails to load the webgl test on FF OS → Determine why peacekeeper benchmark fails

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Comment 9

•

12 years ago

I'll grab this.

Assignee: nobody → vladimir

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Comment 10

•

12 years ago

So, WebGL is failing in general in many cases and causing the browser to crash.  The cube on get.webgl.com works; WebGL Aquarium causes a crash.

My current guess is that an OOM condition is caused by WebGL usage, which ends up taking down the browser.  Demos such as:

https://www.khronos.org/registry/webgl/sdk/demos/google/nvidia-vertex-buffer-object/index.html

also cause a crash. (ignore the "bound vertex attribute buffers do not have sufficient size for given indices from the bound element array" errors in that, that's a separate issue)

Andrew Overholt [:overholt]

Comment 11

•

12 years ago

Chris, is this benchmark a priority from a product standpoint?

blocking-basecamp: + → ?

Flags: needinfo?(clee)

Jason Smith [:jsmith]

Reporter

Comment 12

•

12 years ago

(In reply to Andrew Overholt [:overholt] from comment #11)
> Chris, is this benchmark a priority from a product standpoint?

I think I'm less concerned right now about the benchmark issue, but more concerned about comment 10. I'd be inclined to finish off the analysis in comment 10, as Vlad's comments imply that there's quite a problem in the webgl world right now that we should break off into bugs upon completing the analysis here. If we finish off the analysis in comment 10, then I think that's sufficient to close this bug. But I would caution not doing the analysis based on what I'm hearing comment 10.

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Comment 13

•

12 years ago

Yep, I'm still on the case here, but got detoured by some other work.  It's going to be a little tricky to figure out if this is indeed the case though, without doing some extensive memory logging.

Do we have anyone who knows how to do fairly low level debugging on these devices (otoro or unagi).  Something is causing a reboot -- I'd like to be able to set a breakpoint in the kernel at that point and just figure out how we got there.

Andrew Overholt [:overholt]

Comment 14

•

12 years ago

(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #13)
> Yep, I'm still on the case here, but got detoured by some other work.  It's
> going to be a little tricky to figure out if this is indeed the case though,
> without doing some extensive memory logging.
> 
> Do we have anyone who knows how to do fairly low level debugging on these
> devices (otoro or unagi).  Something is causing a reboot -- I'd like to be
> able to set a breakpoint in the kernel at that point and just figure out how
> we got there.

cjones, dhylands, likely mwu, probably jlebar

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 15

•

12 years ago

vlad, reboot sounds like a new symptom.  Can you STR it?

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Comment 16

•

12 years ago

I think reboot may have been something else.

Peacekeeper runs fine on Unagi, so I think I'm going to close this as WORKSFORME.  I suspect Otoro is just running out of memory.

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → WORKSFORME

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 17

•

12 years ago

How much USS does "Browser" end up taking on unagi?

Btw, this doesn't fail on unagi because we purposely shipped a misconfigured kernel.  When the first FOTA update goes out, we'll ship the properly-configured kernel.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 18

•

12 years ago

To be clearer, the otoro actually has 256MB of physical RAM.  The unagi has 512MB of physical RAM.

However, the configuration we care about is 256MB.  There's a kernel update we didn't ship for the dogfooders which configures the unagi to pretend to have 256MB.  We'll be installing that as soon as we can roll out FOTA updates.

I'd like to know how much memory the "Browser" uses, because we have a substantial amount of memory win about to land.  We may be able to load this benchmark after those.  If it uses more than 110MB USS though, we have pretty much no hope of loading it.

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Comment 19

•

12 years ago

So, watching Browser on Unagi while it's running, I see jumps like this (only showing lines where either vsize or rss changes drastically):

   VSIZE  RSS
   156540 72644 
   773760 83240 
   153084 66860 
   211424 79732 
   242160 111664 
   230064 98524 
   138684 50180 
   259884 64624
   332848 69976
   296760 73328
   197732 75956

I think either that 700mb vsize or the 110mb rss is around when webgl was running; unfortunately the test prints no progress so it's hard to tell.  However, Browser sites at around 80-90MB RSS for most of the run, so it's very close to the limit you mention.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 20

•

12 years ago

Spikes up to 110MB RSS are going to be a challenge, but this makes for a good test case :).

Blocks: slim-fast

Status: RESOLVED → REOPENED

Flags: needinfo?(clee)

Resolution: WORKSFORME → ---

Summary: Determine why peacekeeper benchmark fails → Get peacekeeper benchmark running in 256MB RAM

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Comment 21

•

12 years ago

Ha, nice try, buddy! ;)

Assignee: vladimir → nobody

Component: Canvas: WebGL → General

Jason Smith [:jsmith]

Reporter

Updated

•

12 years ago

Whiteboard: [MemShrink]

Andrew Overholt [:overholt]

Comment 22

•

12 years ago

Vlad, do you think we should block the release on this?

Flags: needinfo?(vladimir)

Justin Lebar (not reading bugmail)

Comment 23

•

12 years ago

Unless we have some reason to believe that it's even /possible/ to run this benchmark within however much ram we have available, I don't think we can sanely block on this.

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Comment 24

•

12 years ago

Tough to say if we can or can't until we figure out what's actually using that memory.  I don't know that we have good tools to do that quickly; about:memory might help, if we could get a dump of it every few seconds to see what's going on.  One useful thing to do would be to figure out why it's exiting, that is which allocation is failing with 256mb.

Flags: needinfo?(vladimir)

Justin Lebar (not reading bugmail)

Comment 25

•

12 years ago

> about:memory might help, if we could get a dump of it every few seconds to see what's 
> going on.

Sure, run $B2G_ROOT/tools/get_about_memory.py in a loop.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 26

•

12 years ago

Just realized the bug title is ambiguous, fixed.  (I don't care about peacekeeper itself at all, it's just "fat code".)

Summary: Get peacekeeper benchmark running in 256MB RAM → Optimize gecko to run unmodified peacekeeper benchmark in 256MB RAM

Justin Lebar (not reading bugmail)

Comment 27

•

12 years ago

If you don't care about Peacekeeper (I certainly don't), can we pick something we do care about as an example of "fat code" to optimize for?  Like Cut the Rope, or a WebGL game, or something?

I'd rather optimize for something we care about and be pleasantly surprised if we change something so Peacekeeper fits in 256mb of RAM than optimize for this benchmark and hope that it will translate to a page/app we care about.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 28

•

12 years ago

All I want to do is see if there's something dumb that peacekeeper triggers that we can optimize.

Totally agreed it's way down on the list.

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 29

•

12 years ago

It doesn't sound like this is a terribly high priority for anyone, so I'm moving this to blocking-.

Justin offered to help anyone who is interested in doing measurement with the various tools that we have. But beyond that his plan was to run tests on actual apps instead which I agree is a higher priority.

blocking-basecamp: ? → -

Nicholas Nethercote [inactive]

Updated

•

12 years ago

Whiteboard: [MemShrink] → [MemShrink:P3]

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 30

•

12 years ago

With the fix from bug 798491, we make it into the WebGL tests.  I see us dying from a null pointer deref in

F/libc    ( 1174): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1)
E/OMXCodec( 1174): Attempting to allocate OMX node 'OMX.google.avc.decoder'
E/GeckoConsole( 1174): [JavaScript Warning: "Media resource http://peacekeeper.futuremark.com/resources/videos/riverfly01/riverfly01.mp4 could not be decoded." {file: "http://peacekeeper.futuremark.com/runTest.action" line: 0}]
E/GeckoConsole(  106): [JavaScript Warning: "Media resource http://peacekeeper.futuremark.com/resources/videos/riverfly01/riverfly01.mp4 could not be decoded." {file: "http://peacekeeper.futuremark.com/runTest.action" line: 0}]
D/memalloc(  106): /dev/pmem: Allocated buffer base:0x4a500000 size:348160 offset:2461696 fd:92
D/memalloc( 1174): /dev/pmem: Mapped buffer base:0x46000000 size:2809856 offset:2461696 fd:36
I/DEBUG   (  109): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
I/DEBUG   (  109): Build fingerprint: 'toro/full_otoro/otoro:4.0.4.0.4.0.4/OPENMASTER/eng.cjones.20121026.170606:user/test-keys'
I/DEBUG   (  109): pid: 1174, tid: 5258  >>> /system/b2g/plugin-container <<<
I/DEBUG   (  109): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 00000000
I/DEBUG   (  109):  r0 fffffffc  r1 425e4c29  r2 425e5040  r3 00000000
I/DEBUG   (  109):  r4 43ae98a0  r5 43c2bb84  r6 43c578e0  r7 00000000
I/DEBUG   (  109):  r8 00000000  r9 4382fd30  10 4106ff71  fp 4106ff6a
I/DEBUG   (  109):  ip 400317b4  sp 4382fc98  lr 425e4c11  pc 00000000  cpsr 00000010

It's certainly possible this is a failed allocation, but the last two 4Hz samples of memory usage at the crash were

Browser          app_0     1174  106   120860 68688 ffffffff 400e6594 R /system/b2g/plugin-container
Browser          app_0     1174  106   121008 64468 ffffffff 400e8330 S /system/b2g/plugin-container

so we're certainly not in memory pressure.  This looks more like one of the decoder bugs.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 31

•

12 years ago

Attached image Victoire — Details

With bug 810719 flipped on on beta, I finish the test and have these processes still running afterwards

$ adb shell b2g-ps
APPLICATION      USER     PID   PPID  VSIZE  RSS     WCHAN    PC         NAME
b2g              root      105   1     176672 57740 ffffffff 400c9330 S /system/b2g/b2g
FM Radio         app_0     484   105   59732  11744 ffffffff 400db330 S /system/b2g/plugin-container
Clock            app_0     557   105   62812  13136 ffffffff 4004f330 S /system/b2g/plugin-container
Calculator       app_0     582   105   59672  10792 ffffffff 400ec330 S /system/b2g/plugin-container
Feedback         app_0     599   105   58712  11168 ffffffff 400ad330 S /system/b2g/plugin-container
Cost Control     app_0     625   105   60760  14584 ffffffff 40062330 S /system/b2g/plugin-container
(Preallocated a  app_0     652   105   55508  8180  ffffffff 40097330 S /system/b2g/plugin-container
Browser          app_0     667   105   92816  38416 ffffffff 4002f330 S /system/b2g/plugin-container

I don't know whether 128 is a good score on this HW.  (Although, the screen turned off several times during the test.)  But now we can go find out :).

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Updated

•

12 years ago

Status: REOPENED → RESOLVED

Closed: 12 years ago → 12 years ago

Resolution: --- → WORKSFORME

Logcat 12 years ago Jason Smith [:jsmith] 539.90 KB, text/plain		Details
Victoire 12 years ago Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 49.03 KB, image/png		Details