806641 - Lots of overflow:scroll divs cause Otoro device to reboot or cause teh "Well, this is embarrassing :( We tried to display this webpage, but it's not responding." message, due to running out of pmem

Reporter

Description

•

13 years ago

See url testcase. I'm also seeing this with: http://people.mozilla.org/~mwargers/tests/performance/overflowscrollscrolling_rotate.htm I guess those pages take the browser to take up too much memory or something.

Justin Lebar (not reading bugmail)

Comment 1

•

13 years ago

We really need some way for you to differentiate between OOM crashes and "real" crashes. We've been discussing this, but I don't think any bug has been filed. I'll file one.

Justin Lebar (not reading bugmail)

Comment 2

•

13 years ago

I filed bug 807007.

Justin Lebar (not reading bugmail)

Comment 3

•

13 years ago

Probably an OOM. We should kill the browser, but not the main process. That's bug 780437.

Depends on: 780437

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 4

•

13 years ago

The bug title says this sometimes causes b2g process crashes. Those aren't OOM.

Justin Lebar (not reading bugmail)

Comment 5

•

13 years ago

(In reply to Chris Jones [:cjones] [:warhammer] from comment #4) > The bug title says this sometimes causes b2g process crashes. Those aren't > OOM. How do you know that we're not OOMing in the browser and shooting the main process?

Vivien Nicolas (:vingtetun) (:21) - (NOT reading bugmails, needinfo? please)

Comment 6

•

13 years ago

(In reply to Martijn Wargers [:mw22] (QA - IRC nick: mw22) from comment #0) > See url testcase. > I'm also seeing this with: > http://people.mozilla.org/~mwargers/tests/performance/ > overflowscrollscrolling_rotate.htm > > I guess those pages take the browser to take up too much memory or something. Can you try to add a a file custom-prefs.js in the root of the Gaia folder with a line containing user_pref("ui.showHideScrollbars", false) and see if you can reproduce? I would like to see if the root cause of the bug is because of some code I have added a while ago.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 7

•

13 years ago

(In reply to Justin Lebar [:jlebar] from comment #5) > (In reply to Chris Jones [:cjones] [:warhammer] from comment #4) > > The bug title says this sometimes causes b2g process crashes. Those aren't > > OOM. > > How do you know that we're not OOMing in the browser and shooting the main > process? If the oomkiller is being invoked then that's possible.

Justin Lebar (not reading bugmail)

Comment 8

•

13 years ago

We seem to be exhausting pmem with this testcase. logcat says: > D/memalloc( 1406): /dev/pmem: Allocated buffer base:0x49b5a000 size:471040 offset:3694592 fd:162 > D/memalloc( 1560): /dev/pmem: Mapped buffer base:0x45e00000 size:4165632 offset:3694592 fd:51 > D/memalloc( 1560): /dev/pmem: Unmapping buffer base:0x44712000 size:3080192 offset:2834432 > D/memalloc( 1406): /dev/pmem: Freeing buffer base:0x49e0e000 size:245760 offset:2834432 fd:111 > D/memalloc( 1406): /dev/pmem: Freeing buffer base:0x49c6f000 size:245760 offset:1134592 fd:171 > D/memalloc( 1406): /dev/pmem: Allocated buffer base:0x49b5a000 size:143360 offset:2834432 fd:111 [many, may similar lines] > D/memalloc( 1406): Out of PMEM. Dumping PMEM stats for debugging > D/memalloc( 1406): ------------- PRINT PMEM STATS -------------- > D/memalloc( 1406): Node 0 -> Start Address : 0 Size 9600 Free info 0 > D/memalloc( 1406): Node 1 -> Start Address : 9600 Size 9600 Free info 0 > D/memalloc( 1406): Node 2 -> Start Address : 19200 Size 2048 Free info 0 > D/memalloc( 1406): Node 3 -> Start Address : 21248 Size 2048 Free info 0 [snip] > D/memalloc( 1406): Total Allocated: Total Free: > D/memalloc( 1406): ---------------------------------------------- > E/memalloc( 1406): /dev/pmem: No more pmem available > W/memalloc( 1406): Falling back to ashmem > D/memalloc( 1406): ashmem: Allocated buffer base:0x42d38000 size:32768 fd:937 > D/memalloc( 1560): ashmem: Mapped buffer base:0x42520000 size:32768 fd:183 > D/memalloc( 1406): Out of PMEM. Dumping PMEM stats for debugging [snip] At this point, every ashmem alloc prints 256+ lines of pmem debugging info to logcat, which is probably not desirable. Then eventually (in this case) the child segfaults > E/memalloc( 1406): /dev/pmem: No more pmem available > W/memalloc( 1406): Falling back to ashmem > D/memalloc( 1406): ashmem: Allocated buffer base:0x434b8000 size:16384 fd:1013 > I/Gecko ( 1560): [Child 1560] WARNING: SCM_RIGHTS message was truncated cmsg_len:16 fd:15: file ../../../../ff-git/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 479 > F/libc ( 1560): Fatal signal 11 (SIGSEGV) at 0x00000030 (code=1) I'll see if I can capture a trace where the parent dies.

Justin Lebar (not reading bugmail)

Comment 9

•

13 years ago

Attached file logcat of child process segfaulting — Details

Justin Lebar (not reading bugmail)

Comment 10

•

13 years ago

And here's us dying in the main process. > E/memalloc( 1406): /dev/pmem: No more pmem available > W/memalloc( 1406): Falling back to ashmem > D/memalloc( 1406): ashmem: Allocated buffer base:0x4da92000 size:8192 fd:1023 > E/libgenlock( 1406): genlock_create_lock: open genlock device failed (err=Too many open files) > E/msm7627a.gralloc( 1406): alloc_impl: genlock_create_lock failed > E/memalloc( 1406): getAllocator: Invalid flags passed: 0x0 > F/libc ( 1920): Fatal signal 11 (SIGSEGV) at 0x00000030 (code=1) Notice that we're running out of fd's in the main process.

Justin Lebar (not reading bugmail)

Updated

•

13 years ago

Summary: Lots of overflow:scroll divs cause Otoro device to reboot or cause teh "Well, this is embarrassing :( We tried to display this webpage, but it's not responding." message → Lots of overflow:scroll divs cause Otoro device to reboot or cause teh "Well, this is embarrassing :( We tried to display this webpage, but it's not responding." message, due to running out of pmem

Justin Lebar (not reading bugmail)

Comment 11

•

13 years ago

This isn't an OOM, so removing the dependency on bug 780437.

No longer depends on: 780437

Justin Lebar (not reading bugmail)

Comment 12

•

13 years ago

Now that we understand the cause here, I'd like this to be re-triaged. I'm afraid that other, less-contrived sites might also trigger a main-process out-of-fd's crash.

blocking-basecamp: --- → ?

David Scravaglieri [:scravag]

Updated

•

13 years ago

blocking-basecamp: ? → +

Priority: -- → P2

David Scravaglieri [:scravag]

Updated

•

13 years ago

Assignee: nobody → ben

Alex Keybl [:akeybl]

Updated

•

13 years ago

Target Milestone: --- → B2G C3 (12dec-1jan)

Justin Lebar (not reading bugmail)

Comment 13

•

13 years ago

> Assignee: nobody@mozilla.org → ben@krellian.com I don't think Ben Francis is the right assignee here. This appears to be an issue in our graphics pipeline, not a Gaia issue.

Assignee: ben → nobody

David Scravaglieri [:scravag]

Updated

•

13 years ago

blocking-basecamp: + → ?

Component: Gaia::Browser → General

Justin Lebar (not reading bugmail)

Comment 14

•

13 years ago

Did you mean to reset the blocking flag?

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

13 years ago

blocking-basecamp: ? → +

Whiteboard: [b2g-gfx]

Milan Sreckovic [:milan] (needinfo for best results)

Comment 15

•

13 years ago

Let's find out if we're leaking, or we're really running out of file descriptors, in which case we can apparently increase the limit from 1024 to something larger.

Assignee: nobody → bgirard

Benoit Girard (:BenWa)

Comment 16

•

13 years ago

(In reply to Justin Lebar [:jlebar] from comment #10) > And here's us dying in the main process. > > > E/memalloc( 1406): /dev/pmem: No more pmem available > > W/memalloc( 1406): Falling back to ashmem > > D/memalloc( 1406): ashmem: Allocated buffer base:0x4da92000 size:8192 fd:1023 > > E/libgenlock( 1406): genlock_create_lock: open genlock device failed (err=Too many open files) > > E/msm7627a.gralloc( 1406): alloc_impl: genlock_create_lock failed > > E/memalloc( 1406): getAllocator: Invalid flags passed: 0x0 > > F/libc ( 1920): Fatal signal 11 (SIGSEGV) at 0x00000030 (code=1) > > Notice that we're running out of fd's in the main process. I know shmems on mac allocated a fd and I suppose that's also the case on b2g. We're liking allocating too many shmem which cause us to run out of fd.

Benoit Girard (:BenWa)

Comment 17

•

13 years ago

I'm having a bit of trouble setting up an up to date b2g environment. While that's waiting I think what needs to happen here is we need to have a bound of the number of active layers so a script can't cause something crazy like 500 active layers like this test case. With 500 active layers change are we are better off just painting everything manually as a single layer. I think this is an important bug but it shouldn't block b2g.

Benoit Girard (:BenWa)

Comment 18

•

13 years ago

(In reply to Benoit Girard (:BenWa) from comment #16) > I know shmems on mac allocated a fd and I suppose that's also the case on > b2g. We're liking allocating too many shmem which cause us to run out of fd. s/liking/likely

Milan Sreckovic [:milan] (needinfo for best results)

Comment 19

•

13 years ago

Thanks Benoit. From this description, this is not a "leak" as such, just a fairly unrealistic scenario. Re-noming so that we can discuss it, but it doesn't sound like a blocker to me either.

Assignee: bgirard → nobody

blocking-basecamp: + → ?

Lawrence Mandel [:lmandel] (use needinfo)

Updated

•

13 years ago

blocking-basecamp: ? → -

tracking-b2g18: --- → +

Milan Sreckovic [:milan] (needinfo for best results)

Comment 20

•

12 years ago

Is this still a problem - qawanted.

Keywords: qawanted

Whiteboard: [b2g-gfx]

gbennett

Updated

•

12 years ago

QA Contact: gbennett

gbennett

Comment 21

•

12 years ago

- Does not repro - 9/27 Environmental Variables Device: Buri 1.3 mozRIL Build ID: 20130927040201 Gecko: http://hg.mozilla.org/mozilla-central/rev/e4cd2242cc7d Gaia: 64ba02f7bbf70a1877ba9dad6889a17cd25b1d35 Platform Version: 27.0a1 Tried for ~10min with no issues.

Keywords: qawanted

Jason Smith [:jsmith]

Updated

•

12 years ago

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → WORKSFORME