20743 - Need low-memory handling code to avoid crash

Assignee

Description

•

25 years ago

We need to have some code that deals with low-memory situations. This code should detect low-memory situations, and respond by flushing various caches (imglib etc). Cc:ing pierre to get input on the embedding situation, since this is something that an embedder might already have code for, but needs to call into Gecko to tell us to flush caches.

Pierre Saslawsky

Comment 1

•

25 years ago

It should work both ways. If we detect a low memory condition, we could ask the application to free some caches too. It makes me think that we should write some sample code to use Gecko as a separate component instead of linking it as static libraries into the app's project so that the memory allocators don't interfere.

Simon Fraser [no longer active]

Assignee

Updated

•

25 years ago

Status: NEW → ASSIGNED

Target Milestone: M13

Simon Fraser [no longer active]

Assignee

Comment 2

•

25 years ago

M13. Not sure where this code should live.

Pierre Saslawsky

Comment 3

•

25 years ago

The embedding code is now in lib/mac/embedding.

Simon Fraser [no longer active]

Assignee

Updated

•

25 years ago

Blocks: 19187

rubydoo123

Updated

•

25 years ago

Summary: Need low-memory handling code → [DOGFOOD]Need low-memory handling code

Target Milestone: M13 → M12

rubydoo123

Comment 4

•

25 years ago

another PDT+ is dependent on this one, either this needs to be PDT+, or we need to downgrade 19187

leger

Updated

•

25 years ago

Whiteboard: [PDT-]

leger

Comment 5

•

25 years ago

Putting this on the PDT- radar. We agreed to not fix, but workaround 19187 by increasing partition size.

rubydoo123

Updated

•

25 years ago

Target Milestone: M12 → M13

rubydoo123

Comment 6

•

25 years ago

moving to m13

Simon Fraser [no longer active]

Assignee

Updated

•

25 years ago

Target Milestone: M13 → M14

Simon Fraser [no longer active]

Assignee

Comment 7

•

25 years ago

I'm on vacation until Jan 10; moving this to M14

rubydoo123

Updated

•

25 years ago

Target Milestone: M14 → M16

rubydoo123

Comment 8

•

25 years ago

moving to m16 - load balance

leger

Comment 9

•

25 years ago

Putting dogfood in the keyword field.

Keywords: dogfood

Michael Lowe

Updated

•

25 years ago

Summary: [DOGFOOD]Need low-memory handling code → Need low-memory handling code

Simon Fraser [no longer active]

Assignee

Updated

•

25 years ago

Blocks: 4201

rubydoo123

Comment 10

•

25 years ago

m17

Target Milestone: M16 → M17

Simon Fraser [no longer active]

Assignee

Comment 11

•

25 years ago

Make jrgm QA contact, since he sees crashes resulting from low memory sitations frequently. Brendan: this bug calls for some memory pressure API. Nominating for nsbeta2, since jrgm sees low memory crashes frequently.

Keywords: nsbeta2

QA Contact: leger → jrgm

Summary: Need low-memory handling code → Need low-memory handling code to avoid crash

rubydoo123

Updated

•

24 years ago

Whiteboard: [PDT-]

leger

Comment 12

•

24 years ago

Putting on [nsbeta2-][dogfood-] radar. Puttig on nsbeta3 keyword for work on that release.

Keywords: nsbeta3

Whiteboard: [nsbeta2-][dogfood-]

Brendan Eich [:brendan]

Updated

•

24 years ago

Keywords: arch

Warren Harris

Comment 13

•

24 years ago

I'd like to add something to nsIAllocator to allow memory flushers to be registered. Is this what you had in mind Simon?

Simon Fraser [no longer active]

Assignee

Comment 14

•

24 years ago

Yes. And I need to do some mac code to detect and respond to low-memory situations.

Warren Harris

Comment 15

•

24 years ago

Simon: I reworked nsIAllocator to be nsIMemory, which now includes the additional methods: void registerObserver(in nsIMemoryPressureObserver obs); void unregisterObserver(in nsIMemoryPressureObserver obs); where: interface nsIMemoryPressureObserver : nsISupports { const unsigned long REASON_ALLOC_FAILURE = 0; const unsigned long REASON_HEAP_MINIMIZE = 1; /** * Called in response to a low-memory condition. * @param reason - either REASON_ALLOC_FAILURE when alloc or realloc * fails, or REASON_HEAP_MINIMIZE when heapMinimize is * called * @param requestedAmount - either the size requested by alloc or realloc, * or 0 if heapMinimize was called */ void flushMemory(in unsigned long reason, in size_t requestedAmount); }; so you be able to hook mac-specific flushers into this.

leger

Comment 16

•

24 years ago

Adding crash to keyword field.

Keywords: crash

Chris Petersen

Comment 17

•

24 years ago

*** Bug 41607 has been marked as a duplicate of this bug. ***

John Morrison

Updated

•

24 years ago

Keywords: relnote

Peter Trudelle

Comment 18

•

24 years ago

Removing whiteboard entries for reconsideration. I don't understand how a dogfood+ bug could be closed as a dup of this, yet this isn't even nsbeta2+. Bumping up the partition does not fix this, at most it delays the problem while raising the bar on target machines. Isn't it risky to wait until b3 to add code to handle such fundamentals? Also, what would the relnote say - "Will crash frequently at random"?

Whiteboard: [nsbeta2-][dogfood-]

Simon Fraser [no longer active]

Assignee

Comment 19

•

24 years ago

I agree with trudelle. And to clarify the effects of increasing the partition size: giving the app more memory will not always avoid out of memory crashes. This is because there are complex interactions between our memory allocators (which grab a chunk of the heap on startup), and memory allocation used for GWorlds and other toolbox structures (used for images and double buffering). We really need some new code that does the right thing under low memory.

Pierre Saslawsky

Comment 20

•

24 years ago

Simon: waiting for a better solution, we still need to increase the memory partition. These days, the daily optimized builds don't even allow me to have a browser window and check my mail. They have a 12Mb partition, we could increase it to 16Mb.

Phil Peterson

Comment 21

•

24 years ago

[dogfood-][nsbeta2-] would like to have for beta3

Whiteboard: [dogfood-][nsbeta2-]

Pierre Saslawsky

Comment 22

•

24 years ago

Fine. I opened bug 41760 to have a solution for beta2.

Simon Fraser [no longer active]

Assignee

Comment 23

•

24 years ago

Pierre: the app uses temp memory when it needs to, for malloc/new as well as the backing store GWorlds, so theoretically there should be no need to increase the partition size. However, there is code that fails to be smart about GWorld allocations in temp mem, so a partition size increase may be in order. We should set the partion size so that with a single browser window, we don't have to alloate any temp mem blocks. I think, now, we allocate 4 1Mb blocks, so this might call for increasing the partition size by about 4Mb.

Simon Fraser [no longer active]

Assignee

Updated

•

24 years ago

Blocks: 43060

Kathleen :Brade

Comment 24

•

24 years ago

I set my new Mac to millions of colors and crashed today opening a new window (one time I saw an assert for failing to allocate a gworld right before I crashed). My work-around is to set my monitor to thousands of colors. Also, I saw my memory partition nearly full (only about 400K spare).

Pierre Saslawsky

Updated

•

24 years ago

Blocks: 3412

Warren Harris

Updated

•

24 years ago

Keywords: footprint

rubydoo123

Comment 25

•

24 years ago

removing status whiteboard text and setting to m18

Keywords: nsbeta2

Whiteboard: [dogfood-][nsbeta2-]

Target Milestone: M17 → M18

Jay Patel [:jay]

Comment 26

•

24 years ago

Putting on [dogfood-] radar. Not critical to everyday use. Putting on [NEED INFO] radar. PDT needs to know impact to user and risk of fix to make a call on this bug. How often does this happen? Has anything changed since the last two times it's been given a -? Is this worth pulling pr2 off the wire.

Whiteboard: [dogfood-][NEED INFO]

Simon Fraser [no longer active]

Assignee

Comment 27

•

24 years ago

Sooner or later, the user is going to crash hard if this bug is not fixed. When/ if they crash will depend on their partition size, how many pages they have loaded, their screen depth, the number of windows open, and how big those windows are. And simply increasing the partion size will not make the crashes go away, necessarily. What more info do you need?

Blocks: 46000

Whiteboard: [dogfood-][NEED INFO]

Jay Patel [:jay]

Comment 28

•

24 years ago

Putting on [nsbeta2+][dogfood-] radar. Does not need a fix ASAP for daily work, but we should fix this for beta2.

Whiteboard: [nsbeta2+][dogfood-]

Simon Fraser [no longer active]

Assignee

Comment 29

•

24 years ago

You really meant nsbeta2+? Not nsbeta3+?

Simon Fraser [no longer active]

Assignee

Comment 30

•

24 years ago

Fix checked in. I now implement the traditional Mac application low memory handling, with a 32k memory reserve handle, and a GrowZoneProc.

Status: ASSIGNED → RESOLVED

Closed: 24 years ago

Resolution: --- → FIXED

John Morrison

Comment 31

•

24 years ago

Well, I've been beating up my Mac, and I cannot trigger the sort of memory problems that I had in the past (where my windows would, er, evaporate at certain times [evaporate being a highly technical term]). I played with my shrinking my preferred size, loading large pages while resizing, resizing with cmd held down, multiple windows, etc. ... Marking verified fixed.

Status: RESOLVED → VERIFIED

Simon Fraser [no longer active]

Assignee

Comment 32

•

24 years ago

Hrmm. I've seen a commercial build go belly-up with a dsMemFullErr a few times now. I think there is still work to do here.

Status: VERIFIED → REOPENED

Resolution: FIXED → ---

John Morrison

Comment 33

•

24 years ago

Removing nsbeta2+; the additional work is not landing for PR2.

Whiteboard: [nsbeta2+][dogfood-] → [dogfood-]

rubydoo123

Comment 34

•

24 years ago

setting to nsbeta3+, setting priority to p:2 -- Simon, need your input

Priority: P3 → P2

Whiteboard: [dogfood-] → [nsbeta3+][p:2][dogfood-]

rubydoo123

Comment 35

•

24 years ago

update whiteboard

Hardware: All → Macintosh

Whiteboard: [nsbeta3+][p:2][dogfood-] → [nsbeta3+][p:2][dogfood-][Mac issue]

Phil Peterson

Comment 36

•

24 years ago

PDT agrees P2 if it still is reproducible

Whiteboard: [nsbeta3+][p:2][dogfood-][Mac issue] → [nsbeta3+][p:2][dogfood-][Mac issue][PDTP2]

Simon Fraser [no longer active]

Assignee

Updated

•

24 years ago

Depends on: 46337

Simon Fraser [no longer active]

Assignee

Comment 37

•

24 years ago

Mac fixes are checked in. This is still waiting on 52402.

Depends on: 52402

Simon Fraser [no longer active]

Assignee

Comment 38

•

24 years ago

Marking this fixed now that imagelib flushing works. To test: open lots of browser windows, mail, composer, and load some heavy web pages (http://www.voodooextreme.com, cnn.com, news.bbc.co.uk etc). However much you load, it should not crash.

Status: REOPENED → RESOLVED

Closed: 24 years ago → 24 years ago

Resolution: --- → FIXED

Simon Fraser [no longer active]

Assignee

Comment 39

•

24 years ago

Hrmm, things are still bad here. Reopening.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Simon Fraser [no longer active]

Assignee

Comment 40

•

24 years ago

*** Bug 53633 has been marked as a duplicate of this bug. ***

Simon Fraser [no longer active]

Assignee

Comment 41

•

24 years ago

Fixes checked in; there was a bug where I was not making the reallocated memory buffer handle purgeable, and I tweaked the numbers so that GFX leaves a bit more free heap space

Status: REOPENED → RESOLVED

Closed: 24 years ago → 24 years ago

Resolution: --- → FIXED

John Morrison

Comment 42

•

24 years ago

> However much you load, it should not crash. Okay, reopening on that basis, but Simon can use his own judgement on how to deal with this. I must say, though, that the performance is really quite good and the first time I managed to crash the app, I had ~25 browser windows open to home.netscape.com, a 3700 message IMAP folder open, AddressBook, and 9 Composer windows open (that's a lot!!, and more than an average user would use). However, after the crash, I devised a random page disher and set it as my home page (http://jrgm.mcom.com/cgi-bin/random-page.pl) where the pages are the ones above plus slashdot, home.netscape.com, an lxr page, and tinderbox. With that, I usually hit a crash at about page 9 or 10, with this kind of info from 'how', 'hc all', 'ht', and 'sc'. Bus Error at FFC66426 _InitProcMenu+00246 while reading word from 1A23ABFF in User data space Checking all heaps The System heap at 00002800 is ok The ROM read-only heap at 00028F20 is ok The Process Manager heap at 01153AB0 is ok The ÒNetscape 6Ó heap at 0E176040 is bad This block's back pointer doesn't point to the previous block. Block header 0EE3C9D0 0E01 2020 AB00 0004 0000 0090 00A5 79FC ¥¥ «¥¥¥¥¥¥�¥¥yü The ÒShareWay IP Personal BgndÓ heap at 0F26A670 is ok The ÒFile Sharing ExtensionÓ heap at 0F339280 is ok The heap at 0F376DB0 is ok The ÒFinderÓ heap at 0F3936D0 is ok The ÒTime SynchronizerÓ heap at 0F496B30 is ok The ÒOpen Transport SNMPÓ heap at 0F4BA390 is ok The ÒNorton SchedulerÓ heap at 0F52B3F0 is ok The ÒFolder ActionsÓ heap at 0F5BC450 is ok The ÒFBC Indexing SchedulerÓ heap at 0F62D4B0 is ok The ÒControl Strip ExtensionÓ heap at 0F667970 is ok System heap high free space + TempMem low free space = #197049584 (#187M) Totaling the ÒNetscape 6Ó heap at 0E176040 The ÒNetscape 6Ó heap at 0E176040 is bad This block's back pointer doesn't point to the previous block. Block header 0EE3C9D0 0E01 2020 AB00 0004 0000 0090 00A5 79FC ¥¥ «¥¥¥¥¥¥�¥¥yü Calling chain using A6/R1 links Back chain ISA Caller (for comparison, I can do the same test on win32 and linux for >20 pages, by which time the Mem size of netscape is > 96MB, and I have put the OS into paging thrash.). [Sidenote: I think the Mac new window performance is better than Windows and Linux when showing the new chrome window! Way to go.].

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Simon Fraser [no longer active]

Assignee

Comment 43

•

24 years ago

What killed you there is heap corruption, not simple out of memory problems. I filed a new bug, bug 54023, on the heap corruption problem, and am going to close this bug again.

Status: REOPENED → RESOLVED

Closed: 24 years ago → 24 years ago

Resolution: --- → FIXED

Simon Fraser [no longer active]

Assignee

Comment 44

•

24 years ago

Brade and other still see low memory crashers. I see that it's possible to get into this state when nothing can be flushed from imglib caches. More tuning required here.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

rubydoo123

Comment 45

•

24 years ago

setting to nsbeta3-, and rtm+ it's a crash, which can and should be resolved for rtm

Keywords: rtm

Whiteboard: [nsbeta3+][p:2][dogfood-][Mac issue][PDTP2] → [nsbeta3-][rtm+][p:2][dogfood-][Mac issue][PDTP2]

Target Milestone: M18 → M19

rubydoo123

Comment 46

•

24 years ago

Simon, please include the required information per the rtm checkin rules

Whiteboard: [nsbeta3-][rtm+][p:2][dogfood-][Mac issue][PDTP2] → [nsbeta3-][rtm+ NEED INFO][p:2][dogfood-][Mac issue][PDTP2]

Simon Fraser [no longer active]

Assignee

Comment 47

•

24 years ago

What's killing us here are the additional handles created for each NewGWorld call. Each one creates 29 handles. When allocating a 1-bit GWorld (e.g. for a GIF mask), one of those handles is 4.6K in size. These go into the heap even if the image bits go into temp mem. So, possible solutions to fixing this: 1. Leave lots of heap space free, and hope that someone does not create enough images that they run out. 2. Use Heap zones to ensure that GWorld bits are controllable, and don't fill the heap. 3. Don't use GWorlds at all, reverting to the old hand-rolled, new[] buffer allocations for image bits. 4. Don't use GWorlds, but create CGrafPorts, PixMapHandles etc by hand. 5. Don't use GWorlds for 1-bit masks, using hand-rolled BitMaps instead (there is a technote on how to do this). Others?

Simon Fraser [no longer active]

Assignee

Comment 48

•

24 years ago

Relevant technotes: http://developer.apple.com/technotes/qd/qd_13.html http://developer.apple.com/technotes/qd/qd_21.html http://developer.apple.com/technotes/tn/tn1021.html

Status: REOPENED → ASSIGNED

rubydoo123

Comment 49

•

24 years ago

removing + per pdt sw rules

Whiteboard: [nsbeta3-][rtm+ NEED INFO][p:2][dogfood-][Mac issue][PDTP2] → [nsbeta3-][rtm NEED INFO][p:2][dogfood-][Mac issue][PDTP2]

Pierre Saslawsky

Comment 50

•

24 years ago

Are we crashing inside NewGWorld()? If not, there is no reason to remove it. If we do, let's make sure we have enough free mem before calling it.

Simon Fraser [no longer active]

Assignee

Comment 51

•

24 years ago

There is plenty of reason to not use GWorlds for images, and hand-roll PixMaps instead (as dcone's old code used to do) -- it would save 2-4K of heap data per image. Where the crash happens can vary; it can be any Toolbox call that allocates memory in the heap. In debug builds it's often in TextEdit while outputting to the console, but it could be in NewRgn or NewGWorld. Right now, if we detect insufficient memory in the heap, we call NewGWorld with the useTempMem flag. But this still puts 29 small handles in the heap -- only the pixels go into temp mem. Maybe a cheap solution to this bug would be to simply return an out of memory error in this case, and not attempt to allocate the GWorld at all.

Simon Fraser [no longer active]

Assignee

Comment 52

•

24 years ago

Attached patch GFX diff to tweak GWorld allocations — Details — Splinter Review

Simon Fraser [no longer active]

Assignee

Comment 53

•

24 years ago

I attach some diffs that I think are the safest way to fix this now. mozilla/gfx/src/mac/nsDrawingSurfaceMac.cpp: Change some constants so that we go to temp mem for drawing surface allocations (of which there are few, and which are normally big) earlier, leaving more heap mem free. Go to temp mem if < 1024k (max) or 512k (contiguous) is free. mozilla/gfx/src/mac/nsImageMac.cpp Similarly to the above patch, I changed some constants so that we go to temp mem earlier (1024k max, 768k contig). However, I also changed the logic so that we'll acutally refuse to allocate at all if memory is below a lowet threshold (512k max, 384k contig). This ensure that we never totally fill the heap with the heap- based handles that GWorlds create (even when allocating the pixels of the GWorld in temp mem). It is safe to allocate GWorlds; these failure paths have been tested, and imglib recovers safely. If the images that fail to allocate are chrome images, you may see missing images in the UI. However, this is preferable to crashing. Longer term, I think we need to go back to hand-rolling PixMaps for image data.

Simon Fraser [no longer active]

Assignee

Comment 54

•

24 years ago

s/lowet/lower s/is safe to allocate/is safe to fail to allocate

Pierre Saslawsky

Comment 55

•

24 years ago

r=pierre

dcone (gone)

Comment 56

•

24 years ago

sr=dcone

Kathleen :Brade

Comment 57

•

24 years ago

change rtm NEED INFO to rtm+

Keywords: nsbeta3

Whiteboard: [nsbeta3-][rtm NEED INFO][p:2][dogfood-][Mac issue][PDTP2] → [rtm+][p:2][dogfood-][Mac issue][PDTP2]

Peter Trudelle

Comment 58

•

24 years ago

PDT says rtm++, please land on branch ASAP.

Whiteboard: [rtm+][p:2][dogfood-][Mac issue][PDTP2] → [rtm++][p:2][dogfood-][Mac issue][PDTP2]

Kurt Weinschenker

Comment 59

•

24 years ago

Probably too late in the game, but Mac Mozilla Installer build 2000101808 blew up in the manner John Morrison mentioned. When I crashed, I had perhaps 10 navigator windows open, one of them to http://www.apple.com/ which AFAIK has a QuickTime movie on it. I crashed so hard, MacsBug wouldn't even give me a stdlog -- all I got was this: MacsBug 6.6.3, Copyright Apple Computer, Inc. 1981-2000 Bus Error at FFC6640A _DeleteMenu+0008A while reading long word from 4F430000 in User data space 18-Oct-2000 9:48:44 PM (since boot = 18 minutes) Current application is ?Mozilla? Machine = #406 (NewWorldMac), System $0904, sysu = $01008000 Unable to access that address PC: FFC66424 Frame Type: B008 Log file already open Dunno if this helps, but thought I might as well post anyway.

Simon Fraser [no longer active]

Assignee

Comment 60

•

24 years ago

Fix checked into trunk and branch. I spun off bug 57327 for going back to using PixMaps here.

Status: ASSIGNED → RESOLVED

Closed: 24 years ago → 24 years ago

Resolution: --- → FIXED

Patrick C. Beard

Comment 61

•

24 years ago

I can't find any setting of the Mac application heap that will let my debug build of Mozilla load its chrome. I am very tempted to back out this change to nsImageMac.cpp. If I revert to v1.30, things work.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Pierre Saslawsky

Comment 62

•

24 years ago

beard: Could you give more details? I have a fresh debug build from tonight that runs perfectly well with the different chromes and within the default 16597 Kb partition (VM off).

Simon Fraser [no longer active]

Assignee

Comment 63

•

24 years ago

beard: use ZoneRanger to look at your heap. Why is yours different?

Simon Fraser [no longer active]

Assignee

Comment 64

•

24 years ago

beard: are you running the Boehm GC, or MJR stuff, or something else that eats heap space?

Simon Fraser [no longer active]

Assignee

Comment 65

•

24 years ago

Beard was running with the GC stuff, which never called MaxApplZone(). He's gonna fix that.

Status: REOPENED → RESOLVED

Closed: 24 years ago → 24 years ago

Resolution: --- → FIXED

John Morrison

Comment 66

•

24 years ago

Say, Simon, et al. Given that this bug has been reopened 5 times, and with that other memory bug (low pref size) floating around, I don't want to confuse matters by doing another reopen. Performance in this regard is mightily improved. But by the arbitrary test of loading from a pallette of heavy web pages (dished from my startup URL), while also having mail, composer, etc. open -- at the point where I went to get my 45th browser window open, I hit an unmapped memory exception with the stack in a call to a factory (in CSS Style somewhere). Should I reopen this bug or should I file a new bug for sometime after RTM (or would you prefer to tell me "Don't do that ...").

Pierre Saslawsky

Comment 67

•

24 years ago

I certainly did not test as thoroughly as you did but I don't crash anymore. However, and this is a memory-related problem too, I hit constantly bug 57835 / bug 58490 where the chrome falls apart after opening a few windows of different types. If we still have a little bit of time to spent on low-mem problems, I think it would be better to focus on the problems with the chrome (apparently fixed today, in the builds tomorrow) than crashes that require dozens of browser windows.

John Morrison

Comment 68

•

24 years ago

Oh, absolutely -- if we reopened this bug, it would be for post-RTM work. There's no "flip one switch" at hand, I don't think, and the "missing chrome" is a far more tangible bug, than some idiot (me) opening 45 browser windows :-] At any rate, this is fine for the branch, so I'm marking it vtrunk, and we can pick it up later, or open a brand-new shiny bug for later.

Keywords: vtrunk

Simon Fraser [no longer active]

Assignee

Comment 69

•

24 years ago

jrgp: please open a separate bug with your CSS-related crash, with MacsBug log is availabule. Thanks.

Myk Melez [:myk] [@mykmelez]

Updated

•

20 years ago

Product: Browser → Seamonkey