Closed Bug 413405 Opened 18 years ago Closed 14 years ago

Limited memory causes Firefox 3.0 to fault and/or hang in many ways

Categories

(Core :: General, defect)

x86
Linux
defect
Not set
major

Tracking

()

RESOLVED INVALID

People

(Reporter: robert.bradbury, Unassigned)

Details

Attachments

(2 files)

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b3pre) Gecko/2008011915 Firefox/3.0b3pre Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b3pre) Gecko/2008011915 Firefox/3.0b3pre Limited memory, either upon startup or while running causes Firefox to SegFault, Abort, or Hang in a variety of ways. This was reported in Firefox 2.0 but has never been dealt with. Reproducible: Always Steps to Reproduce: 1. Make any system dependent adjustments to the shell script ffdbgmem (to be attached). 2. Run the ffdbgmem (under Linux). 3. Watch various types of faults occur and core dumps accumulate. Actual Results: Firefox core dumps, aborts or hangs. Expected Results: Firefox should run cleanly under all circumstances. If it runs out of memory, it should give a message that there is insufficient memory available and exit cleanly. It should not core dump. On memory limited systems (e.g. cell phones) there may be no place to dump a core file. If it runs out of memory with many tabs open, it should not core dump -- it should refuse to open a new window, tab, image, etc. and provide an error that opening such an object would exceed the available memory. The Firefox developers really need to get out of the mindset the memory is infinite. It is not infinite on a machine where one has a process running amok (something I tend to do every few months). It is not infinite on a cell phone which doesn't have a swap disk. It is not infinite if I arbitrarily decide to use ulimit to limit the amount of memory available to Firefox. In all of these cases (when a memory allocation fails) the response by Firefox should be as graceful as possible. I.e. the program doesn't fault and lose its current state -- it informs the user that a specific action cannot be performed and continues to function normally.
ffdbgmem attempts to run a test version of firefox-3.0b3pre with various memory limitations (which in part equate to various failures which would occur when the system runs out of swap space, the swap disk fails, etc.). The errors range from SegFaults to Aborts to Firefox hanging depending on what memory allocation fails. While the traces from the core dumps could be provided they are of questionable use as the failures will vary from system to system (32 bit vs. 64 bit), swap space size (try browsing on a system with no swap and only 256MB of main memory!!!). The trace of the output of ffdbgmem will be in the next attachment.
Copy of output of a run of ffdbgmem and the variety of errors it can generate. It is worth noting that prlink.c has the particular mapping of various shared libraries enabled so one can approximate what part of the system is failing. If the allocs() are failing in one of the gdk/gtk/glib portions of the system the Firefox code should catch the error and handle it cleanly. A Segfault or Abort should not be the "normal" course of action. An instance of a std::bad_alloc error is an instance of a failure to catch errors in the C++ code (which is a Firefox responsibility -- not a gdk/gtk/glib responsibility).
Status: UNCONFIRMED → NEW
Ever confirmed: true
Patches to gracefully handle out-of-memory situations are accepted (see https://bugzilla.mozilla.org/buglist.cgi?short_desc_type=casesubstring&short_desc=OOM&resolution=FIXED for some examples, if you don't believe me). The problem with this bug report is that it is too broad to be useful - it's hard to determine whether it will ever by "fixed", and the scope of work is too large for one bug to be useful in coordinating work. Suggesting that we need to fix all cases of bad OOM handling is one thing, actually doing the work is another. This bug being open doesn't really help get work done, so I suggest marking it INVALID, and encourage you to file specific bug reports for each of the general areas of code where you've found a problem (one bug per-function or per-file, perhaps, depending on the frequency). And since you're evidently very passionate about the issue, perhaps you could also look into attaching patches for some of them? I'd be glad to help you find reviewers and such.
Product: Firefox → Core
QA Contact: general → general
(In reply to comment #1) > (try browsing on a system with no swap and only 256MB of main memory!!!). That would be nice, my Nokia N800 only has 128MB! :-) +1 to what Gavin said, and I'd further point out that Mozilla is serious about Mobile efforts, which don't have the luxury of gigabytes of ram even if cutting edge desktops do. [See also, http://wiki.mozilla.org/Mobile]
Gavin, I don't think it should be labeled "INVALID" as I've got dozens of core dumps that I intend create traces for and add as attachments. I've got to put the current versions of gtk/gdk/glib/libstdc together with debugging again before I do that however. That is perhaps several days of work and this isn't a full time project of mine. The similarity between many of the errors suggests that a few "master" level fixes might take care of much of the problem, e.g. handling JS allocation errors, handling DLL mapping problems, handling graphics library structure or image allocation errors, etc. The "hangs" are another deeper problem as they suggest some fundamental architecture issues. I'd suggest that after I have the traces and get a better feel for what is going wrong that we may parcel the problems out into separate bugs but keep this as a common reference point. As I generally don't do C++ I will not be able to propose fixes and someone (or someones) who are generally familiar with the specific code sections involved will probably be required to make them. With respect to Justin's comment... You might be able to get 2.0 to run in 128MB (depends how big the OS is) [I have run 2.0 on a Windows 98 machine with 192MB] but 3.0 is significantly (~25-30%) larger and I doubt you could run it in 128MB unless you do major surgery (strip out Javascript & printing???).
(In reply to comment #5) > Gavin, I don't think it should be labeled "INVALID" as I've got dozens of core > dumps that I intend create traces for and add as attachments. I'm not saying that the issues you're raising are INVALID, I'm saying that I don't think you should be raising them all in single bug full of core dumps that are only related because they represent the same type of problem in different parts of the code. Bugzilla is typically used to track changes per-component, because different parts of the code are maintained by different people and have different review requirements. The more specific your bug is the easier it will be to fix. If you want to keep this bug to track changes to the shell script you're using, and otherwise keep track of the other bugs you raise, you can turn it into a meta bug and put the more specific bugs in the dependency chain.
(In reply to comment #5) > With respect to Justin's comment... You might be able to get 2.0 to run in > 128MB Actually, the N800's browser is based on the same code Firefox 3 is (for the most part; different UI and some parts not needed for mobile use removed). My point was that there is active, ongoing development to ensure Mozilla runs well on low-memory devices.
printing isn't that big, but the os dolske is talking about doesn't support printing, so that it's enabled is an uninteresting bug. gavin's point is very simple, if you find a bug in a module (nspr, xpcom, gfx, cairo, javascript), then file a bug against that module for that stack trace. don't hang dozens of unrelated stacks into a single bug. matt@nightrealms.com: in the future, please don't confirm bugs like this. > In all of these cases > (when a memory allocation fails) > the response by Firefox should be as graceful as possible. If a consumer like gtk (or dbus) decides to abort(), then how graceful do you want us to be? attachment 298360 [details] Test of memory limit of 43000 failed SIGABRT (134) being a convenient example. Test of memory limit of 73000 faild with status: 1 attachment 298358 [details] 139) echo "Test of memory limit of $ML failed SIGFAULT ($STATUS)" ;; 134) echo "Test of memory limit of $ML failed SIGABRT ($STATUS)" ;; *) echo "Test of memory limit of $ML faild with status: $STATUS" ;; that's odd, are you intentionally misspelling failed here? note: we do have plans to handle c++ allocation failure. but at the present they're probably stuck behind replacing the crt <http://benjamin.smedbergs.us/blog/2008-01-10/patching-the-windows-crt/>
URL: N/A
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: