Closed Bug 678940 Opened 13 years ago Closed 13 years ago

[Linux] Firefox 6 consumes a lot of system memory just after start when using layers.acceleration.force-enabled=true

Categories

(Core :: Graphics, defect)

6 Branch
All
Linux
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla12

People

(Reporter: robko, Assigned: k.kotlenga)

References

(Depends on 1 open bug)

Details

(Keywords: memory-footprint, Whiteboard: [MemShrink:P2])

Attachments

(2 files, 1 obsolete file)

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.112 Safari/535.1

Steps to reproduce:

I started firefox normally. My system is Arch Linux - kernel Linux 3.0.1, firefox from testing repository.


Actual results:

Firefox consumed all of system memory. Showing of dialogs is really slow. Firefox tend to freeze.
Summary: Firefox 6 consumes all of memory just after start → Firefox 6 consumes all system memory just after start
Thanks, it works perfect in safe mode. But where could be the problem? I have all add-ons and plugins disabled in normal mode and it's absolutely unusable.
I created new Firefox profile and it's working properly in normal mode. So this issue is happening only on my profile migrated from Firefox 5.
This two errors are showing just after loading the old profile:

Can't find symbol 'glXBindTexImageEXT'
Can't find symbol 'glXReleaseTexImageEXT'
That makes no sense if it works in the safemode and a new profile but not in the normal mode without extensions.
Do you really have all extensions disabled ?

In that case i would try to to disable the hardware acceleration under tools/options/advanced/general but that doesn't explain why it works with a new profile.
Yes, I am sure I have all extension disabled, but you were right with hardware acceleration. I disabled it and Firefox works properly with my old profile. But it doesn't explain why it works OK with new profile and hardware acceleration enabled.

Additional info: I have Intel GMA HD Integrated Graphics and xf86-video-intel 2.15.0 graphic driver from Arch Linux repository.
Menus and bookmark folders are showing slow too. When they are loading, there is a lot I/O operation and it freezes my computer for a few seconds. iotop is showing 100% UI. This is the second symptom. With disabled hardware acceleration, I/O operations are normal.
Maybe you have some features of HW acceleration forcefully enabled in the old profile, that are not enabled automatically (on linux) in the new profile even when HW accel is enabled. Something like 'layers.acceleration.force-enabled' ?
Nice catch, it was my problem - layers.acceleration.force-enabled. When I set it to false, everything works perfectly.
I think layers still has problems on Linux therefore it is not enabled by default (see https://wiki.mozilla.org/Blocklisting/Blocked_Graphics_Drivers). I have also seen this (FF using insane amounts of memory with one page loaded) when experimenting with HW accel and layers (ATI card). However I don't know if this is a known problem (I haven't found a bug for it). So let's leave this bug open and I'll do some more tests myself.
Summary: Firefox 6 consumes all system memory just after start → Firefox 6 consumes all system memory just after start when using layers.acceleration.force-enabled
OK, I can see this too. When forcing layers enabled, on FF 9a1, ATI Radeon HD 4350. Firefox after start with the about:home page open takes 1.2GB resident memory (RES in top).

I don't know if that is a real problem, as it is a non-default configuration and maybe still unsupported.
Status: UNCONFIRMED → NEW
Component: General → Graphics
Ever confirmed: true
Product: Firefox → Core
QA Contact: general → thebes
Hardware: x86_64 → All
I can confirm this on an Arch Linux x64 system with Intel HD graphics, kernel 3.04 and firefox package from [extra] repository (the standard, stable release). 

In my case FF didn't use *all* the system memory, but the --heap-unclassified portion was 2.2 GB compared to about 200 MB after disabling the layers.acceleration.force-enabled option.

Attaching the output of glxinfo in case it helps
Attached file output of glxinfo
Yes, it does not explicitly take all memory available on the system, it probably just needs to allocate some big number (like 1-2GB, maybe dependent on driver), which incidentally may be all the user has. Even the reporter didn't say it crashed due to OOM or anything, it was just slow (maybe swapping).
Summary: Firefox 6 consumes all system memory just after start when using layers.acceleration.force-enabled → [Linux] Firefox 6 consumes a lot of system memory just after start when using layers.acceleration.force-enabled=true
I think resolving this bug could be helped by reducing heap-unclassified numbers in about:memory.
Depends on: DarkMatter
Good idea, I'll see what about:memory shows when this happens.
about:memory just after Firefox start (Firefox 10).

Main Process

Explicit Allocations
454.65 MB (100.0%) -- explicit
├──424.98 MB (93.47%) -- heap-unclassified
├───23.79 MB (05.23%) -- js
│   ├──15.22 MB (03.35%) -- compartment([System Principal], 0xffffffffb3052000)
│   │  ├───8.55 MB (01.88%) -- gc-heap
│   │  │   ├──4.39 MB (00.97%) -- objects
│   │  │   │  ├──2.50 MB (00.55%) -- non-function
│   │  │   │  └──1.89 MB (00.42%) -- (1 omitted)
│   │  │   └──4.16 MB (00.92%) -- (6 omitted)
│   │  └───6.67 MB (01.47%) -- (8 omitted)
│   ├───4.57 MB (01.01%) -- (7 omitted)
│   └───4.00 MB (00.88%) -- stack
├────3.78 MB (00.83%) -- storage
│    └──3.78 MB (00.83%) -- sqlite
│       └──3.78 MB (00.83%) -- (10 omitted)
└────2.10 MB (00.46%) -- (9 omitted)

Other Measurements
  0.05 MB -- gfx-surface-image
  6.60 MB -- gfx-surface-xlib
437.11 MB -- heap-allocated
438.29 MB -- heap-committed
    0.26% -- heap-committed-unallocated-fraction
  0.08 MB -- heap-dirty
 22.88 MB -- heap-unallocated
        2 -- js-compartments-system
        1 -- js-compartments-user
 11.00 MB -- js-gc-heap
  0.17 MB -- js-gc-heap-arena-unused
  1.00 MB -- js-gc-heap-chunk-clean-unused
  1.77 MB -- js-gc-heap-chunk-dirty-unused
  0.00 MB -- js-gc-heap-decommitted
   24.50% -- js-gc-heap-unused-fraction
  1.14 MB -- js-total-analysis-temporary
  1.55 MB -- js-total-mjit
  5.26 MB -- js-total-objects
  1.94 MB -- js-total-scripts
  2.23 MB -- js-total-shapes
  3.93 MB -- js-total-strings
  0.08 MB -- js-total-type-inference
       46 -- page-faults-hard
  167,463 -- page-faults-soft
471.43 MB -- resident
673.85 MB -- vsize
Keywords: footprint
Depends on: 598875
Something that was not supposed "to become a significant memory issue" (see the comment in the code) unfortunately has become one. The problem was introduced in "Bug 567626 - fix up opengl layers backend". The offending line is

mUniformValues.SetLength(maxloc+1);

The reason is that maxloc, as returned by fGetUniformLocation() happens to be pretty big number (usually between 300000 and 500000 on my machine). This can easily sum up to over a gigabyte of memory just by running over menus (because a new GL context is being created for every widget and this uniform "cache" is not being shared between them).

The attached patch is probably a hack quality (this is my first patch and I'm not really a C++ programmer).
This is worrying indeed. cc'ing people. If maxloc is the max value returned by GetUniformLocation then 

    mUniformValues.SetLength(maxloc+1);

is indeed a recipe of uncontrollable large memory usage.
Whiteboard: MemShrink?
Does this only happen when force-enabled is true?  Presumably this option is only rarely turned on?  

I'm just asking because the answers to these questions will help with the MemShrink prioritization.  Thanks.
Nicholas, force-enabled is the only way to use layers acceleration under Linux for now. So this problem will probably happen when bug 594876 will be resolved (if the video card and the driver in question won't be blacklisted).
(In reply to Krzysztof Kotlenga from comment #18)
> Created attachment 585261 [details] [diff] [review]

You need to set a flag on the attachment to request review of it from some Core developer (see https://wiki.mozilla.org/Modules/Core (Graphics)).
Set the flag as 'review ?' and his email address.

Thanks for looking into this.
(In reply to Marco Castelluccio from comment #21)
> Nicholas, force-enabled is the only way to use layers acceleration under
> Linux for now. So this problem will probably happen when bug 594876 will be
> resolved (if the video card and the driver in question won't be blacklisted).

Except that it seems that this bug can also affect all GL layers, not only on Linux. GL layers are default on Mac and soon on Android. The large memory usage only happens when uniform location ids grow large, which is very implementation-dependent (depends on drivers).
Attachment #585261 - Flags: review?(joe)
Assignee: nobody → k.kotlenga
Status: NEW → ASSIGNED
The present bug is in code that (IIUC) tries to be an optimization by caching the values of uniforms, so that we don't have to call glGetUniform.

What data (bug number?) do we have to support the need for this optimization?
Whiteboard: MemShrink? → [MemShrink:P2]
Yeah, I'd much rather just remove the cache altogether. Vlad added this as part of bug 567626, saying it was to "avoid unnecessary state changes", but it was my understanding that the correct way to use OpenGL was to always set stuff, so it would surprise me if this measurably improved performance.

(Note that I'm ready to be surprised!)
Well, the correct way to use GL is to do everything in your power to avoid state changes being seen by GL.  Many drivers are good at optimizing no-op changes, but they also often assume that you're going to be doing that yourself.  Calling glGet *anything* is performance death.  Uniforms may be cached client-side by some drivers, but really no code that cares about performance should ever call glGet anything.  Changing the value of a uniform may require the entire uniform block to be sent to the GPU with the next draw call, for example; will that be avoided by the driver for a no-op change?  Dunno, but do you really want to find out? :)
Still, I'd like to see numbers. Obviously we want to avoid glGet; I also would like to avoid adding a hash table, since even though that's O(1) there's a big difference between that and just using a pointer.

So, I'd like to see numbers with what happens without a uniform cache on a variety of drivers before I r+ this patch.
(In reply to Joe Drew (:JOEDREW!) from comment #27)
> So, I'd like to see numbers with what happens without a uniform cache on a
> variety of drivers before I r+ this patch.

I would suggest that since this patch fixes a potentially large memory consumption bug, it's worth taking as a stopgap solution; but do require filing a followup bug to continue investigating the best solution.

Maybe the really good solution, if LayerManagerOGL wants to store and quickly read the values of uniforms, is to store each uniform as a separate data structure and hold them individually, rather than store them all in a single (array or map) data structure and have to look them up by id? Akin to what WebGL does, with the WebGLUniformLocation class.
Erm, except that WebGLUniformLocation doesn't actually store a copy of its value, so that was a bad example. I rather meant something like:

struct GLUniformLocation {
  GLint location;
  GLenum type;
  GLsizei size;
  void *datal
};
Except that'd be even worse, wouldn't it? An array to binary search?
No, because each program that's in use will have a known finite number of uniforms and types.  (In fact, I thought I implemented it like that originally?  With the value being directly on the uniform?  It's fuzzy...)

One thing to try would be to just unconditionally do a glUniform call.  The number of layers and layer renders is unlikely to be large, so this should probably not have a huge impact on perf.
Oh, I did, I see what the issue is (the uniform location vs. index issue that I mentioned in the comment).  Hmmm.

Using an actual object for each uniform like WebGL would work, but it would complicate the API -- callers would need to set uniforms by object and not by location, and keep a pointer around (with associated memory management complications) to reference it.  I would try getting rid of the caching entirely and see what happens.  Can be tested with GL layers on on some heavy layer-producing web content -- any difference is likely to be most easily seen on mobile (assuming the baseline perf is ok to begin with), otherwise I'd try under Windows.
Comment on attachment 585261 [details] [diff] [review]
Avoid allocating a huge array by using a hashtable instead

Review of attachment 585261 [details] [diff] [review]:
-----------------------------------------------------------------

Yeah, ok. Krzysztof, there's nothing wrong with this patch—it looks correct—but I think what we'd rather do to fix this bug is to remove the cache altogether. (We can add different caches, if it's needed, later.) Do you think you'd be able to put that sort of patch together?
Attachment #585261 - Flags: review?(joe) → review-
Just code removal and minor comment fixes. I don't know how to get some meaningful numbers to compare, so no numbers for now.
Attachment #586005 - Flags: review?(joe)
Attachment #585261 - Attachment is obsolete: true
Comment on attachment 586005 [details] [diff] [review]
Remove caching of uniform values

Review of attachment 586005 [details] [diff] [review]:
-----------------------------------------------------------------

Lovely!
Attachment #586005 - Flags: review?(joe) → review+
Try run for 4a721abc57b1 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=4a721abc57b1
Results (out of 271 total builds):
    exception: 2
    success: 243
    warnings: 24
    failure: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jdrew@mozilla.com-4a721abc57b1
Try run for 4a721abc57b1 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=4a721abc57b1
Results (out of 271 total builds):
    exception: 2
    success: 243
    warnings: 24
    failure: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jdrew@mozilla.com-4a721abc57b1
Try run for 4a721abc57b1 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=4a721abc57b1
Results (out of 271 total builds):
    exception: 2
    success: 243
    warnings: 24
    failure: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jdrew@mozilla.com-4a721abc57b1
Try run for 4a721abc57b1 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=4a721abc57b1
Results (out of 271 total builds):
    exception: 2
    success: 243
    warnings: 24
    failure: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jdrew@mozilla.com-4a721abc57b1
Try run for 4a721abc57b1 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=4a721abc57b1
Results (out of 271 total builds):
    exception: 2
    success: 243
    warnings: 24
    failure: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/jdrew@mozilla.com-4a721abc57b1
Those failures all look intermittent/fine. This is ready for checkin.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/a3a0382b5de8

Thanks for the patch Krzysztof! Hope to see you on IRC in #developers soon, where we can find you other things to work on if you are interested? :-)
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
In case you hadn't spotted it, you got a mention here:
http://blog.mozilla.com/nnethercote/2012/01/11/memshrink-progress-week-30/

:-)
Status: RESOLVED → VERIFIED
Duplicate of this bug: 711171
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: