Closed Bug 643651 Opened 13 years ago Closed 12 years ago

Huge memory usage on mapcrunch.com with webgl leads to crash

Categories

(Core :: Graphics: CanvasWebGL, defect)

x86
Windows XP
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: nick, Assigned: jgilbert)

References

(Blocks 1 open bug, )

Details

(Keywords: crash, memory-leak, qawanted, Whiteboard: [see comment 17][MemShrink:P2])

Attachments

(3 files)

User-Agent:       Mozilla/5.0 (Windows NT 5.1; rv:2.0) Gecko/20100101 Firefox/4.0
Build Identifier: Mozilla/5.0 (Windows NT 5.1; rv:2.0) Gecko/20100101 Firefox/4.0

The page contains a number of Google Maps street view panoramas. The first few seem to load and display, but the browser crashes moments later.

Reproducible: Always

Steps to Reproduce:
1.Go to site
2.Crash
3.?
Actual Results:  
-
The crash is not immediate, it seems to occur about 10 ~ 15 seconds after requesting the page.
Crash report IDs needed, please see:
https://developer.mozilla.org/En/How_to_get_a_stacktrace_for_a_bug_report

Works for me using Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b13pre) Gecko/20110321 Firefox/4.0b13pre ID:20110321030419 ; D2D enabled.

However the page does get really unresponsive at times due to the sheer number/size of images.

Does the crash still occur using safe mode?
http://support.mozilla.com/kb/Safe+Mode

How about with a new, empty testing profile? (Don't install any addons into it)
http://support.mozilla.com/kb/Basic+Troubleshooting#w_make-a-new-profile
Keywords: crash
Version: unspecified → Trunk
Crashes in safe mode as well. I will try a new profile shortly.

Report IDs:

bp-ee018df9-b958-4430-984c-915212110321 (safe mode)
bp-458352cb-bd57-4aaa-970a-9c1732110321 (regular mode)

I experience the same unresponsiveness when the page is intially loading. This quickly worsens until the UI is frozen, then the crash.

Although there are a huge number of images (each large one is a composite of dozens of smaller tiles), the page works fine (and is responsive) in Safari, IE 6+, Chrome and Firefox 3.x
Crashes with an empty profile:

bp-813c139b-51fd-4020-8118-6366f2110321
Just noticed that the task manager indicated firefox.exe memory usage increases very rapidly when opening the page. Gets to about 1.5GB then crashes.
None of the crash reports will open, not sure why.

Yeah mem usage gets to about the same for me - how much RAM do you have out of interest?

Does this occur for you using 3.6.15?

Adding qawanted to get some eyes on this.
Keywords: qawanted
I have 4GB, but in 32 bit Windows XP. 

It did not occur in 3.6.15. 

In fact, when administering the website with 3.6.15, I can load over 300 panoramas on one page without issue. (public facing site has only 20 per page).
Whilst it may take a few iterations, perhaps the easiest way is just to bisect the regression range using this tool:
http://harthur.github.com/mozregression/
The Mozilla FTP servers seem extremely slow at the moment (at least for me), making the tool unusable. 

The high memory usage is a problem in of itself - is there some other action I can take to help diagose this issue?
That page makes FF4 pretty much unusable for me. No problems on Chrome canary build nor with IE9.

Peak memory > 2GB (in 32bit FF4 on Win7 64bit with 4GB RAM). Chrome's tab uses about 140mb, IE9 ~ 60mb.
Should this block one of the memshrink bugs? e.g. mslim-fx5?

https://bugzilla.mozilla.org/show_bug.cgi?id=640457
In safe mode (only other tab open google reader as an app tab) I go from 60mb to 1.4GB after opening it. Peak mem is 1.93GB. Page zoom bring windows itself to a halt for a while. IE9 and Chrome have no problems with page zoom - another bug to file? Especially as IE9 generally has much faster page zoom than FF4.
It seems it is only the private bytes and working set that is huge:

Memory mapped:
393,216,000
          
Memory in use:
258,532,354

	malloc/allocated 258,535,594
	malloc/mapped 393,216,000
	malloc/committed 298,782,720
	malloc/dirty 2,805,760
	win32/privatebytes 1,781,628,928
	win32/workingset 1,810,579,456
	js/gc-heap 75,497,472
	js/string-data 5,815,838
	js/mjit-code 3,684,801
	storage/sqlite/pagecache 28,336,984
	storage/sqlite/other 1,470,272
	gfx/d2d/surfacecache 15,071,516
	gfx/d2d/surfacevram 18,616,868
	images/chrome/used/raw 0
	images/chrome/used/uncompressed 7,560,164
	images/chrome/unused/raw 0
	images/chrome/unused/uncompressed 1,072
	images/content/used/raw 2,849,721
	images/content/used/uncompressed 12,437,868
	images/content/unused/raw 2,108
	images/content/unused/uncompressed 2,144
	layout/all 3,314,618
	layout/bidi 0
	gfx/surface/image 19,991,496
	gfx/surface/win32 0
	content/canvas/2d_pixel_bytes 0
	shmem/allocated 585,728
	shmem/mapped 585,728
Minefield 4.2a1pre + Windows 7 Pro x64 SP1:

Memory mapped:
363,855,872

Memory in use:
347,966,166

malloc/allocated 347,970,110
malloc/mapped 363,855,872
malloc/committed 358,592,512
malloc/dirty 364,544 
win32/privatebytes 1,788,342,272
win32/workingset 1,831,051,264
js/gc-heap 61,865,984
js/string-data 11,301,112
js/mjit-code 19,931,993
storage/sqlite/pagecache 26,154,464
storage/sqlite/other 1,426,856
gfx/d2d/surfacecache 458,332
gfx/d2d/surfacevram 13,218,104
images/chrome/used/raw 0
images/chrome/used/uncompressed 380,752
images/chrome/unused/raw 0
images/chrome/unused/uncompressed 1,072
images/content/used/raw 3,628,980
images/content/used/uncompressed 5,562,868
images/content/unused/raw 0
images/content/unused/uncompressed 0
layout/all 8,163,178
layout/bidi 1,450
gfx/surface/image 5,962,472
gfx/surface/win32 0
content/canvas/2d_pixel_bytes 0

Other info: 6GB DDR3, 2xHD5770s in CrossfireX. And in the time it took to type this, the working set creeped up another 90MB (but I think this leak is caused by one of my extensions, and is a problem for another day).
(In reply to comment #14)
> Minefield 4.2a1pre + Windows 7 Pro x64 SP1:
> 
> Memory mapped:
> 363,855,872
>
> win32/privatebytes 1,788,342,272
> win32/workingset 1,831,051,264

Huh.  I thought "memory mapped" was meant to included everything, clearly it's not, on Windows at least!
(In reply to comment #15)
> 
> Huh.  I thought "memory mapped" was meant to included everything, clearly it's
> not, on Windows at least!

Oh wait:  "memory mapped" is only the heap, ie. allocated via malloc/new.  It doesn't include stuff allocated via mmap (on Linux/Mac) or VirtualAlloc (on Windows).  Clearly there's a huge amount of non-heap memory being used here.
Mozilla/5.0 (Windows NT 5.1; rv:2.0) Gecko/20100101 Firefox/4.0

This appears to be a WebGL bug.

Disable ANGLE and Firefox will no longer crash plus win32/privatebytes & win32/workingset will settle to 100-300MB after loading completes (2GB of pagefile is still abused and GPU memory still gets maxed out and flushed repeatedly while the page is loading):
webgl.prefer-native-gl TRUE


Disable WebGL completely, and all problems go away with <500MB system memory use, <500MB pagefile use, basically no GPU memory use, no crash, and the page finishes loading orders of magnitude quicker:
webgl.disabled TRUE

malloc/allocated 136,920,742
malloc/mapped 164,626,432
malloc/committed 153,059,328
malloc/dirty 2,514,944
win32/privatebytes 476,151,808
win32/workingset 493,727,744
js/gc-heap 27,262,976
js/string-data 7,037,112
js/mjit-code 4,154,287
storage/sqlite/pagecache 15,593,504
storage/sqlite/other 1,050,768
gfx/d2d/surfacecache 0
gfx/d2d/surfacevram 0
gfx/surface/win32 334,146,966
images/chrome/used/raw 0
images/chrome/used/uncompressed 239,676
images/chrome/unused/raw 0
images/chrome/unused/uncompressed 0
images/content/used/raw 8,205,399
images/content/used/uncompressed 333,890,590
images/content/unused/raw 10,335
images/content/unused/uncompressed 14,508
layout/all 719,538
layout/bidi 0
gfx/surface/image 4,416
content/canvas/2d_pixel_bytes 0
Here is what I'm seeing when I disable WEBGL with the 3/24 nightly build, Windows 7 Pro x64 SP1:

Memory mapped:
769,654,784
Memory in use:
728,573,596

malloc/allocated 728,579,588
malloc/mapped 769,654,784
malloc/committed 742,289,408
malloc/dirty 1,212,416
win32/privatebytes 870,113,280
win32/workingset 862,244,864
js/gc-heap 56,623,104
js/string-data 12,777,210
js/mjit-code 22,236,352
storage/sqlite/pagecache 24,509,264
storage/sqlite/other 1,457,864
gfx/d2d/surfacecache 254,176,156
gfx/d2d/surfacevram 13,902,424
images/chrome/used/raw 0
images/chrome/used/uncompressed 201,228
images/chrome/unused/raw 0
images/chrome/unused/uncompressed 0
images/content/used/raw 10,940,690
images/content/used/uncompressed 377,148,272
images/content/unused/raw 10,491
images/content/unused/uncompressed 13,520
layout/all 6,615,040
layout/bidi 896 
gfx/surface/image 377,380,316
gfx/surface/win32 0
content/canvas/2d_pixel_bytes 0
With just this tab open (from a fresh start) and WebGL disabled mem use is still a lot higher than chrome and IE9.



  Memory Usage


    Overview

Memory mapped: 	417,333,248
Memory in use: 	402,690,868


    Other Information

Description 	Value
malloc/allocated	402,695,596
malloc/mapped	417,333,248
malloc/committed	413,609,984
malloc/dirty	2,801,664
win32/privatebytes	467,984,384
win32/workingset	488,022,016
js/gc-heap	19,922,944
js/string-data	3,933,904
js/mjit-code	2,470,316
gfx/d2d/surfacecache	214,560,560
gfx/d2d/surfacevram	5,920,096
images/chrome/used/raw	0
images/chrome/used/uncompressed	163,468
images/chrome/unused/raw	0
images/chrome/unused/uncompressed	0
images/content/used/raw	7,752,838
images/content/used/uncompressed	319,879,824
images/content/unused/raw	6,211
images/content/unused/uncompressed	10,752
storage/sqlite/pagecache	6,781,784
storage/sqlite/other	1,070,632
layout/all	1,874,952
layout/bidi	0
gfx/surface/image	320,067,444
gfx/surface/win32	0
content/canvas/2d_pixel_bytes	0
Here's a comparison of the three major browsers' memory usage (on win7 64bit) with just that tab open (all browsers opened fresh for test), from chrome's about:memory.


                     RAM                                 virtual

Browser 	Private 	Shared 	Total 	Private 	Mapped
*Google Chrome* 12.0.713.0
	403,496k 	13,908k 	417,404k 	411,604 k 	113,408 k
*IE* 9.00.8112.16421
	219,928k 	16,374k 	236,302k 	114,740 k 	425,452 k
*Firefox* 4.2a1pre
	461,104k 	33,724k 	494,828k 	458,428 k 	104,856 k
If I disable WEBGL and ALL hardware acceleration including d2d and directwrite, I see the following memory usage on this page:

Memory mapped:
372,244,480
 
Memory in use:
341,752,450

malloc/allocated 341,756,586
malloc/mapped 372,244,480
malloc/committed 358,539,264
malloc/dirty 1,183,744
win32/privatebytes 831,008,768
win32/workingset 804,876,288
js/gc-heap 58,720,256
js/string-data 12,813,966
js/mjit-code 24,686,224
gfx/d2d/surfacecache 0
gfx/d2d/surfacevram 0
gfx/surface/win32 342,416,339
images/chrome/used/raw 0
images/chrome/used/uncompressed 193,136
images/chrome/unused/raw 0
images/chrome/unused/uncompressed 27,404
images/content/used/raw 9,766,094
images/content/used/uncompressed 342,103,152
images/content/unused/raw 124,585
images/content/unused/uncompressed 60,040
storage/sqlite/pagecache 29,905,384
storage/sqlite/other 1,422,384
layout/all 7,710,176
layout/bidi 896
gfx/surface/image 4,880
content/canvas/2d_pixel_bytes 0

And then if I disable methodjit, webgl, and hardware acceleration:

Memory mapped:
336,592,896

Memory in use:
323,015,444

malloc/allocated 323,019,532
malloc/mapped 336,592,896
malloc/committed 331,681,79
malloc/dirty 2,584,576
win32/privatebytes 749,002,752
win32/workingset 737,599,488
js/gc-heap 52,428,800
js/string-data 17,210,376
js/mjit-code 0
gfx/d2d/surfacecache 0
gfx/d2d/surfacevram 0
gfx/surface/win32 342,502,320
images/chrome/used/raw 0
images/chrome/used/uncompressed 183,196
images/chrome/unused/raw 0
images/chrome/unused/uncompressed 4,240
images/content/used/raw 9,485,661
images/content/used/uncompressed 338,779,036
images/content/unused/raw 923,224
images/content/unused/uncompressed 3,547,284
storage/sqlite/pagecache 24,871,344
storage/sqlite/other 1,448,512
layout/all 6,805,205
layout/bidi 1,124
gfx/surface/image 3,904
content/canvas/2d_pixel_bytes 0
This bug is about a _crash_ , which appears to by caused by insane memory usage by WebGL. I suspect there is already another (non-WebGL) bug for high memory usage on image heavy sites, as it's been talked about plenty prior to Fx4 release. If not, someone should file a new bug.

Firefox Safe Mode (no tabs, WebGL disabled)
Memory mapped:
30,408,704

Memory in use:
26,054,382

malloc/allocated 26,057,606
malloc/mapped 30,408,704
malloc/committed 28,966,912
malloc/dirty 1,679,360
win32/privatebytes 38,735,872
win32/workingset 48,754,688
js/gc-heap 3,145,728
js/string-data 513,890
js/mjit-code 0
gfx/d2d/surfacecache 0
gfx/d2d/surfacevram 0
gfx/surface/win32 155,368
images/chrome/used/raw 0
images/chrome/used/uncompressed 154,268
images/chrome/unused/raw 0
images/chrome/unused/uncompressed 0
images/content/used/raw 894
images/content/used/uncompressed 1,060
images/content/unused/raw 0
images/content/unused/uncompressed 0
storage/sqlite/pagecache 5,196,928
storage/sqlite/other 614,456
layout/all 270,869
layout/bidi 0
gfx/surface/image 2,208
content/canvas/2d_pixel_bytes 0



Firefox Safe Mode (http://www.mapcrunch.com/gallery, WebGL disabled)
Memory mapped:
63,963,136

Memory in use:
56,651,722
          
malloc/allocated 56,654,946
malloc/mapped 63,963,136
malloc/committed 62,316,544
malloc/dirty 3,358,720
win32/privatebytes 367,988,736
win32/workingset 380,887,040
js/gc-heap 9,437,184
js/string-data 1,230,348
js/mjit-code 0
gfx/d2d/surfacecache 0
gfx/d2d/surfacevram 0
gfx/surface/win32 333,384,680
images/chrome/used/raw 0
images/chrome/used/uncompressed 187,208
images/chrome/unused/raw 0
images/chrome/unused/uncompressed 0
images/content/used/raw 8,047,172
images/content/used/uncompressed 333,196,084
images/content/unused/raw 130
images/content/unused/uncompressed 64
storage/sqlite/pagecache 7,466,760
storage/sqlite/other 768,752
layout/all 1,811,813
layout/bidi 0
gfx/surface/image 7,728
content/canvas/2d_pixel_bytes 0

As you can see, the majority of Firefox's memory use is coming from storing ~318MB of Uncompressed Images (images/content/used/uncompressed) in memory. Subtract that from the Working Set and you get only ~45MB for the Browser itself, not much at all. This is on Windows XP x86.
Alex Firestone, thanks for the diagnosis in comment 17, that's extremely helpful! I've CC'd some WebGL experts.

Let's keep this bug about the specific WebGL problem.  If you see another website where you think Firefox is using an unreasonable amount of memory, please file a new bug with full steps to reproduce and mark it as blocking bug 640457, which is the tracking bug for our ongoing efforts to reduce Firefox's memory consumption.  Thanks!
Whiteboard: see comment 17
Component: General → Canvas: WebGL
Product: Firefox → Core
QA Contact: general → canvas.webgl
Summary: crash everytime when visiting this page → Huge memory usage by WebGL leads to crash
Disabling WebGL only mitigates the problem (at least on my system, Win XP 32 bit, FF 4.0).

Try the following with WebGL disabled:

1. Visit http://www.mapcrunch.com/gallery
2. Scroll to bottom and click through to page 2.
3. Observe cumulative memory usage increase in task manager (or about:memory)

I can reach page 5 before memory use exceeds 1.8GB, and the crash occurs:

bp-d6ef8a17-42af-4d9d-842d-bebee2110324

With WebGL enabled, the crash (and extreme memory use) occurred on the first page.
(In reply to comment #24)
> Disabling WebGL only mitigates the problem (at least on my system, Win XP 32
> bit, FF 4.0).
> 
> Try the following with WebGL disabled:
> 
> 1. Visit http://www.mapcrunch.com/gallery
> 2. Scroll to bottom and click through to page 2.
> 3. Observe cumulative memory usage increase in task manager (or about:memory)
> 
> I can reach page 5 before memory use exceeds 1.8GB, and the crash occurs:
> 
> bp-d6ef8a17-42af-4d9d-842d-bebee2110324
> 
> With WebGL enabled, the crash (and extreme memory use) occurred on the first
> page.

I can't reproduce this on my WinXP PC. Every time a page finishes loading, Firefox 4 garbage collection reduces memory usage to around that of a single page on mapcrunch nearly instantly. Below is my memory use after 20 pages using Firefox 4 Safe-mode:

malloc/allocated 83,951,880
malloc/mapped 101,711,872
malloc/committed 95,399,936
malloc/dirty 3,862,528
win32/privatebytes 368,918,528
win32/workingset 383,176,704
js/gc-heap 16,777,216
js/string-data 1,356,998
js/mjit-code 0
gfx/d2d/surfacecache 0
gfx/d2d/surfacevram 0
gfx/surface/win32 308,201,680
images/chrome/used/raw 0
images/chrome/used/uncompressed 171,156
images/chrome/unused/raw 0
images/chrome/unused/uncompressed0images/content/used/raw 7,923,474
images/content/used/uncompressed 308,008,360
images/content/unused/raw 12,472
images/content/unused/uncompressed 20,836
storage/sqlite/pagecache 23,293,448
storage/sqlite/other 891,616
layout/all 429,416
layout/bidi 0
gfx/surface/image 2,208
content/canvas/2d_pixel_bytes 0

Overall memory usage was lower than page 1 of mapcrunch.com only:
malloc increased slightly, privatebytes stayed the same, workingset decreased, uncompressed images decreased

With WebGL disabled, everything seems to be working correctly on my end as far a garbage collection and memory use goes. Nick, after changing a page, how long does it take Firefox garbage collection to start reducing memory?
Can't reproduce on win7 64bit (latest 32bit nightly) with webgl disabled on a lean profile (D2D/DW/D3D10 all active). 

After 20 pages memory use is:


Memory mapped: 	409,993,216
Memory in use: 	387,684,506


    Other Information

Description 	Value
malloc/allocated	387,689,058
malloc/mapped	409,993,216
malloc/committed	399,360,000
malloc/dirty	3,338,240
win32/privatebytes	463,241,216
win32/workingset	487,948,288
js/gc-heap	17,825,792
js/string-data	1,300,066
js/mjit-code	2,518,041
gfx/d2d/surfacecache	210,945,524
gfx/d2d/surfacevram	4,886,180
images/chrome/used/raw	0
images/chrome/used/uncompressed	159,228
images/chrome/unused/raw	0
images/chrome/unused/uncompressed	0
images/content/used/raw	7,415,494
images/content/used/uncompressed	304,426,544
images/content/unused/raw	19,811
images/content/unused/uncompressed	4,195,632
storage/sqlite/pagecache	13,618,856
storage/sqlite/other	1,096,752
layout/all	1,790,166
layout/bidi	0
gfx/surface/image	308,794,772
gfx/surface/win32	0
content/canvas/2d_pixel_bytes	0

Peak memory according to win task manager was 767mb.
I followed these steps:

1. Visited http://www.mapcrunch.com/gallery
2. After page had finished loading, task manager indicated ~400MB 
3. Clicked through to page 2, waited for that to load, indicated ~800MB

I then waited for around 4 minutes, but the memory usage did not change.

Perhaps I should try with the latest nightly?
With webGL off are you using a fresh profile, or one with addons etc? 

I can't reproduce your memory increase/crash on my normal profile either (web GL off, ABP and some small extensions enabled, decode on draw on, all HWA on).
(In reply to comment #25)
> Nick, after changing a page, how long
> does it take Firefox garbage collection to start reducing memory?

Depends entirely on what the page is doing.  But garbage collection is only used for the JavaScript heap, and the js/gc-heap entries above are all quite small, so I don't think it's relevant here.

W.r.t. comment 24:  since that problem is not as bad as the first reported problem, and it's not reproducing for everyone, focussing on the first reported problem seems like the best idea here.
Okay, after re-testing with a fresh profile, I no longer experienced the cumulative memory increase.

I renabled addons to find the culprit; the issue occurs when ABP is enabled.

To summarise:

WebGL enabled, no addons: Extreme memory use and crash on first page

WebGL disabled, ABP enabled: Cumulative memory increase (400MB per page) leading to crash after 5 pages

WebGL disabled, ABP disabled: Normal behaviour, memory 'clears' after each subsequent page load, hence no significant cumulative increase, and no crash.
What filters are you using with ABP? I use fanboy's list and his tracking list without your memory problem. Are you also using noscript?
> I renabled addons to find the culprit; the issue occurs when ABP is enabled.

Thanks for the info!  Sounds like it might be a leak that occurs when ABP is enabled.  Nick, can you spin off a new bug, include the steps to reproduce (including disabling WebGL) and mark the new bug as blocking bug 640452 (which tracks leaks)?  Thanks!
(In reply to comment #31)
> What filters are you using with ABP? I use fanboy's list and his tracking list
> without your memory problem. Are you also using noscript?

I'm not using noscript, and tried ABP with both fanboy's list and easylist. The behaviour is consistent - leak with ABP every time, and no leak with ABP disabled, every time.

I have reported a new bug, 644876, as this is now a different issue to the webGL memory leak. We should probably continue trying to reproduce the bug there.
Regarding the WebGL leak:

This seems to be a leak in ANGLE: I can reproduce the huge memory usage in the default config, but when I set webgl.prefer-native-gl=true, I can't reproduce anymore. The effect of this preference is to use your OpenGL driver instead of ANGLE for rendering.

Can you guys try with these preferences set:
  webgl.prefer-native-gl=true
  webgl.force-enabled=true

Note: force-enabled is needed on Intel and ATI cards on Windows where we have blacklisted their OpenGL drivers. Warning: depending on how bad your OpenGL driver is, this could be crashy.
(In reply to comment #17)
> This appears to be a WebGL bug.
> 
> Disable ANGLE and Firefox will no longer crash plus win32/privatebytes &
> win32/workingset will settle to 100-300MB after loading completes (2GB of
> pagefile is still abused and GPU memory still gets maxed out and flushed
> repeatedly while the page is loading):
> webgl.prefer-native-gl TRUE

There is still some problems without ANGLE that I described in comment 17.

__________

Bad:

With or without ANGLE = ~2GB pagefile (virtual memory) use until Firefox is closed.

With or without ANGLE = GPU memory filled to capacity and flushed repeatedly during page load (causes background to turn black and flicker)

With ANGLE = Extremely high win32/privatebytes & win32/workingset

With or without ANGLE = Extremely slow loading of http://www.mapcrunch.com/gallery caused by the above
__________

Good:

Without ANGLE = Very low win32/privatebytes & win32/workingset
__________


Benoit, can you reproduce the large GPU memory fluctuations, slow page loading, and high pagefile use? This is on WinXP x86 with a 7800GTX 512MB.
I can reproduce the effect that disabling WebGL causes the mapcrunch site to load much faster and stays much more responsive. Also memory usage is much lower.
Sorry to ask, but any word on a fix for this? Still exists in 4.0.1. 

Status is still unconfirmed, but it's easily demonstratable and reproducable.
Marking new, in case unco is what is stopping the relevant people getting bugmail.
Status: UNCONFIRMED → NEW
Ever confirmed: true
bjacob: have you had a chance to look at this more?
Yes, sorry, been busy. Will add about:memory support for WebGL resources, so we get a chance to learn more about this problem.
(In reply to comment #37)
> Sorry to ask, but any word on a fix for this? Still exists in 4.0.1. 
> 
> Status is still unconfirmed, but it's easily demonstratable and reproducable.

...depends for whom. I've still not been able to reproduce the issue here. That's why adding about:memory will help by allowing people who can reproduce to get more info on their own machine.
bjacob, when you file the bug for the about:memory reporters can you CC me?  Thanks.
Depends on: 638549
Keywords: mlk
Whiteboard: see comment 17 → [see comment 17][MemShrink:P1]
Assignee: nobody → jmuizelaar
Jeff, any updates on this bug?
I don't see any evidence that this site uses webgl at all. This looks like the same bug as our other image memory bugs.
Summary: Huge memory usage by WebGL leads to crash → Huge memory usage on mapcrunch.com leads to crash
Indeed I'm pretty sure it doesn't use WebGL at all, since I set a breakpoint on WebGLContext::SetDimensions and it wasn't hit ;-) The worrying thing is that I didn't realize that before.
Whiteboard: [see comment 17][MemShrink:P1] → [see comment 45][MemShrink:P1]
He changed the site to not use WebGL because of this bug. You now need to use the following link:
http://www.mapcrunch.com/gallery?webgl=1
The WebGL version is using cross-domain textures so it doesn't work since bug 656277 landed. It will resume working once both these things happen: bug 662599 lands, and the image servers get updated to use CORS (if these are Google servers, you can count on this happening quickly as they are very interested in this problem).
Aurora is still on r653. Between 653 and 686, there is 678, http://code.google.com/p/angleproject/source/detail?r=678 , which looks serious to me from a security perspective. This sounds like the kind of things that can give access to uninitialized video memory, though you'd have to ask a ANGLE developer to be sure.

I support taking ANGLE r686 on Aurora, it's been well tested on Nightly, it has useful improvements (removes crashes in debug builds on the conformance test suite, removes wrong-rendering bugs).
oops, ignore comment 49, wrong bug.
Someone with access should update the URL in the bug with the WebGL link and change the title to once again reflect WebGL. The major concern with this bug was related to extremely high WebGL memory usage and out-of-memory crashes. 

Without WebGL mapcrunch works normally, if not a bit high on memory use per comment 45. Though as Benoit stated in comment 48, it will be a little while till this bug can be reproduced again in anything other than Fx4.
Summary: Huge memory usage on mapcrunch.com leads to crash → Huge memory usage on mapcrunch.com with webgl leads to crash
Whiteboard: [see comment 45][MemShrink:P1] → [see comment 17][MemShrink:P1]
I still get a very high memory usage using the current nightly when I use the webgl version. So maybe although the site is not visually correct, we can still try to fix the bug?
In bug 638549 I've made patches that add WebGL accounting in about:support.

Try builds are ready at:
https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bjacob@mozilla.com-c2df6455c534/

Can you please try this demo in these builds, and go to about:memory and check the webgl entries there, at the bottom under the 'Other Measurements' category. They only appear if webgl objects are in memory. If you don't see webgl there, that means that no webgl objects are currently in memory.
It seems there are some webgl objects, but those are not taking up much memory.

I'm only not sure if this still the original problem, since after a GC, the memory returns to normal.

This is using WebGL:

Main Process

Explicit Allocations
542.01 MB (100.0%) -- explicit
├──426.94 MB (78.77%) -- images
│  ├──426.75 MB (78.73%) -- content
│  │  ├──426.75 MB (78.73%) -- used
│  │  │  ├──419.33 MB (77.36%) -- uncompressed
│  │  │  └────7.42 MB (01.37%) -- raw
│  │  └────0.00 MB (00.00%) -- (1 omitted)
│  └────0.19 MB (00.04%) -- (1 omitted)
├───48.06 MB (08.87%) -- js
│   ├──22.26 MB (04.11%) -- compartment([System Principal])
│   │  ├───8.56 MB (01.58%) -- gc-heap
│   │  │   ├──4.87 MB (00.90%) -- objects
│   │  │   ├──3.08 MB (00.57%) -- shapes
│   │  │   └──0.61 MB (00.11%) -- (5 omitted)
│   │  ├───7.06 MB (01.30%) -- (6 omitted)
│   │  └───6.64 MB (01.23%) -- mjit-code
│   ├──18.18 MB (03.35%) -- compartment(http://www.mapcrunch.com/gallery?webgl=1...)
│   │  ├───7.68 MB (01.42%) -- gc-heap
│   │  │   ├──6.32 MB (01.17%) -- objects
│   │  │   └──1.36 MB (00.25%) -- (6 omitted)
│   │  ├───4.93 MB (00.91%) -- object-slots
│   │  ├───3.36 MB (00.62%) -- mjit-code
│   │  └───2.21 MB (00.41%) -- (5 omitted)
│   ├───4.52 MB (00.83%) -- (6 omitted)
│   └───3.10 MB (00.57%) -- gc-heap-chunk-unused
├───48.04 MB (08.86%) -- heap-unclassified
├───16.19 MB (02.99%) -- storage
│   └──16.19 MB (02.99%) -- sqlite
│      ├──13.12 MB (02.42%) -- places.sqlite
│      │  ├──12.87 MB (02.37%) -- cache-used
│      │  └───0.25 MB (00.05%) -- (2 omitted)
│      └───3.07 MB (00.57%) -- (13 omitted)
└────2.78 MB (00.51%) -- (2 omitted)

Other Measurements
618.10 MB -- resident
613.43 MB -- private
536.94 MB -- heap-committed
530.75 MB -- heap-used
419.55 MB -- gfx-surface-image
 30.24 MB -- heap-unused
 21.00 MB -- js-gc-heap
 11.37 MB -- gfx-d2d-surfacevram
  2.18 MB -- heap-dirty
  1.10 MB -- webgl-buffer-memory
  0.17 MB -- gfx-d2d-surfacecache
  0.00 MB -- gfx-surface-win32
  0.00 MB -- webgl-texture-memory
      418 -- webgl-texture-count
       45 -- webgl-buffer-count
       15 -- webgl-context-count
New tryserver builds, with expanded WebGL about:memory coverage, will soon be available at:
https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bjacob@mozilla.com-087d304b9757
(In reply to comment #54)
> Explicit Allocations
> 542.01 MB (100.0%) -- explicit
> ├──426.94 MB (78.77%) -- images
> │  ├──426.75 MB (78.73%) -- content
> │  │  ├──426.75 MB (78.73%) -- used
> │  │  │  ├──419.33 MB (77.36%) -- uncompressed

So this demo is already loading 400 M worth of images, and the following puts this in perspective:

> Other Measurements
> 618.10 MB -- resident
> 613.43 MB -- private
> 536.94 MB -- heap-committed
> 530.75 MB -- heap-used
> 419.55 MB -- gfx-surface-image

Finally, this:

> 418 -- webgl-texture-count

almost certainly means that the demo is storing 418 images simultaneously, which explains how it ended up using 400 M of memory for image data.

So in your case, the large number of images explains the high memory usage. Now the fact that the gfx-surface-image measurement is high despite WebGL being used and 400 textures being created, strongly suggests that this demo is keeping the images in memory while also having them as WebGL textures, so that each image exists twice in memory (as a standard image and as a WebGL texture). Depending on the graphics driver, WebGL textures may well end up being stored in main memory, especially if it runs out of video memory, so it seems that this demo will use twice as much memory in the WebGL version than in the non-WebGL version.

That still doesn't explain, though, the huge memory usage difference that was noted in e.g. comment 10.
Here is what I get with a MSVC10 PGO jemalloc build of 8.0a1 changeset 97012a02db93 on WinXP SP3:

With WebGL
http://www.mapcrunch.com/gallery?webgl=1
(no images ever render)

Explicit Allocations
126,035,752 B (100.0%) -- explicit
├──1,315,228,763 B (1043.54%) -- images
│  ├──1,315,042,495 B (1043.39%) -- content
│  │  ├──1,315,042,495 B (1043.39%) -- used
│  │  │  ├──1,293,552,032 B (1026.34%) -- uncompressed
│  │  │  └─────21,490,463 B (17.05%) -- raw
│  │  └──────────────0 B (00.00%) -- unused
│  │                 ├──0 B (00.00%) -- raw
│  │                 └──0 B (00.00%) -- uncompressed
│  └────────186,268 B (00.15%) -- chrome
│           ├──186,268 B (00.15%) -- used
│           │  ├──186,268 B (00.15%) -- uncompressed
│           │  └────────0 B (00.00%) -- raw
│           └────────0 B (00.00%) -- unused
│                    ├──0 B (00.00%) -- raw
│                    └──0 B (00.00%) -- uncompressed
├───38,186,995 B (30.30%) -- js
├───32,773,320 B (26.00%) -- storage
└──-1,262,609,312 B (-1001.79%) -- heap-unclassified

Other Measurements
1,545,977,856 B -- vsize
1,426,059,264 B -- private
1,356,402,688 B -- resident
1,293,739,628 B -- gfx-surface-win32
  125,804,544 B -- heap-committed
  120,117,032 B -- heap-used
   25,165,824 B -- js-gc-heap
    7,808,278 B -- heap-unused
    2,043,904 B -- heap-dirty
    1,155,960 B -- webgl-buffer-memory
      243,360 B -- webgl-buffer-cache-memory
        6,765 B -- webgl-shader-sources-size
        3,904 B -- gfx-surface-image
            0 B -- gfx-d2d-surfacecache
            0 B -- gfx-d2d-surfacevram
            0 B -- webgl-texture-memory
            0 B -- webgl-renderbuffer-memory
            0 B -- webgl-shader-translationlogs-size
             45 -- webgl-buffer-count
             15 -- webgl-context-count
              0 -- webgl-renderbuffer-count
             30 -- webgl-shader-count
          1,536 -- webgl-texture-count

____________

Without WebGL
http://www.mapcrunch.com/gallery
(renders quickly)

Explicit Allocations
59,599,630 B (100.0%) -- explicit
├──210,614,678 B (353.38%) -- images
│  ├──210,252,422 B (352.77%) -- content
│  │  ├──210,250,252 B (352.77%) -- used
│  │  │  ├──205,155,960 B (344.22%) -- uncompressed
│  │  │  └────5,094,292 B (08.55%) -- raw
│  │  └────────2,170 B (00.00%) -- unused
│  │           ├──1,124 B (00.00%) -- uncompressed
│  │           └──1,046 B (00.00%) -- raw
│  └──────362,256 B (00.61%) -- chrome
│         ├──361,196 B (00.61%) -- used
│         │  ├──361,196 B (00.61%) -- uncompressed
│         │  └────────0 B (00.00%) -- raw
│         └────1,060 B (00.00%) -- unused
│              ├──1,060 B (00.00%) -- uncompressed
│              └──────0 B (00.00%) -- raw
├──27,924,566 B (46.85%) -- js
├───5,088,096 B (08.54%) -- storage
└──-186,859,549 B (-313.52%) -- heap-unclassified

Other Measurements
400,543,744 B -- vsize
309,841,920 B -- resident
302,911,488 B -- private
205,520,536 B -- gfx-surface-win32
 60,264,448 B -- heap-committed
 55,597,838 B -- heap-used
 17,825,792 B -- js-gc-heap
  7,315,760 B -- heap-unused
  2,498,560 B -- heap-dirty
      7,808 B -- gfx-surface-image
          0 B -- gfx-d2d-surfacecache
          0 B -- gfx-d2d-surfacevram
____________
(In reply to comment #57)
> With WebGL
> 1,293,739,628 B -- gfx-surface-win32
>
> Without WebGL
> 205,520,536 B -- gfx-surface-win32

This is interesting. The WebGL version of this site has 1.29 G of images loaded at once. The non-WebGL version has only 200M of images loaded. It looks like these are 2 really different sites, differing not just in the fact that one uses WebGL as its back-end (that alone doesn't explain why the WebGL version uses more gfx-surface-win32 memory)
I tried it again and it got even worse:

Explicit Allocations

106,864,642 B (100.0%) -- explicit
├──1,715,519,865 B (1605.32%) -- images
│  ├──1,715,157,161 B (1604.98%) -- content
│  │  ├──1,715,157,161 B (1604.98%) -- used
│  │  │  ├──1,685,736,352 B (1577.45%) -- uncompressed
│  │  │  └─────29,420,809 B (27.53%) -- raw
│  │  └──────────────0 B (00.00%) -- unused
│  │                 ├──0 B (00.00%) -- raw
│  │                 └──0 B (00.00%) -- uncompressed
│  └────────362,704 B (00.34%) -- chrome
│           ├──362,704 B (00.34%) -- used
│           │  ├──362,704 B (00.34%) -- uncompressed
│           │  └────────0 B (00.00%) -- raw
│           └────────0 B (00.00%) -- unused
│                    ├──0 B (00.00%) -- raw
│                    └──0 B (00.00%) -- uncompressed
├───32,044,114 B (29.99%) -- js
├───11,008,312 B (10.30%) -- storage
└──-1,654,377,829 B (-1548.11%) -- heap-unclassified

Other Measurements
1,915,424,768 B -- vsize
1,800,155,136 B -- private
1,686,100,156 B -- gfx-surface-win32
1,349,218,304 B -- resident
  106,258,432 B -- heap-committed
  101,863,426 B -- heap-used
   19,922,944 B -- js-gc-heap
    6,138,940 B -- heap-unused
    1,458,176 B -- heap-dirty
    1,155,960 B -- webgl-buffer-memory
      243,360 B -- webgl-buffer-cache-memory
        6,765 B -- webgl-shader-sources-size
        4,880 B -- gfx-surface-image
            0 B -- gfx-d2d-surfacecache
            0 B -- gfx-d2d-surfacevram
            0 B -- webgl-texture-memory
            0 B -- webgl-renderbuffer-memory
            0 B -- webgl-shader-translationlogs-size
             45 -- webgl-buffer-count
             15 -- webgl-context-count
              0 -- webgl-renderbuffer-count
             30 -- webgl-shader-count
          2,034 -- webgl-texture-count
________

It does do garbage collection, but as soon as you move back to the tab or interact with a map, memory usage jumps right back up. Overall the site is completely non-functional (browser is semi-hung & nothing renders, just black boxes...) when using OpenGL. Seems to be the same issue as originally reported in any case.
(In reply to comment #59)
> 106,864,642 B (100.0%) -- explicit
> ├──1,715,519,865 B (1605.32%) -- images
> ├───32,044,114 B (29.99%) -- js
> ├───11,008,312 B (10.30%) -- storage
> └──-1,654,377,829 B (-1548.11%) -- heap-unclassified

There is a bit of a problem with the memory reporters here.
(In reply to comment #60)
> >
> > └──-1,654,377,829 B (-1548.11%) -- heap-unclassified
> 
> There is a bit of a problem with the memory reporters here.

Bug 664659, probably.
Jeff, any updates here?
I don't think there's a reason to believe it's WebGL related at this point. The WebGL version of mapcrunch just has a huge amount of image data loaded at the same time (see about about:memory captures), that's the first problem to understand.
Since the WebGL version is all Google Maps API stuff, does Firefox loading 1500-2000 WebGL textures all at once for just 15 maps, make this a Google API problem or a Firefox problem? That so many textures are being created, suggests Firefox is trying to render all possible views, for all maps, at the same time, not just the currently visible view like the non-WebGL version.

Can Firefox be coded to render image heavy WebGL smarter, so out-of-memory problems don't happen? Do something like limit currently load/rendered WebGL textures to what fits in GPU memory, dynamically discarding and swapping out textures as needed?

Is this something Google could fix in their WebGL API?
(In reply to comment #64)
> Since the WebGL version is all Google Maps API stuff, does Firefox loading
> 1500-2000 WebGL textures all at once for just 15 maps, make this a Google
> API problem or a Firefox problem? That so many textures are being created,
> suggests Firefox is trying to render all possible views, for all maps, at
> the same time, not just the currently visible view like the non-WebGL
> version.

It's not Firefox who is trying to do that, it's the JS code at Mapcrunch. WebGL is just a dump low-level API to talk to the graphics hardware. The decision to use 2000 textures at once comes from Mapcrunch, not from Firefox. Also, it's not just WebGL textures, it's also plain images. If Mapcrunch doesn't have this problem in other browsers, that's interesting, we should then debug this at the JS level to understand why in Firefox it loads so many images and WebGL textures.

> 
> Can Firefox be coded to render image heavy WebGL smarter, so out-of-memory
> problems don't happen?

Again, this is key: none of that is up to Firefox. If the JS script asks to load a WebGL texture, it gets a WebGL texture. It's exactly the same problem as with a JS script that would create a very large array of zeros.

> Do something like limit currently load/rendered WebGL
> textures to what fits in GPU memory, dynamically discarding and swapping out
> textures as needed?

Textures that are paged out of GPU memory go into main memory, so that wouldn't help at all with the problem that we use too much main memory.

> 
> Is this something Google could fix in their WebGL API?

I don't understand what is meant by "Google's WebGL API" ? Did you mean Google Maps API? I don't know anything about it.
Chrome on WinXP with 7800GTX 512MB
(only last 3 maps active when loading finishes):
http://img834.imageshack.us/img834/9135/chromewinxp7800gtx512mb.png

Firefox Nightly on WinXP 7800GTX 512MB:
~2GB memory usage, only black boxes rendered, browser semi-hung


Chrome on Win7 with HD5750 1GB
(all maps active when loading finishes):
http://img803.imageshack.us/img803/8398/chromewin7x6457501gb.png

Firefox Nightly on Win7 5750 1GB:
~1GB memory usage, only black boxes rendered, browser semi-hung
> only black boxes rendered

That is actually the expected behavior since these are cross-domain textures. If you try in Chrome 13 it should be behaving the same way.
If Bug 662599 is fixed, and it currently works in Chrome 15 snapshots, when will the black boxes be fixed on 8.0a1 Nightly builds? Is there another bug tracking that?

(In reply to comment #66)
> Firefox Nightly on WinXP 7800GTX 512MB:
> ~2GB memory usage, only black boxes rendered, browser semi-hung

I take that back. I tested the latest August 2nd nightly, the same I tested on the Win7 computer, and things were better. It now seems to be using ~1GB like I saw on the Win7 computer (which is good). Though it's only rendering white boxes not black, so something else may have broke causing WebGL to not get loaded at all (which may be bad). We'll need to wait for whatever change to be done to make remote (CORS?) images shown again for WebGL Google Maps in Firefox.

If both Chrome 15 and Firefox Nightly agree that 15 WebGL Google Maps need ~1GB of RAM, the question is still how does the browser deal with GPUs with less than 1GB of memory on-board? Even Chrome seems to have trouble with this. The first couple maps briefly loaded, with Chrome ultimately deciding to unload everything except the last 3 maps when it discovers it's unable to allocate any more GPU memory. Firefox on the other hand seems like it was previously loading everything multiple times (constantly maxing out and flushing GPU memory), producing twice as many webgl-texture-count (~1500-2000) on the WinXP computer then there actually are on page (~800-1000?). Hopefully the latest Nightly is a good sign.
(In reply to comment #68)
> If Bug 662599 is fixed, and it currently works in Chrome 15 snapshots, when
> will the black boxes be fixed on 8.0a1 Nightly builds? Is there another bug
> tracking that?

Strange: I confirm that this site works in Chrome 14. This means that either Chrome or Firefox has a bug in the implementation of the rules on cross-domain images. Writing to the WebGL mailing list.

> 
> (In reply to comment #66)
> > Firefox Nightly on WinXP 7800GTX 512MB:
> > ~2GB memory usage, only black boxes rendered, browser semi-hung
> 
> I take that back. I tested the latest August 2nd nightly, the same I tested
> on the Win7 computer, and things were better. It now seems to be using ~1GB
> like I saw on the Win7 computer (which is good). Though it's only rendering
> white boxes not black, so something else may have broke causing WebGL to not
> get loaded at all (which may be bad). We'll need to wait for whatever change
> to be done to make remote (CORS?) images shown again for WebGL Google Maps
> in Firefox.

CORS is already implemented, as you noted (bug 662599). So there's a bug in either Firefox's or Chrome's implementation. 

> 
> If both Chrome 15 and Firefox Nightly agree that 15 WebGL Google Maps need
> ~1GB of RAM, the question is still how does the browser deal with GPUs with
> less than 1GB of memory on-board?

It just doesn't. Think of it this way: it's just like a JS script that allocates a huge array of data, occupying 1 G of RAM. On machines with less memory than that, that just won't run.

Note that there is little or no difference between video RAM and main RAM here, as video RAM is nowadays virtualized just like regular RAM.

> Even Chrome seems to have trouble with
> this. The first couple maps briefly loaded, with Chrome ultimately deciding
> to unload everything except the last 3 maps when it discovers it's unable to
> allocate any more GPU memory.

How do you know that Chrome is following that logic?

> Firefox on the other hand seems like it was
> previously loading everything multiple times (constantly maxing out and
> flushing GPU memory), producing twice as many webgl-texture-count
> (~1500-2000) on the WinXP computer then there actually are on page
> (~800-1000?). Hopefully the latest Nightly is a good sign.

Interesting, but again that is probably the behavior of the JS script, not of Firefox per se. There's nothing in Firefox that would cause it to create a 2nd texture when the first fails, as far as I can see.
(In reply to comment #69)
> Strange: I confirm that this site works in Chrome 14. This means that either
> Chrome or Firefox has a bug in the implementation of the rules on
> cross-domain images. Writing to the WebGL mailing list.
> 
> CORS is already implemented, as you noted (bug 662599). So there's a bug in
> either Firefox's or Chrome's implementation. 

Hopefully you have good luck getting it sorted out quickly.

> > If both Chrome 15 and Firefox Nightly agree that 15 WebGL Google Maps need
> > ~1GB of RAM, the question is still how does the browser deal with GPUs with
> > less than 1GB of memory on-board?
> 
> It just doesn't. Think of it this way: it's just like a JS script that
> allocates a huge array of data, occupying 1 G of RAM. On machines with less
> memory than that, that just won't run.
> 
> Note that there is little or no difference between video RAM and main RAM
> here, as video RAM is nowadays virtualized just like regular RAM.
>
> > Even Chrome seems to have trouble with
> > this. The first couple maps briefly loaded, with Chrome ultimately deciding
> > to unload everything except the last 3 maps when it discovers it's unable to
> > allocate any more GPU memory.
> 
> How do you know that Chrome is following that logic?

I don't, but that is what appears to be happening when watching GPU memory usage when the page loads. As soon as GPU memory get max out the first time, Chrome flushes GPU memory and unloads the first maps, it then continues to load the rest of the page without attempting to render any more until it settles on displaying only the last 3-4 maps which can be safely rendered in GPU memory (it fills my GPU memory to 100% a total of 3 times when loading the page). I doubt even Chrome is working perfectly though, since it ended using the full 512MB of GPU memory with 4 maps, when it was able to render all 15 with only 1GB on Win7 with the ATI card. Still, making an attempt to show something which it is able to display with the GPU memory available, seems better than outright failing to show anything.

> Interesting, but again that is probably the behavior of the JS script, not
> of Firefox per se. There's nothing in Firefox that would cause it to create
> a 2nd texture when the first fails, as far as I can see.

Could poor memory management by the GPU driver be causing this odd behavior? How much of rendering WebGL is under control by the browser, and how much by the GPU driver? How much are both aware of what the other is doing? Even Chrome seems to only report the GPU process using half (~256MB) of what is actually (~512MB) being used by the GPU. Does Firefox even attempt to detect available/total GPU memory? NVIDIA Inspector & MSI Afterburner seem like two popular applications which can report GPU memory usage in real-time. Would adding such GPU memory monitoring functionality to Firefox be useful at all?

The optimal way to handle that page seems to be loading maps in order from top to bottom, taking things a step further than Chrome does. As soon as you hit the GPU memory limit (something fails to allocate?), stop loading the rest of the WebGL maps and show the first few which are done. As you scroll down the page, dynamically unload the oldest map to make room for loading up the next map. It you hit the GPU memory limit again, unload the next oldest map. That would make the page usable with limited GPU memory.

Still, if you keep on saying everything is probably the fault of JS (Firefox having no control over anything), and MapCrunch is just using the standard Google Maps Javascipt API (which supports WebGL), wouldn't that make it a Google problem? If getting Google to fix their WebGL javascript for Google Maps is the ultimate solution (if you don't think this is a Firefox WebGL problem), someone would still need to at least identify the badly behaving WebGL JS and report it to Google for fixing, before this bug could be closed.
(In reply to comment #70)
> (In reply to comment #69)
> > Strange: I confirm that this site works in Chrome 14. This means that either
> > Chrome or Firefox has a bug in the implementation of the rules on
> > cross-domain images. Writing to the WebGL mailing list.
> > 
> > CORS is already implemented, as you noted (bug 662599). So there's a bug in
> > either Firefox's or Chrome's implementation. 
> 
> Hopefully you have good luck getting it sorted out quickly.

It turns out that when run in Firefox, this page NEVER sets crossOrigin attributes. Neither 'anonymous' nor 'use-credentials'. So Firefox has the correct behavior of blocking these textures.

Most likely this page does browser detection and does not use CORS with Firefox.

I'm scared to think what else in this page could be different between Firefox and Chrome. So far we don't have a definite reason to believe that Firefox is using too much memory for the code that it's receiving, since we don't know that any other browser receives the same code.
(In reply to comment #70)
> Still, if you keep on saying everything is probably the fault of JS (Firefox
> having no control over anything), and MapCrunch is just using the standard
> Google Maps Javascipt API (which supports WebGL), wouldn't that make it a
> Google problem? If getting Google to fix their WebGL javascript for Google
> Maps is the ultimate solution (if you don't think this is a Firefox WebGL
> problem), someone would still need to at least identify the badly behaving
> WebGL JS and report it to Google for fixing, before this bug could be closed.

Do you know if the Google Maps API is what should take care of setting the CORS attribute? I don't know how this API works.

If yes, then we need to report a bug to them, about crossOrigin not being set on Firefox. I believe that crossOrigin should just be set everywhere for simplicity.
Many thanks to Jeff for helping me investigate this. It turns out that the page always sets image.crossOrigin="".

The spec says that an empty string means 'anonymous CORS', but our implementation wrongly interpretes it as 'no CORS'. Patch coming.
Filed bug 643651 about the CORS bug.
Benoit: should this still be a MemShrink:P1?
(In reply to comment #75)
> Benoit: should this still be a MemShrink:P1?

Today's findings do not pertain to memory usage, only to CORS correctness (bug 643651). So there is no new reason to make that a MemShrink:P1 as of today.
(In reply to comment #75)
> Benoit: should this still be a MemShrink:P1?

Oh I misread the 'still'. This page is creating a thousand WebGL textures, which pretty much has to be its own fault. I can't say for sure yet that nothing is our fault here, but nothing has been found yet MemShrink-wise. (On the other hand a serious bug has been found wrt CORS correctness). So I would agree with removing the MemShrink:P1.
Ok, I've downgraded to MemShrink:P2.  Thanks for the info!
Whiteboard: [see comment 17][MemShrink:P1] → [see comment 17][MemShrink:P2]
I wonder if this could have something to do with this WebGL list discussion:
https://www.khronos.org/webgl/public-mailing-list/archives/1106/msg00099.html

This could explain why a script has a large number of textures around without noticing it. Such a script would be a poor WebGL script, but as I explain there, there may be something we could do to help such scripts not use too much memory.
(In reply to comment #74)
> Filed bug 643651 about the CORS bug.

Bug 676413 is the correct bug number for the CORS bug, for anybody having trouble finding it.
(In reply to comment #80)
> Bug 676413 is the correct bug number for the CORS bug, for anybody having
> trouble finding it.

Oops, yes.
The CORS bug 676413 is now fixed, and in today's Nightly the pictures now show like in Chrome.

But the huge memory usage is still present, both on the heap and in GL texture memory:

1,909.65 MB -- heap-allocated
1,951.00 MB -- heap-committed
1,146.52 MB -- resident
2,577.50 MB -- vsize
  760.25 MB -- webgl-texture-memory

On the other hand, I can't reproduce anymore the problem with huge amounts of memory staying in use for general images (as opposed to GL textures):

    0.06 MB -- gfx-surface-image
    5.77 MB -- gfx-surface-xlib

Using 760 M of texture memory is never a good idea, and is the fault of the web page. Is Chrome really using less? That's not really possible if the code is the same. WebGL texture memory usage is controlled by the script and fully specified. So I'd just assume that Chrome uses just as much texture memory on this page. Note that texture memory is not accounted for in normal memory usage measurements.

What I'd rather like to understand is why we're using 1.9 G on the heap and Chrome isn't.
Generated as explained by Nick on this blog post:
http://blog.mozilla.com/nnethercote/2010/12/09/memory-profiling-firefox-with-massif/

Below is the peak snapshot. If I understand that correctly, it turns out that:
 * most of the malloc'd memory is allocated by the GL driver (libnvidia-glcore).
 * most of the mmap'd memory is mmap'd from GL context initialization, from WebGL context initialization.

All in all, it seems that the only reason why this page causes so many bigs mallocs/mmaps is that it has many WebGL contexts, each of which having their own OpenGL context. Indeed this page has 15 WebGL contexts which is not really what WebGL was designed for. I suppose that useful next steps would include:
 * run Chromium in massif to compare.
 * make a simple test case just creating 15 WebGL contexts to compare.

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
 44 40,414,513,321    4,187,449,746    4,187,449,746             0            0
100.00% (4,187,449,746B) (page allocation syscalls) mmap/mremap/brk, --alloc-fns, etc.
->42.92% (1,797,287,936B) 0x58B6BE8: syscall (syscall.S:38)
| ->42.92% (1,797,287,936B) 0x403C44: pages_map (jemalloc.c:399)
|   ->41.13% (1,722,499,072B) 0x405C9C: chunk_alloc.isra.7 (jemalloc.c:2451)
|   | ->39.14% (1,638,924,288B) 0x406D16: huge_malloc (jemalloc.c:4654)
|   | | ->38.64% (1,617,952,768B) 0x406F0D: malloc (jemalloc.c:5925)
|   | | | ->38.61% (1,616,904,192B) 0x27ACBFBE: ??? (in /usr/lib/libnvidia-glcore.so.270.41.19)
|   | | | | ->36.36% (1,522,532,352B) 0xFFFFE: ???
|   | | | | | ->36.36% (1,522,532,352B) in 751 places, all below massif's threshold (01.00%)
|   | | | | |   
|   | | | | ->02.25% (94,371,840B) 0x2C07CA: ???
|   | | | |   ->02.25% (94,371,840B) in 30 places, all below massif's threshold (01.00%)
|   | | | |     
|   | | | ->00.03% (1,048,576B) in 1+ places, all below ms_print's threshold (01.00%)
|   | | | 
|   | | ->00.50% (20,971,520B) in 1+ places, all below ms_print's threshold (01.00%)
|   | | 
|   | ->02.00% (83,574,784B) 0x405EB3: arena_run_alloc.isra.8 (jemalloc.c:3240)
|   |   ->01.90% (79,712,256B) 0x4062B9: arena_malloc (jemalloc.c:3811)
|   |   | ->01.77% (74,039,296B) 0x406F0D: malloc (jemalloc.c:5925)
|   |   | | ->01.77% (74,039,296B) in 28 places, all below massif's threshold (01.00%)
|   |   | |   
|   |   | ->00.14% (5,672,960B) in 1+ places, all below ms_print's threshold (01.00%)
|   |   | 
|   |   ->00.09% (3,862,528B) in 1+ places, all below ms_print's threshold (01.00%)
|   |   
|   ->01.76% (73,744,384B) 0x405D25: chunk_alloc.isra.7 (jemalloc.c:2478)
|   | ->01.30% (54,525,952B) 0x406D16: huge_malloc (jemalloc.c:4654)
|   | | ->01.25% (52,428,800B) 0x406F0D: malloc (jemalloc.c:5925)
|   | | | ->01.25% (52,428,800B) 0x27ACBFBE: ??? (in /usr/lib/libnvidia-glcore.so.270.41.19)
|   | | | | ->01.25% (52,428,800B) 0xFFFFE: ???
|   | | | |   ->01.25% (52,428,800B) in 25 places, all below massif's threshold (01.00%)
|   | | | |     
|   | | | ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%)
|   | | | 
|   | | ->00.05% (2,097,152B) in 1+ places, all below ms_print's threshold (01.00%)
|   | | 
|   | ->00.46% (19,218,432B) in 1+ places, all below ms_print's threshold (01.00%)
|   | 
|   ->00.02% (1,044,480B) in 1+ places, all below ms_print's threshold (01.00%)
|   
->40.44% (1,693,450,240B) 0x406DA5: huge_malloc (jemalloc.c:4689)
| ->39.89% (1,670,381,568B) 0x406F0D: malloc (jemalloc.c:5925)
| | ->39.87% (1,669,332,992B) 0x27ACBFBE: ??? (in /usr/lib/libnvidia-glcore.so.270.41.19)
| | | ->37.61% (1,574,961,152B) 0xFFFFE: ???
| | | | ->37.61% (1,574,961,152B) in 751 places, all below massif's threshold (01.00%)
| | | |   
| | | ->02.25% (94,371,840B) 0x2C07CA: ???
| | |   ->02.25% (94,371,840B) in 30 places, all below massif's threshold (01.00%)
| | |     
| | ->00.03% (1,048,576B) in 1+ places, all below ms_print's threshold (01.00%)
| | 
| ->00.55% (23,068,672B) in 1+ places, all below ms_print's threshold (01.00%)
| 
->07.42% (310,542,336B) 0x4015F39: mmap (syscall-template.S:82)
| ->07.15% (299,548,672B) 0x400636B: _dl_map_object_from_fd (dl-load.c:1189)
| | ->07.15% (299,548,672B) 0x40076B6: _dl_map_object (dl-load.c:2250)
| |   ->05.62% (235,450,368B) 0x400CF40: openaux (dl-deps.c:65)
| |   | ->05.62% (235,450,368B) 0x400D904: _dl_catch_error (dl-error.c:178)
| |   |   ->05.62% (235,450,368B) 0x400C02A: _dl_map_object_deps (dl-deps.c:247)
| |   |     ->05.24% (219,435,008B) 0x4011E94: dl_open_worker (dl-open.c:263)
| |   |     | ->05.24% (219,435,008B) 0x400D904: _dl_catch_error (dl-error.c:178)
| |   |     |   ->05.24% (219,435,008B) 0x4011878: _dl_open (dl-open.c:555)
| |   |     |     ->05.14% (215,093,248B) 0x4E3FF64: dlopen_doit (dlopen.c:67)
| |   |     |     | ->05.14% (215,093,248B) 0x400D904: _dl_catch_error (dl-error.c:178)
| |   |     |     |   ->05.14% (215,093,248B) 0x4E402EA: _dlerror_run (dlerror.c:164)
| |   |     |     |     ->05.14% (215,093,248B) 0x4E3FEDF: dlopen@@GLIBC_2.2.5 (dlopen.c:88)
| |   |     |     |       ->02.57% (107,577,344B) 0x408419: ReadDependentCB(char const*, int) (nsGlueLinkingDlopen.cpp:213)
| |   |     |     |       | ->02.57% (107,577,344B) 0x407D8C: XPCOMGlueLoadDependentLibs(char const*, void (*)(char const*, int)) (nsXPCOMGlue.cpp:138)
| |   |     |     |       |   ->02.57% (107,577,344B) 0x4084CF: XPCOMGlueLoad(char const*, unsigned int (**)(XPCOMFunctions*, char const*)) (nsGlueLinkingDlopen.cpp:229)
| |   |     |     |       |     ->02.57% (107,577,344B) 0x407CB4: XPCOMGlueStartup (nsXPCOMGlue.cpp:77)
| |   |     |     |       |       ->02.57% (107,577,344B) 0x401AF7: main (nsBrowserApp.cpp:238)
| |   |     |     |       |         
| |   |     |     |       ->02.10% (88,047,616B) 0x415B91B: pr_LoadLibraryByPathname (prlink.c:836)
| |   |     |     |       | ->02.10% (88,047,616B) 0x415BEBE: PR_LoadLibrary (prlink.c:475)
| |   |     |     |       |   ->01.29% (54,132,736B) 0x6382468: nsNativeAppSupportUnix::Start(int*) (nsNativeAppSupportUnix.cpp:498)
| |   |     |     |       |   | ->01.29% (54,132,736B) 0x637B23A: XRE_main (nsAppRunner.cpp:3127)
| |   |     |     |       |   |   ->01.29% (54,132,736B) 0x401D5E: main (nsBrowserApp.cpp:198)
| |   |     |     |       |   |     
| |   |     |     |       |   ->00.81% (33,914,880B) in 1+ places, all below ms_print's threshold (01.00%)
| |   |     |     |       |   
| |   |     |     |       ->00.46% (19,468,288B) in 1+ places, all below ms_print's threshold (01.00%)
| |   |     |     |       
| |   |     |     ->00.10% (4,341,760B) in 1+ places, all below ms_print's threshold (01.00%)
| |   |     |     
| |   |     ->00.38% (16,015,360B) in 1+ places, all below ms_print's threshold (01.00%)
| |   |     
| |   ->01.48% (61,997,056B) 0x4011E37: dl_open_worker (dl-open.c:226)
| |   | ->01.48% (61,997,056B) 0x400D904: _dl_catch_error (dl-error.c:178)
| |   |   ->01.48% (61,997,056B) 0x4011878: _dl_open (dl-open.c:555)
| |   |     ->01.23% (51,384,320B) 0x4E3FF64: dlopen_doit (dlopen.c:67)
| |   |     | ->01.23% (51,384,320B) 0x400D904: _dl_catch_error (dl-error.c:178)
| |   |     |   ->01.23% (51,384,320B) 0x4E402EA: _dlerror_run (dlerror.c:164)
| |   |     |     ->01.23% (51,384,320B) 0x4E3FEDF: dlopen@@GLIBC_2.2.5 (dlopen.c:88)
| |   |     |       ->01.23% (51,384,320B) in 5 places, all below massif's threshold (01.00%)
| |   |     |         
| |   |     ->00.25% (10,612,736B) in 1+ places, all below ms_print's threshold (01.00%)
| |   |     
| |   ->00.05% (2,101,248B) in 1+ places, all below ms_print's threshold (01.00%)
| |   
| ->00.26% (10,993,664B) in 1+ places, all below ms_print's threshold (01.00%)
| 
->06.43% (269,312,000B) 0x58B6D79: mmap (syscall-template.S:82)
| ->04.81% (201,424,896B) 0x4C2A3E1: pthread_create@@GLIBC_2.2.5 (allocatestack.c:498)
| | ->04.41% (184,639,488B) 0x416FE24: _PR_CreateThread (ptthread.c:424)
| | | ->04.41% (184,639,488B) 0x4170096: PR_CreateThread (ptthread.c:507)
| | |   ->03.61% (151,068,672B) 0x6CD391E: nsThread::Init() (nsThread.cpp:353)
| | |   | ->03.61% (151,068,672B) 0x6CD408B: nsThreadManager::NewThread(unsigned int, unsigned int, nsIThread**) (nsThreadManager.cpp:247)
| | |   |   ->02.00% (83,927,040B) 0x6CA7D54: NS_NewThread_P(nsIThread**, nsIRunnable*, unsigned int) (nsThreadUtils.cpp:74)
| | |   |   | ->02.00% (83,927,040B) in 7 places, all below massif's threshold (01.00%)
| | |   |   |   
| | |   |   ->01.60% (67,141,632B) 0x6CD4C0C: nsThreadPool::PutEvent(nsIRunnable*) (nsThreadPool.cpp:115)
| | |   |     ->01.60% (67,141,632B) 0x6CD4D8C: nsThreadPool::Dispatch(nsIRunnable*, unsigned int) (nsThreadPool.cpp:260)
| | |   |       ->01.60% (67,141,632B) 0x6CC4CC3: nsAStreamCopier::PostContinuationEvent() (nsStreamUtils.cpp:467)
| | |   |         ->01.60% (67,141,632B) 0x6CC4E12: NS_AsyncCopy(nsIInputStream*, nsIOutputStream*, nsIEventTarget*, nsAsyncCopyMode, unsigned int, void (*)(void*, unsigned int), void*, int, int, nsISupports**) (nsStreamUtils.cpp:290)
| | |   |           ->01.60% (67,141,632B) 0x639ABA3: nsInputStreamTransport::OpenInputStream(unsigned int, unsigned int, unsigned int, nsIInputStream**) (nsStreamTransportService.cpp:145)
| | |   |             ->01.60% (67,141,632B) 0x6399E79: nsInputStreamPump::AsyncRead(nsIStreamListener*, nsISupports*) (nsInputStreamPump.cpp:344)
| | |   |               ->01.20% (50,356,224B) 0x6393710: nsBaseChannel::BeginPumpingData() (nsBaseChannel.cpp:262)
| | |   |               | ->01.20% (50,356,224B) 0x6393B7C: nsBaseChannel::AsyncOpen(nsIStreamListener*, nsISupports*) (nsBaseChannel.cpp:591)
| | |   |               |   ->01.20% (50,356,224B) in 3 places, all below massif's threshold (01.00%)
| | |   |               |     
| | |   |               ->00.40% (16,785,408B) in 1+ places, all below ms_print's threshold (01.00%)
| | |   |               
| | |   ->00.80% (33,570,816B) in 1+ places, all below ms_print's threshold (01.00%)
| | |   
| | ->00.40% (16,785,408B) in 1+ places, all below ms_print's threshold (01.00%)
| | 
| ->01.62% (67,887,104B) in 19 places, all below massif's threshold (01.00%)
|   
->01.99% (83,439,616B) 0x40631F: arena_malloc (jemalloc.c:3822)
| ->01.74% (72,908,800B) 0x406F0D: malloc (jemalloc.c:5925)
| | ->01.74% (72,908,800B) in 165 places, all below massif's threshold (01.00%)
| |   
| ->00.25% (10,530,816B) in 1+ places, all below ms_print's threshold (01.00%)
| 
->00.80% (33,417,618B) in 1+ places, all below ms_print's threshold (01.00%)
Attached file about:memory result
I does not see any huge memory usage now but abnormal thing is:
 930.00 MB -- webgl-texture-memory

Otherwise site opens perfectly, no responsiveness issues, no crashes.
I'd be interested to know if this try-build fixes it:
   https://bugzilla.mozilla.org/show_bug.cgi?id=704839
as it completely refactors how we decide when to delete WebGL objects such as textures.
I meant: https://tbpl.mozilla.org/?tree=Try&rev=655743deed5d
the link in comment 85 is to the bug that this comes from.
Hey, please try again Nightly now! Important changes have landed and are in the latest Nightly: bug 705904, bug 704839, bug 707033.
http://www.mapcrunch.com/gallery?webgl=1 still crashes with the latest Nightly build. Loading that URL by itself with no other tabs open takes me up to ~1,240,908K (Private Working Set), ~1,434,524K (peak working set), ~1,250,021K (commit size). I tried to load up about:memory and the Firefox window disappears and Windows pops up saying that Nightly has stopped working. No crash reporter. 

Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:11.0a1) Gecko/20111207 Firefox/11.0a1
Built from http://hg.mozilla.org/mozilla-central/rev/489f2d51b011
Whoa, you're using the Win64 build - can you get this to reproduce with a Win32 build?
QA Contact: canvas.webgl → bjacob
On WinXP SP3 x86 I still get a black screen and an unresponsive browser when loading that page. System RAM use seems a bit better than when I tested last year, capping at 1GB, so the remaining problem must be primarily related to running out of GPU RAM (512MB on this PC) in combination with composted layers. about:memory shows 670MB of webgl-texture-memory allocated.
Reproducing locally on my trunk build on Win7. Texture memory usage is sitting at about 780MB. I'll look into this.
Assignee: jmuizelaar → jgilbert
Status: NEW → ASSIGNED
Immediately after the page finishes loading, it had 670MB of textures. This is not immediately freed if you reload the page. Usage instead climbs to at least 1.2GB (thankfully I have a 2GB card) before it is apparently GC'd.
Just after load it appears to have 565 512x512 textures (~1MB each). The first two contexts have 77 textures each. Triggering GC and CC do not clean up any textures, so the page must be holding on to them. Actually, reading some comments above, it sounds like we're not allowing GC to pick up orphaned textures. If this is the case, this is Bad.
(In reply to Jeff Gilbert [:jgilbert] from comment #93)
> Actually, reading
> some comments above, it sounds like we're not allowing GC to pick up
> orphaned textures. If this is the case, this is Bad.

It's not the case anymore, since Bug 705904 was fixed. Unreferenced WebGL objects are GC'able now, or else it's a new bug we don't know about.
Yep, we correctly can GC things when they fall out of scope.
The page in question must simply be holding on to all those textures.
Testing in Chrome shows they get similar GPU memory usage.

Here's a radical idea: Should we limit WebGL buffer/texture allocation if the allocation would take them above, say, 90% of GPU RAM? This is hard to know for sure, but we can make fairly good guesses about WebGL's usage, and we know the reported adapter RAM size.
(In reply to Jeff Gilbert [:jgilbert] from comment #96)
> Testing in Chrome shows they get similar GPU memory usage.

Then this means that there is no bug anymore here, just a page using lots of GPU resources?

> 
> Here's a radical idea: Should we limit WebGL buffer/texture allocation if
> the allocation would take them above, say, 90% of GPU RAM? This is hard to
> know for sure, but we can make fairly good guesses about WebGL's usage, and
> we know the reported adapter RAM size.

This would have to be discussed in greater generality to be useful:
 - with hw-accelerated layers it is possible to cause the browser to allocate arbitrary amounts of texture memory without using WebGL. Just create large 2d canvases, or large image elements.
 - a similar problem exists with general memory, especially on system with either low virtual memory (e.g. phones/tablets) or large but slow on-disk virtual memory (e.g. most desktops). Allocating just large JS arrays can then either cause OOM's (if small virtual memory) or whole-system sluggishness (if large slow virtual memory). So, the problem should be treated in enough generality to encompass JS arrays as well, unless it is shown that texture memory OOM's are really worse than general memory OOMs.
Blocks: 743009
(In reply to Benoit Jacob [:bjacob] from comment #97)
> (In reply to Jeff Gilbert [:jgilbert] from comment #96)
> > Testing in Chrome shows they get similar GPU memory usage.
> 
> Then this means that there is no bug anymore here, just a page using lots of
> GPU resources?

Indeed, marking WFM, since everything is being done 'properly'.

> > 
> > Here's a radical idea: Should we limit WebGL buffer/texture allocation if
> > the allocation would take them above, say, 90% of GPU RAM? This is hard to
> > know for sure, but we can make fairly good guesses about WebGL's usage, and
> > we know the reported adapter RAM size.
> 
> This would have to be discussed in greater generality to be useful:
>  - with hw-accelerated layers it is possible to cause the browser to
> allocate arbitrary amounts of texture memory without using WebGL. Just
> create large 2d canvases, or large image elements.
>  - a similar problem exists with general memory, especially on system with
> either low virtual memory (e.g. phones/tablets) or large but slow on-disk
> virtual memory (e.g. most desktops). Allocating just large JS arrays can
> then either cause OOM's (if small virtual memory) or whole-system
> sluggishness (if large slow virtual memory). So, the problem should be
> treated in enough generality to encompass JS arrays as well, unless it is
> shown that texture memory OOM's are really worse than general memory OOMs.

Created bug 743009 for this discussion.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: