Closed Bug 1081926 Opened 5 years ago Closed 5 years ago
memory leak on freeciv HTML5 game causing computer to be unusable
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:32.0) Gecko/20100101 Firefox/32.0 Build ID: 20140924083558 Steps to reproduce: 1. Start Firefox (tested with Ubuntu repository's 32.0.3 and nightly 35.0a1) 2. Go to http://play.freeciv.org/webclient/?action=load&load=tutorial&scenario=true 3. Computer becomes unusable(see results) Actual results: My 4 GiB RAM is immediately filled up so my computer becomes unusable. The game never stops loading. See collected data: http://pastebin.com/sXCd8qiJ Shutting down Firefox by closing it or using killall becomes impossible because of the lack of memory. Video and audio stutters heavily and it's impossible to start a new program. The only way to regain control is by restarting the computer. See reddit thread for original report: https://www.reddit.com/r/gamedev/comments/2j13cj/gamedev_please_review_freecivweb/cl7giu2 Expected results: The game starts after a few seconds of loading and uses up ~138 MiB. This is what happens if I visit the URL with chromium-browser. According to `freecivnet' in the reddit thread,this is also what happens with both Chrome and Firefox on Windows 8.
Hard to save a memory log, the memory use is really high and make the machine hang. Regression range: good=2014-10-01 bad=2014-10-02 http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=14665b1de5ee&tochange=2399d1ae89e9
I assume that Loic tested on Windows like me. Please correct me if I'm wrong ! According to the reporter this happens with Fifrefox32 on Linux. The windows regression is a recent issue on trunk. windows regression range: good: http://hg.mozilla.org/mozilla-central/rev/165c3fd176ec Bad http://hg.mozilla.org/mozilla-central/rev/7f08e93cc2a5 so either bug 1075621 or bug 902952
Status: UNCONFIRMED → NEW
Ever confirmed: true
Component: Untriaged → Graphics
Product: Firefox → Core
I just tried Firefox trunk (f547cf19d104 and a couple of slightly older revisions) on Ubuntu 14.04 and failed to reproduce. I played for several turns, built a city, and RSS didn't go above ~400 MiB. Getting memory reports (visit about:memory and click on "Measure and save...") would be really useful, though Loic says performance is so bad this is hard to do.
georgezhu, can you visit about:support in Firefox, click on "Copy text to clipboard" and then paste the results into a comment here? That will give us information about your graphics card, which may be relevant. Thank you.
I can reproduce on windows7. 100% physical memory usage and gigantic virtual memory usage. And finally, my pc is completely unresponsive. Reset PC.... Regression window(m-i) Good: https://hg.mozilla.org/integration/mozilla-inbound/rev/165c3fd176ec Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0 ID:20141001112720 Bad: https://hg.mozilla.org/integration/mozilla-inbound/rev/d954ed24e795 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:35.0) Gecko/20100101 Firefox/35.0 ID:20141001113823 Pushlog: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=165c3fd176ec&tochange=d954ed24e795 Regressed by: Bug 902952
about:memory (I was able to barely acquire it)
Graphics -------- Adapter Description: ATI Radeon HD 4300/4500 Series Adapter Drivers: aticfx64 aticfx64 aticfx32 aticfx32 atiumd64 atidxx64 atiumdag atidxx32 atiumdva atiumd6a atitmm64 Adapter RAM: 512 ClearType Parameters: Gamma: 2200 Pixel Structure: R ClearType Level: 50 Enhanced Contrast: 200 Device ID: 0x954f Direct2D Enabled: true DirectWrite Enabled: true (6.2.9200.16492) Driver Date: 4-29-2013 Driver Version: 8.970.100.1100 GPU #2 Active: false GPU Accelerated Windows: 1/1 Direct3D 11 (OMTC) Subsys ID: 00000000 Vendor ID: 0x1002 WebGL Renderer: Google Inc. -- ANGLE (ATI Radeon HD 4300/4500 Series Direct3D9Ex vs_3_0 ps_3_0) windowLayerManagerRemote: true AzureCanvasBackend: direct2d AzureContentBackend: direct2d 1.1 AzureFallbackCanvasBackend: cairo AzureSkiaAccelerated: 0
Thanks Alice! All the non-graphics stuff in the memory report looks reasonable. There are a few graphics measurements that are very high: 3,893.28 MB ── gfx-d2d-vram-source-surface 4,114.96 MB ── gpu-committed 505.65 MB ── gpu-dedicated 1,315.55 MB ── gpu-shared 1,805.88 MB ── vsize-max-contiguous
[Tracking Requested - why for this release]: Hang/crash umm, I can reproduce on Firefox32 ubuntu12.04 32bit, but not Firefox31. (PC crashes before killing browser. so, it is difficult to get about:memory and regression range) So, Bug 902952 seems trigger something existing bug which is introduced in Firefox32.
Regression window(m-i) Good: https://hg.mozilla.org/integration/mozilla-inbound/rev/684d2cee3f2d Mozilla/5.0 (X11; Linux i686; rv:32.0) Gecko/20100101 Firefox/32.0 ID:20140606102635 Bad: https://hg.mozilla.org/integration/mozilla-inbound/rev/79624417d247 Mozilla/5.0 (X11; Linux i686; rv:32.0) Gecko/20100101 Firefox/32.0 ID:20140606104333 Pushlog: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=684d2cee3f2d&tochange=79624417d247 Regressed by: Bug 994081
VMWare Player Graphics -------- Adapter Description: VMware, Inc. -- Gallium 0.4 on SVGA3D; build: RELEASE; Device ID: Gallium 0.4 on SVGA3D; build: RELEASE; Driver Version: 2.1 Mesa 8.0.4 GPU Accelerated Windows: 0/1 Basic Vendor ID: VMware, Inc. WebGL Renderer: VMware, Inc. -- Gallium 0.4 on SVGA3D; build: RELEASE; windowLayerManagerRemote: false AzureCanvasBackend: cairo AzureContentBackend: cairo AzureFallbackCanvasBackend: none AzureSkiaAccelerated: 0
[Tracking Requested - why for this release]:
VM crashes. I cannot get crash sig.
Severity: normal → critical
Thanks for the interest in fixing this bug! I'm the developer of the Freeciv HTML5 client, please let me know if you discover anything that I can do in the code to improve the memory usage or performance of this game.
There's a few issues here, but the big one is the CanvasImageCache doesn't really work well. It seems to generate different entries for the same image. Previously, that was less of an issue since the gfxASurface version of the image would store the optimized SourceSurface version of the image, but now we might generate new SourceSurfaces for every image that doesn't hit the CanvasImageCache. And in Freeciv's case, it misses CanvasImageCache a lot. This is a particularly big problem on X11, where OptimizeSourceSurface doesn't expect the caller to pass in an already optimized image, so it ends up generating a new one. I have a patch for that that will go in a new bug.
It seems like there are two distinct problems triggered by FreeCiv: the Windows problem and the Linux problem. Should we split this bug in two to track the two problems separately?
The windows problem is being investigated in https://bugzilla.mozilla.org/show_bug.cgi?id=1082827 . However, there's a core platform independent issue which I'm leaving this bug for.
Depends on: 1082827
Whiteboard: [MemShrink] → [MemShrink:P1]
mwu: with bug 1082745 fixed, I think the problem should be fixed for georgezhu, right? Assuming that's right, if georgezhu is willing to try a Nightly build (see nightly.mozilla.org) then hopefully the problem will be fixed.
It looks like we're missing the cache because the canvas is different every time.
>Assuming that's right, if georgezhu is willing to try a Nightly build (see nightly.mozilla.org) then hopefully the problem will be fixed. I've just tried the Nightly build and the problem was gone. I guess it's fixed now. Thanks all for looking into the problem!
This adds a second hashtable in the cache which allows us to do another lookup that ignores the canvas. Still need to clean it up a bit but I'm going to send it to try and see how that works out.
Assignee: nobody → mwu
Now with less crashing, hopefully.
Attachment #8508229 - Attachment is obsolete: true
I think this also affects b2g since it uses skia as the canvas backend.
B2G is affected but since it has a limit on the canvas image cache memory usage, it doesn't get taken down. Fixing this can still improve performance if images fit within the image cache, however.
Comment on attachment 8510525 [details] [diff] [review] Fallback on a simple lookup when the normal lookup fails, v3 Review of attachment 8510525 [details] [diff] [review]: ----------------------------------------------------------------- Sorry about the slow response on this. How does this help? Wasn't the issue that a single image was being used on multiple Elements? This seems like it would only help the case where a single image element is being drawn to multiple canvases.
Yup - I misdiagnosed the issue the first time. What freeciv is doing is drawing a large image containing many tiles to many different canvases which each hold a tile. The canvas are then used to draw to the main canvas.
Comment on attachment 8510525 [details] [diff] [review] Fallback on a simple lookup when the normal lookup fails, v3 Review of attachment 8510525 [details] [diff] [review]: ----------------------------------------------------------------- Ok, sounds good then. Maybe add a comment/todo about using something specific to the image (rather than the element) as a key so we can solve that use case too, if it ever matters.
Attachment #8510525 - Flags: review?(matt.woodrow) → review+
Got a bit bitrotted, so I'm putting it up on try one more time - https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=2f6ec7f95a3f
Attachment #8510525 - Attachment is obsolete: true
mwu - Assuming this is good on m-c, what do you think about uplifting to 34 and 35, which are both marked as affected?
Comment on attachment 8513888 [details] [diff] [review] Fallback on a simple lookup when the normal lookup fails, v4 Approval Request Comment [Feature/regressing bug #]: Regressed by bug 994081. [User impact if declined]: Sites which render large images to multiple canvas run slowly and may cause the browser to run out of memory. This would only impact systems where the canvas and content backends are different. At the moment, that includes Windows 7+, starting with Gecko 35, and Android since Gecko 32. B2G is also technically affected but it has a hard limit on canvas image caching which prevents any out of memory issues. [Describe test coverage new/current, TBPL]: No new tests. Possible to test, though the test would only work on platforms where content and canvas backends are different. [Risks and why]: I think the risk is low, but the patch isn't trivial either. It's the most conservative fix I can think of, however. [String/UUID change made/needed]: None.
Comment on attachment 8513888 [details] [diff] [review] Fallback on a simple lookup when the normal lookup fails, v4 I understand from comment 43, that this fix should have no impact on desktop but will impact mobile. Let's get this into beta6 to test. Beta+ Aurora+
Bas, please address your needinfos. Just clear them with a brief explanation if you don't have anything to add.
(In reply to Nicholas Nethercote [:njn] from comment #47) > Bas, please address your needinfos. Just clear them with a brief explanation > if you don't have anything to add. I get a lot of them. I usually have something to add but I address them by priority rather than date (if I some day find the time to clear my queue, it would probably be better ;-)). I'd rather leave them open than say I have nothing to add when I might. So when I do answer a ni? I like the answer to be true rather than just an excuse to close it :-). Anyway, didn't mean to spam this bug with it, but I happened to see this when I found a regression this caused.
You need to log in before you can comment on or make changes to this bug.