TM: GameBoy emulator runs slowly

RESOLVED WORKSFORME

Status

()

defect
RESOLVED WORKSFORME
10 years ago
6 years ago

People

(Reporter: dmandelin, Unassigned)

Tracking

(Blocks 1 bug)

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

()

Attachments

(1 attachment)

Reporter

Description

10 years ago
From a comment at http://blog.mozilla.com/dmandelin/2009/07/20/tracemonkey-hacks/#comments. It runs around 17 fps in FF35, 55 fps in Safari 4. I am building a new opt TM build to see how that does. Some top aborts:

   1 abort: 10345: object used as index
   1 abort: 11409: primitive lhs
   1 abort: 6993: fp->scopeChain is not global or active call object
   1 abort: 9950: can't trace set of property with setter and slot
   2 Abort recording of tree file:///Users/dmandelin/sources/tracemonkey/obj-d/dist/MinefieldDebug.app/Contents/MacOS/components/nsBlocklistService.js:803@59 at file:///Users/dmandelin/sources/tracemonkey/obj-d/dist/MinefieldDebug.app/Contents/MacOS/components/nsBlocklistService.js:805@78: No compatible inner tree.
   2 Abort recording of tree file:///Users/dmandelin/sources/tracemonkey/obj-d/dist/MinefieldDebug.app/Contents/MacOS/modules/XPCOMUtils.jsm:163@37 at file:///Users/dmandelin/sources/tracemonkey/obj-d/dist/MinefieldDebug.app/Contents/MacOS/modules/XPCOMUtils.jsm:165@64: setrval.
   2 abort: 12672: callee is not an object
   3 Abort recording of tree chrome://browser/content/browser.js:2554@54 at chrome://browser/content/browser.js:2555@60: name.
   3 Abort recording of tree file:///Users/dmandelin/sources/tracemonkey/obj-d/dist/MinefieldDebug.app/Contents/MacOS/components/nsSearchService.js:1579@23 at file:///Users/dmandelin/sources/tracemonkey/obj-d/dist/MinefieldDebug.app/Contents/MacOS/components/nsSearchService.js:1581@42: lookupswitch.
   3 Abort recording of tree file:///Users/dmandelin/sources/tracemonkey/obj-d/dist/MinefieldDebug.app/Contents/MacOS/modules/XPCOMUtils.jsm:263@31 at file:///Users/dmandelin/sources/tracemonkey/obj-d/dist/MinefieldDebug.app/Contents/MacOS/modules/XPCOMUtils.jsm:265@55: setrval.
   3 Abort recording of tree http://www.codebase.es/jsgb/:188@13 at http://www.codebase.es/jsgb/jsgb.memory.js:143@205: Loop edge does not return to header.
   3 Abort recording of tree http://www.codebase.es/jsgb/:188@13 at http://www.codebase.es/jsgb/jsgb.memory.js:143@205: No compatible inner tree.
   3 abort: 7481: switch on object or null
   3 abort: 9521: new Function
   4 Abort recording of tree http://www.mozilla.org/script/1.0/jquery-1.3.2.min.js:12@201 at http://www.mozilla.org/script/1.0/jquery-1.3.2.min.js:12@238: apply.
   5 Abort recording of tree chrome://browser/content/tabbrowser.xml:513@33 at chrome://browser/content/browser.js:3799@47: SetPropHit.
   5 abort: 9495: trying to call native apply or call
   6 Abort recording of tree http://www.mozilla.org/script/1.0/jquery-1.3.2.min.js:12@126 at http://www.mozilla.org/script/1.0/jquery-1.3.2.min.js:12@148: No compatible inner tree.
   6 abort: 9952: can't trace JavaScript function setter
   7 abort: 11142: name() not accessing a valid slot
   7 abort: 2936: non-stub getter
  11 abort: 11242: script getter
Would be good to attach the code here...
If someone were to take this, the first step would be to confirm that:

1. There is only one hot loop in this program.

(By the way, do we have a debug mode that really measures exactly this? I don't know of anything. If not, I'll file a bug and make that happen. Should be easy to do: just turn off the JIT and count JSOP_LOOPs.)

2. The main problem tracing the loop is that we never join traces. The loop is very branchy, and each branch is occasionally taken, just like the "game of life". To summarize the loop structure, from memory:

    while (1) {
        optable[MEM[pc++]]();   // call through function table
        if (++c1 > N1) { ... }
        if (++c2 > N2) { ... }
    }

This much alone would make the whole tree 1024 traces wide, certainly 100 at least in actual practice; and some of the functions in the table contain branches too.

(I'm sure bz would not mind teaching any interested person how to confirm #2. Again, if we need better debug output, we should invest--I think everyone is getting tired of having to dig for this information.)


Even if we were able to trace this loop, we might be slow, but let's just confirm that this much is accurate first. I would not be surprised at all to find otherwise.

If the diagnosis is right, I don't think there's a quick fix. Tinkering with the "too branchy; didn't trace" heuristic, for example, is not going to do it. Next would be some brainstorming and hard cost-benefit analysis. :-|

Updated

10 years ago
Depends on: 516264
I tried this page again today and started a shark session because its still veeeery slow. 
Shark says:
	51.2%	51.2%	libmozjs.dylib	JS_HashTableRawRemove
	9.5%	9.5%	libmozjs.dylib	js_fgets
	7.4%	7.4%	CoreGraphics	argb32_sample_argb32
	5.1%	5.1%	libmozjs.dylib	js_LookupProperty
	4.3%	4.3%	libmozjs.dylib	js_GetterOnlyPropertyStub
	3.7%	3.7%	libmozjs.dylib	js_CoerceArrayToCanvasImageData
	2.3%	2.3%	libmozjs.dylib	js_DeepBail(JSContext*)

or top down:
	0.0%	85.6%	libmozjs.dylib	           js_Invoke
	0.0%	85.6%	libmozjs.dylib	             js_Invoke
	51.2%	85.2%	libmozjs.dylib	              JS_HashTableRawRemove
	0.2%	21.4%	libmozjs.dylib	               js_DeepBail(JSContext*)
	0.4%	20.9%	libmozjs.dylib	                js_DeepBail(JSContext*)
	0.0%	1.6%	Unknown Library	                 0x15eb59fa [unreadable]

Comment 5

9 years ago
FWIW, on my x64 Linux machine over here, the lastest Minefield nightlies were getting like 14fps, but the latest JägerMonkey builds were getting 62fps.
(during gameplay)

Comment 6

9 years ago
Oh, for comparison purposes, Chrome/5.0.375.99 was averaging (after leaving it running for a couple of minutes on a game screen) 60.7fps.  At same point, JägerMonkey was averaging 61.8
(In reply to comment #5)
> FWIW, on my x64 Linux machine over here, the lastest Minefield nightlies were
> getting like 14fps, but the latest JägerMonkey builds were getting 62fps.
> (during gameplay)

Yeah, JaegerMonkey is expected to do much better on branchy code that you see in emulators.  The aim is for it to be merged into the tracemonkey repository on September 1st, I believe.

Comment 8

9 years ago
On my machine (intel core 2 CPU 3GHz & 2gig RAM), latest minefield still offers poor performance running this gameboy emulator with Jaegermonkey enabled. The emulator runs at about 22FPS.

Chromium, on same machine, runs the emulator at a constant 60FPS.

Comment 9

9 years ago
Alright, if tracejit.content is set to false, minefield beats chromium at the gameboy emulator, running at 63FPS.

So at some point, the JS engine makes the wrong decision on whether to use trace or method.

Comment 10

9 years ago
I've noticed some recent problems with that in the past couple of weeks.  Some tuning they've been doing that doesn't seem to work quite optimally in all cases. :)
Depends on: 604029
Depends on: 604031
Depends on: 604035
Depends on: 604045
Reporter

Updated

9 years ago
Blocks: WebJSPerf

Comment 11

9 years ago
FYI, don't confuse this why my GameBoy Color emulator.
That one has its own bug report @ https://bugzilla.mozilla.org/show_bug.cgi?id=598655

Mathieu: It seems setting tracejit off does help a lot in many online emu, so it "appears" trace jit heuristics aren't stopping trace jit from attempting to trace repeatedly. On my own emu, it seems worse with trace jit and better with jaegermonkey for sure. My emu derived some pieces of code from the jsgb code, but I added a lot, replaced the video code, overhauled and recoded almost everything. The OP Codes are implemented as functions of the this variable (instead of a global array like in jsgb) and the memory reading / writing is done the same fashion (unlike JSGB though, which just if / elses its way, since it doesn't work with many games due to no MBC2, MBC3, and MBC5 type RAM banking). So my emu inherently has to deal with much more branchy memory than JSGB, to work with more gameboy games and gameboy color games. Please don't confuse the two emus!!!
Reporter

Comment 12

8 years ago
The URL for this bug is now 60fps in Firefox.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.