Closed Bug 443438 Opened 17 years ago Closed 12 years ago

Analyze/benchmark characteristics of real-world JavaScript code

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: gal, Assigned: mohammad.r.haghighat)

Details

(Keywords: meta)

Attachments

(5 files, 4 obsolete files)

For efficient compilation of JS we need a better understanding of large real-world JavaScript codes.
Proposed benchmark applications include sunspider, gmail, gmap, zimbra. The measurements can be done with an off-the-shelf version of spidermonkey+firefox. As Moh pointed out the test set is key for this, and for most applications we have none, so we might have to setup an infrastructure, test it, and then ask for vendor support (send them the modified firefox and get collected data from them). Of particular interest to me are: - Value usage, especially 31-bit integers vs 32-bit integers vs 33-bit (unsigned) integers. - Type stability of local variables (args+vars+stack), especially along loop edges. - Type stability of properties in scope objects (global scope i.e.) and data objects. - Stability of the prototype chain of objects. Please post additional ideas in this bug. I can help with finding the right points in spidermonkey to track the above events. Brendan has code already in spidermonkey to profile properties. I am sure he can help using those for collection as well.
Status: NEW → ASSIGNED
Assignee: general → mohammad.r.haghighat
Status: ASSIGNED → NEW
Branch stability seems key as well. or, identifying unpredictable branches either way you want to think of it.
I will post a small value profiling API that should be helpful. It allows you insert probes like "_vprof (x);" at source level to do profiling of arbitrary values, something that is not possible with DTrace now. For instance, by adding the following two lines, you can profile the outcome of all "if" statements: boolean cv; #define if(c) cv = (c); _vprof (cv); if (cv) There's however a problem with this particular redefinition of "if" and that is the case where the "if" statement terminates a previous statement for instance a loop or another if statement: e.g., "for (...) if (c)". If instead we had for (...) {if (c) ...}, there would be no problems. _vprof(v) profiles the value of expression "v", finds its min, max, average, the number of times the probe is called. You can also call your own probe on the value: #define if(c) cv = (c); _vprof (cv, cProbe); if (cv) where you provide the function cProbe (void* vprofID). In the probe, there are several "registers" available to you to store and use as you wish. There's also a thread-safe version of the API. There's also _hprof(v, ...) that finds a histogram of the value as requested. It is simple and rudimentary, but can be extended. I will clean it and post it. Where should it land? - moh
The vprof framework seems extremely useful. The macro magic for if is especially neat. That would definitively be very useful to determine branchyness and type stability (since all the type checks are cascades of if statements in the interpreter). If you make _vprof return its argument the macro might become less fragile: #define if(c) if (_vprof(cv)) Is the profiling API open source? Or should we keep that somewhere under wraps?
For efficiency purpose, I've made this first implementation of _vprof as a macro and not a function. This is how it is currently defined: #define _vprof(v,...) \ { \ static void* id = 0; \ (id != 0) ? \ _profileEntryValue (id, (__int64) (v)) \ : \ profileValue (&id, __FILE__, __LINE__, (__int64) (v), __VA_ARGS__, NULL) \ ;\ } I create a unique id for each probe and at the first dynamic execution instance of the probe it creates and initializes the probe structure. More specifically, the initialization involves capturing the source file name and the line number where probe resides. At the subsequent calls of the probe, it just does the minimal work for the probe, e.g., incrementing the probe counter, etc. I didn't find a way of "statement expressions", like the comma operator to return the initial value of the probe. There's also an API for the named events, i.e., _nvprof ("name", v);, where all probes with the same "name" are accumulated together. After I post the patch, I'd welcome all such suggestions, even a major rewrite would be possible to make it more useful. BTW, as you can see above, the current implementation relies on the variable-argument macros that is supported by VS. Should this patch land in TT?
Ed is on the CC list. I think we should have him comment on whether he wants it for Tamarin Tracing. I definitively want it for tracemonkey. Both use C++, so I think we should switch to inline functions returning the boolean parameter. Those are equally fast but without any of the nastiness of macros. That would also solve the varargs issue. What do you think?
A motivation for using macros was ensuring that unused parameters are not pushed: e.g., in the case of the histogram _hprof (v, 5, 1000, 10000, 20000, 50000, 100000), we just need the bin boundaries at the first invocation. If inline functions guarantee that property, so much the better. If we want to make it C++, it seems that a template function that returns the same type as its first argument would be a better choice. Right?
It seems to me that its ok to hard cast/return bool or int at all times, since if will treat the result as a boolean anyway. But if you want to be more precise, a template would work too. At least for gcc, static inline forces the method to be inlined and it behaves like a macro.
(In reply to comment #6) > Ed is on the CC list. I think we should have him comment on whether he wants it > for Tamarin Tracing. I definitively want it for tracemonkey. how about we add it to nanojit in TT, as a separate include file? (say, nanojit/vprof.h) Ed
I think nanojit in TT is a good landing place. The package includes two small files: vprof.h and vprof.cpp. I have tested the current implementation on Win32. The main reason for using macros was to have a place holder for a unique ID for each probe. As seen in Comment #5, every call to _vprof() creates a context that has an associated static ID (actually a pointer) that will be set to point to the data structure associated with the probe. This is done only once in the first dynamic use. When built with the right options, the code also works correctly in multithreaded apps (i.e., the ID will be initialized only once). In subsequent calls of the probe, that ID is used to access the probe fields. Using functions (even inlined ones) and static pointers in them does not create per-probe IDs. It is, however, possible to use the probe's caller address (the probe function return address) as a key into a hashtable to find the associated data-structure or object. I can post the patch to be added to TT/nanojit as is (after some cleanup) so you guys can provide some feedback. Later, it can be re-implemented using functions so it is more robust and can be used without restrictions. If you can think of a better solution for per-probe unique ID, please let me know.
(In reply to comment #10) > If you can think of a better solution for per-probe unique ID, please let me > know. how about using a macro to scoop up the value expr, file, and line, and then just pass that all to a helper function. then syntactically it looks like a function call expression, but the extra macro-supplied data can be used for unique probe id's... something like #define vprof(val) _vprof_helper(val,__FILE__,__LINE__)
by the way, +1 to push first and tune later
If you post a patch I can test on mac and linux.
In reply to comment #11) > how about using a macro to scoop up the value expr, file, and line, and then > just pass that all to a helper function. Then every time _vprof() and thus _vprof_helper() is called, file and line info have to be hashed to find the probe unique ID, i.e., the pointer to the probe data structure. The current solution does not need to do any hashing and is very fast.
If you want to keep it a macro we can use GCC statement-expressions (if available, using an #ifdef). if ({ int cv = x; do stuff; cv; }) { } As ugly as C language extensions come but pretty useful. Maybe VS has something similar.
Right. I wish VS had statement expressions, but it doesn't. I'll post the patch with a readme.txt and a simple driver in a state that can be added to TT without any changes, and we'll take it from there. It should be possible to get "if"'s profiled on gcc easily.
Attached file value-profiling header file (obsolete) —
Attached file value profiling mplementation (obsolete) —
Attached file a test driver for vprof (obsolete) —
Attached file vprof readme (obsolete) —
i submitted 4 files: vprof.h, vprfof.cpp, a test driver, and a readme.txt. I suggest to build with VS2008 "cl testVprofMT.c vprfof.cpp" and run a few experiments by modifying the driver using the info in readme.txt.
Needs a little de-windows-ification, but otherwise looks great. Ed, do you want to create a new folder for this? vprof/*? I don't mind importing/tracking a 2nd folder. This is not strictly nanojit.
Moh, Ed asked me to help you land/push the patch... I just looked it over and looks good so far, a few things that probably should happen first: (1) as Andreas said, a little de-Windowsification (mostly using C99 types instead of VC types, eg __int64 -> int64_t) (2) might be useful to make some of the macro names a bit more unique (at least the ones publicly visible in vprof.h) to avoid collisions with host environment macros, since (sadly) we can't namespace them... eg we have tried to prepend AVMPLUS_ to all the macros in TC/TT to help with this. (3) We need to work out license/copyright issues -- will this be under the same license as the rest of TT? If not, is it kosher to land it? In any case, licensing/copyright issues need to be explictly spelled out in each source file. (4) I don't care where it goes -- putting it in a new folder (vprof) is fine with me.
According to Brendan as long Moh attaches the appropriate license text at the top of each file, either of us can commit the patch onces its ready to go. We don't need a contributor agreement from him.
I agree with all the points. (1) As mentioned before, I developed and have used this little package on Win32. Some of the things like types can be ported easily. However, I'm not sure if the var-arg macro capability is supported by gcc. I've used that feature in two places, for _vprof(), with and without the user-provided probes, and also for the histograms. The first case is easy to avoid, just use two different names. The second one needs more thought. It is nice to have the ability of using the same call for any number of bins, i.e., the arguments. It seems we can start adding it on Win32 and broaden it later. (2) The macro names start with "_". I thought that would be enough, but appraently it is not. It could be even shorter, like _v() instead of _vprof() ;) (3) Let me check the licensing/copyright issues. (4) I think putting it in a separate folder is a better option. It would make it easier to use by other clients, e.g., TracingMonkey as Andreas mentioned. - moh
(In reply to comment #24) > According to Brendan as long Moh attaches the appropriate license text at the > top of each file, either of us can commit the patch onces its ready to go. We > don't need a contributor agreement from him. Would you please point me to a sample license text? Short & sweet! thanks.
AFAIK all recent versions of GCC support variadic macros, but I have not personally tried it. It may not be available in all compilers, but for our purposes here, VC and GCC are probably good enough.
Moh -- see http://www.mozilla.org/MPL/boilerplate-1.1/ and ask if you have any questions. Thanks, /be
Moh: let's assume for now that varargs-in-macros is fine. can you go ahead and switch to C99 types (and add the licensing bits) and repost? I'll do any gcc tweaking necessary. I'm happy leaving with the macro names as is and waiting to see if there are conflicts.
(In reply to comment #29) > Moh: > let's assume for now that varargs-in-macros is fine. can you go ahead and > switch to C99 types (and add the licensing bits) and repost? I'll do any gcc > tweaking necessary. > I'm happy leaving with the macro names as is and waiting to see if there are > conflicts. Will do. But getting an answer from the legal folks might take some time. Meanwhile, I'd welcome any other comments/suggestions you might have.
(In reply to comment #28) > Moh -- see http://www.mozilla.org/MPL/boilerplate-1.1/ and ask if you have any > questions. Thanks, > /be I talked to our legal folks, and we are clear to go. So, is the following acceptable as part of the license block? Do you put individual contributor's name? * The Original Code is [Value Profiling Utility.]. * * The Initial Developer of the Original Code is * Intel Corporation. * Portions created by the Initial Developer are Copyright (C) 2004-2007 * the Initial Developer. All Rights Reserved. * * Contributor(s): * Mohammad R. Haghighat [mohammad.r.haghighat@intel.com]
That boilerplate looks great, other than the copyright dates (which I think are just 2008 for this stuff?). You can lose the [] around the Original Code indication, but some people do leave it in.
Attachment #328160 - Attachment is obsolete: true
Attachment #328161 - Attachment is obsolete: true
Attachment #328162 - Attachment is obsolete: true
Attached file vprof readme
Just re-posted the files with the proper license block. Only vprof.h and vprof.cpp need to be checked in. I'd welcome any comments and suggestions you might have. - moh
Attachment #328163 - Attachment is obsolete: true
Just used a one liner _vprof ("patch", 1) in TT Assembler::patch() to see how many times the branches are dynamically patched. Too large a number would have meant too much of self modifing code which would have been bad. But the numbers are pretty small. Cool! TT on Sunspider: (counts are 32, 51, 26, ...) patch 1 [1 : 1] 32 32 patch 1 [1 : 1] 51 51 patch 1 [1 : 1] 26 26 patch 1 [1 : 1] 11 11 patch 1 [1 : 1] 4 4 patch 1 [1 : 1] 16 16 patch 1 [1 : 1] 1 1 patch 1 [1 : 1] 8 8 patch 1 [1 : 1] 32 32 patch 1 [1 : 1] 168 168 patch 1 [1 : 1] 2 2 patch 1 [1 : 1] 31 31 patch 1 [1 : 1] 2 2 patch 1 [1 : 1] 4 4 patch 1 [1 : 1] 7 7 patch 1 [1 : 1] 9 9 patch 1 [1 : 1] 41 41 patch 1 [1 : 1] 8 8 patch 1 [1 : 1] 291 291 patch 1 [1 : 1] 11 11 patch 1 [1 : 1] 292 292 patch 1 [1 : 1] 2 2 patch 1 [1 : 1] 18 18 patch 1 [1 : 1] 4 4 patch 1 [1 : 1] 111 111 patch 1 [1 : 1] 22 22 Results with -Dtrees: patch 1 [1 : 1] 150 150 patch 1 [1 : 1] 223 223 patch 1 [1 : 1] 73 73 patch 1 [1 : 1] 23 23 patch 1 [1 : 1] 3 3 patch 1 [1 : 1] 32 32 patch 1 [1 : 1] 9 9 patch 1 [1 : 1] 71 71 patch 1 [1 : 1] 521 521 patch 1 [1 : 1] 63 63 patch 1 [1 : 1] 1 1 patch 1 [1 : 1] 6 6 patch 1 [1 : 1] 3 3 patch 1 [1 : 1] 115 115 patch 1 [1 : 1] 2 2 patch 1 [1 : 1] 2932 2932 patch 1 [1 : 1] 23 23 patch 1 [1 : 1] 2253 2253 patch 1 [1 : 1] 2 2 patch 1 [1 : 1] 36 36 patch 1 [1 : 1] 1 1 patch 1 [1 : 1] 648 648 patch 1 [1 : 1] 44 44
(In reply to comment #37) > Just used a one liner _vprof ("patch", 1) in TT Assembler::patch() ... _nvprof ("patch", 1);
Attachment #329118 - Attachment is obsolete: true
Attachment #329120 - Attachment is obsolete: true
Attachment #329121 - Attachment is obsolete: true
Attachment #329123 - Attachment is obsolete: true
Attachment #329524 - Flags: review?(stejohns)
Attachment #329118 - Attachment is obsolete: false
Attachment #329123 - Attachment is obsolete: false
Attachment #329121 - Attachment is obsolete: false
Attachment #329120 - Attachment is obsolete: false
Attachment #329524 - Flags: review?(stejohns) → review+
pushed to tt, changeset: 501:a06c4856c4ff
Super! Thanks Ed. I'll be OOO for the rest of the week, but will be happy to fix problems, and incorporate suggestions when I'm back. - moh
Is this fixed? If so, which release caught the fix?
Version: unspecified → Trunk
I don't think there were any particular problems associated with this bug that would require fixes. The purpose of it was studying characteristics of real-world JavaScript code and the tools that would assist us in that direction. This obviously covers a broad range of ongoing activities. Andreas pulled in the value-profiling package, vprof, from Tamarin. Currently, it is under js/src/vprof directory. Meanwhile, vprof was extended for timers (tprof) in Tamarin that is not pulled in TM. There has also been more recent development by Rick, Ed, and team in that front. This is covered in: https://bugzilla.mozilla.org/show_bug.cgi?id=454082 A related tracker that covers study of Zimbra Collaboration Suite Client in Firefox is: https://bugzilla.mozilla.org/show_bug.cgi?id=493651 - moh
Ok, saw a patch with a referenced checkin, thought that was the tool this bug was introducing. Adding meta keyword as it's a tracker bug.
Keywords: meta
These bugs are all part of a search I made for js bugs that are getting lost in transit: http://tinyurl.com/jsDeadEndBugs They all have a review+'ed, non-obsoleted patch and are not marked fixed-in-tracemonkey or checkin-needed but have not seen any activity in 300 days. Some of these got lost simply because the assignee/patch provider never requested a checkin, or just because they were forgotten about.
Any reason to keep this open anymore?
Let's call this fixed.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: