Open Bug 1365361 Opened 8 years ago Updated 2 years ago

[meta] Baseline: Optimize intrinsics used in self-hosted functions

Categories

(Core :: JavaScript Engine: JIT, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: djvj, Unassigned)

References

(Depends on 3 open bugs, Blocks 2 open bugs)

Details

(Keywords: meta, perf)

Attachments

(2 files)

After using rdtsc to measure time spent in intrinsics, and running the results against speedometer, I came up with a total measurement for time-spent in various intrinsics across the benchmark. I ran it on speedometer, and once on a general browsing session that involved visiting gmail, gdocs, twitter frontpage, twitter search, cnn frontpage and article, amazon.com frontpage and "deals" page, reddit front page and articles. I'll just paste the top 20 intrinsics, in terms of time-spent, for each case: Speedometer: std_Math_min => 28518554 ticks (count=413138) RegExpInstanceOptimizable => 31105700 ticks (count=418403) std_Math_max => 33540817 ticks (count=395139) IsArray => 34312855 ticks (count=330131) ToString => 36136290 ticks (count=912097) StringReplaceString => 38208127 ticks (count=28218) ToInteger => 44851095 ticks (count=535956) IsPackedArray => 45654528 ticks (count=1414262) GetElemBaseForLambda => 46164058 ticks (count=192145) SubstringKernel => 61161206 ticks (count=270412) RegExpGetSubstitution => 102270993 ticks (count=60778) RegExpSearcher => 205811871 ticks (count=85808) RegExpTester => 306156740 ticks (count=131025) StringSplitStringLimit => 494954420 ticks (count=169360) std_Array => 539239036 ticks (count=202579) regexp_clone => 566546133 ticks (count=343155) _FinishBoundFunctionInit => 1155262166 ticks (count=2322935) _IsConstructing => 1271157593 ticks (count=2453737) StringSplitString => 1538477812 ticks (count=788446) RegExpMatcher => 2036433481 ticks (count=414165) General browsing: TypedArrayBuffer => 32183411 ticks (count=71061) std_Set_iterator => 32347514 ticks (count=4579) ToInteger => 37726105 ticks (count=439636) GetElemBaseForLambda => 39114952 ticks (count=47771) RegExpGetSubstitution => 40522762 ticks (count=18382) intl_availableCalendars => 47658905 ticks (count=2) std_Array_unshift => 55967512 ticks (count=1704) NewArrayIterator => 60243392 ticks (count=44271) intl_availableCollations => 61644663 ticks (count=8) SubstringKernel => 76173240 ticks (count=199479) _IsConstructing => 271923201 ticks (count=175464) regexp_clone => 346393543 ticks (count=211026) RegExpSearcher => 376314821 ticks (count=85419) std_Array => 479490605 ticks (count=121695) StringSplitString => 719726742 ticks (count=140479) RegExpTester => 817476769 ticks (count=98545) _FinishBoundFunctionInit => 874447924 ticks (count=90540) RegExpMatcher => 2157427068 ticks (count=342273) std_Math_max => 78983211736 ticks (count=410267) std_Math_min => 139792616267 ticks (count=366073)
Results for speedometer run.
Attached file RESULTS_BROWSING.TXT
Results for general browsing run.
The general browsing result has a bad output for ToString, due to bad data cleanup in my script. Actual line should be: ToString => 57046500 ticks (count=650918)
Depends on: 1365387
As mentioned on IRC, we could probably use |if (new.target)| now instead of |if (_IsConstructing())|. Worth measuring. The call IC is the most complicated Baseline IC at this point. It would be nice to convert it to CacheIR - after that optimizing specific natives/intrinsics could be done with very little extra work. I can work on that in a few weeks if we think it's useful.
Till noted that the numbers for min/max seem really off. There might be outliers in the data caused by context switches in the middle of the function. Reminder to modify analysis script to throw out outliers before proceeding.
Depends on: 1365650
After updating the measurement patch to also measure timings for Array and String builtins (both C++-native and self-hosted), and updating the analysis script to throw out outliers (bug 1357180), here are new numbers for speedometer (I've not included functions with a total tick count of <100MM). array_shift => ticks=102220982 count=77806 std_Array_slice => ticks=106317754 count=4254 StringReplaceString => ticks=110791304 count=82614 std_Function_apply => ticks=117073692 count=4707 GetElemBaseForLambda => ticks=124137568 count=1075279 array_pop => ticks=160722276 count=140350 String_slice => ticks=176774870 count=84595 String_substring => ticks=193600968 count=75892 String_substr => ticks=229778575 count=76086 RegExpSearcher => ticks=230758406 count=90718 array_isArray => ticks=272407559 count=8885200 ArrayValues => ticks=281930820 count=74746 str_indexOf => ticks=345024737 count=1488149 RegExpGetSubstitution => ticks=347971536 count=193384 str_toLowerCase => ticks=357953923 count=1277959 RegExpTester => ticks=365266413 count=207379 array_slice => ticks=582474382 count=165662 std_Array => ticks=742882733 count=279016 StringSplitString => ticks=758041193 count=153176 array_sort => ticks=791836175 count=38533 _IsConstructing => ticks=806360192 count=1108003 _FinishBoundFunctionInit => ticks=911459879 count=2068292 array_join => ticks=1003951216 count=133046 std_Object_propertyIsEnumerable => ticks=1168704086 count=10379632 array_indexOf => ticks=1179345302 count=532043 regexp_clone => ticks=1191878032 count=1272803 String_split => ticks=1427541939 count=157076 OwnPropertyKeys => ticks=1978027088 count=2205082 RegExpMatcher => ticks=2802172819 count=1207792 array_push => ticks=3461368102 count=3132916
Numbers from general browsing. This browsing session lasted a while, across a number of top sites (gmail, google, youtube, linkedin, facebook, twitter, cnn, reddit). Only including the numbers for ticks >100MM. std_Set_iterator => ticks=110725296 count=17862 std_Object_propertyIsEnumerable => ticks=112682723 count=263010 array_shift => ticks=125786592 count=70598 std_Function_apply => ticks=145469240 count=8750 String_slice => ticks=146129924 count=103430 str_toLowerCase => ticks=161173426 count=285030 array_splice => ticks=169108842 count=36109 String_substring => ticks=183796233 count=89195 String_substr => ticks=183904593 count=147892 str_indexOf => ticks=198504070 count=587945 std_Array_slice => ticks=222747012 count=12678 RegExpGetSubstitution => ticks=242745744 count=132726 OwnPropertyKeys => ticks=314028362 count=74700 ArrayValues => ticks=333697458 count=93216 array_join => ticks=344194601 count=154238 _IsConstructing => ticks=378989490 count=229550 array_indexOf => ticks=410826994 count=496983 RegExpSearcher => ticks=444617019 count=234626 array_sort => ticks=606447899 count=39295 array_unshift => ticks=611621356 count=161320 regexp_clone => ticks=612977614 count=370586 RegExpTester => ticks=732315111 count=117129 std_Array => ticks=884810658 count=234193 _FinishBoundFunctionInit => ticks=962056303 count=195800 array_slice => ticks=1120530487 count=461899 StringSplitString => ticks=1139177620 count=247124 array_push => ticks=1563841031 count=1405280 RegExpMatcher => ticks=2153891265 count=480151 String_split => ticks=2357659278 count=272309
Depends on: 1366263
Whiteboard: [qf]
Depends on: 1366375
Depends on: 1366377
Depends on: 1366696
Depends on: 1367779
Depends on: 1368076
Whiteboard: [qf] → [qf:p1]
New numbers on builtins after StringSplit and ArrayPush optimizations have been run: Once again, please remember not to compare numbers between different runs. The right way to look at these numbers is to compare them to a "benchmark" function's tick count in the same run. We compare those ratios between different runs to look at relative speedups. Under that approach, in Speedometer we are gaining about 80% on time spent in StringSplitString, and 34% on time spent in ArrayPush. Here are the new report summaries for most time spent in builtins for Speedometer and General Browsing. I reversed the ordering from last time so most-time-spent items are at the top. Using the same cutoff of 10M ticks. Speedometer: array_join => ticks=19352580768 count=569279 RegExpMatcher => ticks=3205670410 count=484666 array_push => ticks=2681080704 count=3832802 std_Array => ticks=2575203136 count=1352807 array_slice => ticks=2523553088 count=1340680 array_indexOf => ticks=3358988002 count=551564 String_split => ticks=5032030848 count=147866 _IsConstructing => ticks=1018052960 count=1172548 array_sort => ticks=826795328 count=36777 str_indexOf => ticks=582414622 count=1785576 RegExpGetSubstitution => ticks=616876800 count=205655 str_toLowerCase => ticks=455484710 count=1870669 RegExpTester => ticks=480780768 count=426712 RegExpSearcher => ticks=1341955072 count=125226 ArrayValues => ticks=4194592338 count=40117 StringSplitString => ticks=306969696 count=67727 String_slice => ticks=11685939682 count=233225 array_pop => ticks=208414240 count=149748 String_substring => ticks=215179841 count=81851 GetElemBaseForLambda => ticks=199046364 count=1185811 std_Function_apply => ticks=184088960 count=5196 std_Array_slice => ticks=192658272 count=4589 _FinishBoundFunctionInit => ticks=176931424 count=118368 std_Array_unshift => ticks=140123424 count=3196 String_substr => ticks=136596640 count=52571 IsPackedArray => ticks=118330662 count=2508664 UnsafeGetInt32FromReservedSlot => ticks=99324537 count=1531467 ToObject => ticks=95117093 count=2510389 str_trim => ticks=95450108 count=842208 array_splice => ticks=87784192 count=27187 SubstringKernel => ticks=88307449 count=266676 StringReplaceString => ticks=86056992 count=62759 array_shift => ticks=80963552 count=54789 std_Math_min => ticks=80229890 count=803188 std_Math_max => ticks=78993198 count=774565 std_Reflect_getPrototypeOf => ticks=72150963 count=1519598 NewArrayIterator => ticks=76265942 count=40419 ToString => ticks=69363327 count=1347470 IsCallable => ticks=67435449 count=1150638 array_unshift => ticks=59193056 count=18150 array_isArray => ticks=57246768 count=994075 IsObject => ticks=49239181 count=1253637 std_Set_iterator => ticks=48248480 count=3102 ToInteger => ticks=44689052 count=758013 str_toString => ticks=44261369 count=213097 RegExpInstanceOptimizable => ticks=43805627 count=465875 IsRegExpObject => ticks=39202109 count=1030030 str_startsWith => ticks=34570080 count=73920 IsArray => ticks=30569062 count=192068 str_toUpperCase => ticks=30462912 count=65158 RegExpPrototypeOptimizable => ticks=20968322 count=159469 str_charAt => ticks=20551068 count=86571 std_Map_iterator => ticks=20766112 count=13392 String_localeCompare => ticks=17968608 count=1 UnsafeSetReservedSlot => ticks=17876455 count=147121 intl_CompareStrings => ticks=17483232 count=1 GetFirstDollarIndex => ticks=16500614 count=115185 IsConstructor => ticks=15801899 count=104103 array_includes => ticks=27502752 count=839 IsWrappedArrayConstructor => ticks=12548376 count=97388 str_endsWith => ticks=11584704 count=26405 UnsafeGetStringFromReservedSlot => ticks=10775487 count=269314 IsPossiblyWrappedTypedArray => ticks=10614453 count=93969 General Browsing: String_split => ticks=663516512 count=69659 RegExpMatcher => ticks=1064411262 count=155731 array_push => ticks=404477376 count=532077 RegExpSearcher => ticks=383880025 count=110920 _FinishBoundFunctionInit => ticks=358290528 count=31550 array_slice => ticks=344820256 count=118802 RegExpTester => ticks=270172864 count=65736 array_sort => ticks=381719936 count=10163 std_Array => ticks=221480544 count=65479 ArrayValues => ticks=195148171 count=30119 _IsConstructing => ticks=175242432 count=125953 StringSplitString => ticks=163423456 count=44663 array_indexOf => ticks=157964612 count=182115 str_indexOf => ticks=124445984 count=254837 array_unshift => ticks=111931264 count=41741 array_join => ticks=122935552 count=36264 String_slice => ticks=113649667 count=172508 std_Function_apply => ticks=750153536 count=3517 array_splice => ticks=72989600 count=13028 String_substr => ticks=66333440 count=24983 str_toLowerCase => ticks=62616640 count=111437 array_shift => ticks=61394624 count=34002 array_pop => ticks=58621216 count=33596 String_substring => ticks=70482048 count=19195 std_Set_iterator => ticks=53232096 count=4738 NewArrayIterator => ticks=55698846 count=30408 str_startsWith => ticks=40516864 count=83127 RegExpGetSubstitution => ticks=48959232 count=22120 ToString => ticks=43755419 count=361958 TypedArrayBuffer => ticks=38244721 count=101789 UnsafeGetInt32FromReservedSlot => ticks=32083248 count=476532 SubstringKernel => ticks=42634824 count=84356 IsCallable => ticks=26072953 count=365345 std_Math_max => ticks=24845374 count=178627 str_trim => ticks=30686177 count=35442 std_Math_min => ticks=22148391 count=186798 std_Array_slice => ticks=33065952 count=1249 std_Array_unshift => ticks=22054976 count=496 std_Map_iterator => ticks=22780640 count=13982 str_endsWith => ticks=17673440 count=39705 ToInteger => ticks=17197309 count=229402 IsObject => ticks=16766629 count=359039 RegExpInstanceOptimizable => ticks=15256584 count=171238 IsRegExpObject => ticks=13813555 count=274547 GetElemBaseForLambda => ticks=18855884 count=25385 RegExpPrototypeOptimizable => ticks=11587099 count=57723 array_includes => ticks=43144032 count=779 UnsafeSetReservedSlot => ticks=10363473 count=95875
Just needinfo-ing you as a ping to note that the new measurements you asked for in SF are above.
Flags: needinfo?(andrebargull)
Depends on: 1382837
Changing this to [qf:meta] because this is not a perf bug, mostly tracking umbrella bug for identifying specific built-ins to optimize, which will be tracked as dependents of this bug.
Whiteboard: [qf:p1] → [qf:meta]
Depends on: 1383643
Depends on: 1383644
Depends on: 1383645
Depends on: 1383646
Depends on: 1383647
Depends on: 1383648
(In reply to Kannan Vijayan [:djvj] from comment #9) > Just needinfo-ing you as a ping to note that the new measurements you asked > for in SF are above. Great, thanks! I've already found a few things for further optimizations. :-D
Flags: needinfo?(andrebargull)
Ugh, so the previous numbers I posted were not "cleaned up" to remove far outliers (chose the wrong dataset to put up). Here's the cleaned up data. array_join is still at the top. Speedometer: array_join ticks=17611970528 count=567655 array_push ticks=3014251328 count=4677882 std_Array ticks=2549463008 count=1346281 array_slice ticks=2405925600 count=1343465 RegExpMatcher ticks=1520377047 count=466415 array_indexOf ticks=1091100002 count=551296 String_split ticks=992710144 count=142713 array_sort ticks=943137376 count=119083 _IsConstructing ticks=866099072 count=1087945 str_indexOf ticks=517247230 count=1578903 RegExpGetSubstitution ticks=506158368 count=206485 str_toLowerCase ticks=466028385 count=1974392 RegExpSearcher ticks=342306400 count=127900 RegExpTester ticks=288144544 count=175792 StringSplitString ticks=275228544 count=66369 String_substring ticks=257182945 count=266371 String_substr ticks=214265665 count=871272 GetElemBaseForLambda ticks=179544873 count=1187459 _FinishBoundFunctionInit ticks=175390656 count=115359 array_reverse ticks=174943744 count=102111 String_slice ticks=144205830 count=84601 IsPackedArray ticks=129565917 count=2497151 array_pop ticks=128714656 count=144792 ToObject ticks=100937144 count=2481294 StringReplaceString ticks=84933632 count=62062 array_splice ticks=81222752 count=28218 SubstringKernel ticks=80571792 count=268125 UnsafeGetInt32FromReservedSlot ticks=76111783 count=1198404 std_Reflect_getPrototypeOf ticks=74895548 count=1520341 IsCallable ticks=73455793 count=1129656 array_shift ticks=73016128 count=62501 ToString ticks=67189105 count=1103441 std_Math_max ticks=61439981 count=486319 array_isArray ticks=59274974 count=1062882 std_Math_min ticks=56162285 count=504294 str_trim ticks=51512097 count=77739 array_unshift ticks=44711936 count=16298 str_toString ticks=43479848 count=208809 IsObject ticks=41917650 count=951519 RegExpInstanceOptimizable ticks=38818492 count=451367 ToInteger ticks=35845481 count=465666 IsRegExpObject ticks=31781036 count=761082 ArrayValues ticks=29935552 count=6072 str_toUpperCase ticks=29459392 count=65134 IsArray ticks=23635834 count=184707 std_Function_apply ticks=21491936 count=1944 RegExpPrototypeOptimizable ticks=20985388 count=164479 str_charAt ticks=16285578 count=84045 GetFirstDollarIndex ticks=16039089 count=120141 IsConstructor ticks=14874254 count=98135 UnsafeGetStringFromReservedSlot ticks=10804405 count=247027
(In reply to Kannan Vijayan [:djvj] from comment #12) > Here's the cleaned up data. array_join is still at the top. > > Speedometer: > > array_join ticks=17611970528 count=567655 FWIW I think most of this is the Ember case where we spend time under stringifying objects. Bug 1384562 should help there.
Depends on: 1385802
Depends on: 1386001
Depends on: 851769
Depends on: 1387400
Depends on: 1387968
Depends on: 1388034
Depends on: 1391304
Depends on: 1392766
Depends on: 1395927
Depends on: 1395954
Keywords: perf
Priority: -- → P3
Summary: Baseline: Optimize intrinsics used in self-hosted functions → [meta] Baseline: Optimize intrinsics used in self-hosted functions
Performance Impact: --- → ?
Whiteboard: [qf:meta]
Performance Impact: ? → ---
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: