Open Bug 825268 Opened 12 years ago Updated 2 years ago

IonMonkey & Heuristics: Define « often? ».

Categories

(Core :: JavaScript Engine: JIT, defect, P5)

x86_64
Linux
defect

Tracking

()

People

(Reporter: nbp, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(1 file)

In IonMonkey, we are looking for way to trigger optimizations / de-optimizations based on the frequency of events.  The problem is that we have no good way to deterministically define the frequency without a good reference.

The wall-clock is a bad metric as it is not determinist and thus cannot be used to reproduce bugs.  In addition, it will provide different engine behavior based on each computer performances / scheduler.

Marty suggested using GCs as a scale of time, but GCs appear to be a used as a global time approach.  GCs are not a so good global time metric because they are based on either a timer (if I remember correctly in FF) or the variety / number of allocations.  Even if GCs are mostly determinist, a global time is not a good approach when we are looking for a local time approach (we cannot compare something which runs 1/8 of the time the same way as something which runs 7/8 of the time, even if all are hot).

A local-time approach would be needed to approximate something which is based on the runtime of a script.

The use-count is somehow a local time approach but it has the defect that it does not correspond to any number of instructions executed[3] and that it is reset for various reasons, such as some Ion invalidation.  The use count also has the defect that it is not equivalent to a run-time as it only count the number of time a function has been entered.  In addition, it is sometimes reset[0] / not maintained[1] by IonMonkey, which makes it useless for any Ion metrics.

The max-loop-count has been added as a way to estimate the runtime for OSR.  Combine with the use-count is can be used to estimate the hotness of a function.  Still, it cannot be used once the script is compiled with IonMonkey, as it is not maintained[1] either.

Even if we were maintaining the use-count, we cannot use it as a reference for the number of bailout because a function can, in the worst case scenario bail out at the end, just before the return and still be a valuable compilation result.  To estimate this we would need the average loop-counter[2] to estimate if a bailout is worth keeping or worth re-compiling / de-optimizing.  Most of the cost of a bailout is the time spent in the interpreter and not the conversion from the Ion stack to the interpreter stack, even if this is a slow process.

== Potential ideas ==

[0] The reset is the biggest issue for metrics related to the invalidation, because we cannot do any metrics across a reset.  At this point it would be better to copy a base value instead of doing a reset of the counter, and then compare with the subtracted base instead of the counter.

[1] We want to minimize IonMonkey's instrumentation, which means that currently we sacrifice the basic metric because we value performances.  We need to find another way to update this time-like counters.

One way of doing so would be to do the *wrong* estimate that a script will keep running at the same pace during the rest of the runtime.  Then the idea is to remember when the script has been compiled and count the start-time + estimated-time.  Then any time spent outside of IonMonkey, where the counters are maintained, will increment for the dark-time (= current-time - start-time).  The ratio of the dark-time over the estimated-time will give us the answer to the « often? » question in IonMonkey.

To have an estimated time of a script, we need a global reference of time.  Hopefully, this we can make one easily by aggregating the counters on the runtime when a function returns or each time they are updated.  This imply that we should not reset the counters.

[2] To provide a good estimate, we need average values instead of max values. This can be changed by always counting loops instead of computing a max value, and compute the average value by dividing by the use-count.

[3] *If* we need a better approximation (which is still to be determine), we can easily replace the loop-counter by a counter of the number of "byte"-code executed by adding the number of bytes read every-time a goto/return/throw is taken in the interpreter or in JM.  This can provide a better average cost of a script which might be valuable for bailouts.

Any thoughts?
Blocks: 803710
With the attached patch I was able to speedup some SunSpider and Kraken tests by ~3% (30 executions). It also improved the overall Octane score in ~2% (10 executions) for the current version of the code in Ion repository. This code basically tries to guess the range of the first loop in the script, during the execution in the interpreter. It supports only one nesting level, though the technique I used to do that does not catch all cases and may lead to some false positives. If the loop is guessed to iterate a lot of times, Ion compilation is triggered as soon as possible, without waiting for the 1000 uses as we currently do.

3% may be too little, but the patch is still very simple and the numbers vary a lot depending on the value of usesBeforeCompile. It also cannot live across bailouts and invalidations, as happens to useCount. My idea is to take the most advantage as possible of the 10 uses we are running in the interpreter to gather information for things like early compilation/inlining, since we do not want to instrument the Baseline and Ion generated assembly. What are the current plans for baselineUsesBeforeCompile and usesBeforeCompile? The idea is to completely replace the interpreter by the Baseline compiler in the future?
Comment on attachment 728768 [details] [diff] [review]
Infer loop range during the first execution in the interpreter.

Review of attachment 728768 [details] [diff] [review]:
-----------------------------------------------------------------

When you generate a diff, use -U8 such as there is enough context to understand the patch in Bugzilla.

Then I think you should open another bug for this modification, as this modification adds more heuristics where this Bug is dedicated to clean them.

Thanks.

::: js/src/jsinterp.cpp
@@ +1709,5 @@
>          }                                                                     \
>      JS_END_MACRO
>  
> +#define GUESS_HOTNESS(offset)                                                 \
> +    JS_BEGIN_MACRO                                                            \

2 things:
- This macro is kind of complex and lacks comments.
- It is defined after it usage, which is usually not a good practice with Macros.

We should simplify our current counters.

::: js/src/jsscript.h
@@ +778,5 @@
> +    void resetHotnessInfo() {
> +        hasGuessedHotness = false;
> +        guessedHot = false;
> +        firstReachedBranch = NULL;
> +        useCount = 0;

This is exactly what this bug suggest to avoid and replace by a counter which is constantly incremented and never reset.
(In reply to Pericles Alves from comment #1)
> What are the current plans for
> baselineUsesBeforeCompile and usesBeforeCompile? The idea is to completely
> replace the interpreter by the Baseline compiler in the future?

No the idea is to give the first shot to the baseline compiler and to use the baseline compiler as our source of monitored types.  We want to keep the interpreter as it is still a fast mode of execution for scripts which are run once.
(In reply to Nicolas B. Pierron [:nbp] from comment #0)
> The max-loop-count has been added as a way to estimate the runtime for OSR. 
> Combine with the use-count is can be used to estimate the hotness of a
> function.  Still, it cannot be used once the script is compiled with
> IonMonkey, as it is not maintained[1] either.

Actually, when I originally added the max loop count, its purpose was to mediate places where a JM-compiled script A would compile a callee script B with either JM or Ion.  For "short-running" functions, we wanted the callee to be compiled with JM because the call overhead was high for JM => Ion (and Ion compilation time was higher too).  For "long-running" functions, the additional overhead was considered to be worth it because Ion generated better code.

The way were were measuring "long-running" functions was by checking for the existence of a loop.  However, this was leading to a situation where lots of functions with tiny loops were getting Ion-compiled.  The "max-loop-count" was an interpreter-only mechanism that kept track of how many loop backedges were taken per invocation.  Then, the logic was changed to only compile "high-loop-count" functions with Ion.


As for the rest of the post, I haven't fully processed all of what you are suggesting yet.  I am happy that you started the bug, though.  It's good to have this discussion.  Lemme read and figure out what you are suggesting and I'll post my thoughts afterward.
Blocks: 891172
Assignee: general → nobody
Flags: needinfo?(lazyparser)
Flags: needinfo?(lazyparser)
Component: JavaScript Engine → JavaScript Engine: JIT
Priority: -- → P5
See Also: → 1514284
Depends on: 1514284
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: