TM: non-deterministic LIR generation

RESOLVED WORKSFORME

Status

()

Core
JavaScript Engine
RESOLVED WORKSFORME
8 years ago
7 years ago

People

(Reporter: njn, Unassigned)

Tracking

Trunk
x86
Linux
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments)

(Reporter)

Description

8 years ago
The LIR generated by Tracemonkey for crypto-aes.js, v8-crypto.js and v8-raytrace.js changes non-deterministically.

Eg. here's the result of piping TMFLAGS=readlir output through 'wc' 10 times:

 [ocean:~/moz/ws0/js/src] for i in `seq 1 10` ; do TMFLAGS=readlir js0d32 -j /home/njn/moz/SunSpider/tests/crypto-aes.js | strip_addrs > y$i ; wc y$i ; done
 21897 125196 744143 y1
 21897 125196 744143 y2
 19718 113227 675588 y3
 21897 125196 744143 y4
 21897 125196 744143 y5
 21897 125196 744143 y6
 21897 125196 744143 y7
 17102  99017 594237 y8
 21897 125196 744143 y9
 21897 125196 744143 y10

Most of the time I get 21897 bytes, but sometimes I get 19718 or 17102.

I discovered this when testing a patch that cleaned up the i386 backend but wasn't supposed to change the generated code.  AFAICT crypto-aes is the only SunSpider benchmark that features this non-determinism.

v8-crypto.js exhibits even more non-determinism:

 /home/njn/moz/SunSpider/tests/v8-crypto.js
 102636  616415 3975756
 112656  676163 4363617
 108532  650392 4194133
 110888  665181 4290891
 112386  674402 4351075
 112882  677413 4370792
 158539  953766 6173152
 108489  650195 4192971
 114667  688319 4441484
 111804  671200 4329003

And v8-raytrace.js has a small amount:

/home/njn/moz/SunSpider/tests/v8-raytrace.js
  17706  112018  765336
  15255   96055  656340
  15255   96055  656340
  17706  112018  765348
  17706  112018  765336
  15255   96055  656329
  15255   96055  656339
  15255   96055  656339
  15255   96055  656329
  15255   96055  656340

This behaviour seems undesirable to me.  For one, it makes the diff-based testing mentioned earlier more difficult.  More generally it just doesn't feel right.

I tested with Valgrind, but didn't find any problems.

I looked through crypto-aes.js but couldn't find any cause.

Is this a TM problem, or just a characteristic of these programs?  I'd be happy if anyone could explain.
(Reporter)

Comment 1

8 years ago
Created attachment 417394 [details]
crypto-aes.js output (744143 bytes)
(Reporter)

Comment 2

8 years ago
Created attachment 417395 [details]
crypto-aes.js output (675588 bytes)
(Reporter)

Comment 3

8 years ago
Created attachment 417397 [details]
crypto-aes.js output (594237 bytes)

Nb: this non-determinism also isn't going to help the stability of our SunSpider timings.
(Reporter)

Updated

8 years ago
Attachment #417394 - Attachment description: crypto-aes.js output (21897 bytes) → crypto-aes.js output (744143 bytes)
Attachment #417394 - Attachment mime type: application/octet-stream → text/plain
(Reporter)

Updated

8 years ago
Attachment #417395 - Attachment mime type: application/octet-stream → text/plain
(Reporter)

Updated

8 years ago
Attachment #417397 - Attachment mime type: application/octet-stream → text/plain

Comment 4

8 years ago
I think this non-determinism is due, at least in part, to the non-determinism of the Oracle (bug 517535).  If you need a stable execution sequence for debug/timing, you could disable it by have Oracle::is*Undemotable return true.

Comment 5

8 years ago
We should really rewrite the oracle. Its a gross hack. It should properly map script hashes and pc numbers to results and cache those for a while (maybe a day or so? something like that).
(Reporter)

Comment 6

8 years ago
(In reply to comment #4)
> I think this non-determinism is due, at least in part, to the non-determinism
> of the Oracle (bug 517535).

I tried changing isGlobalSlotUndemotable() to always return true, as that bug suggests.  I still see the same non-determinism.  So either I haven't disabled the Oracle properly or it's not the cause (or not the only cause).

Comment 7

8 years ago
which harness are you using? what is the command line?
isInstructionUndemotable and isStackSlotUndemotable need to return true as well.
(Reporter)

Comment 9

8 years ago
(In reply to comment #7)
> which harness are you using? what is the command line?

I'm using the Webkit SunSpider.  Command line is in comment 0 above.

(In reply to comment #8)
> isInstructionUndemotable and isStackSlotUndemotable need to return true as
> well.

I tried that as well, I'm still seeing the non-determinism.
(In reply to comment #4)
> I think this non-determinism is due, at least in part, to the non-determinism
> of the Oracle (bug 517535).  If you need a stable execution sequence for

Luke, can you say precisely what form of nondeterminism the Oracle
introduces?  For a non-threaded app, how can it be nondeterministic?
I can think of two other ways: either it relies on details of object
(C++ ones as well as JS ones) addresses, and hence is affected by 
address space randomisation.  Or, it looks at cycle counts as obtained
by rdtsc et al, and so is deliberately timing dependent.

But I had the impression it did neither of these things.  I thought
the nondeterminism in these benchmarks is due to the use of 
Math.random, which AIUI is unlike libc random et al in that it 
isn't initialised to a fixed starting point at process start.
The Oracle does rely on somewhat random numbers, it hashes |fp->pc|, |fp->script|, the global object shape, maybe more.

But if the Oracle's disabled and we're still getting non-determinism, without Math.random or Date.something, that's a little sketchy.
(In reply to comment #11)
> The Oracle does rely on somewhat random numbers, it hashes |fp->pc|,
> |fp->script|, the global object shape, maybe more.

Why would any of those be non-repeatable from run to run, though?
crypto-aes has a call to Date.now() and uses the bits for data. When I fix that the results are consistent.
Don't you just love SunSpider?

There may be a bug on file already, checking bugs.webkit.org... didn't find one, feel free to file.

/be
(Reporter)

Comment 15

8 years ago
Created attachment 417563 [details] [diff] [review]
patch partially fixing SS

(In reply to comment #13)
> crypto-aes has a call to Date.now() and uses the bits for data. When I fix that
> the results are consistent.

I didn't see Date.now() but I did see Date.getTime().  The attached patch makes crypto-aes.js deterministic for me even with the Oracle enabled, which is good.

I got v8-crypto much less non-deterministic by futzing with a Date.getTime() call and a Math.random() call -- it now bounces between two very similar results.  (And disabling the Oracle didn't help.)

v8-raytrace.js is still a mystery to me.  It doesn't have any date or random values that I can find.
> (In reply to comment #13)
> v8-raytrace.js is still a mystery to me.  It doesn't have any date or random
> values that I can find.

This looks kinda dodgy to me:

  registerCallback: function() {
    this.timer = setInterval(this.onTimerEvent.bind(this), this.frequency * 1000);
  },

Looks like it might use some kind of timer-based mechanism delivered
by signals?  (waves hands wildly)
(Reporter)

Comment 17

8 years ago
(In reply to comment #14)
> Don't you just love SunSpider?
> 
> There may be a bug on file already, checking bugs.webkit.org... didn't find
> one, feel free to file.

https://bugs.webkit.org/show_bug.cgi?id=32804
(Reporter)

Comment 18

7 years ago
Not much point having this bug open any more.  Bug 580532 has a patch that improves SunSpider and V8 a lot.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.