Open Bug 1056589 Opened 10 years ago Updated 2 years ago

Raytrace takes more time in MinorGC on windows

Categories

(Core :: JavaScript: GC, defect, P5)

x86
Windows 7
defect

Tracking

()

People

(Reporter: h4writer, Unassigned)

References

(Blocks 1 open bug)

Details

The score difference between windows and linux is now much lower (since the --enable-diagnostic-... flag was discovered). But there is still a difference. On linux we get 73300 on this computer (non-pgo, custom shell compiled with --enable-optimize --disable-debug). While on windows (some configure flags) we have 60965.

Now we have already bug 1056529, which marginally improves score, but totally not enough.

It's hard to see a difference between the two builds. (i.e. almost entirely running in ionmonkey. No VM calls. No GC. No invalidation. Everything gets compiled immediatly ...).

Except I noticed that in the tracelogger graphs the MinorGC is a bit higher. 2% on windows and only 1% on Linux.
Terrence can you help me out here? Is it possible MinorGC takes longer on windows? What would be the best way to confirm this is indeed an issue. How can I debug this? Are there easy workarounds that I can try to see if it makes a difference on the score? Are there issues you know about which could cause this discrepancy windows/linux?

Sorry about all these questions, but it still is a big difference in score and I find nothing else to blame (atm). Feel free to investigate yourself. Or give me directions on where I should look at.
Blocks: 1028242
Flags: needinfo?(terrence)
(In reply to Hannes Verschore [:h4writer] from comment #1)
> Terrence can you help me out here?

Absolutely!

> Is it possible MinorGC takes longer on
> windows?

Yes, it certainly is! Some bits of the MinorGC are very sensitive to the quality of the generated code and the particular inlining decisions, so MSVC could very well be giving us sub-par results. This should be pretty easy to fix (if it's the issue) just be sprinkling some more MOZ_ALWAYS_INLINE in the ggc object marking path.

> What would be the best way to confirm this is indeed an issue. How
> can I debug this?

At the top of js/src/gc/Nursery.cpp, uncomment |#define PROFILE_NURSERY|, rebuild, then run with JS_MINORGC_TIME=0 in the environment (or just hack it always-on if setting env vars in windows is still a hassle). This will print a bunch of timing information (in microseconds) at every minor gc. This should give you a good idea if minorgc is taking way longer on windows, and where it's falling over, if so.

It might also be worth taking a look at the full GC's to see if anything sticks out there. No rebuild necessary, just set MOZ_GCTIMER to a file (note, you can use - to get stdout, but this is a stripped down format that isn't really useful). I think these times are in ms, but it's been awhile since I looked.

> Are there easy workarounds that I can try to see if it
> makes a difference on the score? Are there issues you know about which could
> cause this discrepancy windows/linux?

Not that I know of off-hand. Codegen would be the simplest. The other significant differences are threading and memory model. Window's thread scheduling and locks are different enough that it may be getting into some weird state we didn't run into on other platforms. Also, it's memory management has very different tradeoffs from other platforms, so maybe our chunk management is an issue there.

Jon made some charts awhile ago that give us a tremendous amount of detail on the allocation behaviors in octane, so maybe that would also be helpful when looking at our GC performance here:
http://people.mozilla.org/~jcoppeard/gcstats/index.html?page=benchmark&arg=raytrace&dataDir=data

> Sorry about all these questions, but it still is a big difference in score
> and I find nothing else to blame (atm). Feel free to investigate yourself.
> Or give me directions on where I should look at.

No worries! If you find anything, let me know!
Flags: needinfo?(terrence)
I think my reasoning above must be wrong: after the marking rewrite, everything should be much more inlinable, so we should have seen things even out. That didn't happen, so there must be something else going on.

Note that v8 is about 10% slower, which is smaller than the 25% gap we're experiencing, but still notable. Maybe we're just getting hit harder because we spend more time doing the slower thing?
It would be interesting to remeasure this now, with VS 2015 update 3.
Component: JavaScript Engine: JIT → JavaScript: GC
Priority: -- → P5
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.