Closed
Bug 748146
Opened 13 years ago
Closed 6 months ago
Compiled dlmalloc benchmark 22X slower in IonMonkey
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: azakai, Unassigned)
References
(Blocks 1 open bug)
Details
Attachments
(1 file)
89.39 KB,
application/javascript
|
Details |
js -m -n src.js 20 20
vs.
ionjs src.js 20 20
, IonMonkey is 22X slower. Since almost all emscripten-compiled projects use malloc and free, this affects a lot of them.
Comment 1•13 years ago
|
||
Well, this test is impressive!
we manage to spend almost no time running code. Looking at where perf says we spend time up to the first kernel function listed gives:
13.08% js.ion.opt js [.] js::ion::MNode::replaceOperand(unsigned long, js::ion::MDefinition*)
6.48% js.ion.opt js [.] js::ion::EliminatePhis(js::ion::MIRGraph&)
5.53% js.ion.opt js [.] js::ion::MDefinition::replaceAllUsesWith(js::ion::MDefinition*)
4.98% js.ion.opt js [.] js::ion::LinearScanAllocator::allocateRegisters()
4.97% js.ion.opt js [.] js::ion::LinearScanAllocator::buildLivenessInfo()
3.96% js.ion.opt js [.] js::ion::CodeGeneratorShared::encodeSlots(js::ion::LSnapshot*, js::ion::MResumePoint*, unsigned int*)
3.67% js.ion.opt js [.] js::ion::MResumePoint::inherit(js::ion::MBasicBlock*)
3.37% js.ion.opt js [.] js::ion::LIRGeneratorShared::buildSnapshot(js::ion::LInstruction*, js::ion::MResumePoint*, js::ion::BailoutKind)
3.34% js.ion.opt js [.] js::ion::MPhi::op() const
2.94% js.ion.opt js [.] js::ion::MBasicBlock::inherit(js::ion::MBasicBlock*)
2.38% js.ion.opt js [.] js::ion::LinearScanAllocator::setIntervalRequirement(js::ion::LiveInterval*)
1.91% js.ion.opt js [.] js::ion::ValueNumberer::lookupValue(js::ion::MDefinition*)
1.85% js.ion.opt js [.] js::ion::MResumePoint::getOperand(unsigned long) const
1.79% js.ion.opt js [.] js::ion::LiveInterval::covers(js::ion::CodePosition)
1.59% js.ion.opt js [.] TypeAnalyzer::propagateSpecialization(js::ion::MPhi*)
1.52% js.ion.opt js [.] js::ion::LinearScanAllocator::reifyAllocations()
1.45% js.ion.opt js [.] js::ion::MResumePoint::setOperand(unsigned long, js::ion::MDefinition*)
1.35% js.ion.opt js [.] js::ion::SnapshotWriter::addUndefinedSlot()
1.12% js.ion.opt js [.] js::ion::Loop::insertInWorklist(js::ion::MInstruction*)
1.11% js.ion.opt js [.] js::ion::LiveInterval::firstIncompatibleUse(js::ion::LAllocation)
1.08% js.ion.opt js [.] js::ion::LinearScanAllocator::resolveControlFlow()
1.06% js.ion.opt js [.] js::LifoAlloc::getOrCreateChunk(unsigned long)
0.93% js.ion.opt js [.] js::ion::LinearScanAllocator::populateSafepoints()
0.79% js.ion.opt js [.] js::ion::ValueNumberer::computeValueNumbers()
0.77% js.ion.opt js [.] js::ion::MPhi::getOperand(unsigned long) const
0.72% js.ion.opt [kernel.kallsyms] [k] __percpu_counter_add
which sums to 73.74%!
and none of these routines have anything to do with the interpreter, only compiling in IM (and type analysis).
I suspect that chunked compilation will help with this. Nevertheless, I'll continue looking into this to see if there is anything horribly silly that we do that would cause compilation to go so painfully slowly.
Comment 2•13 years ago
|
||
(In reply to Marty Rosenberg [:mjrosenb] from comment #1)
> I suspect that chunked compilation will help with this.
With my chunked compilation (WIP) patch, Ion is about 2x slower than JM+TI (60 ms vs 30 ms). About 30 ms is compilation time (5 ms with JM+TI). I hope we can bring this down to about 15-20 ms by optimizing snapshots a bit. There's a large number of locals and with chunked compilation these have a fixed location so we don't have to encode them.
Note that the interpreter is still faster than both Ion and JM+TI (20 ms), so I guess there's not much to optimize/win here for the JITs. Alon, is it okay to use "200 200" instead of "20 20"? It won't read out-of-bound array values or something?
Depends on: 746225
Reporter | ||
Comment 3•13 years ago
|
||
Yes, any value >0 of those two arguments is fine. First parameter is how many malloc() /free() calls to do each repetition, the second is how many repetitions. Here is the original source,
https://github.com/kripken/emscripten/blob/c7bed7ab29a5e351166bf570825edc2a94c43aef/tests/dlmalloc_test.c
With high enough parameters, I would hope that JITs would help here...
![]() |
||
Comment 4•12 years ago
|
||
On AWFY, misc-dlmalloc have regressed a lot since last September.
Assignee | ||
Updated•11 years ago
|
Assignee: general → nobody
Updated•2 years ago
|
Severity: normal → S3
Comment 5•6 months ago
|
||
Both Chrome and Nightly give "0,0" as the output.
Status: NEW → RESOLVED
Closed: 6 months ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•