Closed Bug 748146 Opened 13 years ago Closed 6 months ago

Compiled dlmalloc benchmark 22X slower in IonMonkey

Tracking

()

Status:

RESOLVED WORKSFORME

People

(Reporter: azakai, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

dlmalloc 13 years ago Alon Zakai (:azakai) 89.39 KB, application/javascript		Details

Alon Zakai (:azakai)

Reporter

Description

•

13 years ago

Attached file dlmalloc — Details

js -m -n src.js 20 20 vs. ionjs src.js 20 20 , IonMonkey is 22X slower. Since almost all emscripten-compiled projects use malloc and free, this affects a lot of them.

Marty Rosenberg [:mjrosenb]

Comment 1

•

13 years ago

Well, this test is impressive! we manage to spend almost no time running code. Looking at where perf says we spend time up to the first kernel function listed gives: 13.08% js.ion.opt js [.] js::ion::MNode::replaceOperand(unsigned long, js::ion::MDefinition*) 6.48% js.ion.opt js [.] js::ion::EliminatePhis(js::ion::MIRGraph&) 5.53% js.ion.opt js [.] js::ion::MDefinition::replaceAllUsesWith(js::ion::MDefinition*) 4.98% js.ion.opt js [.] js::ion::LinearScanAllocator::allocateRegisters() 4.97% js.ion.opt js [.] js::ion::LinearScanAllocator::buildLivenessInfo() 3.96% js.ion.opt js [.] js::ion::CodeGeneratorShared::encodeSlots(js::ion::LSnapshot*, js::ion::MResumePoint*, unsigned int*) 3.67% js.ion.opt js [.] js::ion::MResumePoint::inherit(js::ion::MBasicBlock*) 3.37% js.ion.opt js [.] js::ion::LIRGeneratorShared::buildSnapshot(js::ion::LInstruction*, js::ion::MResumePoint*, js::ion::BailoutKind) 3.34% js.ion.opt js [.] js::ion::MPhi::op() const 2.94% js.ion.opt js [.] js::ion::MBasicBlock::inherit(js::ion::MBasicBlock*) 2.38% js.ion.opt js [.] js::ion::LinearScanAllocator::setIntervalRequirement(js::ion::LiveInterval*) 1.91% js.ion.opt js [.] js::ion::ValueNumberer::lookupValue(js::ion::MDefinition*) 1.85% js.ion.opt js [.] js::ion::MResumePoint::getOperand(unsigned long) const 1.79% js.ion.opt js [.] js::ion::LiveInterval::covers(js::ion::CodePosition) 1.59% js.ion.opt js [.] TypeAnalyzer::propagateSpecialization(js::ion::MPhi*) 1.52% js.ion.opt js [.] js::ion::LinearScanAllocator::reifyAllocations() 1.45% js.ion.opt js [.] js::ion::MResumePoint::setOperand(unsigned long, js::ion::MDefinition*) 1.35% js.ion.opt js [.] js::ion::SnapshotWriter::addUndefinedSlot() 1.12% js.ion.opt js [.] js::ion::Loop::insertInWorklist(js::ion::MInstruction*) 1.11% js.ion.opt js [.] js::ion::LiveInterval::firstIncompatibleUse(js::ion::LAllocation) 1.08% js.ion.opt js [.] js::ion::LinearScanAllocator::resolveControlFlow() 1.06% js.ion.opt js [.] js::LifoAlloc::getOrCreateChunk(unsigned long) 0.93% js.ion.opt js [.] js::ion::LinearScanAllocator::populateSafepoints() 0.79% js.ion.opt js [.] js::ion::ValueNumberer::computeValueNumbers() 0.77% js.ion.opt js [.] js::ion::MPhi::getOperand(unsigned long) const 0.72% js.ion.opt [kernel.kallsyms] [k] __percpu_counter_add which sums to 73.74%! and none of these routines have anything to do with the interpreter, only compiling in IM (and type analysis). I suspect that chunked compilation will help with this. Nevertheless, I'll continue looking into this to see if there is anything horribly silly that we do that would cause compilation to go so painfully slowly.

Jan de Mooij [:jandem]

Comment 2

•

13 years ago

(In reply to Marty Rosenberg [:mjrosenb] from comment #1) > I suspect that chunked compilation will help with this. With my chunked compilation (WIP) patch, Ion is about 2x slower than JM+TI (60 ms vs 30 ms). About 30 ms is compilation time (5 ms with JM+TI). I hope we can bring this down to about 15-20 ms by optimizing snapshots a bit. There's a large number of locals and with chunked compilation these have a fixed location so we don't have to encode them. Note that the interpreter is still faster than both Ion and JM+TI (20 ms), so I guess there's not much to optimize/win here for the JITs. Alon, is it okay to use "200 200" instead of "20 20"? It won't read out-of-bound array values or something?

Depends on: 746225

Alon Zakai (:azakai)

Reporter

Comment 3

•

13 years ago

Yes, any value >0 of those two arguments is fine. First parameter is how many malloc() /free() calls to do each repetition, the second is how many repetitions. Here is the original source, https://github.com/kripken/emscripten/blob/c7bed7ab29a5e351166bf570825edc2a94c43aef/tests/dlmalloc_test.c With high enough parameters, I would hope that JITs would help here...

Marco Castelluccio [:marco]

Updated

•

12 years ago

Blocks: IonSpeed

Guilherme Lima

Comment 4

•

12 years ago

On AWFY, misc-dlmalloc have regressed a lot since last September.

Nobody; OK to take it and work on it

Assignee

Updated

•

11 years ago

Assignee: general → nobody

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

Mayank Bansal

Comment 5

•

6 months ago

Both Chrome and Nightly give "0,0" as the output.

Status: NEW → RESOLVED

Closed: 6 months ago

Resolution: --- → WORKSFORME

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Compiled dlmalloc benchmark 22X slower in IonMonkey

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: azakai, Unassigned)

References

(Blocks 1 open bug)

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Updated

Comment 4

Updated

Updated

Comment 5

Attachment

General

Description

File Name

Content Type