Closed
Bug 494639
Opened 15 years ago
Closed 15 years ago
nanojit: fix numerous LIR memory management problems
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: n.nethercote, Assigned: n.nethercote)
References
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(1 file, 2 obsolete files)
27.58 KB,
patch
|
Details | Diff | Splinter Review |
Nanojit has numerous problems with its management of LIR memory. I have 7 or 8 improvements in mind to make it better, and I started bugs for several of these (bug 494343, bug 494355, bug 494357, bug 494358). But after giving it more thought, it's clear to me that all of these improvements are interdependent, and trying to do them one at a time is a bad idea -- it would require writing and reviewing various bits of code that would only live temporarily. So I'm going to do them all in one larger patch instead.
Assignee | ||
Comment 1•15 years ago
|
||
- insSkipWithoutBuffer() was different to all the other ins*() functions,
because it didn't call ensureRoom(). I inlined it in the two places it
was called, which made them easier and facilitated merging ensureRoom()
with makeRoom(). (This replicates the patch from bug 494355.)
- I changed all address computations dealing with LIR buffers to be in terms
of bytes, and used uintptr_t to hold addresses. No pointer arithmetic!
This fixes some unit mismatches, and prevents others in the future. (This
addresses bug 494357.)
As part of this, the incorrect size of the 'lir' array in Page was fixed
-- it looked nice and generic but assumed that LIns was the same size as
PageHeader. It didn't actually manifest as a bug because the 'code' array
in the same union was the right size. Erk.
Another consequence is that skip payloads and call arglists are only
rounded up to the nearest word size, not the nearest sizeof(LIns), saving
a little bit of space.
- I combined ensureRoom() and commit() into a single function makeRoom().
This is less typing and avoids the possibility of passing different sizes
to an ensureRoom/commit pair. I also moved the instruction counting into
makeRoom(), as there's no point doing it repeatedly outside makeRoom().
And next() wasn't needed after this change. And it made sense to move
makeRoom() from LirBufWriter into LirBuffer, which meant that LirBufWriter
didn't need to be a friend of LirBuffer any more.
- Once that merging was done, makeRoom() could be changed so that it's
possible to fill up a Page right to the very last byte. Previously this
wasn't possible, because the _unused pointer would be left pointing past
the end of the page. This requirement that the last word (or instruction
slot?) be left free was easy to forget and/or misunderstand.
- That made knowing the maximum skip payload size easier. Making it even
easier, I renamed MAX_SKIP_BYTES and introduced a couple of related
constants that make things clearer; this addresses bug 494358. It also
makes it easy to see that the assertion in makeRoom() is correct, fixing
bug 494343.
- In general, I used expressive (eg. "szB" prefixes for sizes measured in
bytes) and sometimes long names, particularly in code (eg. makeRoom())
that have proven problematic in the past, with the aim of making their
operation more obvious.
Attachment #379644 -
Flags: review?(edwsmith)
Assignee | ||
Comment 2•15 years ago
|
||
Comment on attachment 379644 [details] [diff] [review]
patch fixing memory problems
Suggesting Graydon as an additional reviewer because he knows how many different ways things can go wrong with this code.
Attachment #379644 -
Flags: review?(graydon)
Comment 3•15 years ago
|
||
Comment on attachment 379644 [details] [diff] [review]
patch fixing memory problems
- I think you can probably remove all the "* sizeof(uint8)"
expressions. If it really worries you, throw a static assert in
LIR.cpp that sizeof(uint8) == 1.
- LirBuffer::insCount() and LirBuffer::byteCount() should be changed
to return uintptr_t, not int32_t as they are now.
- LirBuffer::makeRoom() is too complex and still uses the error-prone
samepage() judgments. I'd greatly prefer to see something based on
inequalities. More like:
if (_unused + szB >= pageBottom(_unused))
moveToNewPage(lastWritten());
- Several more uses of samepage() inside asserts. I feel this
is too weak a check to rely on. It doesn't tell us that we're on the
page we think we ought to be on, and it doesn't tell us we're even
in the data area of that page (rather than its header). A more
thorough double-checking would be IMO better: keep a "Page
*_currPage;" field in LirBufWriter and have helper method like:
void validate(uintptr_t addr) {
uintptr_t low(&_currPage->lir[0]);
uintptr_t hi(low + NJ_PAGE_CODE_AREA_SZB);
NanoAssert(low <= addr && addr < hi);
}
- Perhaps this is fodder for a followup bug, but even though I have a
font that emphasizes the distinction between 1 and l, I still think
having variables named l (or o) is visually irritating. Can we
switch them out for 'lins' or 'ins' or something?
Attachment #379644 -
Flags: review?(graydon) → review-
Assignee | ||
Comment 4•15 years ago
|
||
Attachment #379644 -
Attachment is obsolete: true
Attachment #383208 -
Flags: review?(graydon)
Attachment #383208 -
Flags: review?(edwsmith)
Attachment #379644 -
Flags: review?(edwsmith)
Assignee | ||
Comment 5•15 years ago
|
||
(In reply to comment #3)
> (From update of attachment 379644 [details] [diff] [review])
> - I think you can probably remove all the "* sizeof(uint8)"
> expressions. If it really worries you, throw a static assert in
> LIR.cpp that sizeof(uint8) == 1.
That code was already like that, and some of the instances are on lines that I
haven't changed for this bug, so I left them as-is. Could be a follow-up.
> - LirBuffer::insCount() and LirBuffer::byteCount() should be changed
> to return uintptr_t, not int32_t as they are now.
uintptr_t isn't appropriate, because the results aren't addresses. I changed
byteCount() to return a size_t, which is appropriate for a byte count. I left
insCount() as-is, because int32_t seems fine for a non-byte count.
> - LirBuffer::makeRoom() is too complex and still uses the error-prone
> samepage() judgments. I'd greatly prefer to see something based on
> inequalities. More like:
>
> if (_unused + szB >= pageBottom(_unused))
> moveToNewPage(lastWritten());
I changed the first occurrence to this:
+ if (_unused + szB - 1 > pageBottom(_unused))
+ moveToNewPage((uintptr_t)lastWritten());
Your version had an off-by-two error. I changed the other moveToNewPage() test
in a different way. To facilitate these changes, I fixed pageBottom() as per
bug 494817.
> - Several more uses of samepage() inside asserts. I feel this
> is too weak a check to rely on. It doesn't tell us that we're on the
> page we think we ought to be on, and it doesn't tell us we're even
> in the data area of that page (rather than its header). A more
> thorough double-checking would be IMO better: keep a "Page
> *_currPage;" field in LirBufWriter and have helper method like:
>
> void validate(uintptr_t addr) {
> uintptr_t low(&_currPage->lir[0]);
> uintptr_t hi(low + NJ_PAGE_CODE_AREA_SZB);
> NanoAssert(low <= addr && addr < hi);
> }
With the changes to makeRoom() above, only two samePage() calls are now within
asserts. The one in LirReader::read() isn't suitable for testing in the above
way. The one in insSkip() I instead added another assertion to check that
prevLInsAddr is in the data section of its page -- I feel that we have enough
state in these objects, and adding more is not a good idea.
> - Perhaps this is fodder for a followup bug, but even though I have a
> font that emphasizes the distinction between 1 and l, I still think
> having variables named l (or o) is visually irritating. Can we
> switch them out for 'lins' or 'ins' or something?
Follow-up bug, I think.
Assignee | ||
Comment 6•15 years ago
|
||
Ed, have you had a chance to look at this one yet?
Updated•15 years ago
|
Attachment #383208 -
Flags: review?(graydon) → review+
Comment 7•15 years ago
|
||
Comment on attachment 383208 [details] [diff] [review]
update patch
>--- a/js/src/nanojit/LIR.cpp
>+++ b/js/src/nanojit/LIR.cpp
...
>+ // Check we only spilled over by one byte.
>+ NanoAssert(_unused == pageTop(_unused));
>+ NanoAssert(_unused == pageBottom(startOfRoom) + 1);
>+ moveToNewPage(_unused - sizeof(LIns));
>+ uintptr_t addrOfLastLInsOnPage = _unused - sizeof(LIns);
>+ moveToNewPage(addrOfLastLInsOnPage);
Delete the last 2 lines here, move to new page only once. Presumably a typo from the last round?
>--- a/js/src/nanojit/NativeARM.cpp
>+++ b/js/src/nanojit/NativeARM.cpp
>@@ -851,17 +851,17 @@ Assembler::nativePageSetup()
> {
> // This needs to be done or the samepage macro gets confused; pageAlloc
> // gives us a pointer to just past the end of the page.
> _nIns--;
> _nExitIns--;
>
> // constpool starts at top of page and goes down,
> // code starts at bottom of page and moves up
>- _nSlot = pageDataStart(_nIns); //(int*)(&((Page*)pageTop(_nIns))->lir[0]);
>+ _nSlot = (int*)pageDataStart(_nIns); //(int*)(&((Page*)pageTop(_nIns))->lir[0]);
Remove the trailing comment here. Readers can look up pageDataStart.
r+ with these fixed and a sufficient amount of testing to catch whatever I missed. I'll be happy to help with the fuzzer of course.
Thanks for correcting my arithmetic.
Assignee | ||
Comment 8•15 years ago
|
||
(In reply to comment #7)
> (From update of attachment 383208 [details] [diff] [review])
> >--- a/js/src/nanojit/LIR.cpp
> >+++ b/js/src/nanojit/LIR.cpp
> ...
> >+ // Check we only spilled over by one byte.
> >+ NanoAssert(_unused == pageTop(_unused));
> >+ NanoAssert(_unused == pageBottom(startOfRoom) + 1);
> >+ moveToNewPage(_unused - sizeof(LIns));
> >+ uintptr_t addrOfLastLInsOnPage = _unused - sizeof(LIns);
> >+ moveToNewPage(addrOfLastLInsOnPage);
>
> Delete the last 2 lines here, move to new page only once. Presumably a typo
> from the last round?
Urk. I will fix.
> >--- a/js/src/nanojit/NativeARM.cpp
> >+++ b/js/src/nanojit/NativeARM.cpp
> >@@ -851,17 +851,17 @@ Assembler::nativePageSetup()
> > {
> > // This needs to be done or the samepage macro gets confused; pageAlloc
> > // gives us a pointer to just past the end of the page.
> > _nIns--;
> > _nExitIns--;
> >
> > // constpool starts at top of page and goes down,
> > // code starts at bottom of page and moves up
> >- _nSlot = pageDataStart(_nIns); //(int*)(&((Page*)pageTop(_nIns))->lir[0]);
> >+ _nSlot = (int*)pageDataStart(_nIns); //(int*)(&((Page*)pageTop(_nIns))->lir[0]);
>
> Remove the trailing comment here. Readers can look up pageDataStart.
It was like that before I got here, but I will remove.
> r+ with these fixed and a sufficient amount of testing to catch whatever I
> missed. I'll be happy to help with the fuzzer of course.
I'll run some TryServer tests, and check it on an ARM scratchbox. Which fuzzer are you talking about -- Jesse's JS one or your lirasm one? Anything else on your mind w.r.t. "sufficient amount of testing"?
Assignee | ||
Comment 9•15 years ago
|
||
Attachment #383208 -
Attachment is obsolete: true
Attachment #383208 -
Flags: review?(edwsmith)
Comment 10•15 years ago
|
||
Passed most testing I can throw at it casually (linux-x86, mac-x86, debug and not, arm). I'll land after lunch, once the tinderboxes cycle green.
Comment 11•15 years ago
|
||
Whiteboard: fixed-in-tracemonkey
Comment 17•15 years ago
|
||
This patch causes an 8ms performance regression on my machine on SS (across the board). I know the SS isn't really super reliable for such small differences, but this patch produces a clearly reproducible slowdown.
Comment 18•15 years ago
|
||
That is kind of remarkable. Sorry I did not detect. Give me a few days to profile and diagnose before backing out? It's something I'd really like to stick.
Comment 19•15 years ago
|
||
Yeah, no way we are going to back it out on a whim. It fixes a bunch of scary stuff.
Assignee | ||
Comment 20•15 years ago
|
||
The patch doesn't contain anything that I would expect to affect performance. The only thing I could imagine having any effect at all is the change from ensureRoom() to makeRoom()/moveToNewPage(). The rest of the changes really are tiny, eg. changing various pointer types to integer types.
How are you running SunSpider to get your 8ms slowdown figure? I find --runs=10 (or any other number of multiple runs) very difficult to get consistent figures out of. The best methodology I can come up with is to run with --runs=1 lots of times, and look at the best results, and also "typical best" results, for some vague intuitive definition of "typical best" (eg. ignore any freak low numbers, go for the "best" number that is repeatable a few times).
On Linux, I couldn't see any difference between the two.
On Mac, I can maybe see a slowdown something like 10ms. (I wouldn't are state as precisely as "8ms", IMO SunSpider has way too much variation to be that confident.)
I did some analysis with Cachegrind. On Linux, I couldn't see any differences of any note. makeRoom() accounts for 3 million instructions out of 3.4 billion total, and moveToNewPage() accounts for less than 18,000, and their cache misses were negligible. On Mac, Cachegrind didn't seem to be working so well (problems with debug info that I'll have to look into) so I can't say as much there, but the total numbers (instructions executed, L1 misses, L2 misses) didn't change much.
I also tried v8. On Mac, "typical best" went from 407 to 406. The 10ms SunSpider slowdown is ~1%, this is more like a 0.25% change so I don't consider it significant. On Linux, it went from 496 to about 500, so an improvement. But again, there's enough variation (esp. on Linux) to not trust that too much.
As for what's next... Graydon, maybe you could do a comparison on Windows to see what's happening there?
We really need a better performance suite, trying to make judgments with SunSpider and V8 is hopeless.
Comment 21•15 years ago
|
||
Can't reproduce on any platform. Spent the morning trying, honest! The sunspider comparison scripts, even on my hyper-sensitive mac, show mostly "not conclusive", with a handful of speedups and a couple slowdowns, all in the range 1.01x or so. And they move back and forth / disappear / reappear as I re-run it.
IOW, noise. I can't make it any more sensitive and I'm not seeing a signal.
Gal: can you confirm the revision numbers you're using, check the builds, and post your SS results? I feel like I'm chasing ghosts here.
Assignee | ||
Comment 22•15 years ago
|
||
Some more data, using sunspider-compare-results and lots of runs:
- Linux, 200 runs: might be 1.008x slower
- Linux, 500 runs: definitely 1.070x faster
- Mac, 500 runs: definitely 1.006x slower
I need lots of runs particular for Linux, with --runs=10 I get huge variation like +/- 10%.
Notice that the Linux-500 speed-up is much bigger than the other slow-downs. I'm really not convinced the changes are anything more than noise.
Comment 23•15 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•