494639 - nanojit: fix numerous LIR memory management problems

Assignee

Description

•

15 years ago

Nanojit has numerous problems with its management of LIR memory. I have 7 or 8 improvements in mind to make it better, and I started bugs for several of these (bug 494343, bug 494355, bug 494357, bug 494358). But after giving it more thought, it's clear to me that all of these improvements are interdependent, and trying to do them one at a time is a bad idea -- it would require writing and reviewing various bits of code that would only live temporarily. So I'm going to do them all in one larger patch instead.

Nicholas Nethercote [inactive]

Assignee

Comment 1

•

15 years ago

Attached patch patch fixing memory problems (obsolete) — Details — Splinter Review

- insSkipWithoutBuffer() was different to all the other ins*() functions, because it didn't call ensureRoom(). I inlined it in the two places it was called, which made them easier and facilitated merging ensureRoom() with makeRoom(). (This replicates the patch from bug 494355.) - I changed all address computations dealing with LIR buffers to be in terms of bytes, and used uintptr_t to hold addresses. No pointer arithmetic! This fixes some unit mismatches, and prevents others in the future. (This addresses bug 494357.) As part of this, the incorrect size of the 'lir' array in Page was fixed -- it looked nice and generic but assumed that LIns was the same size as PageHeader. It didn't actually manifest as a bug because the 'code' array in the same union was the right size. Erk. Another consequence is that skip payloads and call arglists are only rounded up to the nearest word size, not the nearest sizeof(LIns), saving a little bit of space. - I combined ensureRoom() and commit() into a single function makeRoom(). This is less typing and avoids the possibility of passing different sizes to an ensureRoom/commit pair. I also moved the instruction counting into makeRoom(), as there's no point doing it repeatedly outside makeRoom(). And next() wasn't needed after this change. And it made sense to move makeRoom() from LirBufWriter into LirBuffer, which meant that LirBufWriter didn't need to be a friend of LirBuffer any more. - Once that merging was done, makeRoom() could be changed so that it's possible to fill up a Page right to the very last byte. Previously this wasn't possible, because the _unused pointer would be left pointing past the end of the page. This requirement that the last word (or instruction slot?) be left free was easy to forget and/or misunderstand. - That made knowing the maximum skip payload size easier. Making it even easier, I renamed MAX_SKIP_BYTES and introduced a couple of related constants that make things clearer; this addresses bug 494358. It also makes it easy to see that the assertion in makeRoom() is correct, fixing bug 494343. - In general, I used expressive (eg. "szB" prefixes for sizes measured in bytes) and sometimes long names, particularly in code (eg. makeRoom()) that have proven problematic in the past, with the aim of making their operation more obvious.

Attachment #379644 - Flags: review?(edwsmith)

Nicholas Nethercote [inactive]

Assignee

Comment 2

•

15 years ago

Comment on attachment 379644 [details] [diff] [review] patch fixing memory problems Suggesting Graydon as an additional reviewer because he knows how many different ways things can go wrong with this code.

Attachment #379644 - Flags: review?(graydon)

Nicholas Nethercote [inactive]

Assignee

Updated

•

15 years ago

Blocks: 492866

Nicholas Nethercote [inactive]

Assignee

Updated

•

15 years ago

Blocks: 495734

Graydon Hoare :graydon

Comment 3

•

15 years ago

Comment on attachment 379644 [details] [diff] [review] patch fixing memory problems - I think you can probably remove all the "* sizeof(uint8)" expressions. If it really worries you, throw a static assert in LIR.cpp that sizeof(uint8) == 1. - LirBuffer::insCount() and LirBuffer::byteCount() should be changed to return uintptr_t, not int32_t as they are now. - LirBuffer::makeRoom() is too complex and still uses the error-prone samepage() judgments. I'd greatly prefer to see something based on inequalities. More like: if (_unused + szB >= pageBottom(_unused)) moveToNewPage(lastWritten()); - Several more uses of samepage() inside asserts. I feel this is too weak a check to rely on. It doesn't tell us that we're on the page we think we ought to be on, and it doesn't tell us we're even in the data area of that page (rather than its header). A more thorough double-checking would be IMO better: keep a "Page *_currPage;" field in LirBufWriter and have helper method like: void validate(uintptr_t addr) { uintptr_t low(&_currPage->lir[0]); uintptr_t hi(low + NJ_PAGE_CODE_AREA_SZB); NanoAssert(low <= addr && addr < hi); } - Perhaps this is fodder for a followup bug, but even though I have a font that emphasizes the distinction between 1 and l, I still think having variables named l (or o) is visually irritating. Can we switch them out for 'lins' or 'ins' or something?

Attachment #379644 - Flags: review?(graydon) → review-

Nicholas Nethercote [inactive]

Assignee

Comment 4

•

15 years ago

Attached patch update patch (obsolete) — Details — Splinter Review

Attachment #379644 - Attachment is obsolete: true

Attachment #383208 - Flags: review?(graydon)

Attachment #383208 - Flags: review?(edwsmith)

Attachment #379644 - Flags: review?(edwsmith)

Nicholas Nethercote [inactive]

Assignee

Comment 5

•

15 years ago

(In reply to comment #3) > (From update of attachment 379644 [details] [diff] [review]) > - I think you can probably remove all the "* sizeof(uint8)" > expressions. If it really worries you, throw a static assert in > LIR.cpp that sizeof(uint8) == 1. That code was already like that, and some of the instances are on lines that I haven't changed for this bug, so I left them as-is. Could be a follow-up. > - LirBuffer::insCount() and LirBuffer::byteCount() should be changed > to return uintptr_t, not int32_t as they are now. uintptr_t isn't appropriate, because the results aren't addresses. I changed byteCount() to return a size_t, which is appropriate for a byte count. I left insCount() as-is, because int32_t seems fine for a non-byte count. > - LirBuffer::makeRoom() is too complex and still uses the error-prone > samepage() judgments. I'd greatly prefer to see something based on > inequalities. More like: > > if (_unused + szB >= pageBottom(_unused)) > moveToNewPage(lastWritten()); I changed the first occurrence to this: + if (_unused + szB - 1 > pageBottom(_unused)) + moveToNewPage((uintptr_t)lastWritten()); Your version had an off-by-two error. I changed the other moveToNewPage() test in a different way. To facilitate these changes, I fixed pageBottom() as per bug 494817. > - Several more uses of samepage() inside asserts. I feel this > is too weak a check to rely on. It doesn't tell us that we're on the > page we think we ought to be on, and it doesn't tell us we're even > in the data area of that page (rather than its header). A more > thorough double-checking would be IMO better: keep a "Page > *_currPage;" field in LirBufWriter and have helper method like: > > void validate(uintptr_t addr) { > uintptr_t low(&_currPage->lir[0]); > uintptr_t hi(low + NJ_PAGE_CODE_AREA_SZB); > NanoAssert(low <= addr && addr < hi); > } With the changes to makeRoom() above, only two samePage() calls are now within asserts. The one in LirReader::read() isn't suitable for testing in the above way. The one in insSkip() I instead added another assertion to check that prevLInsAddr is in the data section of its page -- I feel that we have enough state in these objects, and adding more is not a good idea. > - Perhaps this is fodder for a followup bug, but even though I have a > font that emphasizes the distinction between 1 and l, I still think > having variables named l (or o) is visually irritating. Can we > switch them out for 'lins' or 'ins' or something? Follow-up bug, I think.

Nicholas Nethercote [inactive]

Assignee

Comment 6

•

15 years ago

Ed, have you had a chance to look at this one yet?

Graydon Hoare :graydon

Updated

•

15 years ago

Attachment #383208 - Flags: review?(graydon) → review+

Graydon Hoare :graydon

Comment 7

•

15 years ago

Comment on attachment 383208 [details] [diff] [review] update patch >--- a/js/src/nanojit/LIR.cpp >+++ b/js/src/nanojit/LIR.cpp ... >+ // Check we only spilled over by one byte. >+ NanoAssert(_unused == pageTop(_unused)); >+ NanoAssert(_unused == pageBottom(startOfRoom) + 1); >+ moveToNewPage(_unused - sizeof(LIns)); >+ uintptr_t addrOfLastLInsOnPage = _unused - sizeof(LIns); >+ moveToNewPage(addrOfLastLInsOnPage); Delete the last 2 lines here, move to new page only once. Presumably a typo from the last round? >--- a/js/src/nanojit/NativeARM.cpp >+++ b/js/src/nanojit/NativeARM.cpp >@@ -851,17 +851,17 @@ Assembler::nativePageSetup() > { > // This needs to be done or the samepage macro gets confused; pageAlloc > // gives us a pointer to just past the end of the page. > _nIns--; > _nExitIns--; > > // constpool starts at top of page and goes down, > // code starts at bottom of page and moves up >- _nSlot = pageDataStart(_nIns); //(int*)(&((Page*)pageTop(_nIns))->lir[0]); >+ _nSlot = (int*)pageDataStart(_nIns); //(int*)(&((Page*)pageTop(_nIns))->lir[0]); Remove the trailing comment here. Readers can look up pageDataStart. r+ with these fixed and a sufficient amount of testing to catch whatever I missed. I'll be happy to help with the fuzzer of course. Thanks for correcting my arithmetic.

Nicholas Nethercote [inactive]

Assignee

Comment 8

•

15 years ago

(In reply to comment #7) > (From update of attachment 383208 [details] [diff] [review]) > >--- a/js/src/nanojit/LIR.cpp > >+++ b/js/src/nanojit/LIR.cpp > ... > >+ // Check we only spilled over by one byte. > >+ NanoAssert(_unused == pageTop(_unused)); > >+ NanoAssert(_unused == pageBottom(startOfRoom) + 1); > >+ moveToNewPage(_unused - sizeof(LIns)); > >+ uintptr_t addrOfLastLInsOnPage = _unused - sizeof(LIns); > >+ moveToNewPage(addrOfLastLInsOnPage); > > Delete the last 2 lines here, move to new page only once. Presumably a typo > from the last round? Urk. I will fix. > >--- a/js/src/nanojit/NativeARM.cpp > >+++ b/js/src/nanojit/NativeARM.cpp > >@@ -851,17 +851,17 @@ Assembler::nativePageSetup() > > { > > // This needs to be done or the samepage macro gets confused; pageAlloc > > // gives us a pointer to just past the end of the page. > > _nIns--; > > _nExitIns--; > > > > // constpool starts at top of page and goes down, > > // code starts at bottom of page and moves up > >- _nSlot = pageDataStart(_nIns); //(int*)(&((Page*)pageTop(_nIns))->lir[0]); > >+ _nSlot = (int*)pageDataStart(_nIns); //(int*)(&((Page*)pageTop(_nIns))->lir[0]); > > Remove the trailing comment here. Readers can look up pageDataStart. It was like that before I got here, but I will remove. > r+ with these fixed and a sufficient amount of testing to catch whatever I > missed. I'll be happy to help with the fuzzer of course. I'll run some TryServer tests, and check it on an ARM scratchbox. Which fuzzer are you talking about -- Jesse's JS one or your lirasm one? Anything else on your mind w.r.t. "sufficient amount of testing"?

Nicholas Nethercote [inactive]

Assignee

Comment 9

•

15 years ago

Attached patch updated patch, fixes problems from comment #7 — Details — Splinter Review

Attachment #383208 - Attachment is obsolete: true

Attachment #383208 - Flags: review?(edwsmith)

Graydon Hoare :graydon

Comment 10

•

15 years ago

Passed most testing I can throw at it casually (linux-x86, mac-x86, debug and not, arm). I'll land after lunch, once the tinderboxes cycle green.

Graydon Hoare :graydon

Comment 11

•

15 years ago

http://hg.mozilla.org/tracemonkey/rev/20a92feefc99

Whiteboard: fixed-in-tracemonkey

Andreas Gal :gal

Comment 17

•

15 years ago

This patch causes an 8ms performance regression on my machine on SS (across the board). I know the SS isn't really super reliable for such small differences, but this patch produces a clearly reproducible slowdown.

Graydon Hoare :graydon

Comment 18

•

15 years ago

That is kind of remarkable. Sorry I did not detect. Give me a few days to profile and diagnose before backing out? It's something I'd really like to stick.

Andreas Gal :gal

Comment 19

•

15 years ago

Yeah, no way we are going to back it out on a whim. It fixes a bunch of scary stuff.

Nicholas Nethercote [inactive]

Assignee

Comment 20

•

15 years ago

The patch doesn't contain anything that I would expect to affect performance. The only thing I could imagine having any effect at all is the change from ensureRoom() to makeRoom()/moveToNewPage(). The rest of the changes really are tiny, eg. changing various pointer types to integer types. How are you running SunSpider to get your 8ms slowdown figure? I find --runs=10 (or any other number of multiple runs) very difficult to get consistent figures out of. The best methodology I can come up with is to run with --runs=1 lots of times, and look at the best results, and also "typical best" results, for some vague intuitive definition of "typical best" (eg. ignore any freak low numbers, go for the "best" number that is repeatable a few times). On Linux, I couldn't see any difference between the two. On Mac, I can maybe see a slowdown something like 10ms. (I wouldn't are state as precisely as "8ms", IMO SunSpider has way too much variation to be that confident.) I did some analysis with Cachegrind. On Linux, I couldn't see any differences of any note. makeRoom() accounts for 3 million instructions out of 3.4 billion total, and moveToNewPage() accounts for less than 18,000, and their cache misses were negligible. On Mac, Cachegrind didn't seem to be working so well (problems with debug info that I'll have to look into) so I can't say as much there, but the total numbers (instructions executed, L1 misses, L2 misses) didn't change much. I also tried v8. On Mac, "typical best" went from 407 to 406. The 10ms SunSpider slowdown is ~1%, this is more like a 0.25% change so I don't consider it significant. On Linux, it went from 496 to about 500, so an improvement. But again, there's enough variation (esp. on Linux) to not trust that too much. As for what's next... Graydon, maybe you could do a comparison on Windows to see what's happening there? We really need a better performance suite, trying to make judgments with SunSpider and V8 is hopeless.

Graydon Hoare :graydon

Comment 21

•

15 years ago

Can't reproduce on any platform. Spent the morning trying, honest! The sunspider comparison scripts, even on my hyper-sensitive mac, show mostly "not conclusive", with a handful of speedups and a couple slowdowns, all in the range 1.01x or so. And they move back and forth / disappear / reappear as I re-run it. IOW, noise. I can't make it any more sensitive and I'm not seeing a signal. Gal: can you confirm the revision numbers you're using, check the builds, and post your SS results? I feel like I'm chasing ghosts here.

Nicholas Nethercote [inactive]

Assignee

Comment 22

•

15 years ago

Some more data, using sunspider-compare-results and lots of runs: - Linux, 200 runs: might be 1.008x slower - Linux, 500 runs: definitely 1.070x faster - Mac, 500 runs: definitely 1.006x slower I need lots of runs particular for Linux, with --runs=10 I get huge variation like +/- 10%. Notice that the Linux-500 speed-up is much bigger than the other slow-downs. I'm really not convinced the changes are anything more than noise.

Robert Sayre

Comment 23

•

15 years ago

http://hg.mozilla.org/mozilla-central/rev/20a92feefc99

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

patch fixing memory problems 15 years ago Nicholas Nethercote [inactive] 23.57 KB, patch	graydon : review-	Details \| Diff \| Splinter Review
update patch 15 years ago Nicholas Nethercote [inactive] 27.67 KB, patch	graydon : review+	Details \| Diff \| Splinter Review
updated patch, fixes problems from comment #7 15 years ago Nicholas Nethercote [inactive] 27.58 KB, patch		Details \| Diff \| Splinter Review