Closed Bug 427959 Opened 18 years ago Closed 17 years ago

doubling TT performance on string concatenation

Categories

(Tamarin Graveyard :: Tracing Virtual Machine, defect, P2)

x86
Linux
defect

Tracking

(Not tracked)

VERIFIED WONTFIX

People

(Reporter: m.rokicki, Unassigned)

Details

Attachments

(1 file, 1 obsolete file)

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.13pre) Gecko/20070505 (Debian-1.8.0.15~pre080131b-0etch1) Epiphany/2.14 Build Identifier: We have observed that String::concatStrings (AvmCore* core, Stringp leftStr, Stringp rightStr) can be seedup by reducing the number of alloc and copy operations on the StringBuf*. More specifically, the rightStr->m_buf can be appended to the leftStr->m_buf buffer provided the leftStr is the longest string using the buffer leftStr->m_buf. This way string is extended without affecting the shorter strings which use the same buffer. Prveiously the rightStr->m_buf could be appended to the leftStr->m_buf buffer provided the leftBufData->RefCount()==1. This condition seems to be too strong. The performance results on sunspider tests are very promising platform: Linux 2.6.18-6-686 #1 SMP Sun Feb 10 22:11:31 UTC 2008 i686 GNU/Linux: 1) string-validate-input.abc TT with modification: 654 TT originally: 4 591 Thus our modification improves performance 7 times. Tamarin Central : 160 (4 times faster than TT) 2) string-unpack-code.abc: TT with modification: 19 721 TT originally: 26 052 Thus our modification improves performance by 25%. Tamarin Central: 10 500 (twice faster than TT) Reproducible: Always Steps to Reproduce: 1. 2. 3.
Attachment #314554 - Flags: review?(treilly)
We want to get the max len from GC::Size() and not add a field to RCObject.
Yes I agree that adding the new member maxStrLen to the RCObject is not very elegant. Logically maxStrLen belongs to the to the String::StringBuf class. However this class relies on its bit pattern very much and adding the new member causes segmentation fault. When maxStrLen is moved the RCObject the super class of StringBuf everything works fine. I could try to move maxStrLen to the String::StringBuf class but I would need some help. It seems to me that GC::Size() returns the whole memory allocated for the object e.g. it is used to compute capacity of the buffer size_t StringBuf::capacity() const { return GC::Size(this) - sizeof(RCObject) - EXTRA_AT_END; } While we need to know what is the max string length which points to the buffer. Each String object stores its length and the pointer to the buffer. class String : public AvmPlusScriptableObject { .... uint32_t m_lenAndFlags; // string length uintptr_t m_buf; // pointer to the buffer .... } Thus during concatenation of two stings (leftStr and rightStr) the leftStr->m_buf can be extended by appending rightStr->m_buf when leftStr is the longest of all strings using m_buf. Otherwise we could change the value of longer strings which points to the buffer.
Okay I haven't dug into the optimization to fully understand it, can we get the same optimization w/o the extra storage penalty? Small strings are very common.
We have also improved the memory usage. This is because we extend the string when it is possible rather than allocate new buffer. I have done memory usage tests with valgrind on sunspider test string-validate-input (platform: Linux 2.6.18-6-686 #1 SMP Sun Feb 10 22:11:31 UTC 2008 i686). The memory usage with our modification is up to 260k while original TT shell uses up to 350k.
The RCObject::maxStrLen was moved to the String::StringBuf::maxStrLen
Attachment #316199 - Flags: review?(treilly)
Attachment #314554 - Attachment is obsolete: true
Attachment #314554 - Flags: review?(treilly)
That's better. So the idea is to reuse strbuf's that have slop space but you don't know what that slop is just by looking at the String that owns it (when refcount > 1). I can see how this saves space for tests that concat alot but it might require more space in different use cases. Could we detect that there was space left to use in StringBuf by writing sentinel bytes to the unused space? Could a contiguous sequence of 0xff be treated as unused space? I think so for utf8 anyways. Why check ref count > 1? Seems to me we know longer care what the ref count is with this optimization. RefCount was never meant to be an exposed MMgc API anyways...
> That's better. So the idea is to reuse strbuf's that have slop space but you > don't know what that slop is just by looking at the String that owns it (when > refcount > 1). Yes we just need to know the maxStrLen which uses the StringBuf. > I can see how this saves space for tests that concat alot but > it might require more space in different use cases. It seems to me that we always save on space. Generally appending rightStr->m_buf to the leftStr->m_buf is the most efficient. This idea was already implemented. However previously this approach was limited by checking how many references (m_buf->refCount())the leftStr->m_buf. The StringBuf was extended only when there was enough space and there was only one reference to the StringBuf. This is because otherwise we could modify the contents of the buffer for other references. We relax this constraint by observing that leftStr->m_buf can be extended provided leftStr is the longest string (leftStr->getLen()) of all strings using the leftStr->m_buf. This way the shorter strings using the same buffer are not modified. Thus we have to maintain StringBuf::maxStrLen. > Why check ref count > 1? Seems to me we know longer care what the ref count > is with this optimization. RefCount was never meant to be an exposed MMgc API > anyways... When number of references to the StringBuf drops the maxStrLen is not updated but when leftStr->RefCount()==1 we can safely append the rightStr and we can also update maxStrLen (this was not implemented in the patch yet).
I would assert there is no need for String to be independent of MMgc. If tight coupling of the string implementation to the allocator/gc is a performance win, lets go for it. In fact lets go all the way and make strings be an MMgc feature, and if a different allocator comes along that favors a different set of string optimizations, then it also should provide a tightly coupled string class.
Priority: -- → P2
This is made obsolete by the new String stuff I think.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → WONTFIX
Attachment #316199 - Flags: review?(treilly) → review-
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: