Closed Bug 427959 Opened 12 years ago Closed 12 years ago

doubling TT performance on string concatenation

Categories

(Tamarin Graveyard :: Tracing Virtual Machine, defect, P2)

x86
Linux
defect

Tracking

(Not tracked)

VERIFIED WONTFIX

People

(Reporter: m.rokicki, Unassigned)

Details

Attachments

(1 file, 1 obsolete file)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.0.13pre) Gecko/20070505 (Debian-1.8.0.15~pre080131b-0etch1) Epiphany/2.14
Build Identifier: 


We have observed that String::concatStrings (AvmCore* core, Stringp leftStr, Stringp rightStr) can be seedup by reducing the number of alloc
and copy operations on the StringBuf*.
More specifically, the rightStr->m_buf can be appended to the leftStr->m_buf buffer provided the leftStr is the longest string using the
buffer leftStr->m_buf. This way string is extended without affecting the shorter
strings which use the same buffer. 

Prveiously the rightStr->m_buf could be appended to the leftStr->m_buf buffer provided the leftBufData->RefCount()==1. This condition seems to be too strong.

The performance results on sunspider tests are very promising
platform: Linux 2.6.18-6-686 #1 SMP Sun Feb 10 22:11:31 UTC 2008 i686 GNU/Linux:

1) string-validate-input.abc 
   TT with modification: 654
   TT originally: 4 591
   Thus our modification improves performance 7 times.

   Tamarin Central : 160 (4 times faster than TT)

2) string-unpack-code.abc: 
   TT with modification: 19 721
   TT originally: 26 052
   Thus our modification improves performance by 25%.
   
   Tamarin Central: 10 500 (twice faster than TT)



Reproducible: Always

Steps to Reproduce:
1.
2.
3.
This bug is also linked to the https://bugzilla.mozilla.org/show_bug.cgi?id=423519
Attachment #314554 - Flags: review?(treilly)
We want to get the max len from GC::Size() and not add a field to RCObject. 
Yes I agree that adding the new member maxStrLen to the RCObject is not
very elegant. Logically maxStrLen belongs to the to the String::StringBuf class.
However this class relies on its bit pattern very much and adding
the new member causes segmentation fault. When maxStrLen is moved the RCObject 
the super class of StringBuf everything works fine.
I could try to move maxStrLen to the String::StringBuf class but I would need some help.

It seems to me that GC::Size() returns the whole memory allocated for the object e.g. it is used to compute capacity of the buffer

size_t StringBuf::capacity() const { return GC::Size(this) - sizeof(RCObject) - EXTRA_AT_END; }

While we need to know what is the max string length which points to the buffer.
Each String object stores its length and the pointer to the buffer.
	class String : public AvmPlusScriptableObject
	{
          ....
		uint32_t		m_lenAndFlags; // string length
		uintptr_t		m_buf;	// pointer to the buffer
          ....
            }

Thus during concatenation of two stings (leftStr and rightStr) the 
leftStr->m_buf can be extended by appending rightStr->m_buf when leftStr is the longest of all strings using m_buf.
Otherwise we could change the value of longer strings which points to the buffer.
Okay I haven't dug into the optimization to fully understand it, can we get the same optimization w/o the extra storage penalty?   Small strings are very common.
We have also improved the memory usage.
This is because we extend the string 
when it is possible rather than allocate new buffer.
I have done memory usage tests with valgrind on sunspider test
string-validate-input (platform: Linux 2.6.18-6-686 #1 SMP Sun Feb 10 22:11:31 UTC 2008 i686). The memory usage with our modification is up to 260k while
original TT shell uses up to 350k.
The RCObject::maxStrLen was moved to the String::StringBuf::maxStrLen
Attachment #316199 - Flags: review?(treilly)
Attachment #314554 - Attachment is obsolete: true
Attachment #314554 - Flags: review?(treilly)
That's better.   So the idea is to reuse strbuf's that have slop space but you don't know what that slop is just by looking at the String that owns it (when refcount > 1).   I can see how this saves space for tests that concat alot but it might require more space in different use cases.   

Could we detect that there was space left to use in StringBuf by writing sentinel bytes to the unused space?   Could a contiguous sequence of 0xff be treated as unused space?   I think so for utf8 anyways.   

Why check ref count > 1?   Seems to me we know longer care what the ref count is with this optimization.  RefCount was never meant to be an exposed MMgc API anyways...
> That's better.   So the idea is to reuse strbuf's that have slop space but you
> don't know what that slop is just by looking at the String that owns it (when
> refcount > 1).

Yes we just need to know the maxStrLen which uses the StringBuf.

> I can see how this saves space for tests that concat alot but
> it might require more space in different use cases.   

It seems to me that we always save on space.
Generally appending rightStr->m_buf to the leftStr->m_buf is the most efficient.
This idea was already implemented.
However previously this approach was limited by checking 
how many references (m_buf->refCount())the leftStr->m_buf.
The StringBuf was extended only when there was enough space and there was only
one reference to the StringBuf. This is because otherwise we could modify the
contents of the buffer for other references.
We relax this constraint by observing that leftStr->m_buf can be extended 
provided leftStr is the longest string (leftStr->getLen()) of all strings using the leftStr->m_buf.
This way the shorter strings using the same buffer are not modified. 
Thus we have to maintain StringBuf::maxStrLen.

> Why check ref count > 1?   Seems to me we know longer care what the ref count
> is with this optimization.  RefCount was never meant to be an exposed MMgc API
> anyways...

When number of references to the StringBuf drops the maxStrLen is not updated but when leftStr->RefCount()==1 we can safely append the rightStr and we can also update maxStrLen (this was not implemented in the patch yet).
I would assert there is no need for String to be independent of MMgc.  If tight coupling of the string implementation to the allocator/gc is a performance win, lets go for it.  In fact lets go all the way and make strings be an MMgc feature, and if a different allocator comes along that favors a different set of string optimizations, then it also should provide a tightly coupled string class.
Priority: -- → P2
This is made obsolete by the new String stuff I think.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
Attachment #316199 - Flags: review?(treilly) → review-
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.