425097 - SIMD optimization in UnicodeUtils::Utf16ToUtf8 improves Sunspider string-unpack performance by ~4.2%

Reporter

Description

•

16 years ago

User-Agent:       Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; MS-RTC LM 8)
Build Identifier: 

I looked into UTf16toUTF8  function and managed to get some optimizations. Basically, there are two main branches in the code and the second one that does "count output characters w.o actually encoding" could be short-circuited by doing a special case when most of the character values are less than 0x80. The branch computation ch<0x80 is the hottest (i.e this branch is taken most of the times) because the rest of the branches seems to be dealing with corner cases (such as checking for Invalid strings etc). 
 
In my SIMD implementation, I did 16-byte branch evaluation and using the mask to figure out if all of the 16-bytes are less than 0x80. This helps the performance and improves the string-unpack-code.abc workflow by ~4.2%. The Utf16ToUtf8 function with changes is included in the "Addition Information" tab as I couldn't figure out how to attach files. The SSE2 code path needs to be branched with a sse2 flag accordingly, but there should be a flag already existing in the Tamarin source that could be used. 

Reproducible: Always

Steps to Reproduce:
1.
2.
3.




	int UnicodeUtils::Utf16ToUtf8(const wchar *in,
								  int inLen,
								  uint8 *out,
								  int outMax)
	{
		int outLen = 0;
		if (out)
		{
			// Output buffer passed in; actually encode data.
			while (inLen > 0)
			{
				wchar ch = *in;
				inLen--;
				if (ch < 0x80) {
					if (--outMax < 0) {
						return -1;
					}
					*out++ = (uint8)ch;
					outLen++;
				}
				else if (ch < 0x800) {
					if ((outMax -= 2) < 0) {
						return -1;
					}
					*out++ = (uint8)(0xC0 | (ch>>6)&0x1F);
					*out++ = (uint8)(0x80 | (ch&0x3F));
					outLen += 2;
				}
				else if (ch >= 0xD800 && ch <= 0xDBFF) {
					if (--inLen < 0) {
						return -1;
					}
					wchar ch2 = *++in;
					if (ch2 < 0xDC00 || ch2 > 0xDFFF) {
						// This is an invalid UTF-16 surrogate pair sequence
						// Encode the replacement character instead
						ch = 0xFFFD;
						goto Encode3;
					}
					
					uint32 ucs4 = ((ch-0xD800)<<10) + (ch2-0xDC00) + 0x10000;
					if ((outMax -= 4) < 0) {
						return -1;
					}
					*out++ = (uint8)(0xF0 | (ucs4>>18)&0x07);
					*out++ = (uint8)(0x80 | (ucs4>>12)&0x3F);
					*out++ = (uint8)(0x80 | (ucs4>>6)&0x3F);
					*out++ = (uint8)(0x80 | (ucs4&0x3F));
					outLen += 4;
				}
				else {
					if (ch >= 0xDC00 && ch <= 0xDFFF) {
						// This is an invalid UTF-16 surrogate pair sequence
						// Encode the replacement character instead
						ch = 0xFFFD;
					}
				  Encode3:
					if ((outMax -= 3) < 0) {
						return -1;
					}
					*out++ = (uint8)(0xE0 | (ch>>12)&0x0F);
					*out++ = (uint8)(0x80 | (ch>>6)&0x3F);
					*out++ = (uint8)(0x80 | (ch&0x3F));
					outLen += 3;
				}
				in++;
			}
		}
		else
		{

#if INTEL //optimized
			// Count output characters without actually encoding.
			int nChunks = inLen/8;
			__declspec(align(16)) unsigned short val_0x80[8] = {0x80,0x80,0x80,0x80,
				0x80,0x80,0x80,0x80,};
			int val;

			if (1) //need sse2 flag here to choose code path
			{
				for (int nc = 0; nc <nChunks; ++nc)
				{				
					const wchar *src = in;
					__asm
					{
							mov eax, src
							movdqu xmm0, [eax] ;a
							movdqa xmm1, [val_0x80] ;;b
							psubusb xmm1,xmm0 ;;b-a
							pxor xmm2,xmm2
							pcmpeqw xmm1,xmm2
							pmovmskb eax, xmm1
							mov val,eax ;;;val is zero => all a < b
					}

					if (val == 0)
					{
						inLen -= 8;
						outLen += 8*1;
						in += 8;
						continue;
					}

					for (int i=0; i<16; ++i)
					{
						wchar ch = *in;
						inLen--;
						if (ch < 0x80)
						{
							outLen++;
						}
						else if (ch < 0x800) 
						{
							outLen += 2;
						}
						else if (ch >= 0xD800 && ch <= 0xDBFF) 
						{
							if (--inLen < 0) 
							{
								return -1;
							}
							wchar ch2 = *++in;
							if (ch2 < 0xDC00 || ch2 > 0xDFFF) 
							{
								// Invalid...
								// We'll encode 0xFFFD for this
								outLen += 3;
							}
							else 
							{
								outLen += 4;
							}
						}
						else 
						{
							outLen += 3;
						}
						in++;
					}
				}
			}

			//left over remainder

			while (inLen > 0)
			{
				wchar ch = *in;
				inLen--;
				if (ch < 0x80)
				{
					outLen++;
				}
				else if (ch < 0x800) 
				{
					outLen += 2;
				}
				else if (ch >= 0xD800 && ch <= 0xDBFF) 
				{
					if (--inLen < 0) 
					{
						return -1;
					}
					wchar ch2 = *++in;
					if (ch2 < 0xDC00 || ch2 > 0xDFFF) 
					{
						// Invalid...
						// We'll encode 0xFFFD for this
						outLen += 3;
					}
					else 
					{
						outLen += 4;
					}
				}
				else 
				{
					outLen += 3;
				}
				in++;
			}
#else	//ORIGINAL CODE
			// Count output characters without actually encoding.
			while (inLen > 0)
			{
				wchar ch = *in;
				inLen--;
				if (ch < 0x80)
				{
					outLen++;
				}
				else if (ch < 0x800) 
				{
					outLen += 2;
				}
				else if (ch >= 0xD800 && ch <= 0xDBFF) 
				{
					if (--inLen < 0) 
					{
						return -1;
					}
					wchar ch2 = *++in;
					if (ch2 < 0xDC00 || ch2 > 0xDFFF) 
					{
						// Invalid...
						// We'll encode 0xFFFD for this
						outLen += 3;
					}
					else 
					{
						outLen += 4;
					}
				}
				else 
				{
					outLen += 3;
				}
				in++;
			}
#endif
		}

		return outLen;		
	}

Lars T Hansen

Updated

•

15 years ago

Target Milestone: --- → Future

Lars T Hansen

Updated

•

14 years ago

Whiteboard: PACMAN

Werner Sharp

Comment 1

•

14 years ago

FWIW, the string-unpack-code test case never hits into UTF16toUTF8 with the latest tamarin-redux because of the multi-format string support.  This might not be an important fix to make.

Steven Johnson

Comment 2

•

14 years ago

(In reply to comment #1)
> FWIW, the string-unpack-code test case never hits into UTF16toUTF8 with the

It can, just much less likely, as the source string would have to have at least one non-Latin1 character (which much of our acceptance suite doesn't bang on enough)

Andre Kruetzfeldt

Updated

•

13 years ago

Blocks: 645018

Dan Smith

Updated

•

13 years ago

Severity: normal → minor

Flags: flashplayer-qrb+

Sylvestre Ledru [:Sylvestre]

Comment 3

•

6 years ago

Tamarin is a dead project now. Mass WONTFIX.

Status: NEW → RESOLVED

Closed: 6 years ago

Resolution: --- → WONTFIX

Sylvestre Ledru [:Sylvestre]

Comment 4

•

6 years ago

Tamarin isn't maintained anymore. WONTFIX remaining bugs.

Bugzilla

Quick Search

SIMD optimization in UnicodeUtils::Utf16ToUtf8 improves Sunspider string-unpack performance by ~4.2%

Categories

(Tamarin Graveyard :: Virtual Machine, defect)

Tracking

(Not tracked)

People

(Reporter: karthikeyan.krishnan, Unassigned)

References

Details

(Whiteboard: PACMAN)

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Comment 1

Comment 2

Updated

Updated

Comment 3

Comment 4