Closed Bug 425097 Opened 16 years ago Closed 6 years ago

SIMD optimization in UnicodeUtils::Utf16ToUtf8 improves Sunspider string-unpack performance by ~4.2%

Categories

(Tamarin Graveyard :: Virtual Machine, defect)

x86
Windows XP
defect
Not set
minor

Tracking

(Not tracked)

RESOLVED WONTFIX
Future

People

(Reporter: karthikeyan.krishnan, Unassigned)

References

Details

(Whiteboard: PACMAN)

User-Agent:       Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; InfoPath.1; MS-RTC LM 8)
Build Identifier: 

I looked into UTf16toUTF8  function and managed to get some optimizations. Basically, there are two main branches in the code and the second one that does "count output characters w.o actually encoding" could be short-circuited by doing a special case when most of the character values are less than 0x80. The branch computation ch<0x80 is the hottest (i.e this branch is taken most of the times) because the rest of the branches seems to be dealing with corner cases (such as checking for Invalid strings etc). 
 
In my SIMD implementation, I did 16-byte branch evaluation and using the mask to figure out if all of the 16-bytes are less than 0x80. This helps the performance and improves the string-unpack-code.abc workflow by ~4.2%. The Utf16ToUtf8 function with changes is included in the "Addition Information" tab as I couldn't figure out how to attach files. The SSE2 code path needs to be branched with a sse2 flag accordingly, but there should be a flag already existing in the Tamarin source that could be used. 

Reproducible: Always

Steps to Reproduce:
1.
2.
3.




	int UnicodeUtils::Utf16ToUtf8(const wchar *in,
								  int inLen,
								  uint8 *out,
								  int outMax)
	{
		int outLen = 0;
		if (out)
		{
			// Output buffer passed in; actually encode data.
			while (inLen > 0)
			{
				wchar ch = *in;
				inLen--;
				if (ch < 0x80) {
					if (--outMax < 0) {
						return -1;
					}
					*out++ = (uint8)ch;
					outLen++;
				}
				else if (ch < 0x800) {
					if ((outMax -= 2) < 0) {
						return -1;
					}
					*out++ = (uint8)(0xC0 | (ch>>6)&0x1F);
					*out++ = (uint8)(0x80 | (ch&0x3F));
					outLen += 2;
				}
				else if (ch >= 0xD800 && ch <= 0xDBFF) {
					if (--inLen < 0) {
						return -1;
					}
					wchar ch2 = *++in;
					if (ch2 < 0xDC00 || ch2 > 0xDFFF) {
						// This is an invalid UTF-16 surrogate pair sequence
						// Encode the replacement character instead
						ch = 0xFFFD;
						goto Encode3;
					}
					
					uint32 ucs4 = ((ch-0xD800)<<10) + (ch2-0xDC00) + 0x10000;
					if ((outMax -= 4) < 0) {
						return -1;
					}
					*out++ = (uint8)(0xF0 | (ucs4>>18)&0x07);
					*out++ = (uint8)(0x80 | (ucs4>>12)&0x3F);
					*out++ = (uint8)(0x80 | (ucs4>>6)&0x3F);
					*out++ = (uint8)(0x80 | (ucs4&0x3F));
					outLen += 4;
				}
				else {
					if (ch >= 0xDC00 && ch <= 0xDFFF) {
						// This is an invalid UTF-16 surrogate pair sequence
						// Encode the replacement character instead
						ch = 0xFFFD;
					}
				  Encode3:
					if ((outMax -= 3) < 0) {
						return -1;
					}
					*out++ = (uint8)(0xE0 | (ch>>12)&0x0F);
					*out++ = (uint8)(0x80 | (ch>>6)&0x3F);
					*out++ = (uint8)(0x80 | (ch&0x3F));
					outLen += 3;
				}
				in++;
			}
		}
		else
		{

#if INTEL //optimized
			// Count output characters without actually encoding.
			int nChunks = inLen/8;
			__declspec(align(16)) unsigned short val_0x80[8] = {0x80,0x80,0x80,0x80,
				0x80,0x80,0x80,0x80,};
			int val;

			if (1) //need sse2 flag here to choose code path
			{
				for (int nc = 0; nc <nChunks; ++nc)
				{				
					const wchar *src = in;
					__asm
					{
							mov eax, src
							movdqu xmm0, [eax] ;a
							movdqa xmm1, [val_0x80] ;;b
							psubusb xmm1,xmm0 ;;b-a
							pxor xmm2,xmm2
							pcmpeqw xmm1,xmm2
							pmovmskb eax, xmm1
							mov val,eax ;;;val is zero => all a < b
					}

					if (val == 0)
					{
						inLen -= 8;
						outLen += 8*1;
						in += 8;
						continue;
					}

					for (int i=0; i<16; ++i)
					{
						wchar ch = *in;
						inLen--;
						if (ch < 0x80)
						{
							outLen++;
						}
						else if (ch < 0x800) 
						{
							outLen += 2;
						}
						else if (ch >= 0xD800 && ch <= 0xDBFF) 
						{
							if (--inLen < 0) 
							{
								return -1;
							}
							wchar ch2 = *++in;
							if (ch2 < 0xDC00 || ch2 > 0xDFFF) 
							{
								// Invalid...
								// We'll encode 0xFFFD for this
								outLen += 3;
							}
							else 
							{
								outLen += 4;
							}
						}
						else 
						{
							outLen += 3;
						}
						in++;
					}
				}
			}

			//left over remainder

			while (inLen > 0)
			{
				wchar ch = *in;
				inLen--;
				if (ch < 0x80)
				{
					outLen++;
				}
				else if (ch < 0x800) 
				{
					outLen += 2;
				}
				else if (ch >= 0xD800 && ch <= 0xDBFF) 
				{
					if (--inLen < 0) 
					{
						return -1;
					}
					wchar ch2 = *++in;
					if (ch2 < 0xDC00 || ch2 > 0xDFFF) 
					{
						// Invalid...
						// We'll encode 0xFFFD for this
						outLen += 3;
					}
					else 
					{
						outLen += 4;
					}
				}
				else 
				{
					outLen += 3;
				}
				in++;
			}
#else	//ORIGINAL CODE
			// Count output characters without actually encoding.
			while (inLen > 0)
			{
				wchar ch = *in;
				inLen--;
				if (ch < 0x80)
				{
					outLen++;
				}
				else if (ch < 0x800) 
				{
					outLen += 2;
				}
				else if (ch >= 0xD800 && ch <= 0xDBFF) 
				{
					if (--inLen < 0) 
					{
						return -1;
					}
					wchar ch2 = *++in;
					if (ch2 < 0xDC00 || ch2 > 0xDFFF) 
					{
						// Invalid...
						// We'll encode 0xFFFD for this
						outLen += 3;
					}
					else 
					{
						outLen += 4;
					}
				}
				else 
				{
					outLen += 3;
				}
				in++;
			}
#endif
		}

		return outLen;		
	}
Target Milestone: --- → Future
Whiteboard: PACMAN
FWIW, the string-unpack-code test case never hits into UTF16toUTF8 with the latest tamarin-redux because of the multi-format string support.  This might not be an important fix to make.
(In reply to comment #1)
> FWIW, the string-unpack-code test case never hits into UTF16toUTF8 with the

It can, just much less likely, as the source string would have to have at least one non-Latin1 character (which much of our acceptance suite doesn't bang on enough)
Blocks: 645018
Severity: normal → minor
Flags: flashplayer-qrb+
Tamarin is a dead project now. Mass WONTFIX.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
Tamarin isn't maintained anymore. WONTFIX remaining bugs.
You need to log in before you can comment on or make changes to this bug.