Closed Bug 366643 Opened 18 years ago Closed 18 years ago

Crash after loading page with html entities and <img> tag

Categories

(Core :: Graphics, defect)

x86
Windows XP
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla1.9alpha4

People

(Reporter: wildmyron, Assigned: philip)

References

Details

(Keywords: crash, regression, testcase)

Attachments

(4 files, 3 obsolete files)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a2pre) Gecko/20070110 Minefield/3.0a2pre Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a2pre) Gecko/20070110 Minefield/3.0a2pre ID:2007011004 [cairo] Trunk builds of Firefox crash after loading attached html file. The crash often occurs when selecting the content of the page. The crash seems to occur in different places with the current nightly, for example: TB28240944E 0xf6c0e908 ... XPCJSStackFrame::GetCaller TB28240987Z nsThebesFontMetrics::GetMetrics but in mcsmurfs debug build from 20061217 crash always happens while loading the page (nothing displayed) with this error: HEAP[firefox.exe]: Heap block at 02AE80B8 modified at 02AE80E8 past requested size of 28 Partial stack: ntdll!DbgBreakPoint ntdll!RtlpBreakPointHeap+0x24 ntdll!RtlpCheckBusyBlockTail+0x6f ntdll!RtlpValidateHeapEntry+0xa2 ntdll!RtlDebugReAllocateHeap+0xbf ntdll!RtlReAllocateHeap+0x1b7 MSVCR80!realloc+0x35f [f:\rtm\vctools\crt_bld\self_x86\crt\src\realloc.c @ 323] firefox!Uniscribe::Itemize+0x44 [h:\mozilla\tree-main\mozilla\gfx\thebes\src\gfxwindowsfonts.cpp @ 1469] firefox!gfxWindowsTextRun::MeasureOrDrawUniscribe+0xac [h:\mozilla\tree-main\mozilla\gfx\thebes\src\gfxwindowsfonts.cpp @ 1543] firefox!gfxWindowsTextRun::Draw+0x47 [h:\mozilla\tree-main\mozilla\gfx\thebes\src\gfxwindowsfonts.cpp @ 507] firefox!gfxContext::DrawTextRun+0x22 [h:\mozilla\tree-main\mozilla\gfx\thebes\src\gfxcontext.cpp @ 639] firefox!nsThebesFontMetrics::DrawString+0x13c [h:\mozilla\tree-main\mozilla\gfx\src\thebes\nsthebesfontmetrics.cpp @ 441] firefox!nsThebesRenderingContext::DrawStringInternal+0x24 [h:\mozilla\tree-main\mozilla\gfx\src\thebes\nsthebesrenderingcontext.cpp @ 1280] Reproducible: Sometimes Steps to Reproduce: 1. Load attached testcase 2. Select the content on the page 3. Repeat step 2. Actual Results: Crash Expected Results: Not Crash Does not crash the 1.8.1 branch. I haven't found a regression range yet. The testcase was reduced from this link: http://forums.mozillazine.org/viewtopic.php?t=507089 found by trolly
Attached file testcase (obsolete) —
Almost minimal testcase
cant reproduce this Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9a2pre) Gecko/20070110 Minefield/3.0a2pre ID:2007011018 [cairo] but marking it NEW since others can (on moz forum) Win XP only?
Status: UNCONFIRMED → NEW
Ever confirmed: true
fullmetaljacket.xp@gmail.com: do not mark a bug as new without triaging it into an appropriate component. I don't care what gerv says, your account is for the benefit of active bugzilla users, and confirming bugs that gerv will kill 3 years from now because they got lost is *not* for the benefit of active bugzilla users.
Component: General → GFX: Thebes
QA Contact: general → thebes
To comment 3: crashes on W2K for me. Two additional TBs: nsAttrAndChildArray::Compact [mozilla\content\base\src\nsattrandchildarray.cpp, line 625] - TB28251790K After playing with the test case 0x05d03686 imgRequest::FrameChanged [mozilla\modules\libpr0n\src\imgrequest.cpp, line 408] - TB28251765Q After restoring the session Seems to be some memory corruption from my point of view. BTW: I'm trolly on mozillazine. ;-)
Attached file Minimal testcase (obsolete) —
Removing any of the 7 html entities or changing the order stops the crash from happening. Regression window: this crash is present in all cairo enabled builds I tested starting with 2006-02-24-15. Does not crash in 2006-02-22-09. The intermediate builds don't launch on this machine. This is on Windows XP SP1. Doesn't crash on my other machine which is Windows XP SP2.
Attachment #251144 - Attachment is obsolete: true
I can't reproduce this, but I have windowsXP SP2. The regression window was when cairo was enabled on windows.
Flags: blocking1.9?
Sounds like a potential uniscribe bug on pre-SP2 uniscribe... Stuart, any ideas?
Flags: blocking1.9? → blocking1.9+
dunno. It would probably jump out at someone running it through purify on a a pre-sp2 box, but i don't have one handy.
I'm going to have a go at this, but I will need help interpreting results, and making "it jump out at me". I've installed a trial version of Purify and I managed to run an older debug build through it. I've downloaded mcsmurfs 20070228 debug build and I'm playing with it now. Any suggestions much appreciated.
Arie, try to catch someone at #developers on irc.
Unfortunately that didn't work out. I'll attach the Purify logs I did get in case they are useful to anyone. All logs are from a clean install of the build I mentioned in comment 9 with a fresh profile, home page set to about:blank. There are three logs: 1) reference - open Firefox, close Firefox 2) testcase - open Firefox, open testcase, close Firefox 3) testcase-select - open Firefox, open testcase, select page content, crash I also included text files with a copy of the actual leaks recorded, excluding potential leaks.
Looks like the &#8205; character (U+200D zero width joiner) is the problem in this test case - bug 376188 has the same issue with &#8203; (U+200B zero width space), and the "zero width" bit looks like a pattern. If I paste the string x.‍x.‍x (which has some U+200Ds in it) into the address bar a few times, it crashes. The same happens with x.​x.​x (using U+200B, as in 376188). The same also happens with x.x.x (using U+FEFF zero width no-break space). I haven't seen problems with any non-zero-width Unicode characters.
The root cause seems to be a bug in Win32's ScriptItemize - in some cases it writes past the end of the buffer that it's given. I think I know what it's doing, but I'll just check exactly what triggers that behaviour. By the way, would I be wrong in thinking Uniscribe::Itemize leaks mItems?
(In reply to comment #14) > By the way, would I be wrong in thinking Uniscribe::Itemize leaks mItems? > mm, nope. somehow that got lost. can you file a bug?
(Memory leak filed as bug 376265.) For certain input strings and certain buffer sizes, ScriptItemize writes cMaxItems+1 items into pItems. When the buffer is too small (so ScriptItemize is going to return E_OUTOFMEMORY), it writes past the end of the buffer and breaks the heap. With strings of the form "x.<zero width>x.<zero width>x", that happens to occur when the buffer size is either 5 or 2. The Mozilla code defaults to 5, so it breaks with that string. In my attempts at testing, I've never seen it write more than one item too many - so I believe the fix would be to replace the malloc(maxItems * sizeof(SCRIPT_ITEM)); in gfx/thebes/src/gfxWindowsFonts.cpp Uniscribe::Itemize, with malloc((maxItems + 1) * sizeof(SCRIPT_ITEM)); and the same for the realloc a few lines down. MSDN online says of pItems: "The buffer should be cMaxItems*sizeof(SCRIPT_ITEM) + 1 bytes in length." MSDN offline (from VS2005) said: "The buffer pointed to by pItems should be cMaxItems * sizeof(SCRIPT_ITEM) bytes in length." I don't know if that change in the documentation was to account for that pre-SP2 bug, but if so then it's still wrong because the function writes to sizeof(SCRIPT_ITEM) bytes after the buffer.
On Windows 2000, this outputs: maxItems = 2; numItems = 0; after buffer: 02000000cccc4a00cccccccccccccccc maxItems = 3; numItems = 0; after buffer: cccccccccccccccccccccccccccccccc maxItems = 4; numItems = 0; after buffer: cccccccccccccccccccccccccccccccc maxItems = 5; numItems = 0; after buffer: 05000000cccc4a00cccccccccccccccc maxItems = 6; numItems = 0; after buffer: cccccccccccccccccccccccccccccccc maxItems = 7; numItems = 0; after buffer: cccccccccccccccccccccccccccccccc maxItems = 8; numItems = 7; after buffer: cccccccccccccccccccccccccccccccc ... On Windows Server 2003, it outputs: maxItems = 2; numItems = 0; after buffer: cccccccccccccccccccccccccccccccc maxItems = 3; numItems = 0; after buffer: cccccccccccccccccccccccccccccccc maxItems = 4; numItems = 0; after buffer: cccccccccccccccccccccccccccccccc maxItems = 5; numItems = 0; after buffer: cccccccccccccccccccccccccccccccc maxItems = 6; numItems = 0; after buffer: cccccccccccccccccccccccccccccccc maxItems = 7; numItems = 0; after buffer: cccccccccccccccccccccccccccccccc maxItems = 8; numItems = 5; after buffer: cccccccccccccccccccccccccccccccc ... I guess XP SP1 is the same as 2000, and XP SP2 is the same as 2003, though I can't test them. I don't know if it matters that they give different answers once the buffer is big enough.
Not sure if this is still useful but somehow I neglected to actually attach it. Phillip: Glad you're working on this. Would it be useful to you to know the output of your test program on XP SP1 and/or SP2? I can't compile it here at uni (the SP1 machine) but I can compile it at home on my SP2 box and run it here.
Attached patch patch (obsolete) — Splinter Review
Before applying the patch, when running a debug build of Firefox trunk inside a debugger, on Windows 2000, the test case from this bug says: Heap block at 0470A0A0 modified at 0470A0F6 past requested size of 4c and the debugger shows ScriptItemize is writing 8 bytes past the end of mItems. After applying the patch, it runs correctly. The test cases in bug 376188 all show the same behaviour. So I think this works as a fix.
Arie: I ran the test program on XP SP2 and it gave the same answer as Server 2003. It's probably not worth trying it on XP SP1, unless it's trivial to do so - it'll almost certainly be the same as Win2K, given that comment #1 showed it was suffering exactly the same kind of buffer overflow after calling ScriptItemize.
Thanks for tracking this down. You'll want to add a comment explaining why you're adding 1 to maxItems, probably mention this bug number. Then request review from pavlov@pavlov.net.
Attached patch updated patchSplinter Review
Attachment #260474 - Attachment is obsolete: true
Attachment #260578 - Flags: review?(pavlov)
Comment on attachment 260578 [details] [diff] [review] updated patch looks good. thanks for tracking this down.
Attachment #260578 - Flags: review?(pavlov) → review+
a testcase for this would be great
Flags: in-testsuite?
I experimented with Phillip's ideas from comment 13 and found that this testcase reliably crashes on-load with current trunk. The body simply contains the string "x.&amp;#8205;x.&amp;#8205;x". I don't understand why I don't crash when reading this bug but I did crash when selecting the text in comment 13. Phillip: does this reliably crash for you?
Attachment #254921 - Attachment is obsolete: true
Not totally reliably, with the normal release builds - if I load a random page before loading the test case, then it tends to not crash. If it's the first page I load, it usually (but not always) crashes immediately. I think that's unavoidable, because it's relying on memory corruption and there's no reliable way to test for that. But it should be fine if the test is run in a debug build, since it is reliably corrupting memory (just not reliably crashing afterwards) and the first call to realloc will complain (as in comment 1).
Whiteboard: [needs checkin]
Whiteboard: [needs checkin] → [checkin needed]
Assignee: nobody → philip
Target Milestone: --- → mozilla1.9alpha4
Checked in, sorry about the delay. mozilla/gfx/thebes/src/gfxWindowsFonts.cpp 1.110
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Whiteboard: [checkin needed]
Verified fixed in Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a4pre) Gecko/20070426 Minefield/3.0a4pre ID:2007042604 [cairo] Tested with both testcases, comments in this bug report and original forum link. Thanks everyone.
Status: RESOLVED → VERIFIED
Flags: in-testsuite? → in-testsuite+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: