Closed Bug 701637 Opened 13 years ago Closed 13 years ago

DOM related GC | Cycle Collector Crash triggered by harfbuzz buffer overrun

Categories

(Core :: Layout: Text and Fonts, defect)

x86
Windows XP
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla9
Tracking Status
firefox8 - wontfix
firefox9 + fixed
firefox10 + fixed
firefox11 + fixed
status1.9.2 --- unaffected

People

(Reporter: bc, Assigned: jfkthame)

References

()

Details

(Keywords: crash, verified-aurora, verified-beta, Whiteboard: [sg:critical][qa!])

Crash Data

Attachments

(1 file, 3 obsolete files)

1. http://www.haveeru.com.mv/dhivehi/south_asia/112652 Windows XP only I think. May need to reload several times. 2. Crash [@ nsAttrValue::EnsureEmptyMiscContainer() ] Beta bp-dac1f532-e438-450c-b5d3-5e3c12111110 Aurora bp-67ee699e-b484-4106-bb16-0f7c72111111 Nightly bp-3267b454-6e6c-4a49-9da5-d4bfe2111111 Other urls: http://www.haveeru.com.mv/dhivehi http://www.haveeru.com.mv/dhivehi/atolls/112618 See also bug 681118 which is mobile ARM/Android only. I originally saw this in the crash automation where debug Beta builds on Windows XP were firing crashdumps with _hb_ot_layout_get_glyph_property _hb_ot_layout_skip_mark PairPosFormat2::apply(hb_apply_context_t*) PairPos::apply(hb_apply_context_t*) PosLookupSubTable::apply(hb_apply_context_t*, unsigned int) stacks, but still exiting normally though Ted's exploitable tool rated these as low. Ted, I recently updated all my minidump_stackwalk's and your exploitable tool to tip. Any idea why it would generate dumps for the browser process while the browser would still exit normally? Making sensitive since Ted's flagged it as low and it is GC/CC related.
These look like two different stacks. I'm not sure why we'd exit with a normal status and still write a dump, that doesn't ring a bell.
Please capture the testcases in the bug in case they go away or change.
Keywords: testcase-wanted
Kyle, can you dig in here?
Assignee: nobody → khuey
Whiteboard: [sg:critical]
I haven't been able to get a saved test case that crashes, but if you go to the site and start opening links into new tabs you will crash in pretty short order with a variety of signatures. bp-88b4b9ca-05a9-42e9-bf12-ceb882111117 [@ nsHttpChannel::GetLoadGroup(nsILoadGroup**) ] bp-5617f544-f9f9-420e-a14a-da88c2111117 [@ imgRequestProxy::~imgRequestProxy() ] bp-32b45b5d-4b7a-431c-bf46-076c72111117 [@ msvcr80.dll@0x173f2 ] bp-8fd9ee75-ecf9-4931-a790-57a502111117 [@ gfxTextRun::`vector deleting destructor''(unsigned int) ]
Sure. I was able to capture the crash in the record and replay setup and am investigating.
This appears to be an off by one error in HarfBuzz. At http://hg.mozilla.org/mozilla-central/annotate/b62e6ee5ba9b/gfx/harfbuzz/src/hb-ot-layout-gpos-private.hh#l621 we have the following: 624 unsigned int end = MIN (c->buffer->len, c->buffer->i + c->context_length); <snip> 632 unsigned int j = c->buffer->i + 1; 633 while (_hb_ot_layout_skip_mark (c->layout->face, &c->buffer->info[j], c->lookup_props, NULL)) 634 { 635 if (unlikely (j == end)) 636 return false; 637 j++; 638 } The problem is that c->buffer->len starts counting at 1, while array indices start counting at 0. It's thus possible to run off the end of the array and into somebody else's memory. _hb_ot_layout_skip_mark calls _hb_ot_layout_get_glyph_property which writes beyond the end of the loop. This causes the memory corruption which later brings down the browser.
Assignee: khuey → nobody
Component: DOM: Core & HTML → Layout: Text
QA Contact: general → layout.fonts-and-text
I have this in a replay machine if that's needed to proceed. Otherwise, somebody who knows HarfBuzz should finish this off.
It looks to me like the loop-exit test "j == end" needs to happen after j has been incremented in the loop, not before. This idiom occurs several times in this file, and each occurrence requires the same fix. Behdad, I realize this file has been renamed in the current HB trunk, but it looks like the same error is still present - please check and verify whether the fix is correct, and apply upstream as necessary. Thanks!
Attachment #575421 - Flags: review?(mozilla)
Summary: DOM related GC | Cycle Collector Crash → DOM related GC | Cycle Collector Crash triggered by harfbuzz buffer overrun
FWIW, that 'end' still points past the end of the array feels pretty nasty to me, even though the patch should fix the bug.
Lemme check. Kyle, end pointing to to after the array is a very common idiom, and welldefined by the C spec.
Right, it's equivalent to using the begin() and end() iterators on a container, for example; "end" points immediately after the last item. No problem with that; but the "end" iterator (or index) must not be dereferenced. Behdad, please confirm the fix here is OK so that we can land it for the code as present in our tree, as well as fixing it upstream in harfbuzz for future versions.
(09:39:12 AM) behdad: jfkthame: so (09:39:27 AM) behdad: jfkthame: would it make more sense to rewrite the loops as do{}while? (09:39:48 AM) behdad: we're assuming that at the beginning of the loop j can't be equal to end (09:40:02 AM) behdad: and that's correct only because the if at the beginning of the functions takes care of it (09:40:15 AM) behdad: that's too fragile / unclear (09:42:07 AM) behdad: jfkthame: might even be worth putting that loop into a function
Here's a more extensive patch - I've tried to check each of the loops involving to _hb_ot_layout_skip_mark and adjusted exit criteria to avoid potential out-of-bounds access. In a number of cases I've changed while() loops into do...while(), where this seemed to fit well with the control flow and conditions needed. This seems to work OK with a few fonts I've tried, but has not been extensively tested yet.
Attachment #576197 - Flags: review?(mozilla)
Comment on attachment 576197 [details] [diff] [review] patch v2 - rewrite all the loops involving _hb_ot_layout_skip_mark Review of attachment 576197 [details] [diff] [review]: ----------------------------------------------------------------- Generally looks good. I'll do them all as do{}while() upstream, but what you have is fine. Except for one chunk which looks like optimization, but is wrong. ::: gfx/harfbuzz/src/hb-ot-layout-gpos-private.hh @@ +622,5 @@ > { > TRACE_APPLY (); > + unsigned int j = c->buffer->i; > + unsigned int end = MIN (c->buffer->len, j + c->context_length); > + if (unlikely (j >= end)) This is harmless, but not equivalent to what was there before. The old code was accounting for the fact that we need at least two glyphs. But, this version is simpler. I like. ::: gfx/harfbuzz/src/hb-ot-layout-gsub-private.hh @@ +347,3 @@ > unsigned int count = component.len; > + unsigned int end = MIN (c->buffer->len, j + c->context_length); > + if (unlikely (j + count >= end)) This however, keeps the count in the new version. I suggest we drop it and do a simple j >= end here too. I'm thinking about adding a struct to replace this logic. @@ +351,5 @@ > > bool first_was_mark = (c->property & HB_OT_LAYOUT_GLYPH_CLASS_MARK); > bool found_non_mark = false; > > + for (i = 1; i < count; i++) Humm, took me a while to confirm that the new loop correctly skips over the matched glyphs. But it does, and I like how it reads. Nice. @@ +361,1 @@ > if (unlikely (j + count - i == end)) Same here. We can just do j == end. @@ +398,3 @@ > { > + c->buffer->info[k].lig_comp() = i; > + c->buffer->info[k].lig_id() = lig_id; This is wrong. 'k' is static, whereas in the previous code, buffer->i was advancing. Suggest backout the k thing. That's irrelevant to the issue at hand, and wrong. ::: gfx/harfbuzz/src/hb-ot-layout-gsubgpos-private.hh @@ +140,1 @@ > return false; The LigatureSubst logic is exactly the same as this one and your do{}while version makes much more sense IMO. In fact, maybe we should make LigatureSubst call this function. @@ +164,2 @@ > { > + if (unlikely (j < count - i)) Like my other comments, we can simplify this and just id "if (!j)". What you have is fine though. @@ +191,5 @@ > { > while (_hb_ot_layout_skip_mark (c->layout->face, &c->buffer->info[j], c->lookup_props, NULL)) > { > + j++; > + if (unlikely (j + count - i > end)) Same here. I like the do{}while() version more.
Attachment #576197 - Flags: review?(mozilla) → review-
Updated to take account of your comments above - thanks.
Attachment #576197 - Attachment is obsolete: true
Attachment #576223 - Flags: review?(mozilla)
Comment on attachment 576223 [details] [diff] [review] patch v2.1 - updated for review comments Review of attachment 576223 [details] [diff] [review]: ----------------------------------------------------------------- LGTM, except for one minor issue. ::: gfx/harfbuzz/src/hb-ot-layout-gsubgpos-private.hh @@ +133,3 @@ > return false; > > + j = j + 1; Do we need this? Given that j is incremented in the do{}while loop?
Yes, because in the case where count==1, the for-loop body (and hence the do-while loop) is not executed at all, and we'd then end up returning context_length_out as zero, which is wrong.
(In reply to Jonathan Kew (:jfkthame) from comment #17) > Yes, because in the case where count==1, the for-loop body (and hence the > do-while loop) is not executed at all, and we'd then end up returning > context_length_out as zero, which is wrong. I'm not convinced. You have a point, but I think the correct fix is to return j - buffer->i + 1 instead. Two j++'s in a row, without any check on the glyph being skipped is wrong.
(In reply to Behdad Esfahbod from comment #18) > (In reply to Jonathan Kew (:jfkthame) from comment #17) > > Yes, because in the case where count==1, the for-loop body (and hence the > > do-while loop) is not executed at all, and we'd then end up returning > > context_length_out as zero, which is wrong. > > I'm not convinced. You have a point, but I think the correct fix is to > return j - buffer->i + 1 instead. You're right. I'd almost finished a detailed reply explaining why it _was_ correct as I wrote it, but I think that was just my jet-lag talking. Must be time to make coffee....
Attachment #575421 - Attachment is obsolete: true
Attachment #576223 - Attachment is obsolete: true
Attachment #575421 - Flags: review?(mozilla)
Attachment #576223 - Flags: review?(mozilla)
Attachment #576429 - Flags: review?(mozilla)
Comment on attachment 576429 [details] [diff] [review] patch v2.2 - fixed indexing in match_input Review of attachment 576429 [details] [diff] [review]: ----------------------------------------------------------------- LGTM.
Attachment #576429 - Flags: review?(mozilla) → review+
Please ping me when it's ok to push this upstream.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Comment on attachment 576429 [details] [diff] [review] patch v2.2 - fixed indexing in match_input Nominating for aurora and beta as this is a potential vulnerability in our currently-shipping code. I'm not sure how feasible it would be for someone to exploit this for arbitrary code execution rather than "just" a crash/DoS (perhaps using a specially-crafted font, along with specific text strings that hit the erroneous code paths), but anything that can manifest as heap corruption seems dangerous.
Attachment #576429 - Flags: approval-mozilla-beta?
Attachment #576429 - Flags: approval-mozilla-aurora?
Assignee: nobody → jfkthame
Target Milestone: --- → mozilla11
Is the failure mode here understood well enough to write an automated test case for it?
Comment on attachment 576429 [details] [diff] [review] patch v2.2 - fixed indexing in match_input [triage comment] This looks...scary. How well tested is this area of code? We'll approve it but we definitely have reservations.
Attachment #576429 - Flags: approval-mozilla-beta?
Attachment #576429 - Flags: approval-mozilla-beta+
Attachment #576429 - Flags: approval-mozilla-aurora?
Attachment #576429 - Flags: approval-mozilla-aurora+
Whiteboard: [sg:critical] → [sg:critical][qa+]
(In reply to Christian Legnitto [:LegNeato] from comment #25) > [triage comment] > This looks...scary. How well tested is this area of code? We'll approve it > but we definitely have reservations. Hmm, do you mean the crash looks scary, or the patch looks scary? Or both? :) IMO, although the crash is bad, it would probably be pretty challenging to exploit in any predictable way - making use of the heap corruption caused by overshooting the glyph buffer by one position to achieve a controlled, malicious result seems like it'd be really hard. As for the patch, I don't think it's quite as scary as it looks, really. The specific error that caused this crash was pinpointed and clearly understood (thanks, Kyle - comment 6), and could have been fixed with a very small and obviously correct patch (comment 8). However, on examination of similar code fragments we found that the equivalent problem - and hence the potential for corruption - was present in multiple places, and so we're fixing them all pre-emptively; and in discussion with Behdad, we decided to rewrite the loops in a form that makes the indexing and loop-termination logic easier to follow, as it was a bit confusing as originally written. All this makes for a larger and more complex-looking patch, but a better result. I strongly suspect bug 707095 (newly filed from crash-stats) is a manifestation of this bug, and so those reports should die away once the fix here gets out to users.
I agree with Jonathan. We're both much more confident in the code being added here than just trying to fix the old code with minimal changes.
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0a1) Gecko/20111203 Firefox/11.0a1 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0a2) Gecko/20111203 Firefox/10.0a2 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0) Gecko/20100101 Firefox/9.0 Loaded websites listed in the Desciption many times. Crash was easily reproducible with FF8. Validated on Win7 across all channels.
Status: RESOLVED → VERIFIED
Whiteboard: [sg:critical][qa+] → [sg:critical][qa!]
Harfbuzz didn't exist on the 1.9.2 branch so this bug should not affect it.
Depends on: 714067
Group: core-security
This has been upstreamed already btw.
Flags: in-testsuite? → in-testsuite-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: