Closed Bug 548495 Opened 14 years ago Closed 14 years ago

Eliminate sources of O(N^2) backtracking in the linebreaker.

Categories

(Core :: Layout: Text and Fonts, defect)

x86_64
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
status1.9.2 --- .2-fixed
status1.9.1 --- .9-fixed

People

(Reporter: kurt, Assigned: bzbarsky)

References

()

Details

(4 keywords)

Attachments

(5 files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6

This was a supposed webkit dos, the file is about a half meg, no actual < or > characters, crashes Firefox 3.6 and 3.5.7 on Windows 7/XP 64 and 32 bit reliably. Turning off CSS causes it to crash slightly slower.

Reproducible: Always

Steps to Reproduce:
1. Open that file, from web server or local file, rename to .txt/,html still crashes Firefox. 
2. 
3.
Actual Results:  
Firefox crashes.

Expected Results:  
Firefox displays file.
Keywords: crash
Version: unspecified → 3.6 Branch
Well, no crash for me with Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.3a2pre) Gecko/20100224 Minefield/3.7a2pre ID:20100224060607. I do get a hang, and I have to kill the process though. Please give a stacktrace. https://developer.mozilla.org/En/How_to_get_a_stacktrace_for_a_bug_report
Severity: major → critical
Apologies, I didn't realize using the term crash was incorrect within this context. I should have said it hangs (in my case I've let it sit for a while a few times but it never comes back, I have to kill it/etc.).
I can confirm the hang with Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.3a2pre) Gecko/20100224 Minefield/3.7a2pre

The hang is present with Firefox 3.0, 3.5. 3.6 and latest trunk, but not with Firefox 2.0. 
With Firefox 2 it breaks the lines and loads instantly.
Component: General → Layout
Keywords: crashhang
Product: Firefox → Core
QA Contact: general → layout
Summary: this file is not HTML but it crashes firefox badly (3.5.x also). → this file is not HTML but it hangs firefox badly (3.5.x also).
Version: 3.6 Branch → Trunk
Regression range is between 2007091804 and 2007091904. http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2007-09-18+04%3A00&maxdate=2007-09-19+04%3A00&cvsroot=%2Fcvsroot

Masayuki's line-breaking patch looks a possible candidate
Status: UNCONFIRMED → NEW
Component: Layout → Layout: Text
Ever confirmed: true
QA Contact: layout → layout.fonts-and-text
So just to make sure I'm reproducing this correctly....

1) I use wget to download http://www.securityfocus.com/data/vulnerabilities/exploits/38398.html to ~/test.html

2) I open ~/test.html in my web browser.

These steps work for me.  No hang.  Firefox 3.6 build on Mac.

This is not the same as the webkit steps to reproduce, which involve copying and pasting the text one sees if one loads that HTML file (which is mostly escapes) and then parsing the resulting file as HTML.  That also works for me.

So which steps are people using, and is this possibly Windows-only?
blocking2.0: --- → ?
Keywords: qawanted
I don't see the hang either with bz's STR, only when loading directly from securityfocus.com (which serves the file as text/plain), or if I download the file to a .txt file
Aha!  That was the missing bit, ok.  Sounds like they're over-escaping (serving as text/plain _and_ html-escaping), but whatever.

I can reproduce when loading the file as text-plain.  Profile shows 100% of the time spent in ContextualAnalysis(unsigned short, unsigned short, unsigned short, ContextState&).
14.3%	14.3%	0x11103438		dec      dword ptr [ebp-80]		Loop start[8]	
And the specific part of that function that takes up the time (sorry about the assembly; Shark's refusing to show me the source):

14.3% dec  dword ptr [ebp-80] Loop start[8]	
14.7% cmp  byte ptr [eax-2], byte 61			
0.4%  jz   _ZL18ContextualAnalysistttR12ContextState+753>			
0.0%  sub  eax, byte 2			
0.0%  mov  edx, dword ptr [ebp-80]			
56.2% test edx, edx			
14.3% jnz  <_ZL18ContextualAnalysistttR12ContextState+2008> Loop end[8]	

That jz is jumping to somewhere way before this code.
Aha!  The relevant loop is inlined from HasCharacterAlready, with '=' as aCh (see the cmp instruction):

628   PRBool HasCharacterAlready(PRUnichar aCh) {
629     // Be careful for the index being unsigned.
630     for (PRUint32 i = mIndex; i > 0; --i) {
631       if (GetCharAt(i - 1) == aCh)
632         return PR_TRUE;
633     }
634     return PR_FALSE;
635   }

The jz is the |return PR_TRUE|.

The caller here is in ContextualAnalysis and looks like this:

723   } else if (cur == U_AMPERSAND || cur == U_SEMICOLON) {
724     // If this may be a separator of params of URL, we should break after.
725     if (!aState.UseConservativeBreaking(1) &&
726         aState.HasCharacterAlready(U_EQUAL))
727       return CLASS_CLOSE;

So every time we see an '&' or ';' we start scanning backwards from that point until we get to an '=' or to the beginning of the buffer.  In this case, there are lots of '&' and ';' but not '=', so we keep scanning back to buffer start all the time.  That makes the whole operation effectively O(N^2) in number of characters (and in particular makes the number of iterations of that loop O(N^2) in number of characters), and said number is 500,000 or so.  500,000^2 == 250,000,000,000, we and I would surprised if we can execute that loop faster than about 1ns per iteration, which gives us 250s as a lower bound for the entire operation.
It looks like HasCharacterAlready is only called for U_EQUAL, U_SLASH, U_BACKSLASH.
Attached patch Proposed fixSplinter Review
Assignee: nobody → bzbarsky
Status: NEW → ASSIGNED
Attachment #428954 - Flags: review?
Attachment #428954 - Attachment description: dif -w for review → diff -w for review
Attachment #428954 - Flags: review? → review?(masayuki)
Comment on attachment 428954 [details] [diff] [review]
diff -w for review

Looks great, thank you for your work!
Attachment #428954 - Flags: review?(masayuki) → review+
Pushed http://hg.mozilla.org/mozilla-central/rev/c1cfc1bf7e52

Not sure how to write a test for this that will pass reliably...

Masayuki Nakano, do you think this patch is safe enough for the branches?
Status: ASSIGNED → RESOLVED
blocking2.0: ? → ---
Closed: 14 years ago
Flags: in-testsuite?
Keywords: qawanted
Resolution: --- → FIXED
(In reply to comment #15)
> Pushed http://hg.mozilla.org/mozilla-central/rev/c1cfc1bf7e52
> 
> Not sure how to write a test for this that will pass reliably...
> 
> Masayuki Nakano, do you think this patch is safe enough for the branches?

I think so, the line-breading is tested in http://mxr.mozilla.org/mozilla-central/source/layout/reftests/line-breaking/ . If all tests passed, it doesn't break most features (https://wiki.mozilla.org/Gecko:Line_Breaking)
Er, in fact I was sure I had tested those... but I clearly hadn't.  Patch coming up to fix the issue this patch introduced due to notifying on slashes before checking for them.
I pushed http://hg.mozilla.org/mozilla-central/rev/4e1b68ecf126 to fix that test failure.  Will attach that for review in a second.
Attached patch Said diffSplinter Review
Attachment #429285 - Flags: review?
Attachment #429285 - Flags: review? → review?(masayuki)
Comment on attachment 429285 [details] [diff] [review]
Said diff

thanks, and sorry for miscatch the mistake.
Attachment #429285 - Flags: review?(masayuki) → review+
This applies to both the 1.9.1 and 1.9.2 branches.  Should be safe (or at least passes tests!) and eliminates a possible hang.
Attachment #429289 - Flags: approval1.9.2.2?
Attachment #429289 - Flags: approval1.9.1.9?
Summary: this file is not HTML but it hangs firefox badly (3.5.x also). → Eliminate sources of O(N^2) backtracking in the linebreaker.
Can Mozilla please assign a CVE to this for tracking (congrats on you guys getting CNA status!).
I'm not sure why this needs a CVE id, but ccing dveditz...
Because it's a pretty effective DoS (you typically have to forcibly kill firefox, then restart it, kill it again, restart it, get the tab that shows which screens you want to load or not, uncheck the bad one, and then you're back to normal).
That technically isn't a DoS, it's a hang. Not an exact DoS from firefox.
I don't believe we typically assign CVEs to performance issues, though.  Anyway, dveditz is the one who'd know about that.
CVE-2009-0773 crash
CVE-2009-1392 crash
CVE-2009-1827 hang
CVE-2009-1828 hang

You have in the last year (and before that as well).
Both of those CVEs are about infinite loops.  There's no infinite loop in this bug; the browser becomes responsive after a minute or two of pegging the CPU.
Well, more like 5-6 minutes, rather.  But definitely finite time.
Comment on attachment 429289 [details] [diff] [review]
roll-up fix for branches

Approved for 1.9.1.9 and 1.9.2.2, a=dveditz for release-drivers
Attachment #429289 - Flags: approval1.9.2.2?
Attachment #429289 - Flags: approval1.9.2.2+
Attachment #429289 - Flags: approval1.9.1.9?
Attachment #429289 - Flags: approval1.9.1.9+
verified for 1.9.2-2 with Windows 7 Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2 - no hang on testcase file --> verfied 1.9.2
Keywords: perf
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: