18480 - bad html crashes mozilla

Reporter

Description

•

25 years ago

i'm running mozilla pulled from cvs on 9 nov 1999 on linux. loading the page
above causes mozilla to crash. granted, it's bad html (no font closing tags),
but that shouldn't crash the browser, right? the page loads fine in netscape
4.7.

troy

Updated

•

25 years ago

Status: NEW → RESOLVED

Closed: 25 years ago

Resolution: --- → WORKSFORME

troy

Comment 1

•

25 years ago

My pull from this morning isn't crashing. I'm on NT, though. Marking WORKSFORME,
but if you still see it in todays build then a stack trace would be helpful

troy

Updated

•

25 years ago

Status: RESOLVED → REOPENED

troy

Updated

•

25 years ago

Assignee: troy → harishd

Status: REOPENED → NEW

troy

Comment 2

•

25 years ago

I do see a stack overflow when I try and load a different URL from that window.

Reassigning to Parser, to see if they can do something about making the content
model more sensible. There are no closing <FONT> tags and so it looks like
everything just ends up nested and we have a ridiculously nested content model

harishd

Assignee

Updated

•

25 years ago

Assignee: harishd → troy

harishd

Assignee

Comment 3

•

25 years ago

Per spec. start and end tags are mandatory for FONT but FONT nesting is not
illegal.  Gecko parser does support this behavior.  So, the problem seems to be
in the layout land....Troy!! :)

Back to troy@netscape.com

leger

Updated

•

25 years ago

Resolution: WORKSFORME → ---

leger

Comment 4

•

25 years ago

Clearing WORKSFORME resolution due to reopen.

troy

Updated

•

25 years ago

Assignee: troy → rickg

troy

Comment 5

•

25 years ago

The problem is not really in layout. It's in the content model...

Rick, Kipp was the content model guy. Who owns it now?

rickg

Updated

•

25 years ago

Assignee: rickg → vidur

rickg

Comment 6

•

25 years ago

Here's one for you, since you've cheerfully accepted ownership of content model
land.

Michael Lowe

Updated

•

25 years ago

Severity: normal → major

Biswapesh Chatterjee

Comment 7

•

25 years ago

I monitored the memory usage while loading documents with "open" tags - it
seems mozilla (and viewer) are not releasing memory properly after loading. So,
even if a bad HTML does not crash mozilla, it will definitely crash after the
HTML gets loaded a few times.

Biswapesh Chatterjee

Comment 8

•

25 years ago

*** Bug 20089 has been marked as a duplicate of this bug. ***

Biswapesh Chatterjee

Updated

•

25 years ago

Severity: major → critical

Component: Layout → Parser

OS: Linux → All

Hardware: PC → All

Biswapesh Chatterjee

Comment 9

•

25 years ago

UPdating severity to critical since a lot of sites are crashing mozilla just
because a page does not close a tag. Also, updaing platform and OS to all, and
component to parser.

Richard Zach

Comment 10

•

25 years ago

In bug 20087 I reported Mozilla crashing on ODP edit screens. ( Example:
http://www.math.berkeley.edu/~zach/edit.html ) That page has a missing </small>
tag; however, inserting the </small> tag does not help (
http://www.math.berkeley.edu/~zach/editd.html ).  A <td></td> pair are also
missing (the subject of 20087), which may or may not be relevant here.

Biswapesh Chatterjee

Comment 11

•

25 years ago

Another pointer: Go to http://linuxtoday.com/story.php3?
sn=12842&showcomments=flat ; boom - crash. This page has a missing </TABLE> tag.
In fact, so do all pages generated by story.php3 with showcomments=flat. So
effectively you cannot read any user comment in linuxtoday.

vidur (gone)

Comment 12

•

25 years ago

In an attempt to get my bug list in order again, marking all the bugs I have
currently as ASSIGNED.

Jan Fabry

Updated

•

25 years ago

Whiteboard: [MAKINGTEST] cheezycrust@atmosphere.be

Jan Fabry

Comment 13

•

25 years ago

Attached file .html file with 2500 open <font> tags to crash the browser — Details

Jan Fabry

Comment 14

•

25 years ago

This bug shows up when enough open <font> tags are used. On my pc
(PII/400Mhz/64Mb RAM), this is at +/- 2500 open tags. The closing <BODY> and
<HTML> tags do not matter in this bug.

Jan Fabry

Updated

•

25 years ago

Whiteboard: [MAKINGTEST] cheezycrust@atmosphere.be → [TESTCASE] Omitting ending tags should be avoided

vidur (gone)

Comment 15

•

25 years ago

The crash is a result of us blowing the stack during frame construction (Troy
has seen a similar crash on page unloading in nsGenericElement::SetDocument)
because the tree is too deep. It seems like we have the following options:
1) Identify all places where recursive processes need to track the depth of
recursion and either bale out or switch to an iterative equivalent if we go too
deep. Considering the many places we use recursion, this may not be practical.
2) Restrict the depth of the tree in the content sink. This may result in
incorrect layout in pathalogical cases. It turns out that we may actually have
reasonably correct layout in the penguin case if we just flatten in the failure
case.
3) Special-case certain tags (such as FONT) that we believe may be used
incorrectly (as it is in this case) and only allow a few levels of nesting.

For simplicitly, I favor the second option. Troy, Harish - comments?

ekrock's old account (dead)

Comment 16

•

25 years ago

Bulk moving [testcase] code to new testcase keyword. Sorry for the spam!

Keywords: testcase

leger

Comment 17

•

25 years ago

Adding "crash" keyword to all known open crasher bugs.

Keywords: crash

harishd

Assignee

Comment 18

•

25 years ago

*** Bug 26034 has been marked as a duplicate of this bug. ***

Nisheeth Ranjan

Comment 19

•

25 years ago

*** Bug 1070 has been marked as a duplicate of this bug. ***

Nisheeth Ranjan

Comment 20

•

25 years ago

*** Bug 1070 has been marked as a duplicate of this bug. ***

vidur (gone)

Updated

•

25 years ago

Target Milestone: M16

macmahon

Comment 21

•

25 years ago

Here's another real world test link:
<A HREF =
"http://www.iso.port.ac.uk/docs/downloaded/pike-tutorial/tutorial_onepage.html">
http://www.iso.port.ac.uk/docs/downloaded/pike-tutorial/tutorial_onepage.html
</a>.  In this one, the problem seems to be the nested A NAME and A TYPE tags.

macmahon

Comment 22

•

25 years ago

*** Bug 31008 has been marked as a duplicate of this bug. ***

macmahon

Comment 23

•

25 years ago

Another example: http://gostanford.com/ crashes. but the same paged with /FONT
tags added does not: http://deathstar.stanford.edu/~adastra/gostanford.html

Asa Dotzler [:asa]

Comment 24

•

24 years ago

*** Bug 31550 has been marked as a duplicate of this bug. ***

Nisheeth Ranjan

Comment 25

•

24 years ago

*** Bug 18168 has been marked as a duplicate of this bug. ***

Asa Dotzler [:asa]

Comment 26

•

24 years ago

*** Bug 31008 has been marked as a duplicate of this bug. ***

Asa Dotzler [:asa]

Comment 27

•

24 years ago

comments carried over from bug 31008

------- Additional Comments From Roland.Mainz@informatik.med.uni-giessen.de 
2000-03-25 08:27 -------


Loading the URL http://www.autsch.de/bitsteller/linux_penguin.html is very slow
on my Solaris 2.7/MU4 SPARC system.
And trying to select some text at the bottom takes 20secs or more (Ultra
5/333MHz/2MB cache).

Verified it on my Linux system - compared to Netscape 4.x it is VERY slow ;-(

R.K.Aa.

Comment 28

•

24 years ago

bug 37618 shows a side-effect of one unclosed font-tag in combination with
<select> <option>

ekrock's old account (dead)

Comment 29

•

24 years ago

Nominating nsbeta2 because we've seen this bug on linuxtoday.com, and penguin
graphics all over the place such as http://rio.dhs.org/penguin.html (which seems
to be a popular page as it's a frequent DUP), *especially* because it crashes on
http://gostanford.com/ (my alma mater--Berkeley site bugs can be marked WONTFIX
;-> ), and above all because missing end tags are common on the web.

Keywords: nsbeta2

Andreas Franke (gone)

Comment 30

•

24 years ago

*** Bug 39163 has been marked as a duplicate of this bug. ***

R.K.Aa.

Comment 31

•

24 years ago

Adding mostfreq as keyword: Per today there are 11 "nested duplicates" of this
bug, and it's a crasher, originally reported in December 1998. (bug 1070)
I think it deserves the attention it can get.

Keywords: mostfreq

leger

Comment 32

•

24 years ago

Putting on [nsbeta2+] radar.

Whiteboard: [TESTCASE] Omitting ending tags should be avoided → [nsbeta2+][TESTCASE] Omitting ending tags should be avoided

Johnny Stenback (:jst)

Comment 33

•

24 years ago

Taking this bug off Vidurs list.

Assignee: vidur → jst

Status: ASSIGNED → NEW

Johnny Stenback (:jst)

Comment 34

•

24 years ago

I have a fix for this in my tree, the fix is to not let the content model get
too deep, if it does we stop opening containers. The structure of resulting page
will be incorrect, but if we get into a situation like this the input data is
invalid anyway. Does this sound like a reasonable safeguard against crappy HTML?

Status: NEW → ASSIGNED

Hixie (not reading bugmail)

Comment 35

•

24 years ago

That sounds fine... Do you decide if it is "too deep" based on a hardcoded 
depth or based on free memory? I would very very strongly argue that we should
go for a memory-based threshold rather than a fixed number. If we limit it to
a fixed depth like 1024 then we just _know_ that someone will legitimately need
us to support 1025 nested containers and have plenty of free memory for it...

Whiteboard: [nsbeta2+][TESTCASE] Omitting ending tags should be avoided → [nsbeta2+] Omitting ending tags should be avoided

Johnny Stenback (:jst)

Comment 36

•

24 years ago

The amount of free memory is actually irrelevant (on some platforms at least)
here, we don't crash because we run out of memory, we crash due to a stack
overrun.

IMO setting a hardcoded limit is ok here since we can set the limit quite high,
like 1024 like you suggested, if someone writes valid HTML that nests that deep,
they have bigger problems than Mozilla not displaying it correctly ;-)

My guess is that an average HTML document doesn't nest deeper than 10-20 levels
deep so setting the limit at something like 1024 would leave plenty of room
for missing end tags that occasionally don't get closed by the parser.

I know one commersial SGML parser that has a hardcoded nesting depth limit as
low as 48, and that's plenty deep in most cases, I'm not saying we should set
our limit that low, but normally that should be enough.

Roland Mainz

Comment 37

•

24 years ago

I would vote for a limit which is calculated from the file size. This would
catch both cases: Valid and deeply nested HTML in large files and crappy HTML in
small files...

Hixie (not reading bugmail)

Comment 38

•

24 years ago

Johnny: Ok, point taken. Is there no way we can detect when we are hitting the
top of the stack? I'm just a little worried that whatever limit we pick may be
too high for some architectures to which we are ported (or equally, much 
smaller than a particular architecture could easily cope with).

Roland: We don't (always) no the size of the page before we start parsing it.
Think of a page generated by a script which just outputs "<FONT>" continuously.
We need to be able to cope with this worse case scenario without crashing.

Johnny Stenback (:jst)

Comment 39

•

24 years ago

Ian, no, there's no reliable XP way to detect when we're about to run out of
stack space AFAIK. There are hacks that do a pretty good job at this, but
there's no way to tell for sure.

Besides, the stack overflow may occure in different places as you see in Vidurs
comment above so we'd haveto add this stack overflow detection hack in many
places. It's not in the content construction code, it's later on, and it can IMO
happen in quite a few (possible still unidentified) places within layout.

IMO we should simply hardcode the limit as high as we can without causing an
stack overrun, that's the highest we can go, period.

It doesn't matter how big or small the file is, just because the file is
big/small doesn't mean we can nest deeper, we would still fill the stack.

We might want to make the limit platform specific at some point since the stack
size warries from platform to platform.

Péter Bajusz

Comment 40

•

24 years ago

> I have a fix for this in my tree, the fix is to not let the content model get
> too deep, if it does we stop opening containers. The structure of resulting
> page will be incorrect, but if we get into a situation like this the input
> data is invalid anyway.

I think if you hit the limit than instead of not opening the new container, you 
could maybe close the last contaniner and open the new one.
This would have slightly better result, when the limit is hit.

Also from the outside it looks like the bottleneck is not laying out a deep 
tree, the "residual style" handler fails long before that.
I think the residual style handling should be stopped at a lower limit.
This way, opening block level elements above this limit could kill unclosed 
inline elements, so you get a more shallower tree and possibly won't hit the 
"hard" limit on most of the faulty pages.

---

I do think that the limits should be hardcoded XP,
because layout should behave XP no matter what...
Having some (crappy I admit) pages breaking on Windows but non on Linux because 
a different (or worse autodetected) limit, will just drive QA people nuts.

Aleksey Nogin

Comment 41

•

24 years ago

Why not try to avoid stack problems by using CPS (continuations-passing style)
instead of recursive calls?

Johnny Stenback (:jst)

Comment 42

•

24 years ago

Bajusz, what you're suggesting probably result in a better content model, and it
might be worth doing. Howevr, this fix is really only a last effort to avoid a
crash, and thus I would like to keep it as simple and maintainable as possible.
Making sense of whatever HTML we're fed is really up to the parser (ie
normalizing the HTML), this fix is here for one reason only, to prevent the
crash.

Aleksey, that's one possibility and I think that's been discussed, but it's not
really an option in this stage of the development cycle, the change is not
trivial, to say the least. The point here is that we shouldn't be fed a content
model deep enough to overflow the stack any case, and if we are we simply stop
opening the containers until we start closing them again...

Johnny Stenback (:jst)

Updated

•

24 years ago

Whiteboard: [nsbeta2+] Omitting ending tags should be avoided → [nsbeta2+][jst] Omitting ending tags should be avoided

Jeffrey Baker

Comment 43

•

24 years ago

Has this been fixed?  I cannot make Linux build 2000060908 (M17) crash with any
of the testcases listed here.  Some of them are slow, but no crashy crashy.  If
this is fixed, it would make jst a little less doomed for M16.

Johnny Stenback (:jst)

Comment 44

•

24 years ago

Nope, this is not fixed ASAP, you probably need a lot more than 2500 unclosed
font tags today, try 50000 and you'll most likely crash.

I'm handing this over to harishd (per agreement), I'll attach my fix, the
nesting depth limit is still just hardcoded in the patch...
is still

Assignee: jst → harishd

Status: ASSIGNED → NEW

Johnny Stenback (:jst)

Comment 45

•

24 years ago

Attached patch WIP for fixing this crasher — Details — Splinter Review

Johnny Stenback (:jst)

Updated

•

24 years ago

Whiteboard: [nsbeta2+][jst] Omitting ending tags should be avoided → [nsbeta2+] Omitting ending tags should be avoided

harishd

Assignee

Updated

•

24 years ago

Whiteboard: [nsbeta2+] Omitting ending tags should be avoided → [nsbeta2+] Hack in hand; Omitting ending tags should be avoided

rickg

Comment 46

•

24 years ago

If we do add such a hack, then make sure it stays on the contentsink side, and 
not in the parser. The parser is agnostic about these things (you can imagine 
using the parser outside of gecko -- where the stack depth is not an issue).

Johnny Stenback (:jst)

Comment 47

•

24 years ago

Yes, the hack is completely in the content sink, completely independent of the
parser.

Mike

Comment 48

•

24 years ago

M16 has been out for a while now, these bugs target milestones need to be 
updated.

harishd

Assignee

Comment 49

•

24 years ago

Ya..ya. I just checked the *hack* in.  Milestone wouldn't matter anymore.
Thanx for bringing it up though :)

Status: NEW → ASSIGNED

harishd

Assignee

Comment 50

•

24 years ago

FIXED.

Status: ASSIGNED → RESOLVED

Closed: 25 years ago → 24 years ago

Resolution: --- → FIXED

Chris Petersen

Comment 51

•

24 years ago

With the June 26th build (2000062608), I'm not reproduce the crash as described. 
Marking verified fixed.

Status: RESOLVED → VERIFIED

Christopher Hoess (gone)

Comment 52

•

22 years ago

*** Bug 15918 has been marked as a duplicate of this bug. ***

.html file with 2500 open <font> tags to crash the browser 25 years ago Jan Fabry 461 bytes, text/html		Details
WIP for fixing this crasher 24 years ago Johnny Stenback (:jst) 1.23 KB, patch		Details \| Diff \| Splinter Review