Mozilla jumping between standards compliance and quirks mode on same page.

RESOLVED FIXED

Status

()

Core
HTML: Parser
P3
normal
RESOLVED FIXED
16 years ago
12 years ago

People

(Reporter: Janne Jalkanen, Assigned: jst)

Tracking

({fixed-aviary1.0, fixed1.7, helpwanted})

Trunk
x86
All
fixed-aviary1.0, fixed1.7, helpwanted
Points:
---
Bug Flags:
blocking-aviary1.0PR +

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(3 attachments)

(Reporter)

Description

16 years ago
In the above URL, Mozilla can't seem to be able to decide whether the page
should be rendered in standards or quirks mode.  When loaded, Mozilla decides
that the page should be rendered in quirks mode.  However, if you follow any
link, and then click "Back" on the browser, Mozilla loads the page from cache,
and will show the same page in "standards compliance" mode.  This suggests some
possible interaction with the web server (Apache+Tomcat 3.3) - but this should
not affect the rendering mode, yes?

I've seen this occur on Mozilla 1.0, 1.1, and Phoenix 0.4; both on Linux 2.4.18
 (XFree 4.1 and 4.2; KDE 2 and 3) and Windows 98.
Could this be because the first data chunk is all spaces when we're loading this
from the network (whereas from cache it comes in a bigger chunk that actually
includes the doctype)?
Yep.  The server is sending all those newlines as the first packet and the
doctype in the second packet....  So our first OnDataAvailable contains nothing
useful for determining the mode....
Assignee: other → harishd
Status: UNCONFIRMED → NEW
Component: Layout → Parser
Ever confirmed: true
QA Contact: gerardok → moied

Updated

16 years ago
Priority: -- → P3
(Reporter)

Comment 3

15 years ago
Hm.  Isn't this in direct violation of the HTML spec?

Quote from the HTML 4.01 specification, section 7:

An HTML 4 document is composed of three parts:

   1. a line containing HTML version information,
   2. a declarative header section (delimited by the HEAD element),
   3. a body, which contains the document's actual content. The body may be
implemented by the BODY element or the FRAMESET element.

White space (spaces, newlines, tabs, and comments) **may appear before or after
each section**. Sections 2 and 3 should be delimited by the HTML element.
> Isn't this in direct violation of the HTML spec?

No, since that spec has nothing to say on quirks modes.  Which makes it no less 
a bug.
(Reporter)

Comment 5

15 years ago
Hm.  It would seem that mozilla 1.3 fixes this bug - I will test more just to be
sure.  But it is looking good so far: Mozilla goes nicely into standards
compliance mode every time.
(Reporter)

Comment 6

15 years ago
Nope.  It's just that it is occurring now at more irregular intervals.  It could
be that changing the web server could've changed the results or something.

This is still a live bug :-(.

Comment 7

14 years ago
For various arcane reasons, J2EE/JSP actively encourages the creation of pages
with lots of whitespace before the content starts - In my daily working
environment, I'm seeing this bug very frequently indeed.
Keywords: helpwanted
Created attachment 156634 [details]
testcase

this page is in quirks mode due to leading whitespace before the doctype.
remove 1 whitespace byte and it will be in standards .mode
(Assignee)

Comment 9

14 years ago
Created attachment 156829 [details] [diff] [review]
Ignore leading whitespace in the parser when looking for the doctype.
(Assignee)

Comment 10

14 years ago
Comment on attachment 156829 [details] [diff] [review]
Ignore leading whitespace in the parser when looking for the doctype.

This patch fixes this problem by adding a new member to the scanner and
recording the position of the first non-whitespace character in the new member.
The parser then uses this information to know whether or not there's any
non-whitespace data before resuming parsing from OnDataAvailable(), and the
parser also uses the position of the first non-whitespace character as an
offset into the scanner when it copies out a 1024-character long string to look
for the doctype in.

Oddly enough, this change seems safe, which I haven't said about many parser
changes in a long long time...
Attachment #156829 - Flags: superreview?(dbaron)
Attachment #156829 - Flags: review?(darin)
(Assignee)

Comment 11

14 years ago
Taking bug.
Assignee: harishd → jst
Attachment #156829 - Flags: superreview?(dbaron) → superreview+

Comment 12

14 years ago
Comment on attachment 156829 [details] [diff] [review]
Ignore leading whitespace in the parser when looking for the doctype.

>Index: parser/htmlparser/src/nsParser.cpp

>     nsAutoString theBuffer;
>+    // Grab 1024 characters, starting at the first non-whitespace
>+    // character, to look for the doctype in.
>+    mParserContext->mScanner->Peek(theBuffer, 1024, mParserContext->mScanner->FirstNonWhitespacePosition());    

You could replace |nsAutoString theBuffer;| with:

   PRUnichar buf[1024];
   nsFixedString theBuffer(buf, 1024, 0);

Then, you'd be avoiding a heap allocation for this buffer.  Not sure if
there's much benefit, but the amount of generated code should be equivalent.


>Index: parser/htmlparser/src/nsScanner.cpp

>+    while(iter != end) {

nit: whitespace between "while" and opening paren.


r=darin


what happens if a document only contains whitespace?  will we avoid
leaving things in a suspended state?
Attachment #156829 - Flags: review?(darin) → review+
(Assignee)

Comment 13

14 years ago
(In reply to comment #12)
> what happens if a document only contains whitespace?  will we avoid
> leaving things in a suspended state?

In such a case we wouldn't be parsing any of the content until we get the call
to ResumeParse() from OnStopRequest(), i.e. we'd collect all the whitespace in
the scanner w/o parsing any of it while it's coming in, and not call
ResumeParse() until we've seen the whole file.

Not ideal, but how often do we parse files that contain only whitespace? :-)
Status: NEW → ASSIGNED
(Assignee)

Comment 14

14 years ago
Created attachment 156905 [details] [diff] [review]
Patch for checkin (nits fixed, and use nsFixedString)
(Assignee)

Comment 15

14 years ago
Comment on attachment 156905 [details] [diff] [review]
Patch for checkin (nits fixed, and use nsFixedString)

This apparently breaks some pages on ESPN.com, and should be fixed for firefox
1.0, and 1.7 too while we're at it.
Attachment #156905 - Flags: approval1.7.3?
Attachment #156905 - Flags: approval-aviary?
(Assignee)

Comment 16

14 years ago
Comment on attachment 156905 [details] [diff] [review]
Patch for checkin (nits fixed, and use nsFixedString)

Carrying reviews forward...
Attachment #156905 - Flags: superreview+
Attachment #156905 - Flags: review+

Comment 17

14 years ago
Comment on attachment 156905 [details] [diff] [review]
Patch for checkin (nits fixed, and use nsFixedString)

a=mkaply for both
Attachment #156905 - Flags: approval1.7.3?
Attachment #156905 - Flags: approval1.7.3+
Attachment #156905 - Flags: approval-aviary?
Attachment #156905 - Flags: approval-aviary+
(Assignee)

Comment 18

14 years ago
Fixed on trunk, and aviary and 1.7 branches.
Status: ASSIGNED → RESOLVED
Last Resolved: 14 years ago
Keywords: fixed-aviary1.0, fixed1.7
Resolution: --- → FIXED

Updated

14 years ago
Flags: blocking-aviary1.0PR+

Comment 19

14 years ago
this fix causes regression - bug 256938.
(Assignee)

Comment 20

14 years ago
Should'a known. No parser change goes in w/o causing a regression.
(Reporter)

Comment 21

14 years ago
Confirmed fixed in Firefox 1.0PR by the original reporter.  Thanks guys!

Updated

14 years ago
Blocks: 260318

Comment 22

14 years ago
*** Bug 261792 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.