Open Bug 1180623 Opened 10 years ago Updated 2 years ago

Stop choking at badly formatted application/xhtml+xml documents

Tracking

()

Status:

NEW

People

(Reporter: julienw, Unassigned)

References

(
URL
)

Details

Attachments

(3 files)

Error on gmail 10 years ago Julien Wajsberg [:julienw] 81.10 KB, image/png		Details
error_gmail.xhtml 10 years ago Julien Wajsberg [:julienw] 6.05 KB, application/xhtml+xml		Details
Error on gmail with Chromium 10 years ago Julien Wajsberg [:julienw] 63.18 KB, image/png		Details

Julien Wajsberg [:julienw]

Reporter

Description

•

10 years ago

We're now in a world where HTML documents are parsed in a permissive way. The XML-for-web era never went off and is now basically dead. Now is a good time to stop displaying an error when encountering a badly formatted application/xhtml+xml document, and instead parse it with the HTML5 parser. This is especially encountered on GMail's basic mobile interface that is served to Firefox OS, and as a result this impairs a lot the system. I've seen this when trying to display an attachment, and when loading the search interface. How to reproduce: 1. set your user agent to something mobile (eg: Firefox OS user agent: Mozilla/5.0 (Mobile; rv:41.0) Gecko/41.0 Firefox/41.0) 2. load gmail.com, log in. 3. tap the 'search' (the magnifying glass) button

Julien Wajsberg [:julienw]

Reporter

Updated

•

10 years ago

OS: Unspecified → All

Hardware: Unspecified → All

Version: unspecified → 41 Branch

Julien Wajsberg [:julienw]

Reporter

Comment 1

•

10 years ago

Attached image Error on gmail — Details

Of course there is an issue at GMail's end too. But I think this is a good example of why we should relax our behavior here.

Julien Wajsberg [:julienw]

Reporter

Comment 2

•

10 years ago

Attached file error_gmail.xhtml — Details

Copy of the XHTML file triggering the issue.

Julien Wajsberg [:julienw]

Reporter

Comment 3

•

10 years ago

I reported the issue for GMail at https://webcompat.com/issues/1347.

:Ms2ger (he/him; ⌚ UTC+1/+2)

Comment 4

•

10 years ago

Bug 1180625 is probably a more workable approach.

Anthony Ricaud (:rik)

Comment 5

•

10 years ago

Opera, back in the Presto days, decided to re-parse as HTML when encountering this kind of errors. https://dev.opera.com/blog/no-more-xml-parsing-failed-errors/

Julien Wajsberg [:julienw]

Reporter

Comment 6

•

10 years ago

(In reply to :Ms2ger from comment #4) > Bug 1180625 is probably a more workable approach. But not really a short-term approach.

Simon Sapin (:SimonSapin)

Comment 7

•

10 years ago

>> Bug 1180625 is probably a more workable approach. > > But not really a short-term approach. There is an implementation (in Rust) at https://github.com/Ygg01/xml5ever

Karl Dubost💡 :karlcow

Comment 8

•

10 years ago

That looks like Bug 1036987

Karl Dubost💡 :karlcow

Updated

•

10 years ago

See Also: → https://webcompat.com/issues/1347

Yuhong Bao

Comment 9

•

10 years ago

On the other hand, content designed for iOS and Android don't typically use XHTML either, it is only one site encountering this issue.

Yuhong Bao

Comment 10

•

10 years ago

I wonder if it is possible to reparse as HTML but disable all JavaScript and other active content.

Julien Wajsberg [:julienw]

Reporter

Comment 11

•

10 years ago

(In reply to Yuhong Bao from comment #10) > I wonder if it is possible to reparse as HTML but disable all JavaScript and > other active content. Why would this be useful ? (In reply to Yuhong Bao from comment #9) > On the other hand, content designed for iOS and Android don't typically use > XHTML either, it is only one site encountering this issue. Albeit a very used website.

Yuhong Bao

Comment 12

•

10 years ago

(In reply to Julien Wajsberg [:julienw] from comment #11) > (In reply to Yuhong Bao from comment #10) > > I wonder if it is possible to reparse as HTML but disable all JavaScript and > > other active content. > > Why would this be useful ? XSS, and scripts may not expect that the content is being parsed as HTML.

Julien Wajsberg [:julienw]

Reporter

Comment 13

•

10 years ago

In the failing website (basic mobile gmail, see bug 1036987), the issue comes from the fact that the pages uses a script element without CDATA blocks, and that script element uses the "<" character to do a comparison. Wondering if we could simply infer CDATA blocks for scripts? Is there a usage for not using CDATA for such cases?

Simon Sapin (:SimonSapin)

Comment 14

•

10 years ago

(In reply to Julien Wajsberg [:julienw] from comment #13) > Wondering if we could simply infer CDATA blocks for scripts? Is there a > usage for not using CDATA for such cases? How would that work, exactly? If you start tweaking the parser’s behavior, you should specify it to give other browsers a chance to interoperate without reverse-engineering. This is what XML5 does.

Julien Wajsberg [:julienw]

Reporter

Comment 15

•

10 years ago

Attached image Error on gmail with Chromium — Details

Chromium tries to display something but the page is also non-functional. (I forced the Firefox OS UA)

Julien Wajsberg [:julienw]

Reporter

Comment 16

•

10 years ago

(In reply to Simon Sapin (:SimonSapin) from comment #14) > (In reply to Julien Wajsberg [:julienw] from comment #13) > > Wondering if we could simply infer CDATA blocks for scripts? Is there a > > usage for not using CDATA for such cases? > > How would that work, exactly? I was more thinking out loud. In the current case, the issue is with the script part, that's why I thought we could do something here. But I can see the site here is really broken. It likely works only on very permissive UA. I don't think this is really our role to fix it, at least not by tweaking the parser. > > If you start tweaking the parser’s behavior, you should specify it to give > other browsers a chance to interoperate without reverse-engineering. This is > what XML5 does. Agreed.

Mike Taylor [:miketaylr]

Updated

•

6 years ago

URL: https://gmail.com

Mike Taylor [:miketaylr]

Updated

•

6 years ago

URL: https://gmail.com → https://mail.google.com

Karl Dubost💡 :karlcow

Updated

•

3 years ago

Webcompat Priority: --- → ?

Dennis Schubert [:denschub]

Updated

•

3 years ago

Webcompat Priority: ? → ---

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

You need to log in before you can comment on or make changes to this bug.