Open
Bug 1180623
Opened 9 years ago
Updated 2 years ago
Stop choking at badly formatted application/xhtml+xml documents
Categories
(Core :: DOM: HTML Parser, defect)
Tracking
()
NEW
People
(Reporter: julienw, Unassigned)
References
()
Details
Attachments
(3 files)
We're now in a world where HTML documents are parsed in a permissive way. The XML-for-web era never went off and is now basically dead. Now is a good time to stop displaying an error when encountering a badly formatted application/xhtml+xml document, and instead parse it with the HTML5 parser. This is especially encountered on GMail's basic mobile interface that is served to Firefox OS, and as a result this impairs a lot the system. I've seen this when trying to display an attachment, and when loading the search interface. How to reproduce: 1. set your user agent to something mobile (eg: Firefox OS user agent: Mozilla/5.0 (Mobile; rv:41.0) Gecko/41.0 Firefox/41.0) 2. load gmail.com, log in. 3. tap the 'search' (the magnifying glass) button
Reporter | ||
Updated•9 years ago
|
OS: Unspecified → All
Hardware: Unspecified → All
Version: unspecified → 41 Branch
Reporter | ||
Comment 1•9 years ago
|
||
Of course there is an issue at GMail's end too. But I think this is a good example of why we should relax our behavior here.
Reporter | ||
Comment 2•9 years ago
|
||
Copy of the XHTML file triggering the issue.
Reporter | ||
Comment 3•9 years ago
|
||
I reported the issue for GMail at https://webcompat.com/issues/1347.
Comment 4•9 years ago
|
||
Bug 1180625 is probably a more workable approach.
Comment 5•9 years ago
|
||
Opera, back in the Presto days, decided to re-parse as HTML when encountering this kind of errors. https://dev.opera.com/blog/no-more-xml-parsing-failed-errors/
Reporter | ||
Comment 6•9 years ago
|
||
(In reply to :Ms2ger from comment #4) > Bug 1180625 is probably a more workable approach. But not really a short-term approach.
Comment 7•9 years ago
|
||
>> Bug 1180625 is probably a more workable approach. > > But not really a short-term approach. There is an implementation (in Rust) at https://github.com/Ygg01/xml5ever
Comment 8•9 years ago
|
||
That looks like Bug 1036987
Updated•9 years ago
|
See Also: → https://webcompat.com/issues/1347
Comment 9•9 years ago
|
||
On the other hand, content designed for iOS and Android don't typically use XHTML either, it is only one site encountering this issue.
Comment 10•9 years ago
|
||
I wonder if it is possible to reparse as HTML but disable all JavaScript and other active content.
Reporter | ||
Comment 11•9 years ago
|
||
(In reply to Yuhong Bao from comment #10) > I wonder if it is possible to reparse as HTML but disable all JavaScript and > other active content. Why would this be useful ? (In reply to Yuhong Bao from comment #9) > On the other hand, content designed for iOS and Android don't typically use > XHTML either, it is only one site encountering this issue. Albeit a very used website.
Comment 12•9 years ago
|
||
(In reply to Julien Wajsberg [:julienw] from comment #11) > (In reply to Yuhong Bao from comment #10) > > I wonder if it is possible to reparse as HTML but disable all JavaScript and > > other active content. > > Why would this be useful ? XSS, and scripts may not expect that the content is being parsed as HTML.
Reporter | ||
Comment 13•9 years ago
|
||
In the failing website (basic mobile gmail, see bug 1036987), the issue comes from the fact that the pages uses a script element without CDATA blocks, and that script element uses the "<" character to do a comparison. Wondering if we could simply infer CDATA blocks for scripts? Is there a usage for not using CDATA for such cases?
Comment 14•9 years ago
|
||
(In reply to Julien Wajsberg [:julienw] from comment #13) > Wondering if we could simply infer CDATA blocks for scripts? Is there a > usage for not using CDATA for such cases? How would that work, exactly? If you start tweaking the parser’s behavior, you should specify it to give other browsers a chance to interoperate without reverse-engineering. This is what XML5 does.
Reporter | ||
Comment 15•9 years ago
|
||
Chromium tries to display something but the page is also non-functional. (I forced the Firefox OS UA)
Reporter | ||
Comment 16•9 years ago
|
||
(In reply to Simon Sapin (:SimonSapin) from comment #14) > (In reply to Julien Wajsberg [:julienw] from comment #13) > > Wondering if we could simply infer CDATA blocks for scripts? Is there a > > usage for not using CDATA for such cases? > > How would that work, exactly? I was more thinking out loud. In the current case, the issue is with the script part, that's why I thought we could do something here. But I can see the site here is really broken. It likely works only on very permissive UA. I don't think this is really our role to fix it, at least not by tweaking the parser. > > If you start tweaking the parser’s behavior, you should specify it to give > other browsers a chance to interoperate without reverse-engineering. This is > what XML5 does. Agreed.
Updated•5 years ago
|
URL: https://gmail.com
Updated•5 years ago
|
Updated•3 years ago
|
Webcompat Priority: --- → ?
Updated•2 years ago
|
Webcompat Priority: ? → ---
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•