Open
Bug 1180623
Opened 10 years ago
Updated 2 years ago
Stop choking at badly formatted application/xhtml+xml documents
Categories
(Core :: DOM: HTML Parser, defect)
Tracking
()
NEW
People
(Reporter: julienw, Unassigned)
References
()
Details
Attachments
(3 files)
We're now in a world where HTML documents are parsed in a permissive way. The XML-for-web era never went off and is now basically dead.
Now is a good time to stop displaying an error when encountering a badly formatted application/xhtml+xml document, and instead parse it with the HTML5 parser.
This is especially encountered on GMail's basic mobile interface that is served to Firefox OS, and as a result this impairs a lot the system. I've seen this when trying to display an attachment, and when loading the search interface.
How to reproduce:
1. set your user agent to something mobile (eg: Firefox OS user agent: Mozilla/5.0 (Mobile; rv:41.0) Gecko/41.0 Firefox/41.0)
2. load gmail.com, log in.
3. tap the 'search' (the magnifying glass) button
Reporter | ||
Updated•10 years ago
|
OS: Unspecified → All
Hardware: Unspecified → All
Version: unspecified → 41 Branch
Reporter | ||
Comment 1•10 years ago
|
||
Of course there is an issue at GMail's end too. But I think this is a good example of why we should relax our behavior here.
Reporter | ||
Comment 2•10 years ago
|
||
Copy of the XHTML file triggering the issue.
Reporter | ||
Comment 3•10 years ago
|
||
I reported the issue for GMail at https://webcompat.com/issues/1347.
Comment 4•10 years ago
|
||
Bug 1180625 is probably a more workable approach.
Comment 5•10 years ago
|
||
Opera, back in the Presto days, decided to re-parse as HTML when encountering this kind of errors. https://dev.opera.com/blog/no-more-xml-parsing-failed-errors/
Reporter | ||
Comment 6•10 years ago
|
||
(In reply to :Ms2ger from comment #4)
> Bug 1180625 is probably a more workable approach.
But not really a short-term approach.
Comment 7•10 years ago
|
||
>> Bug 1180625 is probably a more workable approach.
>
> But not really a short-term approach.
There is an implementation (in Rust) at https://github.com/Ygg01/xml5ever
Comment 8•10 years ago
|
||
That looks like Bug 1036987
Updated•10 years ago
|
See Also: → https://webcompat.com/issues/1347
Comment 9•10 years ago
|
||
On the other hand, content designed for iOS and Android don't typically use XHTML either, it is only one site encountering this issue.
Comment 10•10 years ago
|
||
I wonder if it is possible to reparse as HTML but disable all JavaScript and other active content.
Reporter | ||
Comment 11•10 years ago
|
||
(In reply to Yuhong Bao from comment #10)
> I wonder if it is possible to reparse as HTML but disable all JavaScript and
> other active content.
Why would this be useful ?
(In reply to Yuhong Bao from comment #9)
> On the other hand, content designed for iOS and Android don't typically use
> XHTML either, it is only one site encountering this issue.
Albeit a very used website.
Comment 12•10 years ago
|
||
(In reply to Julien Wajsberg [:julienw] from comment #11)
> (In reply to Yuhong Bao from comment #10)
> > I wonder if it is possible to reparse as HTML but disable all JavaScript and
> > other active content.
>
> Why would this be useful ?
XSS, and scripts may not expect that the content is being parsed as HTML.
Reporter | ||
Comment 13•10 years ago
|
||
In the failing website (basic mobile gmail, see bug 1036987), the issue comes from the fact that the pages uses a script element without CDATA blocks, and that script element uses the "<" character to do a comparison.
Wondering if we could simply infer CDATA blocks for scripts? Is there a usage for not using CDATA for such cases?
Comment 14•10 years ago
|
||
(In reply to Julien Wajsberg [:julienw] from comment #13)
> Wondering if we could simply infer CDATA blocks for scripts? Is there a
> usage for not using CDATA for such cases?
How would that work, exactly?
If you start tweaking the parser’s behavior, you should specify it to give other browsers a chance to interoperate without reverse-engineering. This is what XML5 does.
Reporter | ||
Comment 15•10 years ago
|
||
Chromium tries to display something but the page is also non-functional.
(I forced the Firefox OS UA)
Reporter | ||
Comment 16•10 years ago
|
||
(In reply to Simon Sapin (:SimonSapin) from comment #14)
> (In reply to Julien Wajsberg [:julienw] from comment #13)
> > Wondering if we could simply infer CDATA blocks for scripts? Is there a
> > usage for not using CDATA for such cases?
>
> How would that work, exactly?
I was more thinking out loud.
In the current case, the issue is with the script part, that's why I thought we could do something here.
But I can see the site here is really broken. It likely works only on very permissive UA. I don't think this is really our role to fix it, at least not by tweaking the parser.
>
> If you start tweaking the parser’s behavior, you should specify it to give
> other browsers a chance to interoperate without reverse-engineering. This is
> what XML5 does.
Agreed.
Updated•6 years ago
|
URL: https://gmail.com
Updated•6 years ago
|
Updated•3 years ago
|
Webcompat Priority: --- → ?
Updated•3 years ago
|
Webcompat Priority: ? → ---
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•