Closed Bug 1173823 Opened 10 years ago Closed 8 years ago

Reader View ignores <base href="...">

Categories

(Toolkit :: Reader Mode, defect, P2)

38 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox52 --- fixed

People

(Reporter: eitzen, Assigned: evanxd)

References

(Blocks 1 open bug)

Details

(Whiteboard: [reader-mode-readability-algorithm])

Attachments

(1 file)

User Agent: Mozilla/5.0 (Windows NT 5.1; rv:38.0) Gecko/20100101 Firefox/38.0 Build ID: 20150525141253 Steps to reproduce: 1. Visit a page with a <base> tag in its <head> and a relative link in its body (example: https://www.redeker.de/main-V2.php/de/news/pm20150611.html ) 2. switch to "Reader View" 3. Follow the link Actual results: The link leads to the wrong or a nonexisting page because it is interperted relative to the page url instead of relative to the <base> url. Apparently, the attempt to remove everything "superfluous" from the DOM tree removed the <base href="..."/> from within the <head>, but as this influences all relative links of the document it is essential rather than superfluous Expected results: The correct page should be displayed
Component: Untriaged → Reader Mode
Product: Firefox → Toolkit
I can confirm that this happens with another example like wikipedia https://en.wikipedia.org/wiki/Cley_Marshes#cite_note-NAE9-1 Click on a footnote url and the page will not scroll. < a href = "#cite_note-NAE9-1"> Support thread reference: https://support.mozilla.org/en-US/questions/1075085
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P2
Whiteboard: [reader-mode-readability-algorithm]
Blocks: 1286221
Assignee: nobody → evan
I think this needs to be fixed in the consumer (ie ReaderMode.jsm which sets the URL to pass to Readability) by using document.baseURI instead of document.documentURI here: https://dxr.mozilla.org/mozilla-central/source/toolkit/components/reader/ReaderMode.jsm#199 . We should be quite careful to check if document.baseURI is somehow security checked so that we don't open ourselves up to pages abusing <base> to get treated like chrome:// or google.com (when they're not) or whatever.
(In reply to :Gijs Kruitbosch from comment #4) > I think this needs to be fixed in the consumer (ie ReaderMode.jsm which sets > the URL to pass to Readability) by using document.baseURI instead of > document.documentURI here: > https://dxr.mozilla.org/mozilla-central/source/toolkit/components/reader/ > ReaderMode.jsm#199 . Good idea. Let's do it. > We should be quite careful to check if document.baseURI is somehow security > checked so that we don't open ourselves up to pages abusing <base> to get > treated like chrome:// or google.com (when they're not) or whatever. We already do some check in `_shouldCheckUri` method[1]. But do we really need to forbid that someone use different domain in `<base>` tag in reader mode?(e.g. a webpage[2] use `https://www.google.com.tw/images/` as its base URI, I think reader mode should support it) I think people could use any valid URI as their base URI if they want. What do you think? [1]: https://dxr.mozilla.org/mozilla-central/source/toolkit/components/reader/ReaderMode.jsm#370-394 [2]: http://evanxd.io/readability/test/test-pages/blogger/source.html
Comment on attachment 8808095 [details] Bug 1173823 - Always use document.baseURI as the uri parameter of _readerParse method, https://reviewboard.mozilla.org/r/91030/#review90780 ::: toolkit/components/reader/ReaderMode.jsm:201 (Diff revision 1) > * @resolves JS object representing the article, or null if no article is found. > */ > parseDocument: Task.async(function* (doc) { > - let uri = Services.io.newURI(doc.documentURI, null, null); > - if (!this._shouldCheckUri(uri)) { > + let documentURI = Services.io.newURI(doc.documentURI, null, null); > + let baseURI = Services.io.newURI(doc.baseURI, null, null); > + if (!this._shouldCheckUri(documentURI) || !this._shouldCheckBaseUri(baseURI)) { I think we can just reuse `_shouldCheckUri` instead of duplicating it?
Attachment #8808095 - Flags: review?(gijskruitbosch+bugs) → review-
Comment on attachment 8808095 [details] Bug 1173823 - Always use document.baseURI as the uri parameter of _readerParse method, https://reviewboard.mozilla.org/r/91030/#review91208
Attachment #8808095 - Flags: review?(gijskruitbosch+bugs) → review+
Pushed by gijskruitbosch@gmail.com: https://hg.mozilla.org/integration/autoland/rev/e32d64d47c52 Always use document.baseURI as the uri parameter of _readerParse method, r=Gijs
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
QA Whiteboard: [good first verify]
Depends on: CVE-2017-7762
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: