Get rid of DOMParser::parseFromStream

NEW
Unassigned

Status

()

defect
P3
normal
7 years ago
a month ago

People

(Reporter: hsivonen, Unassigned)

Tracking

(Depends on 1 bug, Blocks 1 bug, {addon-compat})

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [Snappy:P2])

Attachments

(1 attachment)

nsIDOMParser::parseFromStream consumes a stream synchronously. First of all, it's bad to do sync IO. Secondly, XMLHttpRequest is supported for both Web content and chrome, so it doesn't make sense to put effort into maintaining a second chrome-only API.

We should get callers migrated to XHR and remove nsIDOMParser::parseFromStream.
Marking [Snappy], because this API may lure people into writing unsnappy code.
Whiteboard: [Snappy]
XHR needs a URI.  There are plenty of use cases in which one has a data stream but not a URI (e.g. pretty much anything involving nsITraceableChannel or storage streams)...

I agree the sync thing is bad, but having an async API for parsing from streams is desirable.
We could introduce a stram-argument entry point to XHR. Or a way to wrap a stream in a blob URL. Or a way to point to storage by URL. But there are uses in the code base that could use URLs just fine.
Sure.  My point is that we should have the replacement available _before_ we remove the existing API.
(In reply to Boris Zbarsky (:bz) from comment #4)
> Sure.  My point is that we should have the replacement available _before_ we
> remove the existing API.

I don't see code on m-c that wouldn't work with XHR. That is, unless I'm missing something, the parseFromStream-using code on m-c could already use URLs.

Am I missing something? Do you mean we need a replacement for extensions?
Yes, for extensions.  A quick look at the out-of-date addons mxr shows around 70 extensions using parseFromStream.  Some can do XHR pretty easily, but some (like the ones directly reading cache input streams) not so much...
Jorge, what would be good ways to
 1) ask extension developers to switch from nsIDOMParser::parseFromStream to *asynchronous* XMLHttpRequest so that their extensions don't make Firefox unsnappy
 2) to gain understanding about what extensions that can't migrate to XMLHttpRequest today are trying to accomplish
?

Boris, why was an extension parsing XML directly from a cache input stream?
> Boris, why was an extension parsing XML directly from a cache input stream?

https://mxr.mozilla.org/addons/source/9584/chrome/content/overlay.js#34 if you have access to it.  It's a function called getFromCache.  There's a corresponding storeInCache.  The extension uses a custom cache session name, so it's just using our cache infrastructure the way it's designed to be used.

This appears to be a Thunderbird/SeaMonkey extension to manage bugmail, in fact.
(In reply to Boris Zbarsky (:bz) from comment #8)
> The extension uses a custom cache session name,
> so it's just using our cache infrastructure the way it's designed to be used.

Is that design something we'd like extensions to continue using now that IndexedDB exists?
I have no idea, not least because my familiarity with IndexedDB is nil....
(In reply to Henri Sivonen (:hsivonen) from comment #7)
> Jorge, what would be good ways to
>  1) ask extension developers to switch from nsIDOMParser::parseFromStream to
> *asynchronous* XMLHttpRequest so that their extensions don't make Firefox
> unsnappy
>  2) to gain understanding about what extensions that can't migrate to
> XMLHttpRequest today are trying to accomplish
> ?

We can get the contact information of add-on developers that do this and send a message asking them these questions. I think in most cases they just chose something that worked for them and stuck with it.

Andrew can take care of this communication and come back with the results.
I've just send the following to all developers that had matching add-ons on mxr:
---
Hi, 

Mozilla developers are planning to remove the parseFromStream method of the nsIDOMParser interface in a future version of Firefox/Thunderbird.  We've identified your add-on as containing code which uses parseFromStream so we'd like to get some early feedback from you about this change.

Its being removed because the functionality is largely duplicated in XMLHttpRequest and parseFromStream is synchronous only, which can cause slowdown in Firefox/Thunderbird.   We believe most add-ons' code can be easily be adjusted to use XMLHttpRequest instead [https://developer.mozilla.org/En/XMLHttpRequest/Using_XMLHttpRequest] but if you have a particular use-case that would be a problem we'd like to hear about it.

The change is being tracked in this bug [https://bugzilla.mozilla.org/show_bug.cgi?id=717561] and you are welcome to comment on it, or reply to this email.

thanks.
Sure, parseFromStream is sync and so bad, but surely most developers who can will just end up switching to sync XHR which is I think is potentially worse as it silently spins a nested event loop. Or I guess they could just synchronously read the steam into a string and then use nsIDOMParser.parseFromString or do you plan to get rid of that too?
How do I know which of my add-ons you are referring to? Can I do the search that Andrew did?
(In reply to Michael Kaply (mkaply) from comment #14)
> How do I know which of my add-ons you are referring to? Can I do the search
> that Andrew did?

https://mxr.mozilla.org/addons/search?string=parseFromStream - if you have access to it (via Mozilla LDAP)
I do not.

I just grepped through my source code and couldn't find a reference. Can someone please do the search and let me know?
(In reply to Michael Kaply (mkaply) from comment #16)
> I do not.
> 
> I just grepped through my source code and couldn't find a reference. Can
> someone please do the search and let me know?

I guess the use in Operator (that's one of your add-ons, right?):
https://mxr.mozilla.org/addons/source/4106/chrome/content/operator.js#500
Posted file MXR results
See attached file for an output from MXR of the matching add-ons.  I've modified it remove the actual source lines and changed the links to point to the AMO pages.  It uses old style addon IDs rather than the newer slugs I'm afraid.
(In reply to Andrew Williamson [:eviljeff] from comment #12)
> I've just send the following to all developers that had matching add-ons on

Thank you.

(In reply to Dave Townsend (:Mossop) from comment #13)
> Sure, parseFromStream is sync and so bad, but surely most developers who can
> will just end up switching to sync XHR which is I think is potentially worse
> as it silently spins a nested event loop. 

Yes, migrating to synchronous XHR is worse. We should probably ban synchronous XHR with chrome privileges in due course.

The right fix is to migrate to *asynchronous* XHR (or when that's impossible e.g. because the stream isn't addressable by URL, say what the use case is so that we can figure out what to do about it).

> Or I guess they could just
> synchronously read the steam into a string

To prevent that, we'd need to remove the capability to open channels in the sync mode. :-(

> and then use
> nsIDOMParser.parseFromString or do you plan to get rid of that too?

No, that's part of the Web platform. Not planning to get rid of that one.

(In reply to Michael Kaply (mkaply) from comment #17)
> FYI, part of mine is a copy of this code in the tree:
> 
> http://mxr.mozilla.org/mozilla-central/source/toolkit/components/search/
> nsSearchService.js#1120

Bug 717562.

(In reply to Nils Maier [:nmaier] from comment #18)
> I guess the use in Operator (that's one of your add-ons, right?):
> https://mxr.mozilla.org/addons/source/4106/chrome/content/operator.js#500

It looks like in this case, the stream comes from a URL-addressable source and, therefore, could use XHR (in async mode, I hope).
Just an update on the mailing I sent on the 19th.  

In addition to the developers who added their comments to this bug I had a few responses direct but none that indicated they couldn't work around the removal (or the addon/code was obsolete anyway).  

If anything the problem will be developers switching from one bad sync-only method to a different bad sync method, as mentioned in other posts. 

I'll followup on the bug with any further emails I received but atm there doesn't seem to be an issue with the change being made in 13.
(In reply to Andrew Williamson [:eviljeff] from comment #21)
> Just an update on the mailing I sent on the 19th.  
> 
> In addition to the developers who added their comments to this bug I had a
> few responses direct but none that indicated they couldn't work around the
> removal (or the addon/code was obsolete anyway).  
> 
> If anything the problem will be developers switching from one bad sync-only
> method to a different bad sync method, as mentioned in other posts. 

From one sync-only method to a worse, potentially security bug inducing, sync-only method. I don't know why we're spending time on this.
(In reply to Dave Townsend (:Mossop) from comment #22)
> From one sync-only method to a worse, potentially security bug inducing,
> sync-only method.

In retrospect, it would have made sense to keep quiet about this until bug 721336 has landed.

> I don't know why we're spending time on this.

Synchronous IO is bad for UI responsiveness and plans for a major rewrite of XML parsing code (https://wiki.mozilla.org/Platform/XML_Rewrite) assume that parsing from a stream happens asynchronously. nsIDOMParser::parseFromStream is an exception.
(In reply to Henri Sivonen (:hsivonen) from comment #23)
> (In reply to Dave Townsend (:Mossop) from comment #22)
> > From one sync-only method to a worse, potentially security bug inducing,
> > sync-only method.
> 
> In retrospect, it would have made sense to keep quiet about this until bug
> 721336 has landed.

If we're actually planning on landing that I think we should do them at the same time (or at least in the same cycle) otherwise we run a great risk of users of one switching to the other only to have to switch again which isn't a great experience.

> > I don't know why we're spending time on this.
> 
> Synchronous IO is bad for UI responsiveness and plans for a major rewrite of
> XML parsing code (https://wiki.mozilla.org/Platform/XML_Rewrite) assume that
> parsing from a stream happens asynchronously. nsIDOMParser::parseFromStream
> is an exception.

Ok that makes more sense, thanks.

Updated

7 years ago
Whiteboard: [Snappy] → [Snappy:P2]
Component: DOM: Mozilla Extensions → DOM
Product: Core → Core
Summary: Get rid of nsIDOMParser::parseFromStream → Get rid of DOMParser::parseFromStream

Updated

10 months ago
Priority: -- → P3
Component: DOM → DOM: Core & HTML
Product: Core → Core
You need to log in before you can comment on or make changes to this bug.