nsMediaChannelStream::OnStopRequest nulls out mChannel too early

VERIFIED FIXED

Status

()

defect
VERIFIED FIXED
11 years ago
10 years ago

People

(Reporter: kinetik, Assigned: mayhemer)

Tracking

({verified1.9.1})

unspecified
x86
macOS
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9.1 +
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 6 obsolete attachments)

This causes GetPrincipal() to return null after the load completes.  This causes the two video canvas reftests to fail with a security error.  The tests have been marked random for now, so they need to be reenabled when this bug is fixed.
Flags: blocking1.9.1?
Flags: blocking1.9.1? → blocking1.9.1+
Blocks: 487671
Duplicate of this bug: 487671
Are STR from that bug reliable way to reproduce?
The steps from that bug can be simplified:

1. Visit http://tapper-ware.net/devel/js/JS.StereoVideo/index2.xhtml
2. Wait for the video to completely download
3. Click play

You should see the canvas below the video update as the video plays.  Instead, due to this bug, the canvas won't update, and you will see security errors logged in the error console.
Note that nulling out the channel is probably correct (though not really necessary to avoid leaks).  The real issue is not saving off the principal before doing that.
Does the channel start returning null from GetPrincipal?
No, if you have an nsIChannel, even one that's done getting data, you can get its principal via the security manager API.

So if this code did hold on to mChannel that would certainly fix the principal issue.
I have a patch for this. Soon will submit after I check there are no leaks/crashes.
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Hmm... when I run mochitest and open the URL in the mochitest profile (i.e. clear one) video is not streaming (there is no pre-load). In a completely new profile it works. Having trunk build few hours old from now. Is it intention, some setting for mochitest environment? user_pref("media.cache_size", 100)?
That setting is to provide better testing of the media cache. Video should still play correctly though, what exactly is the problem you're seeing?
(In reply to comment #10)
> That setting is to provide better testing of the media cache. Video should
> still play correctly though, what exactly is the problem you're seeing?

OK, just checked it's not caused by my patch. When I open the URL from comment 3, video first frame appears. Then, when I do this in a clean profile I can see the buffer indicator is growing as the whole video file is being downloaded, immediately after I open the URL. But, when I open the URL in mochitest profile, the video file is not being automatically downloaded; only after I press play button it starts to download and play with it.
In the mochitest profile, a small amount of data will be downloaded and then we'll pause because the media cache is full. Then when you press play we'll be able to discard some data and continue the download. This is intended behaviour when the cache size is small.
Posted patch v1 (obsolete) — Splinter Review
I'm saving mPrincipal in OnStartRequest(). I am not sure if you expect this method be called only on the main thread.
Attachment #373115 - Flags: review?(roc)
Whiteboard: [has patch][needs review roc]
So what happens with the principal in the cached-data case, then?
(In reply to comment #14)
> So what happens with the principal in the cached-data case, then?

cache-data case means to read data from mCacheStream or when an underlying channel is loading from a cache, e.g. an http cache?
The former: the case where you're clearing mPrincipal and mChannel in this patch.
I'm confused ... where is he clearing mPrincipal and mChannel in this patch? The only place I see is in nsMediaFileStream::Close, which seems OK ... nsMediaFileStream is for file: URIs and has nothing to do with caching...
Comment on attachment 373115 [details] [diff] [review]
v1

This is fine as far as I'm concerned but I'd like Boris to sign off too
Attachment #373115 - Flags: review?(roc)
Attachment #373115 - Flags: review?(bzbarsky)
Attachment #373115 - Flags: review+
Ah, ok.  I was fooled by the NS_ERROR_PARSED_DATA_CACHED nsresult used there.

I'll look in more detail tomorrow, assuming I get through all of Blake's wrapper stuff...
(In reply to comment #16)
> The former: the case where you're clearing mPrincipal and mChannel in this
> patch.

Sorry for later answer. As I understand how the media channel works it opens an underlying channel (a request) and caches the content in mCacheStream, that is actually a buffer. When user wants to play the media a decoder is using data in the cache stream. When there are not any data in the buffer for a segment user wants to play media channel opens a new request and fills the buffer - the cache stream. During all this time there is still mPrincipal filled (from the moment we get OnStartRequest) and if not it is always attempted to get it from a channel if there is a channel (as before).

I didn't find another place where the channel is cleared (and where mPrincipal should be discarded) then in nsMediaFileStream::Close. I am not sure it's the best place but as I see the code it seems appropriate to clear it on that place.
So the point is that whenever we're doing Close() there that means we have a new channel we're using to load data?

Note that for this to work correctly for local files, the media code probably needs to be setting some owners on those channels; that should likely be a separate bug.
I guess a better question is... where is nsMediaFileStream::Close called from?  And wouldn't it make more sense to clear mPrincipal when assigning a new non-null mChannel if that's why it's ok to clear it?
We are assigning a new channel in nsMediaChannelStream::CacheClientSeek. :Close is called when decoder is being shut down what happens during removal of the media element or page close, not sure when src of video element is changed by a script.

The patch is wrong. I'd say we should null mPrincipal in all Close method implementations. When element is somehow replaced or URL is replaced channel may for some time keep a wrong principal.

bz, what do think?
I have no idea.  How does the Close() method relate to the ability to get data out of the video element.

Whenever one can get data out of the element, we need to be able to get the principal of that data.  When the element has no data, it's fine to be returning null.
Close() is only called when we shut down the decoder. At that point we should not be painting video to the screen or the canvas anymore. So it's OK to null things out there and not return a principal. nsMediaChannelStream::CloseChannel() happens during CacheClientSeek, but we create a new channel there right away so setting the channel to null in CloseChannel shouldn't be a problem.

I think the main issue for this bug was just nulling out mChannel in OnStopRequest. If we just stop doing that, this bug should be fixed.

A deeper issue is that data in the cache might not all be associated with the same principal. We might read some data with one principal, seek and get redirected, then read data for some other principal. That's really a different bug; I'm not sure what the best way to handle it is. I suppose we should track all the principals of all the channels we've used and return their intersection or something like that.
Posted patch v2 (obsolete) — Splinter Review
A bit better patch. I clear mPrincipal in all Close() methods and when replacing a channel in CacheClientSeek().

Just for curiosity, leaving the channel in OnStopRequest fixes the bug too, and there are no leaks. Maybe do it that way?
Attachment #373115 - Attachment is obsolete: true
Attachment #373709 - Flags: review?(bzbarsky)
Attachment #373115 - Flags: review?(bzbarsky)
Whiteboard: [has patch][needs review roc] → [has patch][needs review bz]
That sounds like what roc is proposing to me.
Posted patch v3 [Checkin comment 33] (obsolete) — Splinter Review
OK, then this is the second variant. All content/media tests are passing with it and there are no leaks.
Attachment #373716 - Flags: review?(bzbarsky)
Attachment #373716 - Attachment is patch: true
Attachment #373716 - Attachment mime type: application/octet-stream → text/plain
That last patch looks fine to me on its own, but someone very familiar with the video networking stuff should review it to make sure that nothing is depeding on mChannel being null.
Comment on attachment 373709 [details] [diff] [review]
v2

Looks like this approach is dead now.
Attachment #373709 - Attachment is obsolete: true
Attachment #373709 - Flags: review?(bzbarsky)
Comment on attachment 373716 [details] [diff] [review]
v3 [Checkin comment 33]

actually, Boris said this was OK, so let's not delay getting this landed.
Attachment #373716 - Flags: review?(bzbarsky) → review+
Whiteboard: [has patch][needs review bz] → [has patch][needs landing]
Comment on attachment 373716 [details] [diff] [review]
v3 [Checkin comment 33]

http://hg.mozilla.org/mozilla-central/rev/524233971e2c
Attachment #373716 - Attachment description: v3 → v3 [Checkin comment 33]
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: [has patch][needs landing] → [has patch][needs 191 landing]
To be clear, that patch is OK from the necko end and from the "getting a principal" end.  I can't speak to its interaction with the other video stuff.

Also, this should have a test, no?
Flags: in-testsuite?
Posted patch test v1 (obsolete) — Splinter Review
Test based on http://tapper-ware.net/devel/js/JS.StereoVideo/index2.xhtml
Attachment #374868 - Flags: review?(roc)
Posted patch v3+test for 1.9.1 (obsolete) — Splinter Review
May land on 1.9.1 after the test gets reviewed.
Whiteboard: [has patch][needs 191 landing] → [has patch][needs review roc][needs 191 landing]
Comment on attachment 374868 [details] [diff] [review]
test v1

+var currentMediaCache = prefs.getIntPref("media.cache_size");
+prefs.setIntPref("media.cache_size", 50*1024);

Just do <script src="use_large_cache.js"> like the other tests.

I think it would be best to do the drawing off the "load" event, not canplaythrough or timeupdate. It's quite likely that this test will pass on trunk if drawWindow fires while we're still downloading and holding the mChannel reference.
Whiteboard: [has patch][needs review roc][needs 191 landing] → [has patch][needs 191 landing]
Robert, rorry for such a long time to react, but I missed bug mail with your last comment... 

Is there something different on 1.9.1 with the events? I'm using canplaythrough that is called when load of the data is complete (we are pass OnStopRequest, where the code change is). It perfectly works on trunk, is the behavior different on 1.9.1? Or you mean that canplaythrough can pass before we load all data?

Also, it seems the patch has already been landed on 1.9.1:
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/cb68da3b406a

So, we are just about to push the test.
Keywords: fixed1.9.1
Whiteboard: [has patch][needs 191 landing] → [missing test]
canplaythrough can be called before the load of the data is complete. We'll call it as soon as we think we can play all the way through without stalling --- i.e., when we think we'll finish loading the data at about the same time as we finish playing.
I cannot reproduce the problem when I rollback the patch on trunk. Something has changed?
Yes, principals are now tracked through nsMediaCache. Your patch here is probably not needed anymore.
So, only thing left to do here is to have a test for 1.9.1.
Posted patch test v0 (obsolete) — Splinter Review
A mochitest that paints the video to a canvas after the video has completely loaded.

According to the spec, it's only legal to paint the video to the canvas when readyState >= HAVE_CURRENT_DATA (i.e. when the current frame is available), and it's possible for the load event to fire before that predicate holds, so the test will run the paint off of the loadeddata event if that occurs.

I don't think it's possible for load to fire before loadeddata in our current code, because we deliberately suppress the load event until we fire loadedmetadata, and we always fire loadedmetadata and loadeddata at the same time.  I can't see anything in the spec that requires the former, and I'm sure that the latter is not required either.  Am I right?  If so, should we be proposing this change/simplification to the HTML 5 WG?
Attachment #373716 - Attachment is obsolete: true
Attachment #374868 - Attachment is obsolete: true
Attachment #374879 - Attachment is obsolete: true
Attachment #374868 - Flags: review?(roc)
You mean, should we propose that the spec require "load" to fire after "loadeddata"?

I don't feel very strongly about it, but I suppose it would be a good idea.
That, or at least after loadedmetadata.  That'd make a media load event consistent with an img load event (which fires when the image is "available", meaning the metadata is known), but that's the only argument I can come up with and it seems fairly week.  Looking at the other places a load event can be fired in HTML5:

link, style: when resource is fetched
script: when script created (including parsing)
object: once completely loaded (not really clear on this one)
iframe: when contented loaded and in frame loads have fired
img, input button: when metadata available
document: once no longer delayed

...it seems like there's scope but no strong argument to delay it for media elements.

I've emailed the WHATWG list.  I guess we won't need to take any action whatever the outcome is, but if we do I'll file a new bug.
Whiteboard: [missing test] → [missing test][needs landing]
Posted patch test v1Splinter Review
Anne van Kesteren pointed out on WHATWG that the spec already requires the events to be ordered, so you will always see loadedmetadata, loadeddata, then load.
Attachment #379434 - Attachment is obsolete: true
Temporarily marking reopened so this shows up on the radar for [needs landing] or checkin-needed searches, since the tree is restricted and I might forget about this before I can check it in.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Can we move the tests to a new bug and close this one? It's blocking1.9.1+ so reopening it will confuse drivers.
Good point.  Sorry, that was dumb.  Bug 495165.
Status: REOPENED → RESOLVED
Closed: 11 years ago10 years ago
Resolution: --- → FIXED
verified FIXED using STR from comment #3

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.2a1pre) Gecko/20090528 Minefield/3.6a1pre ID:20090528112613

and

Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1pre) Gecko/20090602 Shiretoko/3.5pre ID:20090602031310
Status: RESOLVED → VERIFIED
Bug 495165 added a mochitest for this, pushed in rev bd1d2c9c5799.
http://hg.mozilla.org/mozilla-central/rev/bd1d2c9c5799
Flags: in-testsuite? → in-testsuite+
Whiteboard: [missing test][needs landing]
You need to log in before you can comment on or make changes to this bug.