Poorly constructed files can result in nsOggReader scanning most of the file looking for a valid granulepos. This results in playback hanging for a significant time on long files, with Firefox using 100% cpu.
Created attachment 634609 [details]
problem file generator
Problem file generator by Greg Maxwell. To make an example, comment out the ogg_page_checksum call on line 92 and run the output for a few cycles. This will generate a file with valid ogg data at the beginning, but append copies with invalid crc values which libogg with reject.
Example problem files produced this way by Greg Maxwell:
The same issue occurs with both Vorbis and Opus; likely all Ogg-container files are affected. The long-running scan happens inside RangeEndTime. Likely we should limit how far that function is willing to hunt for a seek point.
The problem here is that we rely on crc values from valid pages to know when we're hit data we've seen before. Since these files end with no value pages, we scan them for a sync point, back off 5k, and scan them all again, using quadratic time to find the end of the valid data.
Switching to exponential rather than linear backup allows playback to start instantly, for example.
(In reply to Ralph Giles (:rillian) from comment #2)
> Switching to exponential rather than linear backup allows playback to start
> instantly, for example.
You're talking about changing the mustBackOff loop in EndRangeTime? If you backed off this way you'd want to do a linear scan after the sync to ensure you get the last page with a granulepos in the range, else you won't be getting the last granulepos in the range.
I don't think this is worth spending much (perhaps any) time fixing this, given that Ogg isn't our focus moving forward, and this is only an issue for invalid files.
Created attachment 634672 [details] [diff] [review]
Proposed fix. This restores O(N) when scanning backward from the end. There's still a pause, but it's much better. Does this look like a useful direction?
Comment on attachment 634672 [details] [diff] [review]
Review of attachment 634672 [details] [diff] [review]:
I think that approach is fine.
Created attachment 635516 [details] [diff] [review]
cleaned up patch
Updated patch. This is less aggressive, but won't fail to read long pages at the end of the file.
There are still some problems with the invalid files after applying this patch, but it addresses the quadratic time issue, reducing the time after seek from minutes to seconds.
With the opus example there's a several second delay after any seek while it searches the file again, and playback can become completely stuck with the vorbis example. Nevertheless I'm happy with the mitigation it provides. There are no great solutions here; we can't bisection-search to find the last valid page in a sea of garbage data, so we have to either scan or give up.
Media tests are green. Requesting review for landing.
Created attachment 636353 [details] [diff] [review]
cleaned up patch
Sorry, forgot to update with the latest version of the patch. This is like the previous one, but only reads on maxOggPageLength, not two. We would need two if we were doing random access, in case we land in the middle of the first of two long pages, but since we're scanning backward, one page length is sufficient.
This one is https://tbpl.mozilla.org/?tree=Try&rev=3b02234174a3
Thanks, Chris. Ready to land.
I've not added a test for this; the issue is with corrupt files and must be several MB to make the hang obvious.
Media tests on the try push are green.
Pushed to mozilla-inbound: