Closed Bug 1123040 Opened 10 years ago Closed 4 years ago

Videos are saved with a wrong file name

Categories

(Firefox for Android Graveyard :: Download Manager, defect)

All
Android
defect
Not set
normal

Tracking

(fennec+)

RESOLVED INCOMPLETE
Tracking Status
fennec + ---

People

(Reporter: marco, Unassigned, Mentored)

References

Details

(Whiteboard: [good next bug])

Attachments

(2 files)

That link is no good.
Attached video 2015-11-23-161706.webm
Steps to reproduce: 1) Open the video in Firefox 2) Long press on the video 3) Click "Save Video" The video is saved as "attachment.cgi". Aaron, can you reproduce?
Flags: needinfo?(aaron.train)
Flags: needinfo?(aaron.train) → needinfo?(teodora.vermesan)
Tested using: Device: LG Nexus 4 (Android 5.1) On Firefox for Android 43.0b4, the video is saved as "attachment.cgi". On Firefox for Android 45.0a1 (2015-11-23), the video is saved as "DwGsYL.htm".
Flags: needinfo?(teodora.vermesan)
Version: unspecified → Trunk
tracking-fennec: --- → 46+
Assignee: nobody → margaret.leibovic
(In reply to Teodora Vermesan (:TeoVermesan) from comment #5) > Tested using: > Device: LG Nexus 4 (Android 5.1) > > On Firefox for Android 43.0b4, the video is saved as "attachment.cgi". > On Firefox for Android 45.0a1 (2015-11-23), the video is saved as > "DwGsYL.htm". In my local build (today's fx-team) I see this saved as "attachment.cgi". However, on desktop Nightly, I see Firefox offer to save the video as "2015-11-23-161706.webm". Debugging the code that saves this video, it seems like it bases the file name on the URI that's passed in, and this logic hasn't changed in a long time: http://hg.mozilla.org/mozilla-central/annotate/0711218a018d/toolkit/content/contentAreaUtils.js#l363 Our logic to call this function also hasn't chnaged in a long time: http://hg.mozilla.org/mozilla-central/annotate/0711218a018d/mobile/android/chrome/content/browser.js#l996 Maybe the problem has to do with the target src value we're getting from the video? However, I tried debugging this by saving a Facebook video from my own account, and it saves an .mp4 file and works as expected, even though the URL we pass in to internalSave isn't an .mp4 URL. Paolo, I see you've touched this contentAreaUtils.js code. Do you have any idea what could be going on here?
Flags: needinfo?(paolo.mozmail)
On Desktop, for "Save Link As" there is a "saveHelper" function used specifically for the context menu: http://mxr.mozilla.org/mozilla-central/source/browser/base/content/nsContextMenu.js#1219 This retrieves not only the MIME type but also the Content-Disposition header. If the network operation times out initially, we offer to save the link with a name based on the URL, this is why the results may be inconsistent when testing on Desktop. I think the comment here refers to the fact that this wasn't implemented for Android: http://hg.mozilla.org/mozilla-central/annotate/0711218a018d/mobile/android/chrome/content/browser.js#l999
Flags: needinfo?(paolo.mozmail)
Debugging through the save process, this is the function which ends up generating actual filename: https://dxr.mozilla.org/mozilla-central/source/toolkit/content/contentAreaUtils.js#1018 We explicitly pass in content-disposition object as null, so it ends up falling through into its other options. Here's the decision path: 1) "filename" or (if not present) "name" properties from content-disposition header - sidenote, "name" property isn't defined anywhere in the w3c specs, as far as I can tell... are there really buggy servers out there spitting out "name" property in this header? or is this something to clean up? 2) document title 3) actual filename 4) docTitle(?) 5) caller-provided default 6) dir name if dir 7) host name, and if extension is missing "htm" gets appended as default by the caller, in this case 8) localized default save file name 9) "index" as the last resort. As one would expect, this wouldn't work too well once you get down a few levels. In my tests videos off of Facebook (w/o a filename in the uri) are saved as "m.facebook.com.htm" (option #7, host name), and if you open them up display as plaintext (once they load, it took forever on a few 2mb videos). A path forward seems to be to obtain the Content-Disposition header prior to calling internalSave, as Paolo mentioned. Margaret, out of curiosity, can you find a link to that FB video that got saved with a correct extension? It'll be interesting to see why that worked. Looking through commit logs for files involved, it seems that it has always been like this on Fennec. Here's the commit that added this functionality (and also last time this code was touched): https://hg.mozilla.org/mozilla-central/rev/ce79b39c6e0e
Or, get whatever we might need from the cache.
(In reply to :Grisha Kruglov from comment #8) > Debugging through the save process, this is the function which ends up > generating actual filename: > > https://dxr.mozilla.org/mozilla-central/source/toolkit/content/ > contentAreaUtils.js#1018 > > We explicitly pass in content-disposition object as null, so it ends up > falling through into its other options. Here's the decision path: > 1) "filename" or (if not present) "name" properties from > content-disposition header > - sidenote, "name" property isn't defined anywhere in the w3c specs, as > far as I can tell... are there really buggy servers out there spitting out > "name" property in this header? or is this something to clean up? > 2) document title > 3) actual filename > 4) docTitle(?) > 5) caller-provided default > 6) dir name if dir > 7) host name, and if extension is missing "htm" gets appended as default by > the caller, in this case > 8) localized default save file name > 9) "index" as the last resort. > > As one would expect, this wouldn't work too well once you get down a few > levels. In my tests videos off of Facebook (w/o a filename in the uri) are > saved as "m.facebook.com.htm" (option #7, host name), and if you open them > up display as plaintext (once they load, it took forever on a few 2mb > videos). > > A path forward seems to be to obtain the Content-Disposition header prior to > calling internalSave, as Paolo mentioned. > > Margaret, out of curiosity, can you find a link to that FB video that got > saved with a correct extension? It'll be interesting to see why that worked. I think I remember it being this video: https://www.facebook.com/macleans/videos/10153692342753950/ But I can't get it to work now. It even offered two different file names, depending on the flow to get there. First, I had to log in, and it saved it as "login.php", then when I reloaded it saved it as "m.facebook.com.htm". Although the download isn't even completing properly, so I can't test the result :/ Given that this has always been the case, I don't think this needs to track 46, but this would be good to fix in the 48-49 timeline, to go along with our improved video and saving efforts.
Assignee: margaret.leibovic → nobody
tracking-fennec: 46+ → ?
This changeset attempts to obtain cache information for the resource, and if cache is available (as it should be, at this point in the flow), try to obtain `content-disposition` and `content-type` headers which we can pass along - they will be used to construct a proper filename. Solution seems to work well for URLs which won't redirect (e.g. [1]). Nick suggested that browser toolkit might be a good home for this code? While testing this solution on a broader set of urls, I stumbled upon what I think is a separate problem. Example: 1) [2] - produces a 302 redirect to [1]. Video element's `currentSrc` is [2], while `baseURI` is [1]. You'd expect them both to be [1] 2) if [1] is opened directly, both `currentSrc` and `baseURI` are [1]. This discrepancy at the end of the 302 redirect chain seems like a bug to me, and it's preventing solution above from working correctly on redirecting URIs. If I adjust the solution to use `baseURI` in offending use cases, things work correctly. Margaret, thoughts? Do you know whom can I ping to clarify redirect behaviour described above? [1] https://bug1123040.bmoattachments.org/attachment.cgi?id=8690861 [2] https://bugzilla.mozilla.org/attachment.cgi?id=8690861
Flags: needinfo?(nalexander)
Flags: needinfo?(margaret.leibovic)
(In reply to :Grisha Kruglov from comment #12) > This changeset attempts to obtain cache information for the resource, and if > cache is available (as it should be, at this point in the flow), try to > obtain `content-disposition` and `content-type` headers which we can pass > along - they will be used to construct a proper filename. > > Solution seems to work well for URLs which won't redirect (e.g. [1]). Nick > suggested that browser toolkit might be a good home for this code? Paolo would be a good person to ask about this. > While testing this solution on a broader set of urls, I stumbled upon what I > think is a separate problem. > Example: > 1) [2] - produces a 302 redirect to [1]. Video element's `currentSrc` is > [2], while `baseURI` is [1]. You'd expect them both to be [1] > 2) if [1] is opened directly, both `currentSrc` and `baseURI` are [1]. > > This discrepancy at the end of the 302 redirect chain seems like a bug to > me, and it's preventing solution above from working correctly on redirecting > URIs. If I adjust the solution to use `baseURI` in offending use cases, > things work correctly. > > Margaret, thoughts? Do you know whom can I ping to clarify redirect > behaviour described above? Honza should be able to help, or at least redirect this to someone else who could.
Flags: needinfo?(paolo.mozmail)
Flags: needinfo?(margaret.leibovic)
Flags: needinfo?(honzab.moz)
> (In reply to :Grisha Kruglov from comment #12) > > Solution seems to work well for URLs which won't redirect (e.g. [1]). Nick > > suggested that browser toolkit might be a good home for this code? Not sure. Why are you doing something different than what the Desktop context menu code does, that is opening a channel to the provided URL and using it if it doesn't time out?
Flags: needinfo?(paolo.mozmail)
(In reply to :Paolo Amadini from comment #14) > > (In reply to :Grisha Kruglov from comment #12) > > > Solution seems to work well for URLs which won't redirect (e.g. [1]). Nick > > > suggested that browser toolkit might be a good home for this code? > > Not sure. Why are you doing something different than what the Desktop > context menu code does, that is opening a channel to the provided URL and > using it if it doesn't time out? Grisha and I discussed this, and there's an equally valid question: why is this trying to open a channel rather than reading the cache, like the equivalent image functions do? Does opening the channel hit the cache if it's there? I.e., so that the Desktop code is strictly more functional than just hitting the cache? Do we think that, on mobile devices with slow connections, this will *ever* actually open the connection before timing out?
Flags: needinfo?(nalexander)
(In reply to Nick Alexander :nalexander from comment #15) > Why is this trying to open a channel rather than reading the cache, like the > equivalent image functions do? I'll try to answer best as I can, without having taken the time to look in detail or search Bugzilla about why these choices were made at the time. Sometimes it's just that we didn't have the right APIs when the code was written. I guess that the image functions you're talking about are optimized for getting subresources, for which we prefer what's displayed on the screen, while the context menu code for links is optimized for getting responses that are not currently displayed or even loaded, and may be re-requested even if the URL is the same, depending on what previous headers said to do. Maybe the Android case discussed in this bug is closer to the first Desktop case than the second one. > Does opening the channel hit the cache if it's there? It should, if using the cache is allowed by the load. > Do we think that, on mobile devices with slow > connections, this will *ever* actually open the connection before timing out? The difference in connection is a good reason to either use a different timeout or a different approach on Android. Maybe we should err on the side of getting information about the name from a stale cache entry than doing a load. However, with the current patch we'd still trigger a load after potentially getting the file name anyways, so I don't see why we couldn't just do a load that's allowed (or prefers) to use the cache and get the information from there.
Thanks, Paolo! Regarding opening a channel, the downsides/upsides as I see them are: - most of the time it'll just end up hitting the cache, and a cache miss will likely happen because of a spotty connection, in which case the channel will be unlikely to succeed on a short timeout - upside of the above is that in case of a cache miss, it still has a chance of succeeding..? - it's conceptually more complicated, device is doing more work, possibly less power-efficient - particularly in case of a cache-miss Cache-directly: - simple, less code/work involved, device ends up doing less stuff - will work most of the time, and on a cache miss, network request is also unlikely to succeed in time - similar to how other resources are processed (e.g. images) The whole "power efficient" thing is a bit of a moot point (user is about to download a video over, say, cell data...), but I thought was still worth mentioning. internalSave, which we invoke to do the saving with the information we gather here, goes on to initialize a transfer, and starts up a "persist" object which does the save. I don't imagine that if we open a channel to obtain header info it will somehow get re-used down the road for the transfer - without rewriting bunch of that code? As you said, we're more inline here with the image use case anyway, so I would just go with that approach. Which leaves us with a seemingly malfunctioning 302-redirect chain (second half of comment 12).
(In reply to :Grisha Kruglov from comment #17) > internalSave, which we invoke to do the saving with the information we > gather here, goes on to initialize a transfer, and starts up a "persist" > object which does the save. I don't imagine that if we open a channel to > obtain header info it will somehow get re-used down the road for the > transfer - without rewriting bunch of that code? Heh, internalSave wouldn't, but Desktop passes the channel it opened to nsIExternalHelperAppService and the channel provides the data that gets saved, similarly to a navigation to a page with "Content-Disposition: attachment". http://mxr.mozilla.org/mozilla-central/source/browser/base/content/nsContextMenu.js#1293 Only if it times out, it goes through internalSave: http://mxr.mozilla.org/mozilla-central/source/browser/base/content/nsContextMenu.js#1305 Really, in an ideal state all of this should be doable with one "Downloads.fetch()" call with the right options. We might get there in the next few months, currently Andrew Swan is looking into supporting the "chrome.downloads" API for WebExtensions and having Downloads.jsm support this use case may be one of the requirements.
See Also: → 1245652
(In reply to :Margaret Leibovic from comment #13) > (In reply to :Grisha Kruglov from comment #12) > > 1) [2] - produces a 302 redirect to [1]. Video element's `currentSrc` is > > [2], while `baseURI` is [1]. You'd expect them both to be [1] > > 2) if [1] is opened directly, both `currentSrc` and `baseURI` are [1]. > > > > Honza should be able to help, or at least redirect this to someone else who > could. I have no idea where from you are taking currentSrc or baseURI. That is something docshell or browser code creates, not necko. You have to turn to boris or somebody who knows the code that gives you these properties. Looking at the patch: it's not ensured at all you get a cache entry. you should copy the desktop code. Instead of opening a cache entry, you should open a channel for the url with LOAD_FROM_CACHE flag that will use anything that is currently in the HTTP cache (no revalidation) or go to the network for the resource. I think that desktop is actually starting the download before you select the name, it can be clearly visible that after you confirm the save-as dialog, the download progress is long after 0%.
Flags: needinfo?(honzab.moz)
Assignee: nobody → gkruglov
tracking-fennec: ? → +
Assignee: gkruglov → nobody
Mentor: gkruglov
Whiteboard: [good next bug]
I am experiencing the same issue with audio files. You can reproduce the issue with this podcast: http://www.hookedmagazin.de/wp-content/podcasts/Hooked_FM_Ep092.mp3?ptm_source=download&ptm_context=select-button&ptm_file=Hooked_FM_Ep092.mp3 When I try to download the audio file by long-pressing on the audio player controls (that are displayed when opening the link above) and select to save the audio file, Firefox downloads the file as "www.hookedmagazin.de.htm". It does actually download the audio file, though, and I can play it by giving it an .mp3 suffix, so only the filename it has been given is wrong.
We have completed our launch of our new Firefox on Android. The development of the new versions use GitHub for issue tracking. If the bug report still reproduces in a current version of [Firefox on Android nightly](https://play.google.com/store/apps/details?id=org.mozilla.fenix) an issue can be reported at the [Fenix GitHub project](https://github.com/mozilla-mobile/fenix/). If you want to discuss your report please use [Mozilla's chat](https://wiki.mozilla.org/Matrix#Connect_to_Matrix) server https://chat.mozilla.org and join the [#fenix](https://chat.mozilla.org/#/room/#fenix:mozilla.org) channel.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → INCOMPLETE
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: