Open Bug 1633191 Opened 5 years ago Updated 7 months ago

browser.downloads.download download failing with error "CRASH"

Categories

(Toolkit :: Downloads API, defect, P3)

x86_64
Linux
defect

Tracking

()

Tracking Status
firefox76 --- wontfix
firefox77 --- wontfix
firefox78 --- fix-optional

People

(Reporter: brunoaiss, Unassigned)

References

Details

Attachments

(2 files)

Attached file downloadCrasher.zip

To reproduce:

  1. Install this addon.
  2. Navigate to https://www.youtube.com/
    Some downloads will fail with the error "CRASH".

Expected:
All downloads happen and succeed.

Why a bug report:
I'm making a bug report because at https://matrix.to/#/!CuzZVoCbeoDHsxMCVJ:mozilla.org/$lbIVduc2l1m1uqFMLGz7Zsh4PedogUoqIZUYoQnuhaM?via=mozilla.org&via=humanoids.be I was recommended to.

Flags: needinfo?(andrew.swan)
Product: Core → WebExtensions

The way the error property on download objects is derived is pretty simplistic: https://searchfox.org/mozilla-central/rev/7fd1c1c34923ece7ad8c822bee062dd0491d64dc/toolkit/components/extensions/parent/ext-downloads.js#175-183

Even after fixing the attached extension (renaming manifest.js to manifest.json) I can't reproduce this reliably -- I did see it happen once when I first tried but now, trying to examine the download error more closely, I can't reproduce it. If you can provide consistent STR, perhaps we can improve the presentation of whatever error is being raised in this case.

Flags: needinfo?(andrew.swan) → needinfo?(brunoaiss)
Attached file downloadCrasher.zip

Try disabling browser cache.
I'm getting the problem when the browser has to download new files in parallel.

Try this extension version. It is giving me better results.

Flags: needinfo?(brunoaiss) → needinfo?(andrew.swan)

Tried the new extension and and cleared the cache and still can't reproduce.
If you can reproduce this consistently, you could open the Browser Toolbox and set a breakpoint here: https://searchfox.org/mozilla-central/source/toolkit/components/extensions/parent/ext-downloads.js#182
Then you could examine the error.

Flags: needinfo?(andrew.swan)

OK. Thank you. I will give news when I can.

We should probably improve the error message here.

Also waiting on reliable STR from reporter.

Severity: -- → S2
Flags: needinfo?(brunoaiss)
Priority: -- → P5

For reference, here are the values that the equivalent API in Chrome may produce: https://developer.chrome.com/extensions/downloads#type-InterruptReason

Just an update:
I did not try breakpoints yet. Only tried for a while to replicate.
In my findings, it appears like the more powerful or lean the PC running firefox is (at least, with linux) the harder it is to force the error to happen.

I'll try getting time this weekend to check this again.
I'll leave the needinfo because I still didn't do the debugging action yet.

This is the information I have:

this.download.error:

becauseSourceFailed: false becauseTargetFailed: false message: "[Exception... \"Failure\" nsresult: \"0x80004005 (NS_ERROR_FAILURE)\" location: \"JS frame :: resource://gre/modules/DownloadCore.jsm :: DownloadError :: line 1629\" data: no]" name: "DownloadError" result: 2147500037 stack: "DownloadError@resource://gre/modules/DownloadCore.jsm:1664:16\nonSaveComplete@resource://gre/modules/DownloadCore.jsm:2037:38\n" ​

I went to resource://gre/modules/DownloadCore.jsm:2037 but it was a dead end. I only have the same information I already had. Same for resource://gre/modules/DownloadCore.jsm:1629
All I can say is that it is some sort of generic undefined error with the number 2147500037

Flags: needinfo?(brunoaiss) → needinfo?(andrew.swan)

Wow... The data got completely joint into a single line!
The preview shows fine. I don't get it.
Here's again.

becauseSourceFailed: false
becauseTargetFailed: false
message: "[Exception... "Failure" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: resource://gre/modules/DownloadCore.jsm :: DownloadError :: line 1629" data: no]"
name: "DownloadError"
result: 2147500037
stack: "DownloadError@resource://gre/modules/DownloadCore.jsm:1664:16\nonSaveComplete@resource://gre/modules/DownloadCore.jsm:2037:38\n"

Looks like the error is coming from here:
https://searchfox.org/mozilla-central/rev/ea7f70dac1c5fd18400f6d2a92679777d4b21492/toolkit/components/downloads/DownloadCore.jsm#2043

I'm not sure how to get any more information out of this error, redirecting the needinfo to :mak, perhaps he can help.

Flags: needinfo?(andrew.swan) → needinfo?(mak)

Unfortunately the information at hand doesn't allow to to get more details about that error, if comment 10 points the right location, and both becauseSourceFailed and becauseTargetFailed are false, it means we just got a generic error, so we can't tell much about where/what happened.

Looking at netwerk/base/BackgroundFileSaver.cpp, there are many code points where it just does an NS_ENSURE_SUCCESS (ProcessStateChange for example), and it doesn't LOG in many of them... Adding logging in the many places where the background saver can fail would help figuring out cases like this in the future. One could just enable logging on the fly and reproduce the bug.
Using what we have, the only way to figure out something here would be to reproduce the bug in a debug build and breakpoint in BackgroundFileSaver.cpp. That'd require to reproduce the bug under RR or a similar reverse debugger so one can walk back to when mStatus was set.
I didn't try to reproduce this yet, I'll try and let you know if I can...

I hope this helps:

I just tried making a version that restricts to 3 simultaneous downloads and waits 200ms between each download. That version did not fail to download, even when I throttled the CPU to 800MHz.
If I remove that safeguard and then throttle the CPU to 800MHz and restrict all firefox processes to the same CPU core, this test extension I sent to keeps failing in almost every website. The more files it has, the more it fails.

In your tests, did you try throttling the CPU in the OS's CPU controls?
For example, if you are using pstate, you may just :
echo 1 > /sys/devices/system/cpu/intel_pstate/max_perf_pct
To restore back to normal, just use:
echo 100 > /sys/devices/system/cpu/intel_pstate/max_perf_pct

S1 or S2 bugs need an assignee - could you find someone for this bug?

Flags: needinfo?(mixedpuppy)

The good news is that I can reproduce the failure, I'm using a VM with Ubuntu 18.04, and a debug build I just compiled on it. After opening Firefox I go to about:debugging, load the add-on and then visit youtube.com. some downloads are failing.
The bad news is that I have a Ryzen, and I forgot rr doesn't work on it, so I can't reverse debug it. If someone would like to try it's not particularly complex or time consuming to reproduce the failure.

In the meanwhile, I can try with some ugly methods like asserts. I'll let you know if I find something.

Looks like a lower level issue, moving there.

Severity: S2 → --
Component: General → Downloads API
Flags: needinfo?(mixedpuppy)
Priority: P5 → --
Product: WebExtensions → Toolkit

I didn't have luck with finding a good stack, so this is still open for pick. At least there's a workaround by adding some delay.

The best path forwards are still:

  1. catch the failure with rr and debug reverse to see where we fail
  2. add more runtime-enabled logging
Severity: -- → S3
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(mak)
Priority: -- → P3
See Also: → 1649463
  • info for my situation; related with #1649463
    browser.privatebrowsing.autostart is, in my linux version, set to false

I hope this helps too:

  • Nightly, 2021-06-12
  • Object URLs created with objectUrl = URL.createObjectURL on the same problematic URLs appear to similarly crash when downloading via browser.downloads.download({ url: objectUrl }).
  • Using the browser's right-click context menu to save images worked.
  • I tried the delay workaround suggested above, but it didn't work. It might have worked once, but I could not reliably reproduce the fix.
  • Using fetch to retrieve the URLs worked.
  • Instagram image URLs were consistently triggering the bug.

I worked around the issue by downloading using fetch, passing in the blob from that to createObjectURL, and then calling downloads.download on the resulting object URL.

Observed this bug or one very similar in FF 124.0

The 'CRASH' happens when downloading from an objectURL but it only fails sporadically.

The trigger for the bug appeared to be a race-condition when an object URL is revoked before the download is finished saving to disk:

url = URL.createObjectURL(blob);
await browser.downloads.download({ url });
URL.revokeObjectURL(url);

Strangely this only appeared on Windows platform and I was unable to reproduce it on same hardware running Linux. Other Windows users also reported this issue.

Sorry, I noticed that the API docs warn against calling revokeObjectURL before the download is complete. Presumably the resulting behaviour is not considered a bug.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: