Closed Bug 1583735 Opened 6 years ago Closed 6 years ago

Find out why accessing mmaped JAR file generates SIGBUS

Categories

(Core :: Networking: JAR, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla72
Tracking Status
firefox-esr68 --- fixed
firefox72 --- fixed

People

(Reporter: michal, Assigned: michal)

References

Details

(Whiteboard: [necko-triaged])

Attachments

(2 files)

Bug 1550815 handles SIGBUS but we need to understand why the page faults happen. Let's gather some diagnostic information in the signal handler and get it using MOZ_CRASH.

When we have nsZipHandle available (which is the case of crashes from bugs 1550815, 1564444 and 1564921), we send following information in the crash report:

  • file name
  • current file size
  • buffer address
  • buffer length
  • address that generated SIGBUS

Depends on D42010

Attached file request.md
Attachment #9100336 - Flags: data-review?(chutten)
Comment on attachment 9100336 [details] request.md PRELIMINARY NOTES: This is just the file name, correct, not the full path? (I will proceed with the review assuming this is the case). DATA COLLECTION REVIEW RESPONSE: Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate? Yes. This collection is unfortunately only documented as being part of the general MOZ_CRASH reason. But additionally it is documented as part of this review and in the source tree. Is there a control mechanism that allows the user to turn the data collection on and off? Yes. This collection is opt-in for crash reports, and is also sent via Telemetry ping so can be controlled through Firefox's Preferences. If the request is for permanent data collection, is there someone who will monitor the data over time? No. :michal is responsible for removing it once sufficient information about these crashes has been obtained. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under? Category 1, Technical. Is the data collection request for default-on or default-off? Default on for all channels (for the Telemetry-sent part. For the crash report-sent parts it is default off) Does the instrumentation include the addition of any new identifiers? No. Is the data collection covered by the existing Firefox privacy notice? Yes. Does there need to be a check-in in the future to determine whether to renew the data? Yes. :michal is responsible for renewing or removing the collection once the necessary information has been collected. --- Result: datareview+
Flags: needinfo?(michal.novotny)
Attachment #9100336 - Flags: data-review?(chutten) → data-review+

(In reply to Chris H-C :chutten from comment #3)

This is just the file name, correct, not the full path? (I will proceed with
the review assuming this is the case).

No, it's full URI, but I'll update the patch to send only file name. I think that's enough for our purpose.

Flags: needinfo?(michal.novotny)

(In reply to Michal Novotny [:michal] from comment #4)

(In reply to Chris H-C :chutten from comment #3)

This is just the file name, correct, not the full path? (I will proceed with
the review assuming this is the case).

No, it's full URI, but I'll update the patch to send only file name. I think that's enough for our purpose.

Oh good. For the record, we've had problems in the past where full paths leak personal information (like the username of the person whose home Firefox was installed under), so that's why we aim for filenames only wherever possible.

Pushed by mnovotny@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8d30710883b7 Find out why accessing mmaped JAR file generates SIGBUS, r=gcp,aklotz
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla72

So far, I see only one crash report https://crash-stats.mozilla.com/report/index/2e821805-f115-420d-b1ea-002bb0191203

The crashing address is within the mmapped area and the file size hasn't changed.

Comment on attachment 9100184 [details]
Bug 1583735 - Find out why accessing mmaped JAR file generates SIGBUS, r=gcp

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: If we're going to uplift bug 1550815, we should uplift this too. This bug adds some information to the crash report to find out the cause of the SIGBUS signal. But it happens almost exclusively on Android, so we don't have any useful information yet to move this bug forward.
  • User impact if declined: No change in user experience, but we won't receive information needed for fixing the problem with SIGBUS.
  • Fix Landed on Version: 72
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): It's in the tree for some time now and no problems are observed.
  • String or UUID changes made by this patch: none
Attachment #9100184 - Flags: approval-mozilla-esr68?

Comment on attachment 9100184 [details]
Bug 1583735 - Find out why accessing mmaped JAR file generates SIGBUS, r=gcp

Needed for bug 1550815, approved 68.4b2.

Attachment #9100184 - Flags: approval-mozilla-esr68? → approval-mozilla-esr68+

After landing this patch on ESR we have couple of new crash reports from Android, https://crash-stats.mozilla.com/signature/?signature=MmapAccessScope%3A%3ACrashWithInfo&date=%3E%3D2019-12-08T11%3A46%3A00.000Z&date=%3C2020-01-08T11%3A46%3A00.000Z&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=install_time&_columns=startup_crash&_sort=-date&page=1#reports.

In one case we crash on webcompat-reporter@mozilla.org.xpi and in all other cases we crash on webcompat@mozilla.org.xpi. It seems something is rewriting the file while it's in use by JAR channel. The file size is always multiple of 1K with one exception, https://crash-stats.mozilla.com/report/index/4d83d987-e6d9-4c9e-9508-37e630200108. I guess the file was also shorter than expected but writing has finished and the file flushed between SIGBUS failure and file size check.

I don't know the addon code but could it be that the addon manager is rewriting the existing file instead of deleting the old file and creating a new one?

Flags: needinfo?(kmaglione+bmo)

As far as I can tell, we never try to write over an existing add-on file. If we ever did, that would fail rather badly on Windows due to its mandatory file locking, so we'd find out pretty quick.

In general, when we install an add-on, we copy the new XPI to a staging directory under the final install location, move any existing XPI to a trash directory under the final install location, move the staged XPI into its final location, and then remove the trash directory.

For system add-ons (such as webcompat and webcompat-reporter), we update all add-ons at once, creating a new directory each time, move each new add-on into a staging directory under it, then move it into its final location. Then we eventually remove the old directory.

There are also copies of those add-ons in the app directory, but we obviously never modify them directly.

Flags: needinfo?(kmaglione+bmo)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: