Closed Bug 770605 Opened 13 years ago Closed 12 years ago

poison write during shutdown in release builds too

Tracking

()

Status:

RESOLVED FIXED

People

(Reporter: espindola, Assigned: espindola)

References

Details

(Whiteboard: [snappy:p1])

Attachments

(1 obsolete file)

poison write during shutdown in release builds 13 years ago Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2.89 KB, patch		Details \| Diff \| Splinter Review

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Assignee

Description

•

13 years ago

In bug 732173 we got write poisoning enabled in debug builds. We now need to make that available on release builds too. The final objective is to have: * Release builds default to calling _exit(0), but can optionally poison write and do a full shutdown. * Debug builds default to poisoning write and doing a full shutdown, but can optionally call _exit(0) instead. Before we get there we also want to have write poisoning enabled on a nightly to try to find writes not covered by our tests, so the objective of this bug is to get write poisoning enabled on both release and debug builds.

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Assignee

Updated

•

13 years ago

Depends on: 774390

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Assignee

Updated

•

13 years ago

Depends on: 773903

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Assignee

Comment 1

•

13 years ago

Attached patch poison write during shutdown in release builds (obsolete) — Details — Splinter Review

I have tested it locally by moving the poisoning earlier and checking that our bug reporter shows up during shutdown of a release build. There is a try run at https://tbpl.mozilla.org/?tree=Try&pusher=respindola@mozilla.com

Assignee: nobody → respindola

Status: NEW → ASSIGNED

Attachment #644992 - Flags: review?(ted.mielczarek)

(dormant account)

Comment 2

•

13 years ago

I'm all for write poisoning so we can get data on what writes are being lost. However I think a) we should call exit(0) on nightlies b) poisoning should be done non-destructively and cause a stack to be recorded, which would get submitted via telemetry. I think this means that we should do poisoning randomly, so we can test both exit(0) and poisoned path

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Assignee

Comment 3

•

13 years ago

(In reply to Taras Glek (:taras) from comment #2) > I'm all for write poisoning so we can get data on what writes are being > lost. However I think > a) we should call exit(0) on nightlies > b) poisoning should be done non-destructively and cause a stack to be > recorded, which would get submitted via telemetry. > > I think this means that we should do poisoning randomly, so we can test both > exit(0) and poisoned path Calling exit(0) and poisoning (crash or telemetery) are exclusive. I am not sure I understand when you want us to do one or the other.

(dormant account)

Comment 4

•

13 years ago

To clarify: * debug should poison with crash always * release should exit(0) or poison 50% of the time depending on rand().

(dormant account)

Comment 5

•

13 years ago

(In reply to Taras Glek (:taras) from comment #4) > * release should exit(0) or poison 50% of the time depending on rand(). I meant * nightly should exit(0) or poison 50% of the time depending on rand(). Non-nightly branches should exit(0) exclusively.

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Assignee

Comment 6

•

13 years ago

A quick summary from IRC. We will now have 3 modes of operation which can be selected at runtime (via an env var most likely): MOZ_POISON=NO ./firefox : will call _exit(0) MOZ_POISON=Crash ./firefox : will crash on a late write MOZ_POISON=Report ./firefox : will report a late write via telemetry with a stack trace. The defaults will be * Debug builds: Crash * Nightly builds: Report * Aurora, Beta and Releases: No

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Assignee

Comment 7

•

13 years ago

Comment on attachment 644992 [details] [diff] [review] poison write during shutdown in release builds Per discussion with Taras this patch is obsolete.

Attachment #644992 - Attachment is obsolete: true

Attachment #644992 - Flags: review?(ted.mielczarek)

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Assignee

Updated

•

13 years ago

Depends on: 777122

(dormant account)

Comment 8

•

13 years ago

Also, note this would be 50% of telemetry users. It doesn't really help to poison writes for non-telemetry users since we wouldn't report those.

Lawrence Mandel [:lmandel] (use needinfo)

Updated

•

13 years ago

Whiteboard: [snappy]

(no longer active)

Comment 9

•

13 years ago

(In reply to Rafael Ávila de Espíndola (:espindola) from comment #6) > A quick summary from IRC. > > We will now have 3 modes of operation which can be selected at runtime (via > an env var most likely): > > MOZ_POISON=NO ./firefox : will call _exit(0) > MOZ_POISON=Crash ./firefox : will crash on a late write > MOZ_POISON=Report ./firefox : will report a late write via telemetry with a > stack trace. > > The defaults will be > * Debug builds: Crash > * Nightly builds: Report > * Aurora, Beta and Releases: No I think this mostly makes sense except for the Release channel. Currently calling _exit(0) on release builds is data-loss by default because of late writes coming from add-ons etc. We should focus on fixing that problem first I think.

(dormant account)

Comment 10

•

13 years ago

(In reply to Ehsan Akhgari [:ehsan] from comment #9) > I think this mostly makes sense except for the Release channel. Currently > calling _exit(0) on release builds is data-loss by default because of late > writes coming from add-ons etc. We should focus on fixing that problem > first I think. Lets see how it goes on nightly before we worry about release policy too much.

(no longer active)

Comment 11

•

13 years ago

(In reply to comment #10) > (In reply to Ehsan Akhgari [:ehsan] from comment #9) > > > I think this mostly makes sense except for the Release channel. Currently > > calling _exit(0) on release builds is data-loss by default because of late > > writes coming from add-ons etc. We should focus on fixing that problem > > first I think. > > Lets see how it goes on nightly before we worry about release policy too much. For sure. But in this particular case we already know that we'll have this problem. We just don't know which add-ons. :-)

Terrell Kelley

Comment 12

•

13 years ago

I don't like the idea of making it random. Random would mean that, 25% of the time, the same mode will be chosen for three different tests. (1:8 chance of being exit(0) plus 1:8 chance of write poisoning) It should just alternate.

Terrell Kelley

Comment 13

•

13 years ago

Ugh, submitted too early. Alternation means a 100% chance after two tests (or even three) that it worked. This is more in line with how I believe most addon developers do testing. It would take around 10 tests to make the possibility of all tests being in the same mode small enough to be statistically insignificant, if we stick with random.

(no longer active)

Comment 14

•

13 years ago

(In reply to comment #12) > I don't like the idea of making it random. Random would mean that, 25% of the > time, the same mode will be chosen for three different tests. (1:8 chance of > being exit(0) plus 1:8 chance of write poisoning) It should just alternate. We could compute an initial random number and store the result in a pref or something, so that people would face consistent behavior across multiple runs, but we'd still end up with two buckets of results.

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Assignee

Comment 15

•

13 years ago

> > The defaults will be > > * Debug builds: Crash > > * Nightly builds: Report > > * Aurora, Beta and Releases: No > > I think this mostly makes sense except for the Release channel. Currently > calling _exit(0) on release builds is data-loss by default because of late > writes coming from add-ons etc. We should focus on fixing that problem > first I think. Correct. That is just where we want to get. Currently we have debug builds crash and other builds do nothing. The next step I think is implementing support for reporting for the nightly builds.

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Assignee

Updated

•

13 years ago

No longer depends on: 774390

(dormant account)

Updated

•

13 years ago

Whiteboard: [snappy] → [snappy:p1]

Rafael Ávila de Espíndola (:espindola) (not reading bugmail)

Assignee

Comment 16

•

12 years ago

This has been fixed some time ago. We are getting late writes reports in telemetry.

Status: ASSIGNED → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.