Closed Bug 771083 Opened 12 years ago Closed 12 years ago

Shutdown telemetry causes "Assertion failure: r == count, at xpcom/build/mozPoisonWriteMac.cpp:194"

Categories

(Core :: XPCOM, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla16

People

(Reporter: past, Assigned: espindola)

References

Details

(Keywords: assertion, regression)

Attachments

(1 file)

My latest fx-team tip build (changeset ea890a6eed56) crashes on shutdown with this message. Stack:

Thread 0 Crashed:: Main Thread  Dispatch queue: com.apple.main-thread
0   XUL                             0x0000000102ebc076 (anonymous namespace)::AbortOnBadWrite(int, void const*, unsigned long) + 774 (mozPoisonWriteMac.cpp:183)
1   XUL                             0x0000000102ebc292 (anonymous namespace)::wrap_write(int, void const*, unsigned long) + 34 (mozPoisonWriteMac.cpp:40)
2   libnspr4.dylib                  0x0000000100080da2 pt_Write + 98 (ptio.c:1315)
3   libnspr4.dylib                  0x0000000100069f05 PR_vfprintf + 69 (prstdio.c:32)
4   libnspr4.dylib                  0x0000000100069fcb PR_fprintf + 155 (prstdio.c:22)
5   XUL                             0x00000001027b744b __tcf_0 + 315 (nsAppStartup.cpp:308)
6   libsystem_c.dylib               0x00007fff86f9f7c8 __cxa_finalize + 274
7   libsystem_c.dylib               0x00007fff86f9f652 exit + 18
8   firefox-bin                     0x000000010000111b start + 59

Full info at:

http://past.pastebin.mozilla.org/1691309
Blocks: 732173
I hit this on mozilla-central as well, if I set
  user_pref("toolkit.telemetry.enabled", true);

The stack trace in comment 0 is somehow missing a few function names, despite having correct line numbers.  Here's a better stack:

> Thread 0 Crashed:: Main Thread  Dispatch queue: com.apple.main-thread
> 0   XUL                           	0x00000001031454f8 (anonymous namespace)::AbortOnBadWrite(int, void const*, unsigned long) + 664 (mozPoisonWriteMac.cpp:194)
> 1   XUL                           	0x00000001031457d3 _ZN12_GLOBAL__N_115wrap_write_tempILZNS_10write_dataEEEEliPKvm + 35 (mozPoisonWriteMac.cpp:39)
> 2   XUL                           	0x00000001031457a3 (anonymous namespace)::wrap_write(int, void const*, unsigned long) + 35 (mozPoisonWriteMac.cpp:85)
> 3   libnspr4.dylib                	0x000000010009f4e4 pt_Write + 84 (ptio.c:1315)
> 4   libnspr4.dylib                	0x000000010006e906 PR_Write + 54 (priometh.c:114)
> 5   libnspr4.dylib                	0x0000000100079943 PR_vfprintf + 99 (prstdio.c:70)
> 6   libnspr4.dylib                	0x000000010007989c PR_fprintf + 364 (prstdio.c:19)
> 7   XUL                           	0x000000010294b37c _ZL26RecordShutdownEndTimeStampv + 316 (nsAppStartup.cpp:308)
> 8   XUL                           	0x000000010294bd11 RecordShutdownEndTimeStampHelper::~RecordShutdownEndTimeStampHelper() + 17 (nsAppStartup.cpp:324)
> 9   XUL                           	0x000000010294b535 RecordShutdownEndTimeStampHelper::~RecordShutdownEndTimeStampHelper() + 21 (nsAppStartup.cpp:324)
> 10  libsystem_c.dylib             	0x00007fff87ab37c8 __cxa_finalize + 274
> 11  libsystem_c.dylib             	0x00007fff87ab3652 exit + 18
> 12  firefox-bin                   	0x0000000100000e4b start + 59
Blocks: 753461
Summary: Assertion failure: r == count, at xpcom/build/mozPoisonWriteMac.cpp:194 → Shutdown telemetry causes "Assertion failure: r == count, at xpcom/build/mozPoisonWriteMac.cpp:194"
Interesting, I do wonder how why this doesn't fail on m-c.
Status: NEW → ASSIGNED
(In reply to comment #2)
> Interesting, I do wonder how why this doesn't fail on m-c.

I think telemetry is disabled by default in debug builds.
Assignee: nobody → respindola
The patch also switches to using good old FILEs as I could not find how to say fileno in mozillian.

There are some alternatives on how to fix this. We could disable it on debug builds as the collected time is not relevant for debug builds, but we should also consider what release builds will look like in the end.

On a release build, we will _exit(0) early or poison write just like in a debug build. In the _exit(0) case the correct answer is clear, we should record the time just before the _exit(0) call and don't have to care about poisoning. In the case where we do a full shutdown, what should we write to disk? The time at which _exit(0) would have been called or the actual time it took us to shutdown. I guess the second option is probably better, in which case this code has to handle poisoning anyway.
Attachment #639465 - Flags: review?(benjamin)
Attachment #639465 - Flags: review?(benjamin) → review+
https://hg.mozilla.org/mozilla-central/rev/17305771f594
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla16
(In reply to Ehsan Akhgari [:ehsan] from comment #3)
> (In reply to comment #2)
> > Interesting, I do wonder how why this doesn't fail on m-c.
> 
> I think telemetry is disabled by default in debug builds.

JFTR, telemetry is disabled by default in non-official builds, period.
Uh... my builds that were hitting this bug are very much non-official last I checked!
(In reply to Boris Zbarsky (:bz) from comment #9)
> Uh... my builds that were hitting this bug are very much non-official last I
> checked!

Hm, maybe we are missing some telemetry-enabled checks somewhere, then...
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: