crash in OOM | large | NS_ABORT_OOM(unsigned int) | nsACString_internal::Replace(unsigned int, unsigned int, char const*, unsigned int)

VERIFIED FIXED in Firefox 35

Status

()

Core
General
--
critical
VERIFIED FIXED
4 years ago
3 years ago

People

(Reporter: dmajor (away and/or busy), Assigned: milan)

Tracking

({crash, topcrash-win})

36 Branch
mozilla36
x86
Windows NT
crash, topcrash-win
Points:
---
Bug Flags:
qe-verify +

Firefox Tracking Flags

(firefox35+ verified, firefox36+ verified)

Details

(crash signature)

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

4 years ago
This bug was filed from the Socorro interface and is 
report bp-b5f7e031-0b02-406d-8d9a-026b32141031.
=============================================================

Oh the irony. gfxCriticalError spammed AnnotateCrashReport until we OOM crashed. 

Not sure if this should be addressed on the gfx side or crash reporter. Ted/Nical?
Flags: needinfo?(ted)
Flags: needinfo?(nical.bugzilla)
(Reporter)

Comment 1

4 years ago
[Tracking Requested - why for this release]: #13 crash on aurora

There are some hits on beta but not with this annotation pattern. It might be other issues.
status-firefox34: --- → ?
status-firefox35: --- → affected
status-firefox36: --- → affected
tracking-firefox35: --- → ?
tracking-firefox36: --- → ?
We should probably put a cap on the string for AppendAppNotesToCrashReport as a backstop here, but we should also fix that gfx code. We'd just cap the string somewhere around here:
http://hg.mozilla.org/mozilla-central/annotate/17e190839086/toolkit/crashreporter/nsExceptionHandler.cpp#l1856
Flags: needinfo?(ted)
I think it should be addressed by gfx since we wrote the little thing that is spamming the reporter.

We talked about limiting the gfxCriticalError at some point with something like keeping the first one or two errors, and the last N ones (and perhaps count the number of errors that we dropped).

This is a good incentive to put that in place.
Flags: needinfo?(nical.bugzilla)
(Assignee)

Comment 4

4 years ago
Bug 1074952 limits gfxCriticalError to keeping the very first one, and the last N-1 where N is determined by a preference gfx.logging.crash.length.
Depends on: 1074952
(Assignee)

Updated

4 years ago
Assignee: nobody → milan
(Assignee)

Comment 5

4 years ago
However... if we are considering uplifting the fix to this, we would not want the bug 1074952 - it is too large for that.  So, it'd be good to know.
tracking-firefox35: ? → +
tracking-firefox36: ? → +
(Assignee)

Comment 6

4 years ago
We should stop seeing these as of 2014/11/19 nightly, at least with this signature.
(Reporter)

Comment 7

4 years ago
Here's one from 20141119030200, or did you mean starting from the nightly after that?
https://crash-stats.mozilla.com/report/index/a8dd6d71-bbcd-4a37-a792-ea6262141120
Flags: needinfo?(milan)
(Assignee)

Comment 8

4 years ago
The one after, this landed on central on 2014/11/19.
Flags: needinfo?(milan)
(Reporter)

Updated

3 years ago
Duplicate of this bug: 1094978
(Reporter)

Comment 10

3 years ago
Looks like it worked.

Is it possible carve bug 1074952 into something safe for uplift?
Status: NEW → RESOLVED
Last Resolved: 3 years ago
status-firefox36: affected → fixed
Flags: needinfo?(milan)
Resolution: --- → FIXED
(Assignee)

Comment 11

3 years ago
Only thing I can think of is either removing gfxCriticalError in http://dxr.mozilla.org/mozilla-central/source/gfx/layers/d3d11/TextureD3D11.cpp#291 or replacing it with gfxWarning, but, I'd like to hear if :nical thinks that would reduce the useful information we get from (other) crash reports.
Flags: needinfo?(milan) → needinfo?(nical.bugzilla)
If it's specifically this line that spams the appnotes, maybe a good compromise would be to replace 

if (FAILED(hr)) {
  gfxCriticalError() << "[D3D11] CreateTexture2D failure " << mSize << " Code: " << gfx::hexa(hr);
  return;
}

by

if (FAILED(hr)) {
  static int sCount = 0;
  if (sCount <= 5) {
    gfxCriticalError() << "[D3D11] CreateTexture2D failure " << mSize << " Code: " << gfx::hexa(hr);
  }
  if (sCount = 5) {
    gfxCriticalError() << "The error above was reported 5 times, dropping next occurrences."
  }
  sCount++;
  return;
}
Flags: needinfo?(nical.bugzilla)
Taking note that interestingly, the logging involved a lot more allocations than what one would think is required to a allocate a 16*16 rgba texture. Looks like small texture allocations reserve significantly more memory than the amount they use to store actual pixels.
(Assignee)

Comment 14

3 years ago
Created attachment 8529844 [details] [diff] [review]
Aurora patch.  Kludge to stop the annotation overflow.

:nical, something like this?
Attachment #8529844 - Flags: review?(nical.bugzilla)
Comment on attachment 8529844 [details] [diff] [review]
Aurora patch.  Kludge to stop the annotation overflow.

Review of attachment 8529844 [details] [diff] [review]:
-----------------------------------------------------------------

Sure
Attachment #8529844 - Flags: review?(nical.bugzilla) → review+
Comment on attachment 8529844 [details] [diff] [review]
Aurora patch.  Kludge to stop the annotation overflow.

Review of attachment 8529844 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/d3d11/TextureD3D11.cpp
@@ +293,5 @@
>      if (FAILED(hr)) {
> +      // Temporary patch; see bug 1095289 before bug 1074952 is fixed
> +      static int limitDueTo1095289 = 0;
> +      if (limitDueTo1095289 < 5) {
> +        gfx::gfxCriticalError() << "[D3D11] CreateTexture2D failure (1) " << aSize << " Code: " << gfx::hexa(hr);

mSize

@@ +295,5 @@
> +      static int limitDueTo1095289 = 0;
> +      if (limitDueTo1095289 < 5) {
> +        gfx::gfxCriticalError() << "[D3D11] CreateTexture2D failure (1) " << aSize << " Code: " << gfx::hexa(hr);
> +      } else if (limitDueTo1095289 == 5) {
> +        gfx::gfxCriticalError() << "[D3D11] CreateTexture2D failure (1-stopping further reports after 5+ occurences) " << aSize << " Code: " << gfx::hexa(hr);

mSize
(Assignee)

Comment 17

3 years ago
Created attachment 8529922 [details] [diff] [review]
Aurora patch.  Kludge to stop the annotation overflow.  Carry r=nical
Attachment #8529844 - Attachment is obsolete: true
Attachment #8529922 - Flags: review+
(Assignee)

Comment 18

3 years ago
Comment on attachment 8529922 [details] [diff] [review]
Aurora patch.  Kludge to stop the annotation overflow.  Carry r=nical

Approval Request Comment
[Feature/regressing bug #]:
[User impact if declined]: topcrash
[Describe test coverage new/current, TBPL]:
[Risks and why]: Low
[String/UUID change made/needed]:
Attachment #8529922 - Flags: approval-mozilla-aurora?
status-firefox34: ? → ---
Comment on attachment 8529922 [details] [diff] [review]
Aurora patch.  Kludge to stop the annotation overflow.  Carry r=nical

We merged in between. cf comment #18 for uplift the request.
Attachment #8529922 - Flags: approval-mozilla-aurora? → approval-mozilla-beta?
Attachment #8529922 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
https://hg.mozilla.org/releases/mozilla-beta/rev/f7063d1b1d95
status-firefox35: affected → fixed
Target Milestone: --- → mozilla36
Flags: qe-verify+
A Socorro [1] report for the past 4 weeks shows:
- 35 Beta = 127 crashes
   * 35.0b2 = 65 crashes
   * 35.0b3 = 39 crashes
   * 35.0b4 = 17 crashes
   * 35.0b5 = 6 crashes
- 36 Aurora = 4 crashes

I'm not sure whether the expectation here was for this to not show anymore, or just show at a lower level. Milan, can you provide your insight on this?

[1] - https://crash-stats.mozilla.com/report/list?product=Firefox&range_unit=days&range_value=28&signature=OOM+%7C+large+%7C+NS_ABORT_OOM%28unsigned+int%29+%7C+nsACString_internal%3A%3AReplace%28unsigned+int%2C+unsigned+int%2C+char+const%2A%2C+unsigned+int%29
Flags: needinfo?(milan)
The patch here only fixes one instance of this crash signature, the other crash reports in 36 look to be different issues. We should probably add nsACString_Internal::Replace to the skiplist to differentiate.
(Assignee)

Comment 23

3 years ago
Right - this was just the crash that included CrashReporter::AnnotateCrashReport in it, or more to the point, mozilla::gfx::CriticalLogger::OutputMessage, so I think these are the crashes we weren't going after in this bug.  Probably need a new one?

Speaking of which, is there a way to see a raw .extra file from a crash report?
Flags: needinfo?(milan)
Not precisely, since that data isn't stored untouched, but you can get pretty close with:
https://crash-stats.mozilla.com/api/ProcessedCrash/?datatype=processed&crash_id=xxx
Thank you for clarifying this! Going through the crashes I see there are still a few crashes with CrashReporter::AnnotateCrashReport (and mozilla::gfx::CriticalLogger::OutputMessage):
- 35 Beta = 12 crashes
   * 35.0b2 = 4 crashes
      ** https://crash-stats.mozilla.com/report/index/4c6d6bf6-3a2b-4c32-9da4-5e0652141212
      ** https://crash-stats.mozilla.com/report/index/b66ded76-2102-4395-840f-6a7e22141211
      ** https://crash-stats.mozilla.com/report/index/581c31a2-a5de-4a34-a054-424a22141212
      ** https://crash-stats.mozilla.com/report/index/1d9509cf-87a9-43c6-ba83-1809f2141211
   * 35.0b3 = 5 crashes
      ** https://crash-stats.mozilla.com/report/index/2c8e57f1-3f32-4c0c-868e-fc1b22141218
      ** https://crash-stats.mozilla.com/report/index/24cb9973-2748-482e-87f5-612902141215
      ** https://crash-stats.mozilla.com/report/index/e734942a-ae34-40d1-bfbc-a20c42141214
      ** https://crash-stats.mozilla.com/report/index/01762f71-e2a7-44b2-9493-e1f722141217
      ** https://crash-stats.mozilla.com/report/index/9e0dffc5-f762-495b-9992-55abf2141216
   * 35.0b4 = 2 crashes 
      ** https://crash-stats.mozilla.com/report/index/3f4147bb-9553-4463-b129-404982141220
      ** https://crash-stats.mozilla.com/report/index/bc5f245f-2300-44db-83fb-0d3722141221
   * 35.0b5 = 1 crash 
      ** https://crash-stats.mozilla.com/report/index/a5baf082-b14d-4543-9eb4-3b2a72141220
- 36 Aurora = 0 crashes

These remaining crashes look similar to the original one, but the low rate of occurrence I think is enough to close this.

Also I'm not sure we want to pursue the other crashes as their number is not very high.
Status: RESOLVED → VERIFIED
status-firefox35: fixed → verified
status-firefox36: fixed → verified
You need to log in before you can comment on or make changes to this bug.