Closed Bug 1410276 Opened 2 years ago Closed 2 years ago

Add a canary field to nsStringBuffer

Categories

(Core :: JavaScript: GC, defect, P4)

55 Branch
defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox58 --- wontfix
firefox59 --- fixed
firefox60 --- fixed

People

(Reporter: pbone, Assigned: pbone)

References

(Blocks 1 open bug)

Details

Crash Data

Attachments

(1 file, 3 obsolete files)

Add a temporary conary field to nsStringBuffer to help root out suspected heap crashes, bug 1405525.
Blocks: 1405525
Priority: -- → P4
Ping. See bug 1406996 comment 37 for why we should get moving with this.
Flags: needinfo?(pbone)
(In reply to Bobby Holley (parental leave - send mail for anything urgent) from comment #1)
> Ping. See bug 1406996 comment 37 for why we should get moving with this.

Thanks. I'll make this my current task.
Flags: needinfo?(pbone)
Assignee: nobody → pbone
Status: NEW → ASSIGNED
Hi bz, you were suggested as a reviewer for this patch.  We want to add a canary value to nsStringBuffer and have it in nightly for a few days to try to detect suspected heap curruption.

Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=ab2a107d0312ca4acbd8e6c0816eaae5b69bab11

Thanks.
Attachment #8921862 - Flags: review?(bzbarsky)
Comment on attachment 8921862 [details] [diff] [review]
Bug 1410276 - Add a canary field to nsStringBuffer

You probably don't want to include the trailing ';' in the CHECK_STRING_BUFFER_CANARY macro (both versions).

Making FromData out of line might be a perf hit that we notice in some cases, but that's OK for a temporary thing.
Attachment #8921862 - Flags: review?(bzbarsky) → review+
(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #4)
> Comment on attachment 8921862 [details] [diff] [review]
> Bug 1410276 - Add a canary field to nsStringBuffer
> 
> You probably don't want to include the trailing ';' in the
> CHECK_STRING_BUFFER_CANARY macro (both versions).

Derp.

> Making FromData out of line might be a perf hit that we notice in some
> cases, but that's OK for a temporary thing.

I thought so too, I'll add a comment for anyone else reading this.

FWIW I wrote this in a way that we can keep it in if we chose to.  If we do that we could remove the assertion in this function and move it back to the class definition.

Thanks.
r+ carried forward.
Attachment #8921862 - Attachment is obsolete: true
Attachment #8922128 - Flags: review+
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a3785ec9a48c
Add a canary field to nsStringBuffer. r=bz
Keywords: checkin-needed
Crash Signature: mozalloc_abort | Abort | NS_DebugBreak | nsStringBuffer::Release mozalloc_abort | NS_DebugBreak | nsStringBuffer::FromData → [@ mozalloc_abort | Abort | NS_DebugBreak | nsStringBuffer::Release] [@ mozalloc_abort | NS_DebugBreak | nsStringBuffer::FromData]
Emilio, do you have access to the crash report dumps?
Flags: needinfo?(emilio)
Yes, I do. Paul, let me know what I should look for, or reach to me on IRC so we can talk about it or what not.

Though probably you should get access to the dumps given you're working on the GC...
Flags: needinfo?(emilio) → needinfo?(pbone)
The "moz crash reason" field for all of these crashes don't contain any useful information. These are all content process crashes, and in the content process NS_RUNTIMEABORT  doesn't do anything: http://searchfox.org/mozilla-central/rev/af86a58b157fbed26b0e86fcd81f1b421e80e60a/xpcom/base/nsDebugImpl.cpp#394
Well, of course it crashes. I guess the actual field for this would be AbortMessage but that is blank, too.
Bug 1412048 replaced some uses of NS_RUNTIMEABORT, but it looks like it missed this one. But you can just land a patch to do a similar transformation for your call site.
mcrr8:

When I read that other bug it sounds like all the relevant data _is_ captured, it's just not public.  So hopefully having access to the full crash reports will help me.

All:

I have applied for minidump access and should be able to access this shortly.

I noticed that all the crashes in Release() are on windows and all the crashes in FromData() are on Linux.  This is probably just a coincidence: there are only 8 crashes in total.
Flags: needinfo?(pbone)
Well, that comment isn't quite right. It only applies to annotations in the parent process, AFAICT. I have minidump access and I couldn't see them. It looks like a more recent patch did update the canary annotation so these should start showing up: https://hg.mozilla.org/mozilla-central/rev/e85f59ea455d
Okay thanks.

I expect that with a minidump I'll be able to inspect the memory and find out what these values were.
(In reply to Andrew McCreight [:mccr8] from comment #14)
> Bug 1412048 replaced some uses of NS_RUNTIMEABORT, but it looks like it
> missed this one. But you can just land a patch to do a similar
> transformation for your call site.

That is not necessary. When I landed the patch for bug 1412048 that removed NS_RUNTIMEABORT last week, I replaced CHECK_STRING_BUFFER_CANARY's NS_RUNTIMEABORT call with MOZ_CRASH_UNSAFE_PRINTF:

https://searchfox.org/mozilla-central/diff/4a8bdd4ca5087c07464aaab990dfc3ccd4d3c909/xpcom/string/nsSubstring.cpp#44
Depends on: 1415465
Hi bz,

The original patch was backed out because 58 is getting close to beta.  But it hasn't caught many assertions yet.  This patch should affect performance less and might be valuable.  Is this worth persuing or should we close this bug?
Attachment #8922128 - Attachment is obsolete: true
Attachment #8926718 - Flags: review?(bzbarsky)
Comment on attachment 8926718 [details] [diff] [review]
Bug 1410276 - Add a canary field to nsStringBuffer

This seems fine, though FromDataOutlineCheckCanary is only called when the check would fail, so we could just rename it to FromDataCanaryCheckFailed and nix the checks inside it.  Either way, we should document what it does.
Attachment #8926718 - Flags: review?(bzbarsky) → review+
Updated based on review feedback. r+ carried forward.
Attachment #8926718 - Attachment is obsolete: true
Attachment #8926735 - Flags: review+
Here are some talos jobs to help determine what affect this has on performance and help us decide if it's worth committing.  Do we want this to go to Beta with 58 assuming there's no affect on performance?

Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6b6a49fa7d322ed4e8acaf6f7556452659157b9a
Talos: https://treeherder.mozilla.org/perf.html#/comparechooser?newProject=try&newRevision=6b6a49fa7d322ed4e8acaf6f7556452659157b9a

NI: jonco, I'm not sure who else to ask and this relates to  Bug 1405525
Flags: needinfo?(jcoppeard)
(In reply to Paul Bone [:pbone] from comment #24)
If there's no impact on performance I'd say let's put this in and see what it turns up.
Flags: needinfo?(jcoppeard)
The latest patch causes a single 4.82% perf regression for tp5o_webext responsiveness opt e10s.  I only bothered to test on linux64, there shouldn't be any platform differences.

https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&originalRevision=e2f87726b608&newProject=try&newRevision=6b6a49fa7d322ed4e8acaf6f7556452659157b9a&framework=1&showOnlyImportant=1

I'm going to wait until 58 goes to beta and then push this on 59 for a while.
Pushed by pbone@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ced8af085927
Add a canary field to nsStringBuffer r=bz
I've committed this again so that it can run for a while on 59.

Ionuț this will likely cause some perf regressions, like last time but hopefully fewer / less sagnificant.

Thanks.
Flags: needinfo?(igoldan)
Keywords: leave-open
(In reply to Paul Bone [:pbone] from comment #28)
> I've committed this again so that it can run for a while on 59.
> 
> Ionuț this will likely cause some perf regressions, like last time but
> hopefully fewer / less sagnificant.
> 
> Thanks.

Thanks for the heads up!
Flags: needinfo?(igoldan)
Here are a few Mac nightly crashes in 59 that I spotted today while going through crash stats: http://bit.ly/2iWsYBg
Oh,

I'll take a look at those crashes later today as Bug 1405525.

I lost track of this bug and didn't realise it was still providing crashes.  I'm happy to keep this code in since it's only used in Nightly and Debug builds.  I'll close this bug and continue in the original Bug 1405525.
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Flags: needinfo?(pbone)
Keywords: leave-open
Resolution: --- → FIXED
Summary: Add a temporary canary field to nsStringBuffer → Add a canary field to nsStringBuffer
Blocks: 1435994
No longer blocks: 1435994
Blocks: 1517538
You need to log in before you can comment on or make changes to this bug.