Closed Bug 1427860 Opened 3 years ago Closed 2 years ago

Crash [@ JSScript::functionNonDelazifying] involving saveIncrementalBytecode


(Core :: JavaScript Engine, defect, P3)




Tracking Status
firefox-esr52 --- wontfix
firefox-esr60 --- wontfix
firefox59 --- wontfix
firefox60 --- wontfix
firefox61 --- wontfix
firefox62 --- wontfix
firefox63 --- wontfix
firefox64 --- fixed
firefox65 --- fixed


(Reporter: gkw, Assigned: tcampbell)



(4 keywords, Whiteboard: [jsbugmon:update,testComment=9,origRev=d7e0e17065ca])

Crash Data


(2 files)

The following testcase crashes on mozilla-central revision ac93fdadf102 (build with --enable-debug, run with --fuzzing-safe --ion-offthread-compile=off --ion-eager):

// jsfunfuzz-generated
var x = []
for (var i = 0; i < 9; ++i) {
    x[i] = 0;
// Adapted from randomly chosen test: js/src/jit-test/tests/xdr/bug1390856.js
var t = cacheEntry("function f(){};");
evaluate(t, {
    sourceIsLazy: true,
    saveIncrementalBytecode: true
oomAtAllocation(9, 4);
try {
} catch (e) {}


#0  JSScript::functionNonDelazifying (this=0x7f58dfcbe128) at js/src/jsscript.h:1619
#1  0x0000000000a76fe5 in JSScript::isTopLevel (this=<optimized out>) at js/src/jsscript.h:1695
#2  GenerateLcovInfo (cx=cx@entry=0x7f58e1616000, comp=0x7f58e162d800, out=...) at js/src/jsopcode.cpp:2903
#3  0x0000000000a85507 in js::GetCodeCoverageSummary (cx=cx@entry=0x7f58e1616000, length=length@entry=0x7ffd0f9ecdd8) at js/src/jsopcode.cpp:2978
#4  0x00000000008bae30 in GetLcovInfo (cx=0x7f58e1616000, argc=<optimized out>, vp=<optimized out>) at js/src/builtin/TestingFunctions.cpp:4263

For detailed crash information, see attachment.
autoBisect shows this is probably related to the following changeset:

The first bad revision is:
user:        Nicolas B. Pierron
date:        Tue Jan 31 20:03:58 2017 +0000
summary:     Bug 1316078 part 5 - Add XDR off-thread decoder test cases. r=bhackett

Nicolas, is bug 1316078 a likely regressor?
Blocks: 1316078
Flags: needinfo?(nicolas.b.pierron)
So far I cannot explain the sudden spike in crashes, especially since this seems to be affecting every channel starting on the 27 of december.

I will investigate this issue tomorrow.
Priority: -- → P2
Whiteboard: [jsbugmon:update] → [jsbugmon:update,ignore]
JSBugMon: The testcase found in this bug no longer reproduces (tried revision 3ee38289dac8).
Whiteboard: [jsbugmon:update,ignore] → [jsbugmon:bisectfix]
Whiteboard: [jsbugmon:bisectfix] → [jsbugmon:]
JSBugMon: Fix Bisection requested, result:
autoBisect shows this is probably related to the following changeset:

The first good revision is:
user:        Jon Coppeard
date:        Tue Jan 30 17:57:40 2018 +0000
summary:     Bug 1432794 - Skip prototype and constructor intialization for off-thread parsing r=jandem

This iteration took 269.721 seconds to run.
Jon/Nicolas, is bug 1432794 a likely fix?
Flags: needinfo?(jcoppeard)
No, I thnk that just made the testcase not reproduce by changing allocation patterns.
Flags: needinfo?(jcoppeard)
Well, it's possible I guess if the problem turned out to be OOM handling while setting up off-thread parsing.  I think we still need to find out what the original problem was.
// Adapted from randomly chosen test: js/src/jit-test/tests/xdr/bug1390856.js
let x = cacheEntry("");
evaluate(x, {
    saveIncrementalBytecode: true
for (let i = 1; i < 19; ++i) {
    try {
        oomAtAllocation(i, 4);
    } catch (e) {}
// jsfunfuzz-generated

crashes 64-bit debug shell on m-c rev d7e0e17065ca at JSScript::functionNonDelazifying with --fuzzing-safe --ion-offthread-compile=off --ion-eager

Please look at this testcase before the situation changes again.
Summary: Crash [@ JSScript::functionNonDelazifying] → Crash [@ JSScript::functionNonDelazifying] involving saveIncrementalBytecode
Whiteboard: [jsbugmon:] → [jsbugmon:update,testComment=9,origRev=d7e0e17065ca]
Priority: P2 → P1
The problem here is that GenerateLcovInfo is iterating over a partially initialized JSScript.

Because of oomAtAllocation(i, 4), we fail to allocate in NewEmptyScopeData. MOZ_TRY fails all the way up: NewEmptyScopeData -> Scope::XDRSizedBindingNames -> GlobalScope::XDR -> js::XDRScript -> js::XDRState::codeScript -> ScriptDecodeTask::parse. In parse(), we see that res.isOk() is false, so we don't append the new script to the scripts GCVector that ScriptDecodeTask wants to return. However, the script allocation still exists. Because we bailed halfway through the decoding process, some of the script data is just garbage.

In GenerateLcovInfo, for each zone, we iterate over the JSScript cells in that zone:

    for (ZonesIter zone(rt, SkipAtoms); !zone.done(); {
        for (auto script = zone->cellIter<JSScript>(); !script.done(); {

The half-initialized script is still sitting in an arena, so the iterator finds it and checks to see if we need to collect its code coverage information. The call to isTopLevel() dereferences a null scope pointer, and we segfault.

I'm not sure if the solution is to fix the iterator in GenerateLcovInfo, to be more aggressive about cleaning up the failed script decode, or a combination of both.
Assignee: nobody → iireland
QA Contact: sdetar
QA Contact: sdetar
Thanks for analyzing this! Taking this bug as part of my JSScript::data / XDRScript cleanup. (Bug 1471062 and its follow-ups).
Flags: needinfo?(nicolas.b.pierron)
Assignee: iireland → tcampbell
Crash-stat no longer report the signature on Firefox 63.
As this is only a JS Shell / code-coverage harness issue, and not a fuzzblocker I will downgrade it to P3.
Failures during XDR may leave the script partially initialized in a way
that confuses coverage collection. This ensures the shared script data
is removed from a script if there are any XDR failures in it.
Pushed by
XDR failures should leave script isUncompleted(). r=nbp
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
Comment on attachment 9020466 [details]
Bug 1427860 - XDR failures should leave script isUncompleted(). r?nbp

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: None

User impact if declined: This might explain/fix some startup / XDR crashes that still occur. We are early in Beta cycle and this patch is a low-risk change.

Is this code covered by automated tests?: Yes

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: No

If yes, steps to reproduce: 

List of other uplifts needed: None

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): Low risk change and only impacting low-mem / profile-corruption.

String changes made/needed:
Attachment #9020466 - Flags: approval-mozilla-beta?
I'd like to try this on beta and see if it has any impacts on long-standing crashes. If there is any improvement, it might be a good candidate for ESR60.
Flags: qe-verify-
Comment on attachment 9020466 [details]
Bug 1427860 - XDR failures should leave script isUncompleted(). r?nbp

[Triage Comment]
Fix which may help some of the remaining XDR crashes we see. Approved for 64.0b6.
Attachment #9020466 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Do any of the crashes on ESR60 with this signature look related to this bug? If so, should we consider uplifting this fix?
Flags: needinfo?(tcampbell)
Don't see anything that suggests this fixes anything but lcov under OOM.
Flags: needinfo?(tcampbell)
You need to log in before you can comment on or make changes to this bug.