Intermittent crash in jsreftest.html?test=ecma_5/RegExp/regress-617935.js | load failed: null, or application timed out after 330 seconds with no output [@ JSScript::markChildren]

RESOLVED FIXED in mozilla14

Status

()

Core
JavaScript Engine
RESOLVED FIXED
6 years ago
5 years ago

People

(Reporter: RyanVM, Assigned: Waldo)

Tracking

({intermittent-failure})

Trunk
mozilla14
All
Linux
intermittent-failure
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments)

(Reporter)

Description

6 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=10130105&tree=Mozilla-Inbound&full=1

Rev3 Fedora 12x64 mozilla-inbound opt test jsreftest on 2012-03-16 12:17:19 PDT for push 372f90787ec8

builder: mozilla-inbound_fedora64_test-jsreftest
slave: talos-r3-fed64-013
starttime: 1331925439.4
results: warnings (1)
buildid: 20120316120415
builduid: fcd51f6d0b8a4571a4fe336bc74bb6c6
revision: 372f90787ec8

TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/RegExp/regress-617935.js | application timed out after 330 seconds with no output


PROCESS-CRASH | file:///home/cltbld/talos-slave/test/build/jsreftest/tests/jsreftest.html?test=ecma_5/RegExp/regress-617935.js | application crashed (minidump found)
Crash dump filename: /tmp/tmpBjo6mW/minidumps/0bd53a3f-fd69-13d1-006f7468-36cc1944.dmp
Operating system: Linux
                  0.0.0 Linux 2.6.31.5-127.fc12.x86_64 #1 SMP Sat Nov 7 21:11:14 EST 2009 x86_64
CPU: amd64
     family 6 model 23 stepping 10
     2 CPUs

Crash reason:  SIGABRT
Crash address: 0x1f40000087c

Thread 0 (crashed)
 0  libxul.so!JSScript::markChildren [jsscript.cpp : 2007 + 0x0]
    rbx = 0x584602e0   r12 = 0x0000002a   r13 = 0x61346860   r14 = 0x57acd000
    r15 = 0x6c207284   rip = 0x6bf84501   rsp = 0x871bab60   rbp = 0x61346180
    Found by: given as instruction pointer in context
 1  libxul.so!js::MarkRuntime [jsgc.cpp : 2154 + 0xb]
    rbx = 0x57acc140   r12 = 0x61346000   r13 = 0x61346860   r14 = 0x57acd000
    r15 = 0x6c207284   rip = 0x6bf098c4   rsp = 0x871bab80   rbp = 0x61346180
    Found by: call frame info
 2  libxul.so!BeginMarkPhase [jsgc.cpp : 2986 + 0x4]
    rbx = 0x61346000   r12 = 0x4b022c00   r13 = 0x4b022c00   r14 = 0x00000000
    r15 = 0x00000000   rip = 0x6bf09c64   rsp = 0x871bac20   rbp = 0x61346250
    Found by: call frame info
 3  libxul.so!GCCycle [jsgc.cpp : 3285 + 0x4]
    rbx = 0x61346000   r12 = 0x59f92a10   r13 = 0x00000000   r14 = 0x00000000
    r15 = 0x00000000   rip = 0x6bf0e384   rsp = 0x871bac50   rbp = 0x61346000
    Found by: call frame info
 4  libxul.so!Collect [jsgc.cpp : 3722 + 0xd]
    rbx = 0x61346000   r12 = 0x00000000   r13 = 0x59f92a10   r14 = 0x61346250
    r15 = 0x00000000   rip = 0x6bf0e6b4   rsp = 0x871bad00   rbp = 0x61346860
    Found by: call frame info
 5  libxul.so!nsJSContext::ScriptEvaluated [nsJSEnvironment.cpp : 2967 + 0x8]
    rbx = 0x57932ca0   r12 = 0x00000000   r13 = 0x51f13701   r14 = 0x00000000
    r15 = 0x4c43b520   rip = 0x6b68f5fa   rsp = 0x871bad60   rbp = 0x871bafef
    Found by: call frame info
 6  libxul.so!nsJSContext::EvaluateString [nsJSEnvironment.cpp : 1487 + 0x10]
    rbx = 0x57932ca0   r12 = 0x00000000   r13 = 0x51f13740   r14 = 0x00000000
    r15 = 0x4c43b520   rip = 0x6b6941dc   rsp = 0x871bad90   rbp = 0x871bafef
    Found by: call frame info
 7  libxul.so!nsScriptLoader::EvaluateScript [nsScriptLoader.cpp : 923 + 0x4e]
    rbx = 0x00000002   r12 = 0x4ced50d0   r13 = 0x4c37b0a0   r14 = 0x4ced5080
    r15 = 0x00000001   rip = 0x6b540e05   rsp = 0x871baf00   rbp = 0x57932ca0
    Found by: call frame info
 8  libxul.so!nsScriptLoader::ProcessRequest [nsScriptLoader.cpp : 816 + 0xd]
    rbx = 0x515a6600   r12 = 0x00000000   r13 = 0x00000002   r14 = 0x515a6620
    r15 = 0x871bb130   rip = 0x6b5410c5   rsp = 0x871bb030   rbp = 0x4ced5080
    Found by: call frame info
 9  libxul.so!nsScriptLoader::ProcessPendingRequests [nsScriptLoader.cpp : 966 + 0xc]
    rbx = 0x4ced5080   r12 = 0x00000000   r13 = 0x6e55b000   r14 = 0x871bb36f
    r15 = 0x6e53c6c0   rip = 0x6b541dbf   rsp = 0x871bb1b0   rbp = 0x00000000
    Found by: call frame info
10  libxul.so!nsScriptLoader::OnStreamComplete [nsScriptLoader.cpp : 1202 + 0x7]
    rbx = 0x4ced5080   r12 = 0x00000000   r13 = 0x6e55b000   r14 = 0x871bb36f
    r15 = 0x6e53c6c0   rip = 0x6b54219d   rsp = 0x871bb1f0   rbp = 0x00000000
    Found by: call frame info

...and some more below it.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Jeff, could you take a look at this? There are two suspicious things going on here that may be related.

1. When I run this test from a debug-mode shell, it gets a lot slower. When I bisected, I got this:

The first bad revision is:
changeset:   89509:1788def989ae
user:        Jeff Walden <jwalden@mit.edu>
date:        Wed Mar 14 15:29:29 2012 -0700
summary:     Bug 735313 - StringBuffer still needs length validation.  r=luke

2. Then I looked at the jsreftest logs before and after this changeset.

Log for parent changeset:
  https://tbpl.mozilla.org/php/getParsedLog.php?id=10114237&tree=Mozilla-Inbound&full=1

Log for 1788def989ae:
  https://tbpl.mozilla.org/php/getParsedLog.php?id=10111551&tree=Mozilla-Inbound&full=1

Note that in the first log, the test seems to finish normally. The slowest test is regress-476427.js, which took ~70s. In the second log, the slowest test is regress-617935.js, which took over 200s. I see similar results for later pushes.

I don't really know what's going on here, but it seems like something did get slower and it may be causing timeouts.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
I have a patch that seems to trigger this 100% of the time on my local machine...
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Assignee: general → jwalden+bmo
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Created attachment 609839 [details] [diff] [review]
Check operation limit on each iteration of matching

So we discussed doing this, and we discussed readding back length checks in appends.  The former is easy, and this patch does it.  The latter is complicated to do if you want to use access controls to guarantee every method includes the right check.  So let's see if just the former is sufficient for the instant problem.
Attachment #609839 - Flags: review?(luke)

Updated

6 years ago
Attachment #609839 - Flags: review?(luke) → review+
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
https://hg.mozilla.org/integration/mozilla-inbound/rev/4930669183ed
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla14
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
https://hg.mozilla.org/mozilla-central/rev/4930669183ed
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Hmm, so I guess that didn't fix it.  Maybe I really do have to length-check all the appends.  :-\
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Summary: Intermittent crash in jsreftest.html?test=ecma_5/RegExp/regress-617935.js | application timed out after 330 seconds with no output [@ JSScript::markChildren] → Intermittent crash in jsreftest.html?test=ecma_5/RegExp/regress-617935.js | load failed: null, or application timed out after 330 seconds with no output [@ JSScript::markChildren]
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Created attachment 612014 [details] [diff] [review]
Take two: make string buffers never allocate more memory than necessary to represent a maximum-sized string

Let's see if this works.  The alternative is to add back the checks to all the append methods, or something, which is pretty hairy to get right, so let's hope this does the trick.
Attachment #612014 - Flags: review?(luke)

Updated

5 years ago
Attachment #612014 - Flags: review?(luke) → review+
Comment hidden (Treeherder Robot)
One more time!

https://hg.mozilla.org/integration/mozilla-inbound/rev/6c435c96f10a
Alas, backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/d0a19172c707 for Mac "jsreftest.html?test=js1_5/extensions/regress-342960.js | Unknown file:///Users/cltbld/talos-slave/test/build/jsreftest/tests/js1_5/extensions/regress-342960.js:68: allocation size overflow item 1" like https://tbpl.mozilla.org/php/getParsedLog.php?id=10621505&tree=Mozilla-Inbound
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
https://tbpl.mozilla.org/php/getParsedLog.php?id=10751431&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=10753425&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=10759395&tree=Mozilla-Inbound
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Created attachment 615563 [details] [diff] [review]
Change the test's loop limits

At the time the test was written, string lengths could be any arbitrary uint32_t.  (I think.)  The test was verifying that strings whose lengths had the high bit set were handled correctly, and that the length wasn't treated as wrapping.

These days string lengths aren't uint32_t, they're, um, uint28_t.  So the loop limits used now are pointlessly big.  I'm going to bet we can fix the intermittency here by lowering those limits, so we don't try creating an overbig string that we know for a fact won't overflow.  It'll still be possible to create a StringBuffer wrapping characters that are overlong, but I think this is so uncommon that it's not necessary to complain eagerly when it happens.  One way or the other we'll be throwing an exception at an unpredictable time, so code trying to rely on this is already all sorts of dodgy.
Attachment #615563 - Flags: review?(luke)

Updated

5 years ago
Attachment #615563 - Flags: review?(luke) → review+
https://hg.mozilla.org/mozilla-central/rev/6b33a2960e68
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago5 years ago
Resolution: --- → FIXED
Keywords: intermittent-failure
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.