Last Comment Bug 674522 - s390x freeze in javascript interpreter
: s390x freeze in javascript interpreter
Status: RESOLVED FIXED
[js-triage-done]
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: 5 Branch
: Other Linux
: -- normal (vote)
: mozilla8
Assigned To: Luke Wagner [:luke]
:
Mentors:
: 680652 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-07-27 07:26 PDT by Jan Horak
Modified: 2011-08-20 05:48 PDT (History)
9 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
fix (626 bytes, patch)
2011-07-28 10:16 PDT, Luke Wagner [:luke]
no flags Details | Diff | Review
take 2 (2.13 KB, patch)
2011-07-29 08:46 PDT, Luke Wagner [:luke]
no flags Details | Diff | Review
removed unrelevant static assert (1.38 KB, patch)
2011-08-03 03:38 PDT, Jan Horak
jwalden+bmo: review+
Details | Diff | Review

Description Jan Horak 2011-07-27 07:26:52 PDT
We're having problems with s390x builds of Firefox and Thunderbird. They freeze in execution of following javascript code (from Thunderbird code: modules/gloda/log4moz.js):
    for (let i = 0; i < pieces.length - 1; i++) {
      dump("start i = "+i+"\n");
      dump(pieces.length+" pieces.length\n")
      dump(cur+"\n");
      if (cur)
        cur += '.' + pieces[i];
      else
        cur = pieces[i];
      if (cur in this._loggers)
        parent = cur;
      dump("end i = "+i+"\n")
    }
[dumps added by me]
I get following output:
start i = 0
2 pieces.length
undefined
end i = 0
start i = 2.121995791e-314
2 pieces.length
gloda
end i = 2.121995791e-314
start i = 0
3 pieces.length
undefined
end i = 0
start i = 2.121995791e-314
3 pieces.length
gloda
end i = 2.121995791e-314
start i = 1
3 pieces.length
gloda.undefined
end i = 1
start i = 4.2439915824e-314
3 pieces.length
gloda.undefined.ds
end i = 4.2439915824e-314
start i = 1
3 pieces.length
gloda.undefined.ds.undefined
end i = 1
start i = 4.2439915824e-314
3 pieces.length
gloda.undefined.ds.undefined.ds
end i = 4.2439915824e-314
start i = 1
3 pieces.length
gloda.undefined.ds.undefined.ds.undefined
end i = 1
...[etc]...
It seems something wrong is going on with i property (or whole js stack?). Any debugging thoughts? Thanks in advance.
Comment 1 David Mandelin [:dmandelin] 2011-07-27 09:27:32 PDT
1. I assume this runs in the interpreter only on s390x?

2. Is s390x 32-bit or 64-bit? 2.121995791e-314 is 0x200000000 and 4.2439915824e-314 is 0x200000001. The low 32 bits look like a plausible value for i (modulo its not getting updated propertly), but the high 32 bits are wrong.

3. What happens if you s/let/var ?
Comment 2 Luke Wagner [:luke] 2011-07-27 09:32:48 PDT
Also, it would be useful to know a regression range.  In particular, has worked since after bug 549143 landed (http://hg.mozilla.org/mozilla-central/rev/9c869e64ee26)?
Comment 3 Jan Horak 2011-07-27 23:30:53 PDT
. (In reply to comment #1)
> 1. I assume this runs in the interpreter only on s390x?
Yes, problems occures in js::Interpret function

> 2. Is s390x 32-bit or 64-bit? 
s390x is 64 bit 
> 2.121995791e-314 is 0x200000000 and 4.2439915824e-314 is 0x200000001. The low 32 bits look like a plausible
> value for i (modulo its not getting updated propertly), but the high 32 bits
> are wrong.
Hm, I didn't try to convert double to hex. Checking it now cycle starts with:
0                 = 0x0000000000000000
2.121995791e-314  = 0x0000000100000000
0                 = 0x0000000000000000
2.121995791e-314  = 0x0000000100000000
and followed by:
1                 = 0x0000000000000001
4.2439915824e-314 = 0x0000000200000001
forever.


> 3. What happens if you s/let/var ?
Same result - nothing changed.

Could this be related to:
https://bugzilla.mozilla.org/show_bug.cgi?id=673097
or 
https://bugzilla.mozilla.org/show_bug.cgi?id=589735 ?
Comment 4 Jan Horak 2011-07-27 23:36:20 PDT
(In reply to comment #2)
> Also, it would be useful to know a regression range.  In particular, has
> worked since after bug 549143 landed
> (http://hg.mozilla.org/mozilla-central/rev/9c869e64ee26)?
We were fine with Firefox 3.6.19 (1.9.2) but that's rather old version. I try to revert mentioned changeset an let you know.
Comment 5 Luke Wagner [:luke] 2011-07-28 08:31:23 PDT
Ah, comment 3 makes this look like an endian-ness thing.  It looks like jsval.h uses #if defined(IS_LITTLE_ENDIAN) to catch little-endian and #else assumes big endian.  Could you check whether IS_LITTLE_ENDIAN is correct for your build?
Comment 6 Jan Horak 2011-07-28 09:49:35 PDT
First of all thanks for helping me out with this.
S390x should be big endian and it seems that this is detected correctly.

(gdb) ptype jsval_layout
type = union jsval_layout {
    uint64 asBits;
    struct {
        JSValueTag tag : 17;
        uint64 payload47 : 47;
    } debugView;
    struct {
        union {...} payload;
    } s;
    double asDouble;
    void *asPtr;
}
which should be right part of jsval.h

content of jsautocfg.h:

#ifndef js_cpucfg___
#define js_cpucfg___

/* AUTOMATICALLY GENERATED - DO NOT EDIT */

#undef  IS_LITTLE_ENDIAN
#define IS_BIG_ENDIAN 1

#ifdef __hppa
# define JS_STACK_GROWTH_DIRECTION (1)
#else
# define JS_STACK_GROWTH_DIRECTION (-1)
#endif
#endif /* js_cpucfg___ */
Comment 7 Luke Wagner [:luke] 2011-07-28 10:16:03 PDT
Created attachment 549155 [details] [diff] [review]
fix

Ah ha.  The 64-bit jsval_layout struct is not properly ladding out the 32-bit payload and js::Value::getInt32Ref() (used for the i++) is thus referencing the wrong 32-bit word.  Something tells me bug 618485 didn't actually execute code after getting it to compile :)

Pre-emptive question: does s390 ensure that all your pointers to static, malloced, and mmaped memory has the high 17 bits set to 0?  If you are running code at all it gives me hope that the answer is 'yes'.
Comment 8 Luke Wagner [:luke] 2011-07-28 10:16:35 PDT
Oh, I forgot to ask: could you test whether the patch in comment 7 compiles and works?
Comment 9 Jan Horak 2011-07-29 00:40:05 PDT
Thanks for the patch. Following asserts failed during compile:

JS_STATIC_ASSERT(offsetof(jsval_layout, s.payload) == 0);
JS_STATIC_ASSERT(offsetof(JSPropertyDescriptor, shortid) == offsetof(PropertyDescriptor, shortid));
JS_STATIC_ASSERT(sizeof(JSPropertyDescriptor) == sizeof(PropertyDescriptor));
JS_STATIC_ASSERT(sizeof(JSObject) % sizeof(js::Value) == 0);

When ignoring them js engine become unusable (can't call DumpJSStack() in gdb).
Comment 10 Luke Wagner [:luke] 2011-07-29 08:46:27 PDT
Created attachment 549386 [details] [diff] [review]
take 2

Ah, I see someone has added a word-sized member to the payload union.  That means my last patch was bloating jsval_layout to 12 bytes.  (I'm surprised we don't static assert that somewhere...)  What about this patch?
Comment 11 Jan Horak 2011-07-29 10:42:35 PDT
Um, still one assert remains:
thunderbird-5.0/comm-miramar/mozilla/js/src/jsvalue.h:294: error: size of array 'js_static_assert6' is negative:
JS_STATIC_ASSERT(offsetof(jsval_layout, s.payload) == 0);

ignoring it by commenting it out leads to the same result as with previous patch.
Comment 12 Luke Wagner [:luke] 2011-07-29 11:04:55 PDT
Yeah, that static assert seems bogus.

In general, though, you may be the first person executing this code on 64-bit big-endian so to get it working you may have to actually step into the code and debug it a bit.

Btw, do you have an answer to my question in comment 7?
Comment 13 Jan Horak 2011-08-03 03:38:56 PDT
Created attachment 550344 [details] [diff] [review]
removed unrelevant static assert

I've wrongly linked the libraries and reported this patch as non-working. Actually it is working perfectly. I'm sending a bit modified version with removed no longer necessary static assert. It would be nice to have it in trunk. Thanks.
Comment 14 Luke Wagner [:luke] 2011-08-03 11:06:24 PDT
Comment on attachment 550344 [details] [diff] [review]
removed unrelevant static assert

Yes, we'll get this landed.  I wrote the patch, so I suppose someone else to review it :)
Comment 16 Luke Wagner [:luke] 2011-08-04 23:21:27 PDT
Oops, backed out because of silly jsuword/uint64 incompatibility on 64-bit osx64:
http://hg.mozilla.org/integration/mozilla-inbound/rev/706c47369f83
Relanded:
http://hg.mozilla.org/integration/mozilla-inbound/rev/1d186a5f3a96
Comment 17 Marco Bonardo [::mak] 2011-08-05 09:08:56 PDT
http://hg.mozilla.org/mozilla-central/rev/1d186a5f3a96
Comment 18 Landry Breuil (:gaston) 2011-08-06 01:32:43 PDT
Sorry to bring bad news, but it seems that change broke sparc64 builds :

http://buildbot.rhaalovely.net/builders/mozilla-central-sparc64/builds/108/steps/build/logs/stdio with rev 5684f06138f39e6c6b95cb076cdbe449875a1c2d was ok (well, busted because of bug 676924 but that's another story) and now with rev be090ee1747a378bef88e392164ad01548d912ed including that commit, it fails with :

/var/buildslave/mozilla-central-sparc64/build/js/src/jsvalue.h:298: error: size of array 'arg' is negative

See http://buildbot.rhaalovely.net/builders/mozilla-central-sparc64/builds/109/steps/build/logs/stdio for full log.

Should i reopen or file another bug ?
Comment 19 Landry Breuil (:gaston) 2011-08-07 02:01:25 PDT
Of course, commenting out that offending JS_ASSERT makes the build go further, but i doubt its the way to go.

JS_STATIC_ASSERT(offsetof(jsval_layout, s.payload) == 0);
Comment 20 Luke Wagner [:luke] 2011-08-08 09:02:36 PDT
D'oh, I forgot to remove that.  I think that static assert should be bogus.  comment 13 seems to confirm this.
Comment 21 Luke Wagner [:luke] 2011-08-08 10:49:28 PDT
Removed, and fix strict-aliasing warning:
http://hg.mozilla.org/integration/mozilla-inbound/rev/6f2c0dbb88d3
Comment 22 :Ehsan Akhgari (out sick) 2011-08-09 08:55:52 PDT
http://hg.mozilla.org/mozilla-central/rev/6f2c0dbb88d3
Comment 23 Mike Hommey [:glandium] 2011-08-20 05:48:27 PDT
*** Bug 680652 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.