Closed Bug 425576 Opened 16 years ago Closed 16 years ago

Crash on login to Excite Japan Blog (exblog.jp) after updating to Firefox 2.0.0.13 [@ js_MarkGCThing]

Categories

(Core :: JavaScript Engine, defect, P1)

1.8 Branch
defect

Tracking

()

VERIFIED FIXED

People

(Reporter: kohei, Assigned: igor)

References

()

Details

(7 keywords, Whiteboard: [sg:critical?])

Crash Data

Attachments

(4 files, 2 obsolete files)

Some Japanese users reported that they have encountered crashes on login to Excite Blog <http://www.exblog.jp/> after updating to Firefox 2.0.0.13.

http://forums.mozillazine.jp/viewtopic.php?t=7215
http://takublog.exblog.jp/7581761/
http://asfradio.exblog.jp/8538149/
http://sanggar.exblog.jp/6943437/
http://ddrer.exblog.jp/7736962/
http://th69.exblog.jp/7574222/
http://runtotorun.exblog.jp/7735841/

- error message says: The instruction at "0x600f0290" referenced memory at "0x000013e8". The Memory could not be "read". Click OK to terminate the application.
- both on Windows and Mac
- no problem with the "secure mode login" (with SSL)
- no problem when login from the home page, crash when login form "login" links on each blogs
Attached file Apple Crash Report
Confirmed on Windows XP and Mac OS X v10.4.11. Here's the Apple crash report. Talkback was not invoked.
Severity: normal → critical
Flags: blocking1.8.1.14?
Priority: -- → P1
(In reply to comment #0)
> Some Japanese users reported that they have encountered crashes on login to
> Excite Blog <http://www.exblog.jp/> after updating to Firefox 2.0.0.13.

Yoshino-san, what are the steps to reproduce this?


1. input username/password on http://www.exblog.jp/
2. wait a few seconds after successful login
3. crash

I didn't see the error message in my comment 0.
Summary: Crash on login to Excite Blog after updating to Firefox 2.0.0.13 → Crash on login to Excite Japan Blog (exblog.jp) after updating to Firefox 2.0.0.13
The crash didn't happen every time.

On Windows Vista, Talkback was invoked instead of the system's crash reporter.
TB43179753
Incident ID: 43179753
Stack Signature	js_MarkGCThing 5dfb3e16
Product ID	Firefox2
Build ID	2008031114
Trigger Time	2008-03-27 21:40:54.0
Platform	Win32
Operating System	Windows NT 6.0 build 6000
Module	js3250.dll + (0001f93d)
URL visited	http://www.exblog.jp
User Comments	
Since Last Crash	100 sec
Total Uptime	283 sec
Trigger Reason	Access violation
Source File, Line No.	c:/builds/tinderbox/Fx-Mozilla1.8-Release/WINNT_5.2_Depend/mozilla/js/src/jsgc.c, line 2447
Stack Trace 	
js_MarkGCThing  [mozilla/js/src/jsgc.c, line 2447]
JS_GC  [mozilla/js/src/jsapi.c, line 1879]
nsAppStartup::Run  [mozilla/toolkit/components/startup/src/nsAppStartup.cpp, line 152]
main  [mozilla/browser/app/nsBrowserApp.cpp, line 61]
kernel32.dll + 0x43833 (0x76d93833)
ntdll.dll + 0x3a9bd (0x76fba9bd)
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Firefox → Core
QA Contact: general → general
Summary: Crash on login to Excite Japan Blog (exblog.jp) after updating to Firefox 2.0.0.13 → Crash on login to Excite Japan Blog (exblog.jp) after updating to Firefox 2.0.0.13 [@ js_MarkGCThing]
Version: 2.0 Branch → 1.8 Branch
Used latest 1.8 trunk build, no crash:

Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.1.14pre) Gecko/20080327 BonEcho/2.0.0.14pre

Steps used:

1. go to http://www.exblog.jp/
2. using the textboxes at the top left corner, enter:
   user: firefoxcrash
   pw: password
3. unclick checkbox to select non-SSL login
4. click on the orange login button
5. wait several seconds

> no problem when login from the home page, crash when login form "login" links
on each blogs

????

If the login process is not on the main Excite page, please note better steps, thanks.
Crash confirmed with the 1.8 trunk:

Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.14pre) Gecko/20080327 BonEcho/2.0.0.14pre

TB43180349, same as comment 6.
 
>> no problem when login from the home page, crash when login form "login" links
> on each blogs
> 
> ????
> 
> If the login process is not on the main Excite page, please note better steps,
> thanks.

The crash happens on the home page. Sorry for confusing.
OK, this seems to work consistently:

Steps
1. go to http://testingff.exblog.jp/
2. click on the login [ ログイン ] link in the upper left corner
3. using the textboxes at the top left corner, enter:
   user [ エキサイトID ]: firefoxcrash
   pw [ パスワード ]: password
4. leave the SSL checkbox enabled
5. click on the orange sign in button

Result: returns to initial blog page but crashes after a few seconds

Same stack trace as other reports:

#0  0x2302e978 in js_MarkGCThing ()
#1  0x2302f348 in js_GC ()
#2  0x23003c74 in JS_GC ()
#3  0x005c0868 in nsJSContext::Notify ()
#4  0x2c04bb10 in nsTimerImpl::Fire ()
#5  0x2c04bc1c in handleTimerEvent ()
#6  0x2c047aa0 in PL_HandleEvent ()
#7  0x2c0479c4 in PL_ProcessPendingEvents ()
#8  0x907df2ec in __CFRunLoopDoSources0 ()
#9  0x907de81c in __CFRunLoopRun ()
#10 0x907de29c in CFRunLoopRunSpecific ()
#11 0x932abb20 in RunCurrentEventLoopInMode ()
#12 0x932ab1b4 in ReceiveNextEventCommon ()
#13 0x932f0348 in AcquireNextEventInMode ()
#14 0x932f0138 in RunApplicationEventLoop ()
#15 0x00298d2c in nsAppShell::Run ()
#16 0x00336a48 in nsAppStartup::Run ()
#17 0x00012a14 in XRE_main ()
#18 0x0000d768 in start ()
Flags: blocking1.8.0.15?
Narrowed down the regression range:

Does not occur in:

Firefox 2.0.0.12
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; ja-JP-mac; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12

Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.1.13pre) Gecko/20080215 BonEcho/2.0.0.13pre
Build time: 15-Feb-2008 05:35

But does occur in:

Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.1.13pre) Gecko/20080216 BonEcho/2.0.0.13pre
Build time: 16-Feb-2008 08:15

Checkins to mozilla/js during that time:

http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=MOZILLA_1_8_BRANCH&branchtype=match&dir=mozilla%2Fjs&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2008-02-15&maxdate=2008-02-17&cvsroot=%2Fcvsroot

Possible regression caused by patch for bug 411025?



Assignee: general → igor
It's possible. I commented in bug 411025 comment 24 (and 25) about a couple of problems that could cause crashes like this. However, I'd expect to see an error in the error console for either of those problems to occur and when I try to test locally, I don't see any errors in the error console.
Not sure this is at all related, but don't we need a JS_POP_TEMP_ROOT to match the JS_PUSH_TEMP_ROOT_FUNCTION before the return here:

http://mxr.mozilla.org/mozilla1.8/source/js/src/jsapi.c#4116

The JS_PUSH_TEMP_ROOT_FUNCTION pushes on a stack-allocated value, so if we somehow "escape" this function that we'll be pointing to random poo on the stack, hence the crash later in jsgc.c at line 2947:

  JSTempValueRooter tvr;

  /* From this point the control must flow through the label out. */
  JS_PUSH_TEMP_ROOT_FUNCTION(cx, fun, &tvr);

But I really don't know this code so I could be way off base here...
bug 411025 comment 25. how timely. good indepedent spot
Depends on: 411025
Great, looks like timeless has set up a potential patch for this attached to bug 411025.


Simple patch based on Blake's suggestion and the patch timeless set up on 411025.  This patch is a simple delta on the MOZILLA_1_8_BRANCH code.

Fixes the testcase originally reported.

Will try to set up a simpler JS testcase for this to make it easier to verify later.
Attachment #312233 - Flags: superreview?(brendan)
Attachment #312233 - Flags: review?(igor)
Attachment #312233 - Flags: approval1.8.1.14?
Comment on attachment 312233 [details] [diff] [review]
patch, v.0.1, based on Blake's and timeless comments

Patches must have all required review(s) before requesting approval.
Attachment #312233 - Flags: approval1.8.1.14?
Attachment #312233 - Flags: review?(igor) → review+
Can we get a js testsuite test added for this?
Flags: in-testsuite?
I can't see all the characters on the page, and can't login to get the crashing page. Can someone who can reproduce the crash, save page complete, zip it up and attach here?
Comment on attachment 312233 [details] [diff] [review]
patch, v.0.1, based on Blake's and timeless comments

No sr required for js/src.

/be
Attachment #312233 - Flags: superreview?(brendan)
Attachment #312233 - Flags: superreview+
Attachment #312233 - Flags: approval1.9?
Do we need this on 1.9?  I thought the problem was from an error in the 1.8.1 patch.  Can people reproduce it on trunk?
(In reply to comment #21)
> Do we need this on 1.9? 

No, this the problem was introduced when I ported trunk's patch into the 1.8 branch.
It would be good to also address bug 411025 comment 24 at the same time with this patch...
Comment on attachment 312233 [details] [diff] [review]
patch, v.0.1, based on Blake's and timeless comments

It is a must to fix the issue that Blake has found here as well.
Attachment #312233 - Flags: approval1.9?
Attached patch v1 (obsolete) — Splinter Review
This is the previous patch plus the wrong goto fix.
Attachment #312233 - Attachment is obsolete: true
Attachment #312323 - Flags: review?(mrbkap)
Comment on attachment 312323 [details] [diff] [review]
v1

Thanks.
Attachment #312323 - Flags: review?(mrbkap) → review+
Attachment #312323 - Attachment is obsolete: true
Attachment #312323 - Flags: review+
Attached patch v2Splinter Review
When porting the trunk patch to the branch I not only made the goto/returns bugs, bit I also completely forgot that on 1.8 JSFunction is GCX_PRIVATE and has no associated marking/tracing code. So the patch not only introduced the bug but was completely useless to protect against GC hazards!

The new patch replaces that JS_PUSH_TEMP_FUNCTION insanity by JS_PUSH_TEMP_OBJECT and makes sure that the result of the function is rooted via GCX_OBJECT newborn.
Attachment #312333 - Flags: review?(mrbkap)
Attachment #312333 - Flags: review?(brendan)
Comment on attachment 312333 [details] [diff] [review]
v2

Many apologies for the bad review. The 1.8 branch is fading from memory and I made the mistake of ass-u-ming things fresh in mind from the trunk.

/be
Attachment #312333 - Flags: review?(brendan) → review+
Comment on attachment 312333 [details] [diff] [review]
v2

That makes a lot more sense.
Attachment #312333 - Flags: review?(mrbkap) → review+
Attachment #312333 - Flags: approval1.8.1.14?
Blocks: 425594
Comment on attachment 312333 [details] [diff] [review]
v2

approved for 1.8.1.14, a=dveditz
Attachment #312333 - Flags: approval1.8.1.14? → approval1.8.1.14+
Tried creating a simple testcase by saving the complete page but the bug doesn't occur with just the saved page unfortunately.  This is related to the use of eval's in the js, correct?  Seems like there should be a much, much simpler way of causing the bad codepath to get executed.

Excite Japan has updated their website to 'fix' this problem, so the need for a quick turn-around on 2.0.0.14 is lessened.

http://blog.excite.co.jp/staff/6962026/

That said, we'll be extra careful and test with Excite Blog to ensure that the changes that Excite Japan made today won't be reflected in the updated branch.
Flags: blocking1.8.1.14?
Attachment #312333 - Flags: approval1.8.1.14?
(In reply to comment #32)
> Excite Japan has updated their website to 'fix' this problem, so the need for a
> quick turn-around on 2.0.0.14 is lessened.

This is a new, extremely high-frequency topcrash (especially when you count the new js_GC and js_GetGCThingFlags crashes as the same thing), and almost none of the submitted comments mention Excite. I have not noticed a change in the incoming rates for these crashes after Excite fixed their site--in fact the rate seems to have gone up a little due to the usual increase in the number of mid-week users, totally swamping any relief from Excite Japan. We still need to fix this.

Igor: please land on the MOZILLA_1_8_BRANCH as soon as possible so we can test whether this does seem to help.
I checked in the patch from the comment 27 to MOZILLA_1_8_BRANCH:

Checking in jsapi.c;
/cvsroot/mozilla/js/src/jsapi.c,v  <--  jsapi.c
new revision: 3.214.2.42; previous revision: 3.214.2.41
done
Checking in jscntxt.h;
/cvsroot/mozilla/js/src/jscntxt.h,v  <--  jscntxt.h
new revision: 3.80.4.24; previous revision: 3.80.4.23
done
Checking in jsprvtd.h;
/cvsroot/mozilla/js/src/jsprvtd.h,v  <--  jsprvtd.h
new revision: 3.17.20.7; previous revision: 3.17.20.6
done
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Question: should I mark the bug as fixed1.8.1.14 or fixed1.8.1.15 given that the fix is landed on MOZILLA_1_8_BRANCH?
(In reply to comment #35)
> Question: should I mark the bug as fixed1.8.1.14 or fixed1.8.1.15 given that
> the fix is landed on MOZILLA_1_8_BRANCH?

fixed1.8.1.15. 1.8.1.14 will come off a relbranch.

Igor, is there any way to test this programmatically?
Keywords: fixed1.8.1.15
fyi, been trying the urls found in talkback and scanning top sites and haven't yet come up with a reproducible test case.
I couldn't create a crash testcase, but the Excite Japan team told us that they only fixed their JavaScript redirection like this:

<html>
<head>
    <script type="text/javascript">
    <!--
    document.location.href= "http://www.exblog.jp/";
    //-->  
    </script>
</head>
<body>
</body>
</html>

Fixed:

<html>
<head>
    <script type="text/javascript">
    <!--
    window.onload = function(){
            window.location.replace("http://www.exblog.jp");
            return;
    }
    //-->  
    </script>
</head>
<body>
</body>
</html>
(In reply to comment #38)
> I couldn't create a crash testcase, but the Excite Japan team told us that they
> only fixed their JavaScript redirection like this:
> 
> <html>
> <head>
>     <script type="text/javascript">
>     <!--
>     document.location.href= "http://www.exblog.jp/";
>     //-->  
>     </script>
> </head>
> <body>
> </body>
> </html>

This sample code does not execute the code with the bug in JS_CompileUCFunctionForPrincipals from js/src/jsapi.c.
(In reply to comment #39)
> This sample code does not execute the code with the bug in
> JS_CompileUCFunctionForPrincipals from js/src/jsapi.c.

Just to be clear, before Excite JP updated their code I could reproduce this problem using the steps noted in comment 9.  After building with the attached patch v.0.1 (the change in JS_CompileUCFunctionForPrincipals) the crash no longer occurred.

I have the saved webpage at home but with the saved page the crash does not occur.  I'll attach it later tonight, in case that might help.
Flags: blocking1.8.0.15? → blocking1.8.0.15+
Comment on attachment 312333 [details] [diff] [review]
v2

approved for 1.8.1.14, a=dveditz for release-drivers.

Please land on the GECKO181_20080311_RELBRANCH for this re-release.
Attachment #312333 - Flags: approval1.8.1.14? → approval1.8.1.14+
I checked in the patch from the comment 27 to GECKO181_20080311_RELBRANCH:

Checking in jsapi.c;
/cvsroot/mozilla/js/src/jsapi.c,v  <--  jsapi.c
new revision: 3.214.2.41.4.1; previous revision: 3.214.2.41
done
Checking in jscntxt.h;
/cvsroot/mozilla/js/src/jscntxt.h,v  <--  jscntxt.h
new revision: 3.80.4.23.4.1; previous revision: 3.80.4.23
done
Checking in jsprvtd.h;
/cvsroot/mozilla/js/src/jsprvtd.h,v  <--  jsprvtd.h
new revision: 3.17.20.6.4.1; previous revision: 3.17.20.6
done
Keywords: fixed1.8.1.14
Flags: blocking1.8.1.15?
Flags: blocking1.8.1.15+
Flags: blocking1.8.1.14?
Flags: blocking1.8.1.14+
we|i have no idea how to reproduce or create a test for this. 
Flags: in-testsuite? → in-testsuite-
Igor is still trying to find a way to reproduce and create a test for this. Let's give him a bit before we minus this.
Flags: in-testsuite- → in-testsuite?
saved version of page that caused crash, before Excite JP made changes to prevent the crash.  Does *not* crash with saved version but maybe there are clues in here as to why it crashed.
Keywords: topcrash
Blocks: 411025
No longer depends on: 411025
Status: RESOLVED → VERIFIED
Whiteboard: [sg:critical?]
Crash Signature: [@ js_MarkGCThing]
Flags: in-testsuite? → in-testsuite-
You need to log in before you can comment on or make changes to this bug.