Closed Bug 1008791 Opened 10 years ago Closed 10 years ago

Crash in stability test [@ ? | pthread_kill | raise | __libc_android_abort | ... | js::BaseProxyHandler::get(JSContext*, JS::Handle, JS::Handle, JS::Handle, JS::MutableHandleJS::Value) ]

Categories

(Core :: JavaScript Engine, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

RESOLVED INVALID
blocking-b2g 2.0+
Tracking Status
b2g-v1.4 --- affected
b2g-v2.0 --- affected

People

(Reporter: ggrisco, Assigned: mjrosenb)

References

Details

(Keywords: crash, Whiteboard: [caf-crash 208][caf priority: p1][CR 662112][b2g-crash][POVB])

Crash Data

Attachments

(11 files, 5 obsolete files)

439.61 KB, text/plain
Details
555.89 KB, text/plain
Details
1.03 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
12.36 KB, patch
Details | Diff | Splinter Review
49.76 KB, patch
Details | Diff | Splinter Review
2.42 KB, patch
Details | Diff | Splinter Review
1.33 MB, text/plain
Details
142.81 KB, text/plain
Details
62.19 KB, patch
Details | Diff | Splinter Review
2.43 KB, patch
Details | Diff | Splinter Review
53.34 KB, patch
Details | Diff | Splinter Review
1. Make an MO call.
2. When Call is in progress, send multiple SMS.
3. Browse youtube for few minutes.
4. Download games from marketplace.
5. Open camera and take pictures continuously for few minutes.
6. Wifi ON/OFF multiple times.
7. BT on/off for multiple times, mini dumps are generated on the phone.

Throughout the above test steps, call is in background. 

 [@ ? | pthread_kill | raise | __libc_android_abort | ... | js::BaseProxyHandler::get(JSContext*, JS::Handle, JS::Handle, JS::Handle, JS::MutableHandleJS::Value) ]
Only seen once so far.
Component: General → JavaScript Engine
Keywords: crash
Product: Firefox OS → Core
Whiteboard: [CR 662112] → [CR 662112][b2g-crash]
Crash observed on: 

Device: msm8226
Firmware: AU_LABEL
Moz BuildID: 20140505000201
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=fccb15d6940db51615545574877a62d69406b1c2
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=3ed3bbf1941e608bc9630942c7063a8b818f36bc
Naveed,

Looks like the crash is in the JS area. Can we please investigate?
Flags: needinfo?(nihsanullah)
Marty please take a look and let Jan know if this is a JIT issue and not an ARM one.
Assignee: nobody → mrosenberg
Flags: needinfo?(nihsanullah) → needinfo?(mrosenberg)
This seems like an unrealistic use case based on the description. I'll let JS comment but not sure this would block.
Severity: major → critical
Preeti -- this is stability testing and the use case may not be realistic but the goal is to stress the system and find any issues.
(In reply to Preeti Raghunath(:Preeti) from comment #6)
> This seems like an unrealistic use case based on the description. I'll let
> JS comment but not sure this would block.

Note - we're looking to know if the info given in the bug is enough to go off of to fix this bug.
Summary: Crash in stability test → Crash in stability test [@ ? | pthread_kill | raise | __libc_android_abort | ... | js::BaseProxyHandler::get(JSContext*, JS::Handle, JS::Handle, JS::Handle, JS::MutableHandleJS::Value) ]
Flags: needinfo?(efaustbmo)
(seen 4 times so far on v1.4)
blocking-b2g: 1.4? → 1.4+
Boy, those steps to reproduce are tough. Marty is working on getting a build flashed and trying to get some more reliable way to reproduce this so we can see what's going on.

If that doesn't work out quickly, I will need a device to work on this. Several people have offered to help me work on that, so it should go easily.

At first glance, it looks at first like something jit-related, but I won't testify to that in court.

In the meantime, is there anything we can do to try to improve the STR here? It looks like that's just a stress-test that produced the crash, but otherwise we have no idea what causes it? Is that correct?
Flags: needinfo?(efaustbmo)
What timeframe do we have to work with?
(In reply to Greg Grisco from comment #1)
> Only seen once so far.

Something which is only observed once has by definition no STR.

The concept of an STR is to be able to reproduce (Step To Reproduce).  Something which triggered a bug once is not an STR, otherwise we would have crashed multiple times by reproducing the same error with the STR.

Unless we got a clear way to investigate this issue, spending our time to investigate it would be a waste of time unless it appears more frequently.

I am really happy that you are opening these bugs, thanks you guys for doing so.  Attaching the procedure which is used for reaching this bugs is also interesting.  But unless we got a sense of what is going on, by looking at multiple crash steps to refines the what might be involved for reproducing an issue, there is not much we can do.

On the other hand, if you have a reliable way of reproducing these issues, feel free to share these with us such as we can investigate these issues.

So Unless we got a real STR, and not a crash which was seen once, I do not see any reasons to set a priority as neither major nor critical.
Flags: needinfo?(jsmith)
Given that a crash stack is not enough information to off of to fix this issue, we should not be blocking on this.
blocking-b2g: 1.4+ → 1.4?
Flags: needinfo?(jsmith)
The STR are unlikely to improve, this is normal with long-term stability testing of this nature.  However if you write a patch with additional logging that you'd like to see to help debug please let us know and we'll pick it up ASAP.  Make sure it applies cleanly on v1.4 of course.
(In reply to Nicolas B. Pierron [:nbp] from comment #12)
> Something which is only observed once has by definition no STR.

Please read comment 9
Comment on attachment 8420735 [details]
decoded minidump of crash

 7  libxul.so!js::BaseProxyHandler::get(JSContext*, JS::Handle<JSObject*>, JS::Handle<JSObject*>, JS::Handle<jsid>, JS::MutableHandle<JS::Value>) [jsproxy.cpp : 132 + 0x9]
     r0 = 0xb6ef8a59    r1 = 0x00000006    r2 = 0xb6ef8a19    sp = 0xbedf52d8
     pc = 0xb5d39f2b

I have no idea what the compiled code looks like, but my blind guess is that r1 might be a pointer in which case this bug looks like a null-deref.
(In reply to Michael Vines [:m1] [:evilmachines] from comment #15)
> (In reply to Nicolas B. Pierron [:nbp] from comment #12)
> > Something which is only observed once has by definition no STR.
> 
> Please read comment 9

Do we have other Steps-To-Failures of each of these 4 times, in which case we might refine the Steps-To-Failures in such as way that we find a small reproducible STR?
(In reply to Nicolas B. Pierron [:nbp] from comment #16)
> Comment on attachment 8420735 [details]
> decoded minidump of crash
> 
>  7  libxul.so!js::BaseProxyHandler::get(JSContext*, JS::Handle<JSObject*>,
> JS::Handle<JSObject*>, JS::Handle<jsid>, JS::MutableHandle<JS::Value>)
> [jsproxy.cpp : 132 + 0x9]
>      r0 = 0xb6ef8a59    r1 = 0x00000006    r2 = 0xb6ef8a19    sp = 0xbedf52d8
>      pc = 0xb5d39f2b
> 
> I have no idea what the compiled code looks like, but my blind guess is that
> r1 might be a pointer in which case this bug looks like a null-deref.

Unless I'm missing something, the following line appears in the EXTRA file:

> 05-09 16:53:03.199   211   211 F libc    : bionic/libstdc++/src/pure_virtual.cpp:6: void __cxa_pure_virtual(): assertion "!"Pure virtual function called. Are you calling virtual methods from a destructor?"" failed

This corresponds nicely with the line of the crash in jsproxy.cpp in BaseProxyHandler::get(), where we try to call getPropertyDescriptor() on the handler, which is indeed pure virtual on BaseProxyHandler.

I had assumed this was the problem. An audit of the BaseProxyHandlers defined across the codebase looking for the trivially missing function is not promising, though.
(In reply to Eric Faust [:efaust] from comment #18)
> (In reply to Nicolas B. Pierron [:nbp] from comment #16)
> > Comment on attachment 8420735 [details]
> > decoded minidump of crash
> > 
> >  7  libxul.so!js::BaseProxyHandler::get(JSContext*, JS::Handle<JSObject*>,
> > JS::Handle<JSObject*>, JS::Handle<jsid>, JS::MutableHandle<JS::Value>)
> > [jsproxy.cpp : 132 + 0x9]
> >      r0 = 0xb6ef8a59    r1 = 0x00000006    r2 = 0xb6ef8a19    sp = 0xbedf52d8
> >      pc = 0xb5d39f2b
> > 
> > I have no idea what the compiled code looks like, but my blind guess is that
> > r1 might be a pointer in which case this bug looks like a null-deref.
> 
> Unless I'm missing something, the following line appears in the EXTRA file:
> 
> > 05-09 16:53:03.199   211   211 F libc    : bionic/libstdc++/src/pure_virtual.cpp:6: void __cxa_pure_virtual(): assertion "!"Pure virtual function called. Are you calling virtual methods from a destructor?"" failed
> 
> This corresponds nicely with the line of the crash in jsproxy.cpp in
> BaseProxyHandler::get(), where we try to call getPropertyDescriptor() on the
> handler, which is indeed pure virtual on BaseProxyHandler.
> 
> I had assumed this was the problem. An audit of the BaseProxyHandlers
> defined across the codebase looking for the trivially missing function is
> not promising, though.

Eric,

Can you please help me understand what the next steps are?
Flags: needinfo?(efaustbmo)
At the risk of speaking for Eric I see potential next steps as:

1) We need (from you?) reliable repeatable STR or a system in Mtn View that reliably demonstrates this bug. Eric and NBP are guessing at potential problems but there isn't anything obvious pointed to by the crash dump.

2) A potential next step could be to add instrumentation to the code to make a future crash easier to diagnose. At the moment it doesn't appear that Eric or NBP have found a good candidate method or set of methods to instrument. Without #1 this may be the best we can do.

Perhaps Eric and Nicolas may have other suggestions.
#2 is probably the best we can do at the moment.  We're also continuing to run stability over the weekend and week, so we'll report new occurrences of this crash here as soon as they become known.   

We've been running stability for a while now on v1.4 and comment 2 was the first occurrence of this bug.  Not exactly a tight regression range but it's maybe a start.
Crash observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.097
Moz BuildID: 20140511000204
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=17fb44880e95bc7ae363a609d811bf5a9a067b5b
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=2f11e3aba98eb785ec24504fe9988ab61a03b64d
(In reply to Michael Vines [:m1] [:evilmachines] from comment #21)
> #2 is probably the best we can do at the moment.  We're also continuing to
> run stability over the weekend and week, so we'll report new occurrences of
> this crash here as soon as they become known.   

Are you running Debug builds too?  We have a lot of things which are checked by our assertions.
Maybe we could have a better view on the origin of this issue if we find an early point of failure.

> We've been running stability for a while now on v1.4 and comment 2 was the
> first occurrence of this bug.  Not exactly a tight regression range but it's
> maybe a start.

Hum … ok, so this is likely something related to the backported patches.
I will look into it if something sounds obvious.
Flags: needinfo?(nicolas.b.pierron)
Crash observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.102
Moz BuildID: 20140515000202
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=2e97bee6bb79d3577dba1bf2a1bbfcba64ee99ab
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=35f27a8e9b3f651748aa22095553024556272de8
(In reply to Nicolas B. Pierron [:nbp] from comment #23)
> (In reply to Michael Vines [:m1] [:evilmachines] from comment #21)
> > #2 is probably the best we can do at the moment.  We're also continuing to
> > run stability over the weekend and week, so we'll report new occurrences of
> > this crash here as soon as they become known.   
> 
> Are you running Debug builds too?  We have a lot of things which are checked
> by our assertions.
> Maybe we could have a better view on the origin of this issue if we find an
> early point of failure.
> 
> > We've been running stability for a while now on v1.4 and comment 2 was the
> > first occurrence of this bug.  Not exactly a tight regression range but it's
> > maybe a start.
> 
> Hum … ok, so this is likely something related to the backported patches.
> I will look into it if something sounds obvious.

What are our next steps here? Are we able to reproduce this?
Crash observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.106
Moz BuildID: 20140519000201
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=defd0650fb9d30c6515d50a89e72d8fb74ce7e62
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=49653a2e9c8709028640af39c919f1f8f4c53806
I looked in the history of b2g_30 repository, I was not able to find any JS commit which might have caused this recent regression.

Sadly, looking in bugzilla, found the same signature in Bug 942917 and Bug 946956.
I will note that in all cases, this seems to happen on ARM, with 2 cores. (so far)
Flags: needinfo?(nicolas.b.pierron)
I think our remaining option, is to fix bugs that we know which might be related.

I looked in crash-stat results[1], and searched for any mention of BaseProxyHandler as part of the signature.  We have a few signatures, with common stack traces.  Most of the failures started to "spike" after the version 29.0 (and mostly reported on 29.0.1).  This is also the case for the current signature[2b], even if it was there before the argument type change[2a].  I do not know if all signatures are related, but they seems to appear on Gecko 29.0.

Among the commented signature, I found one user which managed to crash twice[3] with (js::BaseProxyHandler::construct) after installing the Lastpass addon.  This might be a way for us to start investigating these issues.

[1] https://crash-stats.mozilla.com/search/?signature=~BaseProxyHandler&_facets=signature&_facets=version&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=cpu_info&_columns=android_cpu_abi&_columns=android_cpu_abi2

[2a] https://crash-stats.mozilla.com/report/list?signature=js%3A%3ABaseProxyHandler%3A%3Aget%28JSContext*%2C+JS%3A%3AHandle%3CJSObject*%3E%2C+JS%3A%3AHandle%3CJSObject*%3E%2C+JS%3A%3AHandle%3Cint%3E%2C+JS%3A%3AMutableHandle%3CJS%3A%3AValue%3E%29&#tab-reports
[2a] https://crash-stats.mozilla.com/report/list?signature=xul.dll%400x160f03+|+js%3A%3ABaseProxyHandler%3A%3Aget%28JSContext*%2C+JS%3A%3AHandle%3CJSObject*%3E%2C+JS%3A%3AHandle%3CJSObject*%3E%2C+JS%3A%3AHandle%3Cint%3E%2C+JS%3A%3AMutableHandle%3CJS%3A%3AValue%3E%29&#tab-reports
[2a] https://crash-stats.mozilla.com/report/list?signature=mozalloc_abort%28char+const*+const%29+|+xul.dll%400xddbccf+|+xul.dll%400x880423+|+xul.dll%400xe9f21+|+js%3A%3ABaseProxyHandler%3A%3Aget%28JSContext*%2C+JS%3A%3AHandle%3CJSObject*%3E%2C+JS%3A%3AHandle%3CJSObject*%3E%2C+JS%3A%3AHandle%3Cint%3E%2C+JS%3A%3AMutableHandle%3CJS%3A%3AValue%3E%29&#tab-reports

[2b] https://crash-stats.mozilla.com/report/list?signature=js%3A%3ABaseProxyHandler%3A%3Aget%28JSContext*%2C+JS%3A%3AHandle%3CJSObject*%3E%2C+JS%3A%3AHandle%3CJSObject*%3E%2C+JS%3A%3AHandle%3Cjsid%3E%2C+JS%3A%3AMutableHandle%3CJS%3A%3AValue%3E%29&#tab-reports

[3] https://crash-stats.mozilla.com/report/list?signature=js%3A%3ABaseProxyHandler%3A%3Aconstruct%28JSContext*%2C+JS%3A%3AHandle%3CJSObject*%3E%2C+JS%3A%3ACallArgs+const%26%29&#tab-reports
This might not be related, but here is the list of changes which have added/removed the text BaseProxyHandler:


commit http://git.mozilla.org/?p=releases/gecko.git;a=commitdiff;h=b4170c4e663b5c51dad5c4cd95b498e10d08d8ba
Author: Bill McCloskey <wmccloskey@mozilla.com>
Date:   Fri May 16 16:40:37 2014 -0700

    Bug 996785 - Bidirectional CPOWs (r=mrbkap)

commit http://git.mozilla.org/?p=releases/gecko.git;a=commitdiff;h=4ea552c32844de5d4974d4f32ad04de99e2eac0d
Author: Bill McCloskey <wmccloskey@mozilla.com>
Date:   Fri May 16 16:40:36 2014 -0700

    Bug 996785 - Move CPOW wrapper owner code (r=mrbkap)

commit http://git.mozilla.org/?p=releases/gecko.git;a=commitdiff;h=dd38d4256007b57988490ef4fc305a5e612394dd
Author: Brian Hackett <bhackett1024@gmail.com>
Date:   Fri May 9 17:31:07 2014 -0700

    Bug 976446 - Use different names for helper functions in baseline and ion ICs to avoid unified build breakage, r=jandem.

commit http://git.mozilla.org/?p=releases/gecko.git;a=commitdiff;h=c112841a98db7cea6068323cb608d05aa4789e06
Author: Wes Kocher <wkocher@mozilla.com>
Date:   Fri May 9 15:31:41 2014 -0700

    Backed out changeset 91579a455888 (bug 976446) for build bustage on a CLOSED TREE

commit http://git.mozilla.org/?p=releases/gecko.git;a=commitdiff;h=5499c85404d705d24705a2e01c681ecd24e1fbf8
Author: Brian Hackett <bhackett1024@gmail.com>
Date:   Fri May 9 14:51:03 2014 -0700

    Bug 976446 - Use different names for helper functions in baseline and ion ICs to avoid unified build breakage, r=jandem.

commit http://git.mozilla.org/?p=releases/gecko.git;a=commitdiff;h=6635c054308929c00354272ea90990e1cfe17851
Author: Eric Faust <efaustbmo@gmail.com>
Date:   Sat Feb 1 00:29:52 2014 -0800

    Bug 947487 - Part 2: Generate and use js::Class structs for DOM proxies. (r=bz)

commit http://git.mozilla.org/?p=releases/gecko.git;a=commitdiff;h=cab18b58552dd4b913812411a0a216f71b008caf
Author: Bobby Holley <bobbyholley@gmail.com>
Date:   Thu Feb 13 10:54:07 2014 -0800

    Bug 965901 - Track and assert the policy action in AutoEnterPolicy/assertEnteredPolicy. r=gabor sr=mrbkap

commit http://git.mozilla.org/?p=releases/gecko.git;a=commitdiff;h=f66a185e68b91ce5eaf63e22d01d30872e32ecb7
Author: Jason Orendorff <jorendorff@mozilla.com>
Date:   Fri Apr 25 15:07:18 2014 -0500

    Bug 987007, part 2 - Handle assignment to named and indexed setters without using JSRESOLVE_ASSIGNING. r=bz, r=bholley.
(CCing people who might know something more about the BaseProxyHandler)
And I forgot the following patches, because of a lack of push-log in git:

commit http://git.mozilla.org/?p=releases/gecko.git;a=commitdiff;h=9052b9f50991174aeb02b408d8f53b89b1205446
Bug 697343 - Introduce a slice hook to allow optimizing Array.prototype.slice for Proxies etc. r=jandem,bz

commit http://git.mozilla.org/?p=releases/gecko.git;a=commitdiff;h=11675716d56a322419f4062624d3ca9d1f4b7269
Bug 697343 - Remove getElementIfPresent. r=Waldo

commit http://git.mozilla.org/?p=releases/gecko.git;a=commitdiff;h=50c0f8f4e6de7defec7bd6de1967b659dee95790
Bug 697343 - Introduce a slice hook to allow optimizing Array.prototype.slice for Proxies etc. r=jandem,bz

commit http://git.mozilla.org/?p=releases/gecko.git;a=commitdiff;h=http://git.mozilla.org/?p=releases/gecko.git;a=commitdiff;h=922370489e86b39913f187caf6d25b5d247037b0
Bug 697343 - Remove getElementIfPresent. r=Waldo

commit http://git.mozilla.org/?p=releases/gecko.git;a=commitdiff;h=7f87debdf912a60f08b77e629c09d59581cebf86
Bug 926012 - Part 2: Allow __proto__ sets on proxies. (r=Waldo)
Among the previous list, only patches from Bug 947487, Bug 697343 and Bug 926012 made it to Firefox 29 (still assuming that the failures are caused by a modification which added/removed the name BaseProxyHandler)
Crash observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.106
Moz BuildID: 20140519000201
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=defd0650fb9d30c6515d50a89e72d8fb74ce7e62
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=49653a2e9c8709028640af39c919f1f8f4c53806
Crash observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.106
Moz BuildID: 20140519000201
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=defd0650fb9d30c6515d50a89e72d8fb74ce7e62
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=49653a2e9c8709028640af39c919f1f8f4c53806
sorry for the cafbot spam.  Please ignore comments 26, 32, 33.
(In reply to Greg Grisco from comment #34)
> sorry for the cafbot spam.  Please ignore comments 26, 32, 33.

Are these crashes related to this signature?
If so, I do not mind having them here, on the contrary this highlight that this is important to fix, but it would be better if we could isolate a simple STR from all these combined crashes.
Crash observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.106
Moz BuildID: 20140519000201
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=defd0650fb9d30c6515d50a89e72d8fb74ce7e62
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=49653a2e9c8709028640af39c919f1f8f4c53806
(In reply to Nicolas B. Pierron [:nbp] from comment #35)
> (In reply to Greg Grisco from comment #34)
> > sorry for the cafbot spam.  Please ignore comments 26, 32, 33.
> 
> Are these crashes related to this signature?
> If so, I do not mind having them here, on the contrary this highlight that
> this is important to fix, but it would be better if we could isolate a
> simple STR from all these combined crashes.

Yes, these notifications are all from same crash signature.  Each of the crashes that we've seen so far have come from stability testing while running scripts as noted in the Description.  I don't have a simple STR at this time.
How many devices are you using for running these tests, and how frequently are you running these tests?
The question is, will this impact an important portion of Firefox OS users?

We have a web-site which reference the one which appear in our continuous integration [1], but not frequently enough to be investigated.

[1] http://brasstacks.mozilla.com/orangefactor/
Flags: needinfo?(mvines)
Nicolas,

Is there any specific info you need to proceed with the blocker?
Flags: needinfo?(nicolas.b.pierron)
The question for Michael was can your team run debug builds of FxOS?

The debug builds have many assertions that Nicolas thinks could help diagnose this crash. Without STR, this is our next best approach.
We don't run gecko debug builds in normal setup but we can enable extra set of logging if needed as Michael suggested in comment 14.
Considering that we are hitting this issue so frequently in our stability testing it will have an impact in user experience.
Flags: needinfo?(mvines)
Eric, Nicolas, and I (JS engine engineers) met and talked about this bug for an hour or more.

We have no theory that explains the chain of events in the stack in the first attachment.

That said, we have a lot of ideas of things we could do to capture more information.

- Force all proxies to be finalized in the foreground, to rule out
  the possibility that the proxy is being swept in a background thread.

- Change the layout of data structures in an attempt to observe
  their being clobbered.
    - change the value of ProxyObject::HANDLER_SLOT
    - change the offset of the vtable in BaseProxyHandler
    - put constant values in the vacated locations and use RELEASE_ASSERTs
      to check that  before calling into ProxyHandler methods
      
- Make all ProxyHandler objects const. We are confirming that the
  vtables themselves are mapped read-only; but since the ProxyHandlers
  themselves are non-const, they are presumably mapped read-write.
  We can change that. If ProxyHandler objects' __vptrs are actually
  being clobbered (doubtful), then this change will pinpoint the bug.
  
- Add a crash report annotation when we have gc'd during the current event
  (to detect if perhaps a proxy is being swept while a corresponding
  Proxy method is on the stack).

- Poison proxies when they are swept, to crash earlier.
  (But note that this would conflict with some other ideas here, as
  poisoning memory overwrites information we might like to know.)

- Add release-mode sanity asserts in ProxyObject::handler().

- Capture the name of the wrapper type in the crash dump somehow.

Merely landing these changes in Nightly may be enough to get the information we need.

Anyway, we're going to sleep on it. We'll decide what to do tomorrow.
Crash observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.106
Moz BuildID: 20140519000201
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=defd0650fb9d30c6515d50a89e72d8fb74ce7e62
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=49653a2e9c8709028640af39c919f1f8f4c53806
(In reply to Preeti Raghunath(:Preeti) from comment #40)
> Is there any specific info you need to proceed with the blocker?

Yes, looking at the compiled code of …/bin/b2g (before we strip the symbols), or even having an core-dump would help us understand what are the register that we see in the reported minidump, as well as potentially checking the hypothesis about the vtable.

Currently the most likely cause of failure is that the JS heap is being corrupted by something else.  Sadly we have limited opportunities to find what is corrupting the JS heap / proxy's wrappers.

One option, would be to improve Terrence's patch (Bug 995649) to make it work on nightlies, while avoiding false-positive.
Flags: needinfo?(nicolas.b.pierron)
blocking-b2g: 1.4? → 1.4+
More ideas:

- Build a data structure of all ProxyHandler instances and assert in all Proxy methods that proxy->handler() points to one of them.

  We thought of this yesterday and I just forgot to write it down. Though this promises to catch the error earlier, it won't necessarily catch it close enough to its cause.

- In a special build just for tracking down this bug, add a destructor to BaseProxyHandler that crashes hard. (Because destroying a derived-class proxy handler is one of the few things that would leave a __vptr that points to BaseProxyHandler's vtable.)

- Examine the build binaries to make sure the code is what we think it is.

That last I think we want to do anyway. Where can I find the exact binaries being tested?
Crash observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.112
Moz BuildID: 20140524000202
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=8f4201a44676eb70926a3d2645d94bf92fcd6718
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=c69048b9a3fcacc456f281db08a5e6162655ecec
This is one of the tries discussed above.
Attachment #8431047 - Flags: review?(jorendorff)
Flags: needinfo?(efaustbmo)
The only potential problem here is that wrapper targets might be background finalizable, but I can't imagine that would be bad, here.
Attachment #8431047 - Attachment is obsolete: true
Attachment #8431047 - Flags: review?(jorendorff)
Attachment #8431050 - Flags: review?(jorendorff)
Attachment #8431050 - Flags: review?(jorendorff) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/bba94df69c13

Landed; will nominate for uplift in a few hours.
Keywords: leave-open
(In reply to Eric Faust [:efaust] from comment #50)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/bba94df69c13
> 
> Landed; will nominate for uplift in a few hours.

Do we really need to check in test patches on inbound, could we just send it to try, and land it on top of b2g30 (not beta), for the purpose of looking how the crash change?

The goal is not for these patches to stick in, they are just made for analyzing the behaviour of the crash to get a better understanding of it.
Flags: needinfo?(lsblakk)
If this isn't intending to land on beta branch, not sure why ni? on me?
Flags: needinfo?(lsblakk)
(In reply to Lukas Blakk [:lsblakk] from comment #53)
> If this isn't intending to land on beta branch, not sure why ni? on me?

Who should we ask to try these patches on b2g30?  Isn't b2g30 following beta?
beta is merging to b2g30 during this cycle (to pick up common security/stability fixes), but b2g30 is its own independent branch. 1.4 blockers are *only* landing on b2g30, for example.
We want to analyze this issue, based on the crashes reported here.  One of the way is to land patches and wait for the outcome to see if the bug signature changed or not.  The goal is not to let the patches stick in, but to land them for analyzing the issue.
Greg,

Can you please try the patch in your branch? If this works we can land in 1.4 as well.
Flags: needinfo?(ggrisco)
(In reply to Preeti Raghunath(:Preeti) from comment #57)
> Greg,
> 
> Can you please try the patch in your branch? If this works we can land in
> 1.4 as well.

tapas is going to help out with this.
Flags: needinfo?(ggrisco) → needinfo?(tkundu)
(In reply to Preeti Raghunath(:Preeti) from comment #57)
> Greg,
> 
> Can you please try the patch in your branch? If this works we can land in
> 1.4 as well.

We are testing it internally. I will update here if this patch fixes this bug in v1.4
(In reply to Tapas Kumar Kundu from comment #59)
> (In reply to Preeti Raghunath(:Preeti) from comment #57)
> > Greg,
> > 
> > Can you please try the patch in your branch? If this works we can land in
> > 1.4 as well.
> 
> We are testing it internally. I will update here if this patch fixes this
> bug in v1.4

Thanks :)

This is not guaranteed to fix anything, but this would help us identify/exclude the kind of failure which was happening before.
Whiteboard: [CR 662112][b2g-crash] → [caf priority: p1][CR 662112][b2g-crash]
(In reply to Tapas Kumar Kundu from comment #59)
> We are testing it internally. I will update here if this patch fixes this
> bug in v1.4

I do not see any more report from cafbot, did the signature change in anyway?  Did we noticed any new bug which were not reported before and which started since the patch is applied?
Flags: needinfo?(ggrisco)
(In reply to Nicolas B. Pierron [:nbp] from comment #61)
> (In reply to Tapas Kumar Kundu from comment #59)
> > We are testing it internally. I will update here if this patch fixes this
> > bug in v1.4
> 
> I do not see any more report from cafbot, did the signature change in
> anyway?  Did we noticed any new bug which were not reported before and which
> started since the patch is applied?

We have seen many crashes with JS in the signature recently, although, you're right, this particular signature has not been seen since AU112 as cafbot reported.  I don't know if these are related, but some other bugs that we've created to track these JS crashes are bug 1016512, bug 1017757, and bug 1017740.
Flags: needinfo?(ggrisco)
Crash observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.122
Moz BuildID: 20140604000202
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=0c16adced7c51f795ef250aebe184f60b6a9b987
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=157a45f1fa280296dc9204de6def0b5b370ed2bd
This is also worth adding to the builds searching for this bug. It has some hope of stopping it, and should at least give us more assert information otherwise.
This should also go in the patch set. It both has a chance to fix, and also will provide more information on the failure.
Attachment #8437211 - Attachment is obsolete: true
Attachment #8437229 - Attachment is obsolete: true
(In reply to Eric Faust [:efaust] from comment #65)
> Created attachment 8437229 [details] [diff] [review]
> Remove virtual destructor from BaseProxyHandler and add assert
> 
> This should also go in the patch set. It both has a chance to fix, and also
> will provide more information on the failure.

Do want me try a new patch other than #comment 59 ? if so then please provide a patch which is rebased on v1.4 branch.
Flags: needinfo?(efaustbmo)
This should be applied to the 1.4 testing branch and let's see where that gets us.
Flags: needinfo?(efaustbmo)
Crash observed on: 

Device: msm8226
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.125
Moz BuildID: 20140609000201
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=8b239e41bbd85aa7b6a2c5d388e775ba7de6fb2b
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=33f62f84905c1a3ed8387db8372f8449d1c65e3a
We had a good conference call about these bugs this morning. Ender reports the bug still occurs even with the attached patch. The stability testing team will attach a fresh crash report to this bug. I'm very curious to see what the new crash report says.

In the mean time, Eric is working on another patch (to make the ProxyHandler objects const, I think). Also nbp is working on producing a B2G 1.4 build with a custom-built JS engine with assertions enabled and compiled with gcc -Og (disabling C++ compiler optimizations that interfere with capturing a good stack).

A lot more happened in the meeting. There are action items on all sides. I'm looking into sharing our stability tests and other ways we can improve our testing of the JS engine on devices.
Crash observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.130
Moz BuildID: 20140614000202
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=164644d91290708a71436dfdf4301e33b92e2c77
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=89e926ba3a65372a74d72c30a8773081820cccbb
With the fix from bug 964537, this issue does not reproduce. Marking this as a duplicate.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(tkundu)
Flags: needinfo?(mrosenberg)
Resolution: --- → DUPLICATE
Crash observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.132
Moz BuildID: 20140616000202
B2G Version: 1.4
Gecko Version: 30.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=164644d91290708a71436dfdf4301e33b92e2c77
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=078efc631bb49ae6318abc5f9cff439d879130e3
This issue still comes with the fix from bug 964537. Reopening..
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
(In reply to Inder from comment #73)
> This issue still comes with the fix from bug 964537. Reopening..

If only we could have been so lucky. This makes sense, though. The patch in bug 964537 has nothing to do with this system whatsoever.

Let's try two new patches:

The first tries to combat potential corruption of BaseProxyHandler objects by making them all const, and the second adds a canary-like assertion targeting this case. The canary is not so much looking for corruption directly (hopefully the const will take care of that), as it is fact-checking our casting. We do some void * to proxy handler conversions, and I want to make sure we're actually laying the struct over memory correctly. While it's unlikely that this assertion will trip because of the nature of that failure, better to be safe than sorry.

So, where does that leave us from a testing point of view, after we apply these patches:

Any new SEGV signature is potentially relevant, and I'm eager to see what comes from them, or if any new ones turn up.

Secondly, obviously if the newly added MOZ_CRASH trips, then that's clearly relevant.
Comment on attachment 8443686 [details] [diff] [review]
Add canary, checked in BPH::get

Review of attachment 8443686 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsproxy.cpp
@@ +130,5 @@
>  {
>      assertEnteredPolicy(cx, proxy, id, GET);
>  
> +    if (canary != BPH_CANARY_VALUE) {
> +        fprintf(stderr, "canary value %x differs from expected value\n");

You forgot to pass canary to fprintf. %x will print (the wrong) garbage. :)
Attachment #8443686 - Attachment is obsolete: true
Has anything happened with these patches? I don't really know how to get more direct feedback, but at least a ping that they've been pulled in and are being tested would be helpful.
Inder:

1. Has CAF run its stability tests with Eric's canary and Proxy handler patches?

2. Can we move this bug from blocking 1.4 to a later B2G release like 2.0 or 2.1?
Flags: needinfo?(ikumar)
> 1. Has CAF run its stability tests with Eric's canary and Proxy handler
> patches?
We have switched over our stability to 2.0 and we are trying to see if this issue manifests there or not.

> 
> 2. Can we move this bug from blocking 1.4 to a later B2G release like 2.0 or
> 2.1?
Sure. Updated the flags. Will update here shortly after we receive the stability results.
blocking-b2g: 1.4+ → 2.0?
Flags: needinfo?(ikumar)
blocking-b2g: 2.0? → 2.0+
Any word on this? Did it go away with 2.0? Does it still reproduce on 1.4? Where are we?
Not observed on v2.0 yet here, perhaps because we are currently hitting a bunch of other crashes first.  cafbot should report back if/when this is reproduced on v2.0.
Is v1.4 still affected?  I do not see any reports from cafbot since comment 72.
If it does not reproduce on v1.4, should we close this issue, and re-open it if it appears on v2.0?
Nicolas: QC is no longer running stability tests on 1.4, so we can probably close this bug if no one at Mozilla can repro this bug on 1.4.
Preeti, how could we flag this bug as blocking 2.0 without any report mentioning 2.0?

Naoki mentioned on #b2g that he would try to reproduce this issue within the next few days.  If we cannot reproduce this issue locally, I guess we should mark this bug as Works-For-Me until somebody/somebot can reproduce this bug.
Flags: needinfo?(praghunath)
Flags: needinfo?(nhirata.bugzilla)
?, per comment 83
blocking-b2g: 2.0+ → 2.0?
2.0- As per comment 83 and comment 86, this looks to be a 1.4 blocker that was moved to 2.0 without confirming that it is reproducible on 2.0. If this bug does surface on 2.0, we can reconsider the blocking designation.
blocking-b2g: 2.0? → -
Looking for crashes in soccoro for a similar crash, I only saw 1 crash :
https://crash-stats.mozilla.com/report/index/92a8aae6-9ef9-4b9e-a8b0-157fe2140704

[search : https://crash-stats.mozilla.com/search/?signature=~js%3A%3ABaseProxyHandler&product=B2G&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform ]

It's a self made build on geeksphone peak for 2.0.0;  I am guessing it's possible that we could potentially crash with this still on 2.0 if this is a valid crash.

Having said that I don't have any STRs which would make this bug unactionable if I am not mistaken.  I am closing this off as WFM until we figure out STRs or cafbot reproduces it again.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Flags: needinfo?(nhirata.bugzilla)
Resolution: --- → WORKSFORME
[To note, I did try testing a bunch of actions to see if I can reproduce this crash.]
We are seeing this on 2.0 now, a total of 3 times so far.  I'm going to attach latest minidump/extra files.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Blocks: CAF-v2.0-FC-metabug
No longer blocks: MTBF-QC
blocking-b2g: - → 2.0?
Attached file decoded minidump #2
(In reply to Greg Grisco from comment #91)
> We are seeing this on 2.0 now, a total of 3 times so far.  I'm going to
> attach latest minidump/extra files.

Well, that's no good...

Inder, what combination of the patches I have posted in this bug actually made their way onto testing machines while 1.4 testing was still happening? It was never clear to me which things had actually been tried, since we changed testing platforms about the same time. It looks like from your comment the canary and const patches were never tried? I am happy to rebase any untested patches to 2.0 so we can start churning on this again.
Flags: needinfo?(ikumar)
Eric -- I don't think we ended up testing with canary and proxy handler patches. Please rebase for 2.0 and we can do a run with these patches.

Greg -- once these patches are rebased please land it in our build.
Flags: needinfo?(ikumar)
blocking-b2g: 2.0? → 2.0+
Flags: needinfo?(efaustbmo)
(In reply to Inder from comment #95)
> Eric -- I don't think we ended up testing with canary and proxy handler
> patches. Please rebase for 2.0 and we can do a run with these patches.
> 
> Greg -- once these patches are rebased please land it in our build.

Inder, Eric is going to have the 2.0 patches uploaded here soon, just spoke to him on irc.
This patch does a bunch of stuff for v2.0:

1) It rolls together the patches so far posted in this bug.
    It removes virtual destructors from BPH and friends
    It makes BPH instances const
    It adds a canary checked in BPH::get, where the crash occurs

2) I also took the liberty of adding one more assert. Since it looks like we are coming from the event loop, I have asserted that we have not raced the event loop against JS_ShutDown, which could cause a whole host of strange behaviors.

It is my hope that one of these checks will fail, and mask the signature in this bug. None of these patches are a proposed fix, but will hopefully give us more specific information about the mode of failure. Let me know what the results are, and we can hopefully try to get more traction on this.
Attachment #8459010 - Flags: checkin?(ggrisco)
Flags: needinfo?(efaustbmo)
Inder, did this patch make it into the testing build over the weekend, as was discussed?
Flags: needinfo?(ikumar)
Eric -- the patch was late for the weekend testing which was created on thursday night. We have it included in the next round of testing. We will update you with our findings.
Flags: needinfo?(ikumar)
Attachment #8459010 - Flags: checkin?(ggrisco)
Observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.035
Moz BuildID: 20140713000201
B2G Version: 2.0
Gecko Version: 32.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=ca022f811bcbbda0f89086094a9e92bb220fea18
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=be6908fec84d3e39453275da96c031336f58f23d
Observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.040
Moz BuildID: 20140716000201
B2G Version: 2.0
Gecko Version: 32.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=5f8b1b8a2da9e3b531eee817a669f57fa4d9b9c6
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=e00f7e464333689fcf54edb4945ece94f97f930b
Flags: needinfo?(praghunath)
Whiteboard: [caf priority: p1][CR 662112][b2g-crash] → [caf-crash 208][caf priority: p1][CR 662112][b2g-crash]
Inder - Do the cafbot comments means that the bug is still reproducible with the patch?
Flags: needinfo?(ikumar)
Lawrence -- the cafbot comment is from previous runs without the patch. Stability run with the patch is ongoing and we will see if it reproduces. ETA of the report is Monday (7/28). We have also noticed some other errors in the logs indicating the issue could be somewhere else in gonk. We are investigating from that perspective too.
Flags: needinfo?(ikumar)
The patch is built into AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.043 and later AUs.
I think it's ok to land this now that we haven't seen it reproduce on AU43.
Flags: needinfo?(efaustbmo)
Well, I am interested in seeing signature change. Did you see any new crash or assertion signatures related to these patches?
Flags: needinfo?(efaustbmo)
Yes, we've seen two other new signatures on AU 43:

[@ js::BaseProxyHandler::get(JSContext*, JS::Handle, JS::Handle, JS::Handle, JS::MutableHandleJS::Value) | js::Proxy::get(JSContext*, JS::Handle, JS::Handle, JS::Handle, JS::MutableHandleJS::Value) | js::proxy_GetElement(JSContext*, JS::Handle, JS::Handle, unsigned int, JS::MutableHandleJS::Value) | Interpret ]

[@ js::BaseProxyHandler::get(JSContext*, JS::Handle, JS::Handle, JS::Handle, JS::MutableHandleJS::Value) | js::Proxy::get(JSContext*, JS::Handle, JS::Handle, JS::Handle, JS::MutableHandleJS::Value) | js::proxy_GetElement(JSContext*, JS::Handle, JS::Handle, unsigned int, JS::MutableHandleJS::Value) | JSObject::getElement(JSContext*, JS::Handle, JS::Handle, unsigned int, JS::MutableHandleJS::Value) ]
Those are both potentially interesting. Do you have output logs for samples of them, or bug numbers where they were filed?
Flags: needinfo?(ggrisco)
(In reply to Eric Faust [:efaust] from comment #108)
> Those are both potentially interesting. Do you have output logs for samples
> of them, or bug numbers where they were filed?

bug 1046461 and bug 1046465
Flags: needinfo?(ggrisco)
OK, these look highly interesting. Is there any way we can map that line number (jsproxy.cpp:138) onto a line of code? Is three somewhere those builds, or the source-revs for them are available?
NI greg to see if there is anything he can do to help with request in the above comment.
Flags: needinfo?(ggrisco)
(In reply to Eric Faust [:efaust] from comment #110)
> OK, these look highly interesting. Is there any way we can map that line
> number (jsproxy.cpp:138) onto a line of code? Is three somewhere those
> builds, or the source-revs for them are available?

It looks like this line:

http://git.mozilla.org/?p=releases/gecko.git;a=blob;f=js/src/jsproxy.cpp;h=505db3ff6f3aee80076736e55fc3a7024e24a7d2;hb=e00f7e464333689fcf54edb4945ece94f97f930b#l138
Flags: needinfo?(ggrisco)
(In reply to Greg Grisco from comment #112)
> (In reply to Eric Faust [:efaust] from comment #110)
> > OK, these look highly interesting. Is there any way we can map that line
> > number (jsproxy.cpp:138) onto a line of code? Is three somewhere those
> > builds, or the source-revs for them are available?
> 
> It looks like this line:
> 
> http://git.mozilla.org/?p=releases/gecko.git;a=blob;f=js/src/jsproxy.cpp;
> h=505db3ff6f3aee80076736e55fc3a7024e24a7d2;
> hb=e00f7e464333689fcf54edb4945ece94f97f930b#l138

Wait, but this line is from a source that doesn't include the testing patch. That's the only reason why the mapping is interesting.

Looking at a copy of the recent sources with the patches applied, it looks like it's the MOZ_CRASH for the canary check. That's very interesting.
Greg, does that seem correct? I'm still seeing cafbot reports that have the line you are talking about, and still don't seem to include the patch.

FWIW, crashing at that line makes very little sense.

I'm also curious about how quickly we can turn around patches for further testing. I have one more assert I want to add, if I can, but I am told the deadline for this stuff is rapidly approaching. Can we get such a patch into a testing build and on a farm and results by by mid-next-week?
Flags: needinfo?(ggrisco)
Sorry, I thought I would see our locally applied patches indexed by our local grok server, but that's not the case.  I found the crashing line to be at the MOZ_CRASH here:

    if (canary != BPH_CANARY_VALUE) {
        fprintf(stderr, "canary value %x differs from expected value\n", canary);
        MOZ_CRASH();
    }

Be aware that we are talking about the crash reported in bug 1046461 with the 2.0 rebased patch from this bug applied.

If you need further patches applied, the sooner you get them to us the better chance we have at getting some feedback to you before the deadline.  If we have it early enough we can make an engineering build, test overnight, and get feedback the next day.
Flags: needinfo?(ggrisco)
I wasn't able to find that "canary" message in the log though.
Depends on: 1046465
Depends on: 1046461
We have a couple display related changes that we are currently testing that we think might be the underlying cause of these issues.  You might want to hold off on further investigation on this one until we have the results (in next day or two).

Can we mark bug 1046461 and 1046465 as dup of this one for now?
(In reply to Greg Grisco from comment #117)
> Can we mark bug 1046461 and 1046465 as dup of this one for now?

Yes, and indeed we should. They both crash at the MOZ_CRASH added in the testing patch from this bug.
Attachment #8468949 - Flags: review?(jorendorff)
Neither of these should land in testing builds just yet, but could probably go in early tomorrow, once Jason has had a chance to look at them
Comment on attachment 8468949 [details] [diff] [review]
Another testing idea: mprotect BPH

Review of attachment 8468949 [details] [diff] [review]:
-----------------------------------------------------------------

Big chunks of this patch are actually really nice and we should land them, if you have the time --- and take them just a bit further, removing the now-unnecessary #defines in xpconnect/wrappers/FilteringWrapper.cpp for example.

r=me for testing.

(Not to land on m-c in this form, just to be totally clear to anyone lurking in this bug. The patch is against b2g2.0, not m-c.)
Attachment #8468949 - Flags: review?(jorendorff) → feedback+
Greg,

If the bug still reproduces with the display patches, then we should apply the mprotect patch on top of the existing testing patches and run with that for a day and see what happens. Our hope is that we will isolate the source of the data corruption with a direct fault at corruption time.

Please advise as to the state of the world relative to those display patches. I would also like to see them, if they came from us, to try and verify that I believe they would indeed cause this failure mode.
Flags: needinfo?(ggrisco)
I'm marking this [POVB] for now.  We're still working through the display issue; will apply the patch and re-open this if need be once that is handled correctly.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Flags: needinfo?(ggrisco)
Resolution: --- → INVALID
Whiteboard: [caf-crash 208][caf priority: p1][CR 662112][b2g-crash] → [caf-crash 208][caf priority: p1][CR 662112][b2g-crash][POVB]
Observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.054
Moz BuildID: 20140804000204
B2G Version: 2.0
Gecko Version: 32.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=9e5907995c9327f14cb5d182cee5ff16b1743ed4
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=ed87f0a54baf646eb0b20b4debc090c8016d2104
Observed on: 

Device: msm8610
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.6.01.04.00.000.066
Moz BuildID: 20140810160202
B2G Version: 2.0
Gecko Version: 32.0
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=de28796a8956a48bb98ca67df6a33e0622d642d1
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=2b27becae85092d46bfadcd4fb5605e82e1e1093
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: