Closed Bug 1338646 Opened 7 years ago Closed 7 years ago

Crash in js::jit::MBasicBlock::NewWithResumePoint

Categories

(Core :: JavaScript Engine: JIT, defect, P2)

55 Branch
defect

Tracking

()

VERIFIED FIXED
mozilla56
Tracking Status
firefox51 --- unaffected
firefox52 --- wontfix
firefox-esr52 --- fixed
firefox53 --- wontfix
firefox54 --- wontfix
firefox55 --- verified
firefox56 --- verified

People

(Reporter: mccr8, Assigned: nbp)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-98e128af-b31a-4d62-8667-77d782170210.
=============================================================

These all look like null deref crashes. I only see them on 52 and 53. 228 crashes in the last week. Maybe something got fixed that we could backport?
Priority: -- → P1
I have 2 potential bugs that could have moved the needle here:
bug 1328133 or bug 1307651.

I'll look to uplift them to aurora and see if they make a dent in the crashrates here.
They seem to have gotten less on b7 and b7. Not sure why. Moving to P2, unless the volume spikes again.
Priority: P1 → P2
I can reproduce this quickly when clicking around the menu on the left side of https://demo.partkeepr.org/. I get reports like https://crash-stats.mozilla.com/report/index/bp-0717ff7c-2d75-4f0a-a694-f7f152170303 each time.
Mass wontfix for bugs affecting firefox 52.
This bug is affecting one of our products. Can I do anything to help you investigate on the issue?

Florent
(In reply to Florent Fayolle from comment #5)
> This bug is affecting one of our products. Can I do anything to help you
> investigate on the issue?

Yes! Are you able to trigger this reliably? Would it be possible to give someone access to this website? 

You could also try a debug build to see if you get any assertion failures.
It seems that the resume point allocated for the InlinePropertyTable, is either missing, or got already took by some other functions before.  I can think of 2 possible recent related modifications. Either this issue comes from the re-try mechanism added in case of inlining failures, or from the recent work on the Inline Caches.

In all cases, these are not OOM bugs.
I still have this bug in Firefox version 54.0b14 (64-bit) . I tried it with normal Firefox and Firefox-Developer, both are crashing.
I can reproduce this crashes, when I click around on this website for example: https://demo.partkeepr.org/

This is the link directly to my crash report: https://crash-stats.mozilla.com/report/index/3b185e42-9f47-4259-ab87-721450170607
(In reply to Raphael Schachenmayr from comment #8)
> I can reproduce this crashes, when I click around on this website for
> example: https://demo.partkeepr.org/

Thanks for posting this. Where do you click exactly? Does it crash after a few seconds or does it take longer? A screencast or detailed steps to reproduce would be great :)
Flags: needinfo?(r.schachenmayr)
Hi Jan, thanks for the fast reply. I get the crash always on websites with a lot of javascript. In the example it is the Sencha  ExtJS libary. I have my own software with ExtJs and there it crashes also.

There is no special time or something, it crashes when i click some items. An example video you can find here:
https://workupload.com/file/4ewjJDt

If I can help any more, just tell me.

Thank you!!!
Flags: needinfo?(r.schachenmayr)
(In reply to Raphael Schachenmayr from comment #10)
> There is no special time or something, it crashes when i click some items.
> An example video you can find here:
> https://workupload.com/file/4ewjJDt
I get the crash in this video ways with 52.2.0esr on macOS.
Here is crash report: https://crash-stats.mozilla.com/report/index/9a04dc49-d327-4f14-b292-72ae10170616
But I only get the crash after closing Tips window (notes: sometime Tips window doesn't show).
Also I tested enable/disable e10s cases, I get same results.

Also, my customer reports some crashes in our website with ExtJS after updating 52.2.0esr at today.
And I got the crash in our website with ExtJS with 53.0.3 at yesterday.
Crash report in 52.2.0esr on Windows7: https://crash-stats.mozilla.com/report/index/3b9e7d2b-2636-4b57-8e7a-f0c880170616
Crash report in 53.0.3 on macOS: https://crash-stats.mozilla.com/report/index/b4b51bfd-3e80-4edd-9022-726ea1170615
In my case, the crash raise when closing ExtJS's window.
(In reply to Taro Matsuzawa from comment #11)
> But I only get the crash after closing Tips window (notes: sometime Tips
> window doesn't show).

I just get the crash without Tips window in https://demo.partkeepr.org/ .
https://crash-stats.mozilla.com/report/index/16f3c57a-7f15-496f-878b-e4fa01170616
Nicolas can you look at this one?
Flags: needinfo?(nicolas.b.pierron)
(In reply to Jan de Mooij [:jandem] from comment #6)

> Are you able to trigger this reliably? 

I can trigger it by opening two browser tabs with https://demo.partkeepr.org/
Close all "message of the day" and other dialog boxes that appear in any of the two tabs.
After that, in the first tab, click on "1 Semiconductors" in the "Categories" pane on the left hand side; wait until the "parts list" view on the right hand side is updated, and after that click and collapse all of the four transistor entries in the "parts list".

> You could also try a debug build to see if you get any assertion failures.

I've tried a nightly debug build (Version 54.0.1, Build ID 20170615031719), and there is an assertions failure immediately before it crashes:

Assertion failure: priorResumePoint != nullptr, at /builds/slave/m-rel-m64-d-000000000000000000/build/src/js/src/jit/IonBuilder.cpp:4334
Thanks all for finding a way to reproduce this issue.
I will investigate it later today.
I managed to reproduce it immediately, with another url from the crash-stat bug reports.
Whatever I do, I did not managed to reproduce it with https://demo.partkeepr.org/.

I will comment once I know more about this issue.
I found the issue, which is located at the opcodes 332 & 335 of the function at line 1507 and column 229 of https://www.draw.io/js/embed-static.min.js :

# from ifeq @ 00057
00306:  jumptarget                      #
00307:  getarg 2                        # d
00310:  dup                             # d d
00311:  callprop "setId"                # d d.setId
00316:  swap                            # d.setId d
00317:  getarg 1                        # d.setId d c
00320:  dup                             # d.setId d c c
00321:  callprop "getAttribute"         # d.setId d c c.getAttribute
00326:  swap                            # d.setId d c.getAttribute c
00327:  string "id"                     # d.setId d c.getAttribute c "id"
00332:  call 1                          # d.setId d c.getAttribute(...)
00335:  call-ignores-rv 1               # d.setId(...)

which corresponds to:  d.setId(c.getAttribute("id"))

We have a MGetPropertyCache with a prior resume point produced at the opcode 311.
It is reseted by the call at the opcode 332, which is expected, because we should not be allowed to use a resume point from the GetPropertyCache after other effectful instructions got executed in the mean time.
When looking at call opcode 335, we fail to fetch the prior resume point.

I will investigate this last failure, but it seems that we might be missing a guard around the MGetPropertyCache content.
Fyi, bughunter has also reproduced this on Windows and Linux also on 64bit on http://ffsim.tumblr.com/lz

Assertion failure: priorResumePoint != nullptr, at /mozilla/builds/nightly/mozilla/js/src/jit/IonBuilder.cpp:4388

also crashes opt[@ js::jit::MBasicBlock::NewWithResumePoint ]:
bp-e5d20f2f-5f7a-4793-98b0-6605e0170622
bp-7d20abb5-51cf-4eb2-85b7-082e20170622

Appears to have regressed during 55:
bp-a96584a6-6cc9-48e7-8a8e-188710170622
OS: Windows 10 → All
Hardware: x86 → All
Version: unspecified → 55 Branch
(In reply to Bob Clary [:bc:] from comment #18)
> Fyi, bughunter has also reproduced this on Windows and Linux also on 64bit
> on http://ffsim.tumblr.com/lz

That's the one that I have been using so far, but I did not knew if I could paste it on bugzilla.
Which lead me to the analysis in comment 17.
Seems ok. Fortunately we could load it in 54 to see it. Public, nothing really personally identifiable and sfw. I don't see any reason to not share it in bugzilla.
I failed to reproduce this issue in the JS shell. I wish I could figure it
out, but as soon as I mimic the piece of code from comment 17, we no longer
encode the fallback path.
Attachment #8880417 - Flags: review?(jdemooij)
Flags: needinfo?(nicolas.b.pierron)
Comment on attachment 8880417 [details] [diff] [review]
Do not attempt to move the MGetPropertyCache into the fallback path, if the resume point got discarded.

Review of attachment 8880417 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for fixing. Any idea how we could fuzz this better? I wonder if we could add a testing mode where we use polymorphic inlining also if there's a single target... Maybe worth filing a follow-up bug if you think it could make sense.
Attachment #8880417 - Flags: review?(jdemooij) → review+
Pushed by npierron@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7ef778a735f3
Do not attempt to move the MGetPropertyCache into the fallback path, if the resume point got discarded. r=jandem
(In reply to Jan de Mooij [:jandem] from comment #22)
> Thanks for fixing. Any idea how we could fuzz this better? I wonder if we
> could add a testing mode where we use polymorphic inlining also if there's a
> single target... Maybe worth filing a follow-up bug if you think it could
> make sense.

I think the problem might be that we do not have much function to emulate the dom, thus lacking a function which is seen frequently on an object, predictable, but not-inlinable.
https://hg.mozilla.org/mozilla-central/rev/7ef778a735f3
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
I request to uplift this bug to ESR52.
Looking at crash-stats, that seems like a good idea to me too. Nicolas, can you please request Beta/ESR52 approval on this when you get a chance?
Assignee: nobody → nicolas.b.pierron
Flags: needinfo?(nicolas.b.pierron)
Comment on attachment 8880417 [details] [diff] [review]
Do not attempt to move the MGetPropertyCache into the fallback path, if the resume point got discarded.

This patch applies on top of esr52 without merge conflict.

> [Approval Request Comment]
> If this is not a sec:{high,crit} bug, please state case for ESR consideration:
more than ~300 crashes per day

> User impact if declined:
Near-0 segfault.

> Fix Landed on Version:
Firefox 56

> Risk to taking this patch (and alternatives if risky):
Low.

> String or UUID changes made by this patch: 
N/A

> See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.

> Approval Request Comment
> [Feature/Bug causing the regression]:
I have no idea, the code modified in this patch was written in Bug 837312, but the conditions for making it happen only started to appear recently after a heuristic change, which is unlikely to be the root cause.

> [User impact if declined]:
Near-0 segfault.

> [Is this code covered by automated tests?]:
No. (comment 21 & comment 24)

> [Has the fix been verified in Nightly?]:
Locally. Not enough time since landing to verify from crash-stat reports volume:
https://crash-stats.mozilla.com/signature/?version=56.0a1&signature=js%3A%3Ajit%3A%3AMBasicBlock%3A%3ANewWithResumePoint&date=%3E%3D2016-12-28T13%3A41%3A21.000Z&date=%3C2017-06-28T13%3A41%3A21.000Z#graphs

> [Needs manual test from QE? If yes, steps to reproduce]:
Test 1:
 - Visit http://ffsim.tumblr.com/lz
 - Firefox should not crash.
Test 2: see comment 3.

> [List of other uplifts needed for the feature/fix]:
None.

> [Is the change risky?]:
No.

> [Why is the change risky/not risky?]:
This change add a check for one of the mandatory value needed to use the optimize path. If not present, it fallbacks on a generic path which does not require this value.

> [String changes made/needed]:
N/A
Flags: needinfo?(nicolas.b.pierron)
Attachment #8880417 - Flags: approval-mozilla-esr52?
Attachment #8880417 - Flags: approval-mozilla-beta?
Comment on attachment 8880417 [details] [diff] [review]
Do not attempt to move the MGetPropertyCache into the fallback path, if the resume point got discarded.

crash fix for beta55
Attachment #8880417 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: qe-verify+
Comment on attachment 8880417 [details] [diff] [review]
Do not attempt to move the MGetPropertyCache into the fallback path, if the resume point got discarded.

Fix a crash. ESR52+. Let's uplift this to ESR52.3.
Attachment #8880417 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
I have managed to reproduce this issue with an old Nightly build from 2017-02-10 using STR from comment 3.
- https://crash-stats.mozilla.com/report/index/da5aed13-3aa4-4adb-ba7c-745e10170710

This issue is verified fixed on latest Nightly 56.0a1 (2017-07-09) and 55 beta 7 (20170706085221) across platforms:
- Windows 10 x86
- Ubuntu 16.04 x86
- Mac OS X 10.11.5
Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: