The crash happens on ARM due to unaligned access in mop_sf64

RESOLVED FIXED in Q4 11 - Anza

Status

defect
P3
normal
RESOLVED FIXED
8 years ago
8 years ago

People

(Reporter: kenichia, Assigned: kenichia)

Tracking

unspecified
Q4 11 - Anza
Bug Flags:
flashplayer-injection -
flashplayer-qrb +
flashplayer-bug +

Details

(Whiteboard: WE:2932642, loose-end)

Attachments

(2 attachments)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.220 Safari/535.1

Steps to reproduce:

The url of the reproducible swf is described in Watson# 2932642.

The crash happens in avmplus::mop_sf64.
---
    double_overlay d(value);
    uint8_t* u = (uint8_t*)addr;

    *(uint32_t*)u = d.words.lsw;
    *((uint32_t*)u+1) = d.words.msw;
---

Here is a back trace. The address indicates an odd address.
#0  avmplus::mop_sf64 (addr=0x45c5569, value=nan(0xfffffe0000000))

The following assembly code is generated for mop_sf64.
00000000 <avmplus::mop_sf64>:
   0:   e880000c        stm     r0, {r2, r3}
   4:   e12fff1e        bx      lr

mop_sf32 and mop_lf32 were fixed in Bug# 569691, but mop_sf64/lf64 haven't been fixed.
The changeset is https://hg.mozilla.org/tamarin-redux/rev/813358258f11
Probably we need to apply the same fix to some ARM platforms.
Assignee: nobody → jason.boyer
Flags: flashplayer-qrb+
Flags: flashplayer-injection-
Flags: flashplayer-bug+
Priority: -- → P3
Whiteboard: WE:2932642
Target Milestone: --- → Q1 12 - Brannan
Assignee: jason.boyer → lhansen
Kenichi, I see you've submitted a fix for the Watson Express bug.  Please post a diff of the bug for review here and we'll submit it for you.  Thanks!
Assignee: lhansen → kenichia
Whiteboard: WE:2932642 → WE:2932642, loose-end
Target Milestone: Q1 12 - Brannan → Q4 11 - Anza
Posted a diff in mop_lf64/sf64. Please review the patch.
Also, please refer to the Bug 569691, which fixed mop_lf32/sf32.
Comment on attachment 559830 [details] [diff] [review]
fix mop_lf64 and mop_sf64 in instr.cpp

Lars, please review and land.
Attachment #559830 - Flags: review?(lhansen)
Comment on attachment 559830 [details] [diff] [review]
fix mop_lf64 and mop_sf64 in instr.cpp

Kenichi, I don't understand this change.  If unaligned int access causes a fault on the platform then why is it enabled in the config settings?
Attachment #559830 - Attachment is patch: true
Comment on attachment 559830 [details] [diff] [review]
fix mop_lf64 and mop_sf64 in instr.cpp

R- until we resolve the config issue I raised in my previous comment.
Attachment #559830 - Flags: review?(lhansen) → review-
Lars, thank you for your comment.

I just fixed same as the change of mop_sf32(https://hg.mozilla.org/tamarin-redux/rev/813358258f11).

After my investigation, I found that it depends on the ARM instruction type.
When 'str' instruction is generated, unaligned data access is allowed on the device. But, when 'stm' is generated, unaligned data access is not allowed.
Although ARMv7 supports unaligned data access, it seems that not all load/store instructions support it.
(See "ARM Architecture Reference Manual ARMv7-A and ARMv7-R edition" - A3.2.1 Unaligned data access)

In the case of this bug, mop_sf64 is unfortunately compiled to 'stm'.
On the other hand, mop_si32 is compiled to 'str'. Basically, VMCFG_UNALIGNED_INT_ACCESS gives performance benefit on the device.

mop_si32:
    *(int32_t*)(addr) = int32_t(value);
    --> str r1, [r0]

mop_sf64:
    *(uint32_t*)u = d.words.lsw;
    *((uint32_t*)u+1) = d.words.msw;
    --> stm r0, {r2, r3}
Small bit of trivia: the Palm Pre supported unaligned data access and had the OS take the fault and complete the data transfer.  This allowed unaligned data access to be supported but came at a high overhead.  In this scenario, VMCFG_UNALIGNED_INT_ACCESS provided some functionality gains but it was a performance loss.

Resetting review request to R? given update in comment 7.
Attachment #559830 - Flags: review- → review?
Attachment #559830 - Flags: review? → review?(lhansen)
Sorry, I completely overlooked the reply to my previous comment.  Will look at the patch again today.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment on attachment 559830 [details] [diff] [review]
fix mop_lf64 and mop_sf64 in instr.cpp

I do not like this change just as I did not like the change to the 32-bit variants; effectively we're saying that what works on ARM gets to determine what we do on all non-x86 platforms that allow unaligned loads.

That said, it seems that such platforms are few and far between, so the problem is really mostly academic.  R+.  Let me know if you want me to land the patch for you.

There is an impact from this work on bug #692042, I'll make a note on that bug.
Attachment #559830 - Flags: review?(lhansen) → review+
Comment on attachment 559830 [details] [diff] [review]
fix mop_lf64 and mop_sf64 in instr.cpp

The ABC/Wordcode interpreters don't call mop_xxx() functions, instead they inline equivalent code.  Does that code also need patching?
FWIW, its also not ideal if the JIT is calling mop_anything(); the call overhead could easily be higher than the cost of the load+shift+or instructions;  See this ifdef nest in CodegenLIR.cpp:

#if NJ_EXPANDED_LOADSTORE_SUPPORTED && defined(VMCFG_UNALIGNED_INT_ACCESS) && defined(VMCFG_LITTLE_ENDIAN)
    #define VMCFG_MOPS_USE_EXPANDED_LOADSTORE_INT
#endif

#if NJ_EXPANDED_LOADSTORE_SUPPORTED && defined(VMCFG_UNALIGNED_FP_ACCESS) && defined(VMCFG_LITTLE_ENDIAN)
    #define VMCFG_MOPS_USE_EXPANDED_LOADSTORE_FP
#endif

If the VMCFG_UNALIGNED_[INT|FP]_ACCESS and VMCFG_LITTLE_ENDIAN are not set properly, then we are compiling calls to these helper functions, which will be slow.
(In reply to Edwin Smith from comment #11)
> Comment on attachment 559830 [details] [diff] [review] [diff] [details] [review]
> fix mop_lf64 and mop_sf64 in instr.cpp
> 
> The ABC/Wordcode interpreters don't call mop_xxx() functions, instead they
> inline equivalent code.  Does that code also need patching?

It does not need patching.  The interpreters rely on unaligned FP access if that's available but otherwise call the mop_xxx functions.  See macros MOPS_LOAD_FP and MOPS_LOAD_INT.
(In reply to Edwin Smith from comment #12)
> FWIW, its also not ideal if the JIT is calling mop_anything(); the call
> overhead could easily be higher than the cost of the load+shift+or
> instructions;  See this ifdef nest in CodegenLIR.cpp:
> 
> #if NJ_EXPANDED_LOADSTORE_SUPPORTED && defined(VMCFG_UNALIGNED_INT_ACCESS)
> && defined(VMCFG_LITTLE_ENDIAN)
>     #define VMCFG_MOPS_USE_EXPANDED_LOADSTORE_INT
> #endif
> 
> #if NJ_EXPANDED_LOADSTORE_SUPPORTED && defined(VMCFG_UNALIGNED_FP_ACCESS) &&
> defined(VMCFG_LITTLE_ENDIAN)
>     #define VMCFG_MOPS_USE_EXPANDED_LOADSTORE_FP
> #endif
> 
> If the VMCFG_UNALIGNED_[INT|FP]_ACCESS and VMCFG_LITTLE_ENDIAN are not set
> properly, then we are compiling calls to these helper functions, which will
> be slow.

The actual problem is that VMCFG_UNALIGNED_INT_ACCESS has richer semantics than we originally realized because of the over-eager optimizations in the ARM compilers; it's not a question of setting these defines properly, it's a question of properly understanding what their meaning is and using them correctly.

Specifically, VMCFG_UNALIGNED_INT_ACCESS may not be used as a license to access floating point data with integer load instructions.

Sadly this restriction is similar to the no-aliasing restriction, which we've not been good at tackling, but thankfully it's a more controlled problem than that...
Note the docs are *way* too subtle here:  avmfeatures.as says that an unaligned load "… will be at least as efficient as separate instructions to load and assembly the word one byte at a time."  That looks like an assertion but it's actually a *requirement* for enabling the setting.  Rephrasing is desirable, and we *must* document the fact that we're not allowed to use int instructions for FP accesss.
Just clarify what the rules are by beefing up the documentation in avmfeatures.as (patch also includes generated comment for avmfeatures.h).

The use of ( rather than { in the union declaration in the example is unfortunately necessary because { plays a part in the E4X XML literal syntax.
Attachment #565159 - Flags: review?(edwsmith)
Attachment #565159 - Flags: review?(edwsmith) → review+
changeset: 6633:dc04812ad032
user:      Lars T Hansen <lhansen@adobe.com>
summary:   For 685441 - The crash happens on ARM due to unaligned access in mop_sf64: Clean up the documentation for AVMSYSTEM_UNALIGNED_x_ACCESS (r=edwsmith)

http://hg.mozilla.org/tamarin-redux/rev/dc04812ad032
changeset: 6635:d74aa43b6864
user:      Lars T Hansen <lhansen@adobe.com>
summary:   Fix 685441 - The crash happens on ARM due to unaligned access in mop_sf64 (patch=kenichia, r=lhansen, push=lhansen)

http://hg.mozilla.org/tamarin-redux/rev/d74aa43b6864
(In reply to Edwin Smith from comment #12)
> FWIW, its also not ideal if the JIT is calling mop_anything(); the call
> overhead could easily be higher than the cost of the load+shift+or
> instructions;  See this ifdef nest in CodegenLIR.cpp:
> 
> #if NJ_EXPANDED_LOADSTORE_SUPPORTED && defined(VMCFG_UNALIGNED_INT_ACCESS)
> && defined(VMCFG_LITTLE_ENDIAN)
>     #define VMCFG_MOPS_USE_EXPANDED_LOADSTORE_INT
> #endif
> 
> #if NJ_EXPANDED_LOADSTORE_SUPPORTED && defined(VMCFG_UNALIGNED_FP_ACCESS) &&
> defined(VMCFG_LITTLE_ENDIAN)
>     #define VMCFG_MOPS_USE_EXPANDED_LOADSTORE_FP
> #endif
> 
> If the VMCFG_UNALIGNED_[INT|FP]_ACCESS and VMCFG_LITTLE_ENDIAN are not set
> properly, then we are compiling calls to these helper functions, which will
> be slow.

Will open a separate PACMAN item to track this.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
(In reply to Lars T Hansen from comment #19)

> Will open a separate PACMAN item to track this.

Bug #694026.
You need to log in before you can comment on or make changes to this bug.