Closed Bug 1026919 Opened 10 years ago Closed 10 years ago

IonMonkey: (ARM) Simplify the assembler buffer with constant pools.

Categories

(Core :: JavaScript Engine: JIT, defect)

ARM
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33
Tracking Status
firefox31 --- wontfix
firefox32 --- fixed
firefox33 --- fixed
b2g-v2.0 --- fixed
b2g-v2.1 --- fixed

People

(Reporter: dougc, Assigned: dougc)

References

Details

(Keywords: verifyme, Whiteboard: QAExclude)

Attachments

(6 files, 23 obsolete files)

559.47 KB, patch
mjrosenb
: review+
RyanVM
: checkin+
Details | Diff | Splinter Review
128.33 KB, patch
dougc
: review+
Details | Diff | Splinter Review
576.13 KB, patch
dougc
: checkin+
Details | Diff | Splinter Review
124.99 KB, patch
Details | Diff | Splinter Review
566.33 KB, patch
Details | Diff | Splinter Review
125.24 KB, patch
Details | Diff | Splinter Review
The ARM backend uses an assembler buffer with support for constant pools. This code has long standing correctness issues that can be addressed relatively quickly and safely by simplifying the code. The simplification reduces the allocator to use only one pool, and to only allocate to a forward pool. This removes a lot of code paths and focuses test coverage on the remainder. The remaining code has be scrutinized for bugs and several corrections made.

This simplification abandons some useful optimizations, such as allocating to backwards pools (the ARM instructions support both positive and negative offsets for PC relative addressing modes), and does not take advantage of the offset reach of some instruction classes. This is expected to cause pools to be placed more frequently and there could be some performance loss. Measurements suggest that the performance loss will be small, a few % at most.

The simplified code no longer has the correctness issues and does not bail out of JIT compilation upon allocation failures, so code affect by these issues will perform much better. This is a particular issue with asm.js code because bailing out affects the entire module. ARMv6 code depends on the pool allocation much more than ARMv7 code so fixing the correctness issues is a particularly good improvement for ARMv6 based devices.

A patch has been tested back to beta (31) which is an ESR release. FWIW Debian Linux appears to use the ESR releases and a Debian derivative Raspbian is used on the Raspberry Pi which is ARMv6 based. This patch would be a good enabler for the Raspberry Pi.

An attempt has been made to split out as much as possible into separate standalone patches, and many of these appear to be landing. The remainder will be combined into a patch for fuzzing, and perhaps we can give the fuzzers a challenge for a change, and increase the case for, and confidence in, the patch.
(In reply to Douglas Crosher [:dougc] from comment #0)
> The simplified code no longer has the correctness issues and does not bail
> out of JIT compilation upon allocation failures, so code affect by these
> issues will perform much better. This is a particular issue with asm.js code
> because bailing out affects the entire module. ARMv6 code depends on the
> pool allocation much more than ARMv7 code so fixing the correctness issues
> is a particularly good improvement for ARMv6 based devices.

I was worried about the performance impact of this Assembler-bailing mechanism (aborting compilation is also not very asm.js-like) and I think removing it outweighs a small performance loss caused by the assembler buffer simplification. Great work!
Initial patch. The major changes are to IonAssemblerBufferWithConstantPools.h, the other changes are minor. The patch does not need much modification for Aurora (32) and Beta (31) so long as the dependant patches land. Just noticed that pool init. needs cleaning up and Assembler::align() needs some attention, and shall make some more passes over it.

https://tbpl.mozilla.org/?tree=Try&rev=a01775c25cd2
Attached patch Combined patch for fuzzing, v1. (obsolete) — — Splinter Review
Combined patch, bug 1026905, bug 1012694, bug 1020834, and bug 1026919 relative to m-c revision 1cea544c74c5. These patch the weak links in the ARM backend that I am aware of, and might be a worthy target for fuzzing?

The ARM backend still fails for large code objects, around 32M, but should now fail gracefully, rather than through a random crash.

This patch adds two js shell command line arguments that might be of interest for testing:

 --arm-asm-nop-fill=<n> fills the code with nop instructions and can test for a class of errors. The obvious value of <n> to test is 1, but it should work with any amount of fill so long as it does not exceed the code size limit.

 --arm-pool-size=<n> the maximum constant pool size. The default and maximum is 1024. I have not yet explored the lower limit. Lowering the pool size may change the pool placement and frequency of pools.

Alternatively I can supply patches relative to Aurora or Beta.
Attachment #8442085 - Flags: feedback?(choller)
Depends on: 1027079
Comment on attachment 8442085 [details] [diff] [review]
Combined patch for fuzzing, v1.

(In reply to Douglas Crosher [:dougc] from comment #3)
> Created attachment 8442085 [details] [diff] [review]
> Combined patch for fuzzing, v1.
> 
>  --arm-asm-nop-fill=<n> fills the code with nop instructions and can test
> for a class of errors. The obvious value of <n> to test is 1, but it should
> work with any amount of fill so long as it does not exceed the code size
> limit.

Any minimum, maximum value?

> 
>  --arm-pool-size=<n> the maximum constant pool size. The default and maximum
> is 1024. I have not yet explored the lower limit. Lowering the pool size may
> change the pool placement and frequency of pools.

Likewise, any minimum value here?
Attachment #8442085 - Flags: feedback?(gary)
Flags: needinfo?(dtc-moz)
Depends on: 1027441
Depends on: 1027476
Flags: needinfo?(dtc-moz)
Bug 1020834 is also sizeable and bundled in the fuzz patch. Fuzzing them together seems a good strategy because there seems little point fuzzing with a weak link.
Depends on: 1020834, 1026905
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #4)
> Comment on attachment 8442085 [details] [diff] [review]
> Combined patch for fuzzing, v1.
> 
> (In reply to Douglas Crosher [:dougc] from comment #3)
> > Created attachment 8442085 [details] [diff] [review]
> > Combined patch for fuzzing, v1.
> > 
> >  --arm-asm-nop-fill=<n> fills the code with nop instructions and can test
> > for a class of errors. The obvious value of <n> to test is 1, but it should
> > work with any amount of fill so long as it does not exceed the code size
> > limit.
> 
> Any minimum, maximum value?

The minimum is zero, which is the default. The maximum is 2^31-1, but keep in mind that this adds (n * 4) bytes for most instructions emitted so it will hit the 32M code size limit as <n> increases. Large values could be useful for testing the code size limits. It will also slow down the code. I suggest just testing a nop fill of 0 and 1 for a start because a nop fill of 1 might detect most errors in this class.

> > 
> >  --arm-pool-size=<n> the maximum constant pool size. The default and maximum
> > is 1024. I have not yet explored the lower limit. Lowering the pool size may
> > change the pool placement and frequency of pools.
> 
> Likewise, any minimum value here?

The lower usable minimum is 5.  Actually it might be poorly named, it is the pc relative offset limit to a pool entry encoded in an instruction. The encoded offset is restricted to be less than the supplied size <n>. This indirectly controls the pool size. Some ARM instructions limit this offset to being less than 1024 which sets the upper limit for <n> of 1024. A low value of 5 permits an encoded offset of 4 which is just enough to reach over a pool guard plus header word (the encoding has a bias of 8 on the ARM to give a total of 12, or 3x4). Perhaps this will be renamed in review.
Attached patch Combined patch for fuzzing, v2. (obsolete) — — Splinter Review
Sorry, last minute assertions added to the code picked up some issues.

Revised combined patch, bug 1027476, bug 1027441, bug 1026905, bug 1012694, bug 1020834, and bug 1026919 relative to m-c revision f78e532e8a10.

Another option that this patch opens for fuzzing is the ARMv6 code paths which are now expected to pass testing. These code generation paths can be set in the simulator or on hardware using the ARMHWCAP environment variable. e.g. export ARMHWCAP="vfp" for ARMv6 testing.
Attachment #8442085 - Attachment is obsolete: true
Attachment #8442085 - Flags: feedback?(gary)
Attachment #8442085 - Flags: feedback?(choller)
Attachment #8442681 - Flags: feedback?(gary)
Attachment #8442681 - Flags: feedback?(choller)
(In reply to Douglas Crosher [:dougc] from comment #7)
> Created attachment 8442681 [details] [diff] [review]
> Combined patch for fuzzing, v2.
> 
> Sorry, last minute assertions added to the code picked up some issues.
> 
> Revised combined patch, bug 1027476, bug 1027441, bug 1026905, bug 1012694,
> bug 1020834, and bug 1026919 relative to m-c revision f78e532e8a10.
> 
> Another option that this patch opens for fuzzing is the ARMv6 code paths
> which are now expected to pass testing. These code generation paths can be
> set in the simulator or on hardware using the ARMHWCAP environment variable.
> e.g. export ARMHWCAP="vfp" for ARMv6 testing.

Fwiw this combined patch passes tbpl jit-tests on ARVMv6 hardware, namely the Raspberry Pi, and also on the simulator set for ARMv6 code generation.
Thanks for letting us know about suggested minimum and maximum values!

(In reply to Douglas Crosher [:dougc] from comment #7)
> Another option that this patch opens for fuzzing is the ARMv6 code paths
> which are now expected to pass testing. These code generation paths can be
> set in the simulator or on hardware using the ARMHWCAP environment variable.
> e.g. export ARMHWCAP="vfp" for ARMv6 testing.

Can this be a run-time flag (for ARM-simulator/ARM-hardware builds) instead? e.g. --enable-armv6
Flags: needinfo?(dtc-moz)
Depends on: 1028008
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #9)
> Thanks for letting us know about suggested minimum and maximum values!
> 
> (In reply to Douglas Crosher [:dougc] from comment #7)
> > Another option that this patch opens for fuzzing is the ARMv6 code paths
> > which are now expected to pass testing. These code generation paths can be
> > set in the simulator or on hardware using the ARMHWCAP environment variable.
> > e.g. export ARMHWCAP="vfp" for ARMv6 testing.
> 
> Can this be a run-time flag (for ARM-simulator/ARM-hardware builds) instead?
> e.g. --enable-armv6

Bug 1028008 adds an --arm-hwcap=[capabilities] js shell argument. The capabilities are the same as the for the ARMHWCAP environment variable. This is probably more useful than a simple --enable-armv6 argument because you might want to test different sets of capabilities someday. For ARMv6 testing use --arm-hwcap=vfp  I'll update the combined fuzz patch soon to include this.
Flags: needinfo?(dtc-moz)
Comment on attachment 8442681 [details] [diff] [review]
Combined patch for fuzzing, v2.

Thanks for taking all of these into account! :)

I'll wait for the new patch before continuing testing as I'll need to update the harness to support the new flags and values.
Attachment #8442681 - Flags: feedback?(gary)
Blocks: fxos-rpi
Rebase, and made another pass over the code cleaning up style issues.
Attachment #8442043 - Attachment is obsolete: true
Attached patch Combined patch for fuzzing, v3. (obsolete) — — Splinter Review
Revised combined patch, bug 1027476, bug 1027441, bug 1028008, bug 1012694, bug 1020834, and bug 1026919 relative to m-c revision 366b5c0c02d3.

This picks up the latest versions of all the patches, and includes support for the --arm-hwcap js shell argument for ARMv6 code path testing.
Attachment #8442681 - Attachment is obsolete: true
Attachment #8442681 - Flags: feedback?(choller)
Attachment #8444229 - Flags: feedback?(gary)
Attachment #8444229 - Flags: feedback?(choller)
I'm likely to get to this sometime later this week.
Rework the implementation of align(). Made another pass over the code addressing style issues.

Renamed the --arm-pool-size js shell argument to --asm-pool-max-offset which is more meaningful.

Putting the patch up for review as requested. Feedback welcomed.
Attachment #8444227 - Attachment is obsolete: true
Attachment #8445805 - Flags: review?(jdemooij)
Attached patch Combined patch for fuzzing, v4. (obsolete) — — Splinter Review
Revised combined patch, bug 1020834, and bug 1026919 relative to m-c revision a19e0434ea52.

Many of the dependant patches have now landed. This combined patch picks up the latest versions of the remaining patches.

It includes support for the --arm-hwcap js shell argument for ARMv6 code path testing.

The --arm-pool-size js shell argument has been renamed to --asm-pool-max-offset which is more meaningful.

If fuzzing is not urgent then it might be best to wait for review and revision.
Attachment #8444229 - Attachment is obsolete: true
Attachment #8444229 - Flags: feedback?(gary)
Attachment #8444229 - Flags: feedback?(choller)
Comment on attachment 8445805 [details] [diff] [review]
Simplify the assembler buffer with constant pools.

Review of attachment 8445805 [details] [diff] [review]:
-----------------------------------------------------------------

Nice work, it does look simpler! I'm not very familiar with the constant pools, so I added some requests for comments below. Hopefully once these are addressed everything is clearer and I can take a closer look :)

::: js/src/jit/arm/Assembler-arm.cpp
@@ +1531,5 @@
>    public:
>      enum LoadType {
>          // set 0 to bogus, since that is the value most likely to be
>          // accidentally left somewhere.
> +        poolDTR    = 0,

Nit: this comment no longer matches the code. Also, please capitalize the enum values: PoolDTR etc.

@@ +2791,5 @@
> +
> +uint32_t Assembler::AsmPoolMaxOffset = 1024;
> +
> +uint32_t
> +Assembler::getPoolMaxOffset() {

Nit: { on its own line.

::: js/src/jit/arm/Assembler-arm.h
@@ +1238,5 @@
>      BufferOffset actualOffset(BufferOffset) const;
>      static uint32_t NopFill;
>      static uint32_t getNopFill();
> +    static uint32_t AsmPoolMaxOffset;
> +    static uint32_t getPoolMaxOffset();

Nit: capitalize

@@ +2250,5 @@
>  
>  class AutoForbidPools {
>      Assembler *masm_;
>    public:
> +    AutoForbidPools(Assembler *masm, int32_t maxBytes) : masm_(masm) {

It'd be good to add a comment explaining the maxBytes argument. Also make this uint32_t, as it will never be negative.

::: js/src/jit/arm/MacroAssembler-arm.cpp
@@ +1707,5 @@
>      uint32_t descriptor = MakeFrameDescriptor(framePushed(), JitFrame_IonJS);
>  
>      Push(Imm32(descriptor)); // descriptor_
>  
> +    enterNoPool(2 * 4);

Is "4" the instruction size? In that case, 2 * sizeof(Instruction) would make that more explicit, also a few times below.

@@ +4343,5 @@
>  CodeOffsetLabel
>  MacroAssemblerARMCompat::toggledJump(Label *label)
>  {
>      // Emit a B that can be toggled to a CMP. See ToggleToJmp(), ToggleToCmp().
>      

Nit: pre-existing trailing whitespace; you can just remove this blank line.

::: js/src/jit/shared/IonAssemblerBufferWithConstantPools.h
@@ +25,1 @@
>      const int bias;

Make these unsigned, and add a brief explanation for each of them.

@@ +28,4 @@
>      uint8_t *poolData;
>      uint32_t numEntries;
>      uint32_t buffSize;
>      LoadOffsets loadOffsets;

Same here. Also, could we make these private and add accessors for the ones we need outside the class?

@@ +36,1 @@
>      int limitingUsee;

These could also use a comment.

@@ +260,5 @@
>      AssemblerBufferWithConstantPool(int guardSize_, int headerSize_, int footerSize_,
> +                                    int instBufferAlign_, uint32_t poolMaxOffset_, int pcBias_,
> +                                    uint32_t alignFillInst_, uint32_t nopFillInst_,
> +                                    uint32_t nopFill_ = 0)
> +        : entryCount(0), guardSize(guardSize_), headerSize(headerSize_),

Pre-existing, but it's pretty weird that the arguments have a trailing "_" instead of the members. Can you change that while you're here? (so we'd have guardSize_(guardSize) etc)

@@ +370,5 @@
> +        // The alloction of pool entries is not supported in a no-pool region,
> +        // check.
> +        JS_ASSERT_IF(numPoolEntries, !canNotPlacePool);
> +
> +        if (this->oom() && !this->bail())

Does removing bail() completely require other patches?

@@ +470,5 @@
> +        BufferSlice *perforatedSlice = *getTail();
> +        BufferOffset perforation = this->nextOffset();
> +        Parent::perforate();
> +        perforatedSlice->isNatural = false;
> +        IonSpew(IonSpew_Pools, "[%d] Adding a perforation at offset %d", id, perforation.getOffset());

What is "perforation" exactly? Could you add a comment explaining it?

@@ +542,5 @@
> +    }
> +
> +    void enterNoPool(int32_t maxBytes) {
> +        // Don't allow re-entry.
> +        JS_ASSERT(!canNotPlacePool);

Does this mean we can change the type of canNotPlacePool to |bool|?

@@ +569,5 @@
> +    void leaveNoPool() {
> +        canNotPlacePool--;
> +        JS_ASSERT(!canNotPlacePool);
> +
> +        // Validate the maxBytes argument supplied to enterNoPool().

Nice!

@@ +579,5 @@
> +        // Given a buffer offset return the cumulative size of the pools before
> +        // this offset. This is used to convert a buffer offset to its final
> +        // actual offset.
> +        int cur = 0;
> +        while(cur < numDumps && poolInfo[cur].offset <= offset)

Nit: space between "while" and "(". Also, I'd use "unsigned cur = 0;"

@@ +588,5 @@
> +    }
> +
> +    void align(uint32_t alignment)
> +    {
> +        JS_ASSERT((alignment & (alignment - 1)) == 0);

Nit: JS_ASSERT(IsPowerOfTwo(alignment));

IsPowerOfTwo is defined in jsutil.h

@@ +649,5 @@
>          mozilla::DebugOnly<Chunk *> start = (Chunk*)dest_;
>          Chunk *dest = (Chunk*)(((uint32_t)dest_ + instBufferAlign - 1) & ~(instBufferAlign -1));
>          int curIndex = 0;
>          int curInstOffset = 0;
>          JS_ASSERT(start == dest);

I don't understand the Chunk abstraction. It's just a single instruction right? If that's true, can we just use: |uint32_t *destIns| or |Inst *destIns| or whatever, and kill the mozilla::Array typedef and casts?

@@ +650,5 @@
>          Chunk *dest = (Chunk*)(((uint32_t)dest_ + instBufferAlign - 1) & ~(instBufferAlign -1));
>          int curIndex = 0;
>          int curInstOffset = 0;
>          JS_ASSERT(start == dest);
>          for (BufferSlice * cur = *getHead(); cur != nullptr; cur = cur->getNext()) {

Nit: BufferSlice *cur

@@ +660,2 @@
>                  // Is the current instruction a branch?
> +                if (cur->isBranch[idx >> 3] & (1 << (idx & 7))) {

Can you make isBranch private, rename it to isBranch_ and add an isBranch(unsigned index) method to BufferSlice or something? Then we move some complexity out of this loop, and isBranch() can also do bounds checking etc maybe.

@@ +663,4 @@
>                      patchBranch((Inst*)&src[idx], curIndex, BufferOffset(curInstOffset));
>                  }
>                  memcpy(&dest[idx], &src[idx], sizeof(Chunk));
>              }

This loop seems more complicated than it has to be. Wouldn't something like this work:

Inst *destInst = ...;
...
for (Inst *srcInst = cur->insStart(), srcInstEnd = cur->insEnd();
     srcInst < srcInstEnd;
     srcInst++, destInst++)
{
    // If the current instruction is a branch, patch it.
    if (cur->isBranch(srcInst))
        patchBranch(srcInst, curIndex, BufferOffset(curInstOffset));
    *destInst = *srcInst;
}

@@ +663,5 @@
>                      patchBranch((Inst*)&src[idx], curIndex, BufferOffset(curInstOffset));
>                  }
>                  memcpy(&dest[idx], &src[idx], sizeof(Chunk));
>              }
> +            dest += cur->size() / InstBaseSize;

Then you can remove this line as well.

@@ +724,2 @@
>          int32_t offset;
>          int32_t poolNum;

Can these be uint32_t?
Attachment #8445805 - Flags: review?(jdemooij) → feedback+
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #14)
> I'm likely to get to this sometime later this week.

Sorry, I've been distracted this week by other stuff + build failures, so will wait again for post-review and revisions.
I'll get to fuzzing of v4 tomorrow most likely :)
(In reply to Christian Holler (:decoder) from comment #19)
> I'll get to fuzzing of v4 tomorrow most likely :)

Could I request deferring this testing for now. All the dependent bugs have now landed so it's just down to this bug now, and the patch is being revised based on the feedback in comment 17. A new patch will be out in a few days.
Attached patch Cleanup of the assembler and backend. (obsolete) — — Splinter Review
This patch address only style issues and some dead code, and makes no substantive changes. These have been split out to keep this noise out of the substantive patch.

* Made a pass over the comments. Capitalizing sentences, adding full stops, reformatting for consistency, and correct some spelling mistakes.

* Checked the enum member names and tried to rename them as suggested in feedback in comment 17.

* Picked up some other minor style issues, space between binary operations.

* Renamed some private variable names as name_ as directed in comment 17.

If the changes look ok then it would be nice to land this patch soon as it will probably bit-rot quickly.

If you spot any other classes of style issues then I'll address them too?
Attachment #8450959 - Flags: review?(jdemooij)
This is a wip patch that addresses many of issues raised in the review of comment 17, but I intend to add more documentation and still need to explore some of the issues.
Attachment #8445805 - Attachment is obsolete: true
Attachment #8445807 - Attachment is obsolete: true
Attachment #8450963 - Attachment is obsolete: true
Attachment #8451497 - Flags: review?(jdemooij)
(In reply to Jan de Mooij [:jandem] from comment #17)
> Comment on attachment 8445805 [details] [diff] [review]
> Simplify the assembler buffer with constant pools.
...

Thank you for the feedback. Most of the issues raised have been addressed.

> > +        if (this->oom() && !this->bail())
> 
> Does removing bail() completely require other patches?

bail() is still used when the code is larger than 32MB. This is not addressed in this patch.

> @@ +649,5 @@
> >          mozilla::DebugOnly<Chunk *> start = (Chunk*)dest_;
> >          Chunk *dest = (Chunk*)(((uint32_t)dest_ + instBufferAlign - 1) & ~(instBufferAlign -1));
> >          int curIndex = 0;
> >          int curInstOffset = 0;
> >          JS_ASSERT(start == dest);
> 
> I don't understand the Chunk abstraction. It's just a single instruction
> right? If that's true, can we just use: |uint32_t *destIns| or |Inst
> *destIns| or whatever, and kill the mozilla::Array typedef and casts?

The Chunk appears to represent a block of size InstBaseSize. This is 32 bit for the ARM. The intention appears to be to make the code ready for different InstBaseSize sizes. This has been retained for now.

> @@ +663,4 @@
> >                      patchBranch((Inst*)&src[idx], curIndex, BufferOffset(curInstOffset));
> >                  }
> >                  memcpy(&dest[idx], &src[idx], sizeof(Chunk));
> >              }
> 
> This loop seems more complicated than it has to be. Wouldn't something like
> this work:
> 
> Inst *destInst = ...;
> ...
> for (Inst *srcInst = cur->insStart(), srcInstEnd = cur->insEnd();
>      srcInst < srcInstEnd;
>      srcInst++, destInst++)
> {
>     // If the current instruction is a branch, patch it.
>     if (cur->isBranch(srcInst))
>         patchBranch(srcInst, curIndex, BufferOffset(curInstOffset));
>     *destInst = *srcInst;
> }

With Chunk retained this is not practical, but the loop has been cleaned up a bit more in other ways.

There is the option to hard-code in a 32 bit instruction size and this would simplify the code further, and some of the code already makes this assumption.

Looking forward, would the assembler-buffer-with-constant-pools always be layered on another implementation and be specific to the ARM and MIPS with a 32 bit instruction size? If the x86 and x64 ever get a patchable elastic assembler buffer then they still may not use the assembler-buffer-with-constant-pools.

If you want to hard code in the 32 bit instruction size for the assembler-buffer-with-constant-pools then let me know and I'll clean it up further?
Flags: needinfo?(jdemooij)
Attached patch 1. Cleanup of the assembler and backend. (obsolete) — — Splinter Review
Fixed a few more style issues in IonAssemblerBuffer.h.
Attachment #8450959 - Attachment is obsolete: true
Attachment #8450959 - Flags: review?(jdemooij)
Attachment #8451547 - Flags: review?(jdemooij)
Rebase.
Attachment #8451497 - Attachment is obsolete: true
Attachment #8451497 - Flags: review?(jdemooij)
Attachment #8451548 - Flags: review?(jdemooij)
Attached patch 1. Cleanup of the assembler and backend. (obsolete) — — Splinter Review
Spotted some more suspect class static method names and class static variables that need the first letter capitalized, and made another pass looking to these style issues.
Attachment #8452155 - Flags: review?(mrosenberg)
Attachment #8451547 - Attachment is obsolete: true
Attachment #8451547 - Flags: review?(jdemooij)
Rebase.
Attachment #8451548 - Attachment is obsolete: true
Attachment #8451548 - Flags: review?(jdemooij)
Attachment #8452180 - Flags: review?(jdemooij)
Comment on attachment 8452155 [details] [diff] [review]
1. Cleanup of the assembler and backend.

Review of attachment 8452155 [details] [diff] [review]:
-----------------------------------------------------------------

Bunch of nits. Might as well fix them, since this is a clean up patch.

::: js/src/jit/arm/Architecture-arm.h
@@ +38,5 @@
>  // These offsets are related to bailouts.
>  ////
>  
> +// Size of each bailout table entry. On arm, this is presently a single call
> +// (which is wrong!). the call clobbers lr.

Capitalize 'the' while you're here.

@@ +39,5 @@
>  ////
>  
> +// Size of each bailout table entry. On arm, this is presently a single call
> +// (which is wrong!). the call clobbers lr.
> +// For now, I've dealt with this by ensuring that we never allocate to lr.  it

Single space, capitalize 'it'

::: js/src/jit/arm/Assembler-arm.cpp
@@ +96,5 @@
>      return r.code() << 8;
>  }
>  
> +// Encode a standard register when it is being used as src1, the dest, and an
> +// extra register.  For these, an InvalidReg is used to indicate a optional

One space after period.

@@ +234,2 @@
>          return (InstNOP*) (&i);
>      return nullptr;

Since these are all basically the same, we should probably standardize on not having the parens around &i

@@ +671,5 @@
>  
>          // Make sure we're branching to the same register.
>  #ifdef DEBUG
> +        // A toggled call sometimes has a NOP instead of a branch for the third
> +        // instruction.  No way to assert that it's valid in that situation.

One space after period

@@ +2174,4 @@
>      return writeVFPInst(sz, ls | 0x01000000 | addr.encode() | VD(vd) | c, dest);
>  }
>  
> +// VFP's ldm/stm work differently from the standard arm ones.  You can only

One space after period.

@@ +2486,5 @@
>  {
>      return;
>  }
>  
> +// The size of an arbitrary 32-bit call in the instruction stream.  On ARM this

One space after period

@@ +2499,3 @@
>  {
>      Instruction *inst = (Instruction *) start.raw();
> +    // Overwrite whatever instruction used to be here with a call.  Since the

One space after period.

@@ +2774,5 @@
>  
> +    *inst = InstALU(InvalidReg, index, imm8, OpCmp, SetCond, Always);
> +    // NOTE: we don't update the Auto Flush Cache!  this function is currently
> +    // only called from within AsmJSModule::patchHeapAccesses, which does that
> +    // for us.  Don't call this!

one space after period.

@@ +2782,5 @@
>  {
>      // Work around pools with an artificial pool guard and around nop-fill.
>      i = i->skipPool();
>  }
> +Assembler *Assembler::Dummy = nullptr;

I'm not sure the capitalized-static values rule holds for non-functions.

::: js/src/jit/arm/Assembler-arm.h
@@ +19,5 @@
>  
>  namespace js {
>  namespace jit {
>  
> +// NOTE: there are duplicates in this list!  Sometimes we want to specifically

One space.

@@ +128,5 @@
>  static MOZ_CONSTEXPR_VAR FloatRegister d13(FloatRegisters::d13);
>  static MOZ_CONSTEXPR_VAR FloatRegister d14(FloatRegisters::d14);
>  static MOZ_CONSTEXPR_VAR FloatRegister d15(FloatRegisters::d15);
>  
> +// For maximal awesomeness, 8 should be sufficent.  ldrd/strd (dual-register

One space (and again two lines down)

@@ +161,5 @@
>  uint32_t VN(VFPRegister vr);
>  uint32_t VM(VFPRegister vr);
>  
> +// For being passed into the generic vfp instruction generator when there is an
> +// instruction that only takes two registers

Terminating period.

@@ +247,5 @@
>      IsStore = 0 << 20
>  };
> +// You almost never want to use this directly.  Instead, you wantto pass in a
> +// signed constant, and let this bit be implicitly set for you.  this is
> +// however, necessary if we want a negative index

One space (x2), capitalize 'this', terminating period.

@@ +331,5 @@
> +// component Imm8data structures.  The reason this is so horribly round-about is
> +// I wanted to have Imm8 and RegisterShiftedRegister inherit directly from
> +// Operand2 but have all of them take up only a single word of storage.  I also
> +// wanted to avoid passing around raw integers at all since they are error
> +// prone.

bunch of double spaces here.

@@ +440,5 @@
>        : imm4L(-1U & 0xf), imm4H(-1U & 0xf), isInvalid(-1)
>      { }
>  
>      Imm8VFPImmData(uint32_t imm)
> +      : imm4L(imm&0xf), imm4H(imm >> 4), isInvalid(0)

Spaces around '&' as well.

@@ +666,5 @@
> +// An offset from a register to be used for ldr/str.  This should include the
> +// sign bit, since ARM has "signed-magnitude" offsets.  That is it encodes an
> +// unsigned offset, then the instruction specifies if the offset is positive or
> +// negative.  The +/- bit is necessary if the instruction set wants to be able
> +// to have a negative register offset e.g. ldr pc, [r1,-r2];

More spaces.

@@ +1146,5 @@
>          { }
>      };
>  
> +    // TODO: this should actually be a pool-like object.  It is currently a big
> +    // hack, and probably shouldn't exist

Single space, terminal period.

@@ +1168,5 @@
> +    // but pass it a destination address, which will be overwritten with a new
> +    // instruction. In order to do this very after assembly buffers no longer
> +    // exist, when calling with a third dest parameter, a this object is still
> +    // needed.  dummy always happens to be null, but we shouldn't be looking at
> +    // it in any case.

Spaces, not sure if the variable 'Dummy' should be capitalized, but if it should be, then it should be updated in the comment.

@@ +1389,2 @@
>      BufferOffset as_FImm64Pool(VFPRegister dest, double value, Condition c = Always);
> +    // Load a 32 bit floating point immediate from a pool into a register

terminal periods, here and above.

@@ +1396,4 @@
>      BufferOffset as_bx(Register r, Condition c = Always, bool isPatchable = false);
>  
> +    // Branch can branch to an immediate *or* to a register.  Branches to
> +    // immediates are pc relative, branches to registers are absolute

spaces, terminal periods.

@@ +1488,1 @@
>      // determined by the float2core.

Single space.

@@ +1494,4 @@
>      // to uniquely specify the encoding that we are going to use.
>      BufferOffset as_vcvt(VFPRegister vd, VFPRegister vm, bool useFPSCR = false,
>                           Condition c = Always);
> +    // Hard coded to a 32 bit fixed width result for now

period.

@@ +1716,5 @@
>      }
>  }; // Assembler
>  
> +// An Instruction is a structure for both encoding and decoding any and all ARM
> +// instructions.  many classes have not been implemented thusfar.

spaces, capitals.

@@ +1769,5 @@
>      const uint32_t *raw() const { return &data; }
>      uint32_t size() const { return 4; }
>  }; // Instruction
>  
> +// Make sure that it is the right size

period.

@@ +2113,5 @@
>          bool a = value >> 7;
>          bool b = value >> 6 & 1;
>          bool B = !b;
>          uint32_t cdefgh = value & 0x3f;
> +        return a << 31 | B << 30 | rep(b, 8) << 22 | cdefgh << 16;

Personally, I like the multi-line setup better, possibly with more alignment of the operands to |<<|

::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ +425,5 @@
>                      Register src = ToRegister(lhs);
>                      uint32_t shift = FloorLog2(constant);
>                      uint32_t rest = constant - (1 << shift);
> +                    // See if the constant has one bit set, meaning it can be
> +                    // encoded as a bitshift

Period.

@@ +432,5 @@
>                          handled = true;
>                      } else {
> +                        // If the constant cannot be encoded as (1<<C1), see if
> +                        // it can be encoded as (1<<C1) | (1<<C2), which can be
> +                        // computed using an add and a shift

Period. You may also want to make the comments follow our coding guidelines, e.g. (1 << C1)

@@ +682,5 @@
>      // If (Y < 0), then we compare X with 0, and bail if X == 0
> +    // If (Y == 0), then we simply want to bail.  Since this does not set the
> +    // flags necessary for LT to trigger, we don't test X, and take the bailout
> +    // because the EQ flag is set.
> +    // If (Y > 0), we don't set EQ, and we don't trigger LT, so we don't take

Single space.

@@ +1104,5 @@
> +    // $pc+8.  In other words, there is an empty word after the branch into the
> +    // switch table before the table actually starts.  Since the only other
> +    // unhandled case is the default case (both out of range high and out of
> +    // range low) I then insert a branch to default case into the extra slot,
> +    // which ensures we don't attempt to execute the address table.

Spaces.

::: js/src/jit/arm/MacroAssembler-arm.cpp
@@ +26,5 @@
>  isValueDTRDCandidate(ValueOperand &val)
>  {
>      // In order to be used for a DTRD memory function, the two target registers
> +    // need to be a) Adjacent, with the tag larger than the payload, and b)
> +    // Aligned to a multiple of two.

Capitalize 'need'

@@ +90,5 @@
> +// was clamped to INT_MIN/INT_MAX, and we can test it.  NOTE: if the value
> +// really was supposed to be INT_MAX / INT_MIN then it will be wrong.
> +//
> +// 2. Convert the floating point value to an integer, if it did not fit, then it
> +// set one or two bits in the fpcsr.  Check those.

Spaces x2

@@ +125,5 @@
>      if (negativeZeroCheck) {
>          ma_cmp(dest, Imm32(0));
> +        // Test and bail for -0.0, when integer result is 0.  Move the top word
> +        // of the double into the output reg, if it is non-zero, then the
> +        // original value was -0.0

One  space, trailing period.

@@ +154,5 @@
>      if (negativeZeroCheck) {
>          ma_cmp(dest, Imm32(0));
> +        // Test and bail for -0.0, when integer result is 0.  Move the float
> +        // into the output reg, and if it is non-zero then the original value
> +        // was -0.0

One space, trailing period.

@@ +252,5 @@
> +    // For the most part, there is no good reason to set the condition codes for
> +    // the first instruction.  We can do better things if the second instruction
> +    // doesn't have a dest, such as check for overflow by doing first operation
> +    // don't do second operation if first operation overflowed.  this preserves
> +    // the overflow condition code.  Unfortunately, it is horribly brittle.

One space x 3, capitalize 'this'

@@ +286,5 @@
> +    // points, dest can be replaced (nearly always invalid => ScratchRegister)
> +    // This is useful if we wish to negate tst.  tst has an invalid (aka not
> +    // used) dest, but its negation is bic *requires* a dest.  We can
> +    // accomodate, but it will need to clobber *something*, and the scratch
> +    // register isn't being used, so...

Spaces.

@@ +294,5 @@
>      }
>  
>      if (HasMOVWT()) {
> +        // If the operation is a move-a-like then we can try to use movw to move
> +        // the bits into the destination.  Otherwise, we'll need to fall back on

Spaces.

@@ +321,5 @@
> +            // already exists.  If it can't be done with a single movw then we
> +            // *need* to use two instructions since this must be some sort of a
> +            // move operation, we can just use a movw/movt pair and get the
> +            // whole thing done in two moves.  This does not work for ops like
> +            // add, sinc we'd need to do:

Spaces x2, 'sinc'->'since'

@@ +351,5 @@
> +    // Since we want the condition codes (and don't know which ones will be
> +    // checked), we need to assume that the overflow flag will be checked and
> +    // add{,s} dest, src, 0xff00; add{,s} dest, dest, 0xff is not guaranteed to
> +    // set the overflow flag the same as the (theoretical) one instruction
> +    // variant.

Spaces, spaces everywhere.  also, some first words need to be capitalized.

@@ +368,5 @@
>          as_movw(ScratchRegister, imm.value & 0xffff, c);
>          if ((imm.value >> 16) != 0)
>              as_movt(ScratchRegister, (imm.value >> 16) & 0xffff, c);
>      } else {
> +        // Going to have to use a load.  If the operation is a move, then just

Spaces

@@ +423,5 @@
>      }
>      switch(rs) {
>        case L_MOVWT:
>          as_movw(dest, Imm16(imm & 0xffff), c, i);
> +        // I can be nullptr here.  that just means "insert in the next in

'i' was referring to the variable here, and should remain lower case.

@@ +1653,5 @@
>      Register base = Register::FromCode(addr.base());
>      if (off > -1024 && off < 1024)
>          return as_vdtr(ls, rt, addr.toVFPAddr(), cc);
>  
> +    // We cannot encode this offset in a a single ldr.  Try to encode it as an

Spaces.

@@ +2493,3 @@
>          as_vxfer(output, InvalidReg, ScratchFloat32Reg.uintOverlay(), FloatToCore);
> +        // See if this value *might* have been an exact integer after adding
> +        // 0.5.  This tests the 1/2 through 1/16,777,216th places, but 0.5 needs

Spaces.

@@ +3082,5 @@
>  MacroAssemblerARMCompat::branchTestValue(Condition cond, const ValueOperand &value, const Value &v,
>                                           Label *label)
>  {
> +    // If cond == NotEqual, branch when a.payload != b.payload || a.tag !=
> +    // b.tag.  If the payloads are equal, compare the tags. If the payloads are

Spaces.

@@ +3628,5 @@
> +// Also ION is breaking the ARM EABI here (sort of). The ARM EABI says that a
> +// function call should move the pc into the link register, then branch to the
> +// function, and *sp is data that is owned by the caller, not the callee.  The
> +// ION ABI says *sp should be the address that we will return to when leaving
> +// this function

Spaces, trailing period.

@@ +3633,5 @@
>  void
>  MacroAssemblerARM::ma_callIon(const Register r)
>  {
> +    // When the stack is 8 byte aligned, we want to decrement sp by 8, and write
> +    // pc+8 into the new sp.  when we return from this call, sp will be its

Spaces, capitalize 'when'

@@ +3652,5 @@
>  
>  void
>  MacroAssemblerARM::ma_callIonHalfPush(const Register r)
>  {
> +    // The stack is unaligned by 4 bytes.  We push the pc to the stack to align

Spaces.

@@ +4188,5 @@
>  
> +    // The argument is a positive number, truncation is the path to glory. Since
> +    // it is known to be > 0.0, explicitly convert to a larger range, then a
> +    // value that rounds to INT_MAX is explicitly different from an argument
> +    // that clamps to INT_MAX

Trailing period.

@@ +4212,5 @@
>      ma_vcvt_U32_F64(ScratchDoubleReg.uintOverlay(), ScratchDoubleReg);
>      compareDouble(ScratchDoubleReg, input);
>      ma_add(output, Imm32(1), output, NoSetCond, NotEqual);
> +    // Negate the output.  Since INT_MIN < -INT_MAX, even after adding 1, the
> +    // result will still be a negative number.

Spaces.

@@ +4217,5 @@
>      ma_rsb(output, Imm32(0), output, SetCond);
>      // Flip the negated input back to its original value.
>      ma_vneg(input, input);
> +    // If the result looks non-negative, then this value didn't actually fit
> +    // into the int range, and special handling is required.  Zero is also

Spaces x2.

@@ +4268,5 @@
>      ma_rsb(output, Imm32(0), output, SetCond);
>      // Flip the negated input back to its original value.
>      ma_vneg_f32(input, input);
> +    // If the result looks non-negative, then this value didn't actually fit
> +    // into the int range, and special handling is required.  Zero is also

Spaces x2.

@@ +4430,5 @@
>  
> +    // The argument is a positive number, truncation is the path to glory; Since
> +    // it is known to be > 0.0, explicitly convert to a larger range, then a
> +    // value that rounds to INT_MAX is explicitly different from an argument
> +    // that clamps to INT_MAX

Trailing period.

@@ +4447,5 @@
>      ma_b(&fin);
>  
>      bind(&handleNeg);
> +    // Negative case, negate, then start dancing.  This number may be positive,
> +    // since we added 0.5

Spaces, trailing period.

@@ +4458,5 @@
>      ma_vcvt_U32_F64(ScratchDoubleReg.uintOverlay(), ScratchDoubleReg);
>      compareDouble(ScratchDoubleReg, tmp);
>      ma_sub(output, Imm32(1), output, NoSetCond, Equal);
> +    // Negate the output.  Since INT_MIN < -INT_MAX, even after adding 1, the
> +    // result will still be a negative number

Spaces, trailing period.

@@ +4497,5 @@
>  
> +    // The argument is a positive number, truncation is the path to glory; Since
> +    // it is known to be > 0.0, explicitly convert to a larger range, then a
> +    // value that rounds to INT_MAX is explicitly different from an argument
> +    // that clamps to INT_MAX

Trailing period.

@@ +4506,5 @@
>      ma_b(&fin);
>  
>      bind(&handleZero);
> +    // Move the whole float32 into the output reg, if it is non-zero, then the
> +    // original value was -0.0

Trailing period.

@@ +4514,5 @@
>      ma_b(&fin);
>  
>      bind(&handleNeg);
> +    // Negative case, negate, then start dancing.  This number may be positive,
> +    // since we added 0.5

Spaces, trailing period.

@@ +4525,5 @@
>      ma_vcvt_U32_F32(ScratchFloat32Reg.uintOverlay(), ScratchFloat32Reg);
>      compareFloat(ScratchFloat32Reg, tmp);
>      ma_sub(output, Imm32(1), output, NoSetCond, Equal);
> +    // Negate the output.  Since INT_MIN < -INT_MAX, even after adding 1, the
> +    // result will still be a negative number

spaces, trailing period.

::: js/src/jit/arm/MacroAssembler-arm.h
@@ +109,5 @@
>      void ma_movPatchable(ImmPtr imm, Register dest, Assembler::Condition c,
>                           RelocStyle rs, Instruction *i = nullptr);
> +
> +    // These should likely be wrapped up as a set of macros or something like
> +    // that.  I cannot think of a good reason to explicitly have all of this

Spaces.

@@ +278,5 @@
>      void ma_sdiv(Register num, Register div, Register dest, Condition cond = Always);
>      void ma_udiv(Register num, Register div, Register dest, Condition cond = Always);
>  
> +    // Memory:
> +    // Sshortcut for when we know we're transferring 32 bits of data.

'Sshortcut'->'shortcut'

@@ +424,5 @@
>      }
>  
>  private:
>      // Implementation for transferMultipleByRuns so we can use different
> +    // iterators for forward/backward traversals.  The sign argument should be 1

Spaces.

@@ +470,1 @@
>      // alignment requirements

Trailing period.

::: js/src/jit/shared/CodeGenerator-x86-shared.cpp
@@ +2005,5 @@
>  {
>      // Ensure that there is enough space in the buffer for the OsiPoint
>      // patching to occur. Otherwise, we could overwrite the invalidation
>      // epilogue.
> +    for (size_t i = 0; i < sizeof(void *); i+= Assembler::NopSize())

space before |+=|

::: js/src/jit/shared/IonAssemblerBuffer.h
@@ +31,5 @@
>      // A BOffImm is a Branch Offset Immediate. It is an architecture-specific
> +    // structure that holds the immediate for a pc relative branch.  diffB takes
> +    // the label for the destination of the branch, and encodes the immediate
> +    // for the branch.  This will need to be fixed up later, since A pool may be
> +    // inserted between the branch and its destination

Spaces x2, trailing period.

::: js/src/jit/shared/IonAssemblerBufferWithConstantPools.h
@@ +861,4 @@
>              // Since the pool entry was filled back-to-front, and in the next buffer, the elements
>              // should be front-to-back, this insertion also needs to proceed backwards
>              int idx = outcasts[poolIdx].length();
>              for (BufferOffset *iter = outcasts[poolIdx].end()-1;

Spaces around the '-'.

@@ +865,5 @@
>                   iter != outcasts[poolIdx].begin()-1;
>                   --iter, --idx) {
>                  pools[poolIdx].updateLimiter(*iter);
>                  Inst *inst = this->getInst(*iter);
> +                Asm::InsertTokenIntoTag(pools[poolIdx].instSize, (uint8_t*)inst, outcasts[poolIdx].end()-1-iter);

Spaces around the '-'es.

@@ +870,1 @@
>                  pools[poolIdx].insertEntry(&outcastEntries[poolIdx][idx*pools[poolIdx].immSize], *iter, this->LifoAlloc_);

spaces around '*'.

@@ +872,5 @@
>              delete[] outcastEntries[poolIdx];
>          }
>          // this (*2) is not technically kosher, but I want to get this bug fixed.
>          // It should actually be guardSize + the size of the instruction that we're attempting
>          // to insert. Unfortunately that vaue is never passed in.  On ARM, these instructions

Spaces.

@@ +919,5 @@
>              if (shouldMarkAsBranch)
>                  this->markNextAsBranch();
>          }
>  
>          // We have a perforation.  Time to cut the instruction stream, patch in the pool

Spaces.

@@ +953,2 @@
>                      // it already only aligns when the pool has data in it, but we want to not
>                      // align when all entries will end up in the backwards half of the pool

Trailing period.

@@ +953,5 @@
>                      // it already only aligns when the pool has data in it, but we want to not
>                      // align when all entries will end up in the backwards half of the pool
>                      poolOffset = p->align(poolOffset);
>                      IonSpew(IonSpew_Pools, "[%d] Entry %d in pool %d is before the pool.", id, idx, poolIdx);
>                      // Everything here is known, we can safely do the necessary substitutions

Trailing period.

@@ +954,5 @@
>                      // align when all entries will end up in the backwards half of the pool
>                      poolOffset = p->align(poolOffset);
>                      IonSpew(IonSpew_Pools, "[%d] Entry %d in pool %d is before the pool.", id, idx, poolIdx);
>                      // Everything here is known, we can safely do the necessary substitutions
>                      Inst * inst = this->getInst(*iter);

For a change, too many spaces around |*|

@@ +962,1 @@
>                      // pool entry that is being loaded.  We need to do a non-trivial amount

Spaces.
Attachment #8452155 - Flags: review?(mrosenberg)
Attached patch 1. Cleanup of the assembler and backend. (obsolete) — — Splinter Review
Address the reviewer feedback in comment 30, thank you for the detailed feedback. Fixed more issues while addressing these. Fixes for the MIPS changes which are now confirmed to build.

The naming of class static variables is unresolved and I am requesting a call to be made on this? The patch capitalizes them. For example:
    static Assembler *Dummy;
Attachment #8452155 - Attachment is obsolete: true
Attachment #8452366 - Flags: feedback?(jdemooij)
(In reply to Douglas Crosher [:dougc] from comment #31)
> Created attachment 8452366 [details] [diff] [review]
> 1. Cleanup of the assembler and backend.
> 
> Address the reviewer feedback in comment 30, thank you for the detailed
> feedback. Fixed more issues while addressing these. Fixes for the MIPS
> changes which are now confirmed to build.
> 
> The naming of class static variables is unresolved and I am requesting a
> call to be made on this? The patch capitalizes them. For example:
>     static Assembler *Dummy;

fwiw, doing a search for ".  " still returns a handful of results in the right column (sorry if I didn't catch those the first time)
Thank you for the feedback. Doubled checked for ".  " and fixed all in the arm/ directory, and found more comments to clean up.

There is still the issue of 'static Assembler *Dummy;', can you accept this being capitalized, or I'll revert it?

Apparently enum member names should be capitalized, and are the follow an exception? Even if these need changing, could this is left for another patch?
class Registers { enum RegisterID { r0 = 0, ... }}
class FloatRegisters { enum FPRegisterID { d0, ...}}
Attachment #8452366 - Attachment is obsolete: true
Attachment #8452366 - Flags: feedback?(jdemooij)
Attachment #8452797 - Flags: review?(mrosenberg)
Attachment #8452797 - Flags: review?(mrosenberg) → review+
The change to ensureOsiSpace() seems bad, got a crash in local testing. It does appear that there can be multiple OSI points at the same code location for the ARM too, and some fill appears to be necessary. The change to ensureOsiSpace() has been reverted.

There are still have some concerns about ensureOsiSpace(). It might be inserting excessive fill. Patching the code without taking into account pools looks like trouble if the patched code is not abandoned. The need to know code instruction positions at compile time is a blocker for planned work. Need to study this code, however these would be pre-existing issues that could be dealt with in a separate bug.
Attachment #8452180 - Attachment is obsolete: true
Attachment #8452180 - Flags: review?(jdemooij)
Attachment #8453761 - Flags: review?(jdemooij)
Attached patch 1.2 Cleanup of the assembler and backend. (obsolete) — — Splinter Review
A few more style issues and fixes to the prior patch. Had already removed dumpPool(), it just called flushBuffer().
Attachment #8454475 - Flags: review?(mrosenberg)
Approval Request Comment
[Feature/regressing bug #]: General style issues.

[User impact if declined]: None. This is only to help developers. There have been a lot of trivial changes between 32 and 33 and these make backporting difficult, time consuming, and more risky. There are other substantial patches to uplift.

[Describe test coverage new/current, TBPL]: The patch has landed on m-c. The backport has been tested locally; the tbpl run has problems not seen locally and they appear unrelated: https://tbpl.mozilla.org/?tree=Try&rev=741d05353097

[Risks and why]: Negligible. This only reformats comments and renames some method, variables, and enum member names to conform to the expected style. Errors are most likely to be caught at compile time.

[String/UUID change made/needed]:
Attachment #8454853 - Flags: approval-mozilla-aurora?
Approval Request Comment
[Feature/regressing bug #]: General style issues.

[User impact if declined]: None. This is only to help developers. There have been a lot of trivial changes between 31 and 33 and these make backporting difficult, time consuming, and more risky. There are other substantial patches to uplift although they are not expect to land until a later ESR 31 point release, however given the negligible risk you might want to consider landing this now.

[Describe test coverage new/current, TBPL]: The patch has landed on m-c. The backport has been tested locally; the tbpl run has problems not seen locally and they appear unrelated: https://tbpl.mozilla.org/?tree=Try&rev=4cd33e705987

[Risks and why]: Negligible. This only reformats comments and renames some methods, variables, and enum member names to conform to the expected style. Errors are most likely to be caught at compile time.

[String/UUID change made/needed]:
Attachment #8454856 - Flags: approval-mozilla-beta?
Comment on attachment 8454856 [details] [diff] [review]
1. Cleanup of the assembler and backend. Backport for ESR 31.

Withdraw the request for FF31, shall try again for ESR31.
Attachment #8454856 - Attachment description: 1. Cleanup of the assembler and backend. Backport for Beta 31. → 1. Cleanup of the assembler and backend. Backport for ESR 31.
Attachment #8454856 - Flags: approval-mozilla-beta?
(In reply to Douglas Crosher [:dougc] from comment #43)
> Comment on attachment 8454856 [details] [diff] [review]
> 1. Cleanup of the assembler and backend. Backport for ESR 31.
> 
> Withdraw the request for FF31, shall try again for ESR31.

After reading up on the ESR criteria it's now clear to me that this work would not be in scope. "Backports of any functional enhancements and/or stability fixes would not be in scope." Sorry for the noise.
Doug, sorry for the delay. I'll do my best to review this patch tomorrow.

And yeah let's be careful with backports. These patches fix some issues but they are also very big and it's easy to introduce new bugs (or expose old bugs) :)
Attachment #8454475 - Flags: review?(mrosenberg) → review+
mjrosenb concurs that this code can be specialized for a fixed 32-bit instruction size and this will address some of your feedback in comment 17 related to the 'Chunk' type and rewriting some of the loops. I'll have a new patch out soon.

(In reply to Jan de Mooij [:jandem] from comment #45)
> Doug, sorry for the delay. I'll do my best to review this patch tomorrow.
> 
> And yeah let's be careful with backports. These patches fix some issues but
> they are also very big and it's easy to introduce new bugs (or expose old
> bugs) :)
Flags: needinfo?(jdemooij)
Simplify a little more using the assumption of a fix 32-bit instruction size.
Attachment #8453761 - Attachment is obsolete: true
Attachment #8453761 - Flags: review?(jdemooij)
Attachment #8456082 - Flags: review?(jdemooij)
Comment on attachment 8456082 [details] [diff] [review]
2. Simplify the assembler buffer with constant pools.

Review of attachment 8456082 [details] [diff] [review]:
-----------------------------------------------------------------

Nice work! The comments were really useful. My comments below are mostly nits.

It would be great if we could land it this week/cycle (sorry for the review delay).

::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ +1112,5 @@
>      int32_t cases = mir->numCases();
>      // Lower value with low value.
>      masm.ma_sub(index, Imm32(mir->low()), index, SetCond);
>      masm.ma_rsb(index, Imm32(cases - 1), index, SetCond, Assembler::NotSigned);
> +    AutoForbidPools afp(&masm, 1 + 3 + cases);

Is the 1 for the ma_b and the 3 for the instructions ma_ldr can emit? Please add a short comment explaining these numbers.

::: js/src/jit/shared/IonAssemblerBufferWithConstantPools.h
@@ +11,5 @@
>  
>  #include "jit/IonSpewer.h"
>  #include "jit/shared/IonAssemblerBuffer.h"
>  
> +// This code extends the AssemblerBuffer to support the pooling of values loaded

Thanks, awesome comment! Very helpful.

@@ +27,5 @@
> +// stored separately from the buffer slice data during assembly and is
> +// concatenated together with the instructions when finishing the buffer. The
> +// parent AssemblerBuffer slices are extended by the BufferSliceTail class below
> +// to include the pool information. The pool guard instructions (the branches
> +// around the pools) are included in the buffer slices.

Can you also describe the actual pool format somewhere, like why it needs a header etc?

@@ +30,5 @@
> +// to include the pool information. The pool guard instructions (the branches
> +// around the pools) are included in the buffer slices.
> +//
> +// To illustrate.  Without the pools there are just slices of instructions. The
> +// slice size may varying - they need not be completely full.

Nit: s/varying/vary

@@ +75,5 @@
> +// distance, between instructions and their pool entries, for all such
> +// pairs. This pair is recorded as the limiter, and it is updated when new pool
> +// entries are added, see updateLimiter()
> +//
> +// It is often required to keep a reference to an pool entry, to patch it after

Nit: s/an/a

@@ +76,5 @@
> +// pairs. This pair is recorded as the limiter, and it is updated when new pool
> +// entries are added, see updateLimiter()
> +//
> +// It is often required to keep a reference to an pool entry, to patch it after
> +// the buffer is finished. Each pool entry is assigned a unique in, counting up

s/in/id?

@@ +81,5 @@
> +// from zero (see the poolEntryCount slot below). The can be mapped back to the
> +// offset of the pool entry in the finished buffer, see poolEntryOffset().
> +//
> +// The code supports no-pool regions, and for these the size of the region, in
> +// instruction bytes, must be supplied. This size is used to determine if

It's now the number of instructions instead of instruction bytes right?

@@ +91,5 @@
> +// The only planned instruction sets that require inline constant pools are the
> +// ARM, ARM64, and MIPS, and these all have fixed 32-bit sized instructions so
> +// for simplicity the code below is specialized for 32-bit sized instructions
> +// and makes no attempt to support variable length instructions. The base
> +// assembler buffer which supports variable width instruction is by the x86 and

s/is by/is used by

@@ +149,3 @@
>      {
>      }
>      static const int garbage=0xa5a5a5a5;

Nit: static const unsigned Garbage = 0xa5a5a5a5;

@@ +198,3 @@
>      }
>  
> +    int insertEntry(unsigned num, uint8_t *data, BufferOffset off, LifoAlloc &LifoAlloc_) {

Nit: LifoAlloc &lifoAlloc

@@ +198,5 @@
>      }
>  
> +    int insertEntry(unsigned num, uint8_t *data, BufferOffset off, LifoAlloc &LifoAlloc_) {
> +        if (numEntries_ + num >= buffSize) {
> +            // Grow the poolData_ vector.

Would be nice if we could make poolData_ a Vector<PoolAllocUnit, 8, IonAllocPolicy> or something, then we don't have to reallocate the vector ourselves (and we get bounds checks and inline storage for small pools etc).

That requires passing a TempAllocator& to the constructor though; that may require more changes and I'm not entirely sure about it, so it's probably best to file a follow-up for this.

@@ +233,4 @@
>  };
>  
>  
> +template <size_t SliceSize, unsigned InstSize>

Nit: make them both size_t

@@ +237,5 @@
>  struct BufferSliceTail : public BufferSlice<SliceSize> {
> +  private:
> +    // Bit vector to record which instructions in the slice have a branch, so
> +    // that they can be patched when the final positions are known.
> +    mozilla::Array<uint8_t, (SliceSize + (8 * InstSize - 1)) / (8 * InstSize)> isBranch_;

Why do we add 8 * IntSize - 1? I expected something like this:

mozilla::Array<uint8_t, (SliceSize / InstSize) / sizeof(uint8_t)>

We can static_assert SliceSize is a multiple of InstSize if needed.

@@ +244,1 @@
>      bool isNatural : 1;

Can you add a comment here as well? isNatural is stored in the pool header, right? It'd be good to briefly explain why we care about this, here or in the big comment.

@@ +244,2 @@
>      bool isNatural : 1;
>      BufferSliceTail *getNext() {

Nit: getNext() const {

@@ +245,5 @@
>      BufferSliceTail *getNext() {
>          return (BufferSliceTail *)this->next;
>      }
> +    BufferSliceTail() : pool(nullptr), isNatural(true) {
> +        memset(&isBranch_[0], 0, sizeof(isBranch_));

mozilla::PodArrayZero(isBranch_);

(see mfbt/PodOperations.h, it has an overload for Array<..>)

@@ +250,3 @@
>      }
>      void markNextAsBranch() {
> +        size_t idx = this->nodeSize / InstSize;

Nit: MOZ_ASSERT(nodeSize % IntSize == 0); before this line.

@@ +254,3 @@
>      }
> +    bool isBranch(unsigned idx) const {
> +        return (isBranch_[idx >> 3] >> (idx & 0x7)) & 1;

Nit: MOZ_ASSERT(idx < nodeSize / InstSize);

@@ +266,4 @@
>  
> +// The InstSize is the sizeof(Inst) but is needed here because the buffer is
> +// defined before the Instruction.
> +template <int SliceSize, unsigned InstSize, class Inst, class Asm>

Nit: size_t SliceSize, size_t InstSize

@@ +270,1 @@
>  struct AssemblerBufferWithConstantPool : public AssemblerBuffer<SliceSize, Inst> {

Nit: this should be Pools (plural) right, to match the file name? Can be a follow-up patch/bug.

@@ +277,4 @@
>      class PoolEntry {
> +        size_t index_;
> +      public:
> +        PoolEntry(size_t index) : index_(index) {

Nit: "explicit PoolEntry"

@@ +305,5 @@
> +    // pool placement and frequency of pool placement.
> +    const size_t poolMaxOffset_;
> +
> +    // The bias on pc relative addressing mode offsets, in units of byts. The
> +    // ARM as a bias of 8 bytes.

Nit: s/as/has

@@ +309,5 @@
> +    // ARM as a bias of 8 bytes.
> +    const unsigned pcBias_;
> +
> +    // The current working pool. Copied out as needed before resetting.
> +    Pool pool;

Nit: the other vars have a trailing underscore, so pool_

@@ +317,3 @@
>  
> +    // The number of times we've dumped a pool.
> +    unsigned numDumps;

Nit: numDumps_

@@ +337,3 @@
>      PoolInfo *poolInfo;
> +    // The number of PoolInfo elements available to use in the above poolInfo vector.
> +    size_t poolInfoSize;

Nit: trailing underscore here too.

Also it'd be good if we could use Vector<PoolInfo, 8, IonAllocPolicy> here as well.

@@ +346,5 @@
> +    // The buffer offset when entering the no-pool region.
> +    size_t canNotPlacePoolStartOffset;
> +    // The maximum number of word sized instructions declared for the no-pool
> +    // region.
> +    size_t canNotPlacePoolMaxInst;

Nit: underscores here too.

@@ +362,3 @@
>      // Inhibit the insertion of fill NOPs in the dynamic context in which they
>      // are being inserted.
>      bool inhibitNops;

And here.

@@ +368,1 @@
>      int id;

Nit: can we make this #ifdef DEBUG ?

@@ +377,1 @@
>      }

It looks like all callers of getHead and getTail dereference the result immediately, to get a BufferSlice*. Can you change the return type to BufferSlice * to simplify the callers?

@@ +382,5 @@
>          if (!tmp) {
>              this->m_oom = true;
>              return nullptr;
>          }
>          new (tmp) BufferSlice;

This can be simplified a bit:

BufferSlice *slice = a.new_<BufferSlice>();
if (!slice) {
    ...
}
return slice;

@@ +407,2 @@
>      void initWithAllocator() {
> +        poolInfo = static_cast<PoolInfo *>(this->LifoAlloc_.alloc(sizeof(PoolInfo) * poolInfoSize));

poolInfo = this->LifoAlloc_.newArrayUninitialized<PoolInfo>(poolInfoSize);

Also s/LifoAlloc_/lifoAlloc_.

@@ +416,1 @@
>      const PoolInfo & getInfo(int x) const {

Nit: const PoolInfo &getInfo(unsigned x) const {

@@ +416,5 @@
>      const PoolInfo & getInfo(int x) const {
>          static const PoolInfo nil = {0,0,0};
>          if (x < 0 || x >= numDumps)
>              return nil;
>          return poolInfo[x];

Is x ever out-of-bounds? If not we can just:

MOZ_ASSERT(x < numDumps);
return poolInfo[x];

@@ +491,5 @@
> +        }
> +        if (numPoolEntries)
> +            return pool.insertEntry(numPoolEntries, data, this->nextOffset(), this->LifoAlloc_);
> +        else
> +            return INT_MIN;

Nit: no else after a return. Also it'd be good to add a one-line comment here to explain what it means to return INT_MIN here.

@@ +513,3 @@
>              IonSpewStart(IonSpew_Pools, "[%d] data is: 0x", id);
> +            size_t length = numPoolEntries * sizeof(PoolAllocUnit);
> +#if IS_LITTLE_ENDIAN

Let's wrap the whole "if (numPoolEntries) {" in an #ifdef DEBUG or something.

And we don't have a JIT on big endian platforms, AFAIK, so I'd just remove this #if and assume little endian (worst-case you get bogus spew, we can deal with that if it ever becomes a problem).

@@ +607,5 @@
> +        Parent::perforate();
> +        perforatedSlice->isNatural = false;
> +        IonSpew(IonSpew_Pools, "[%d] Adding a perforation at offset %d", id, perforation.getOffset());
> +
> +        // With the pools final position determined it is now possible to patch

Nit: s/pools/pool's

@@ +625,5 @@
> +             ++iter, ++idx)
> +        {
> +            // All entries should be before the pool.
> +            JS_ASSERT(iter->getOffset() < perforation.getOffset());
> +            // Everything here is known so we can safely do the necessary

Nit: space before this line.

@@ +629,5 @@
> +            // Everything here is known so we can safely do the necessary
> +            // substitutions.
> +            Inst *inst = this->getInst(*iter);
> +            size_t codeOffset = poolOffset - iter->getOffset();
> +            // That is, PatchConstantPoolLoad wants to be handed the address of

And here. We don't do this consistently, but most people do and IMO it reads better when there's a blank line before comments (except when the comment is at the start of a block).

@@ +642,5 @@
> +        // Record the pool info.
> +        if (numDumps >= poolInfoSize) {
> +            // Grow the poolInfo vector.
> +            poolInfoSize *= 2;
> +            PoolInfo *tmp = static_cast<PoolInfo *>(this->LifoAlloc_.alloc(sizeof(PoolInfo) * poolInfoSize));

PoolInfo *tmp = LifoAlloc_.newArrayUninitialized<PoolInfo>(poolInfoSize);

But again, this should be a Vector so that we don't have to take care of this ourselves, but that doesn't have to happen in this patch.

@@ +665,3 @@
>              return;
>          }
> +        memcpy(*tmp, &pool, sizeof(Pool));

Nit: mozilla::PodCopy(*tmp, &pool);

@@ +792,5 @@
> +        JS_ASSERT(pool.numEntries() == 0);
> +        // Expecting the dest_ to already be aligned to instBufferAlign_, check.
> +        JS_ASSERT(uintptr_t(dest_) == ((uintptr_t(dest_) + instBufferAlign_ - 1) & ~(instBufferAlign_ - 1)));
> +        // Assuming the Instruction size is 4 bytes, check.
> +        JS_STATIC_ASSERT(InstSize == sizeof(uint32_t));

Nit: static_assert(InstSize == sizeof(uint32_t), "Assuming instruction size is 4 bytes");
Attachment #8456082 - Flags: review?(jdemooij) → review+
Comment on attachment 8454853 [details] [diff] [review]
1. Cleanup of the assembler and backend. Backport for Aurora 32.

Skimming through the patch this does look to be low risk. Happy to take a change that makes developers' lives easier. Aurora+
Attachment #8454853 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Address reviewer feedback, thank you for great feedback. All the suggestions have been integrated except: the 'id' has been kept to avoid lots of DEBUG conditionals; the code still manages the vector allocation itself and a follow up bug will be filed to explore using the library vector code.

Carrying forward r+.

https://tbpl.mozilla.org/?tree=Try&rev=bbd7ff49265f
Attachment #8454475 - Attachment is obsolete: true
Attachment #8456082 - Attachment is obsolete: true
Attachment #8456890 - Flags: review+
Attachment #8454853 - Attachment is obsolete: true
Rebase, take two. The ARM backend builds now locally, tested with debug enabled and disabled. Carrying forward the approval from attachment 8454853 [details] [diff] [review].
Attachment #8456969 - Attachment is obsolete: true
Part 2 for m-c is ready to checkin. The try run in comment 50 looks green enough and it builds and passes the jit-tests locally. https://bugzilla.mozilla.org/attachment.cgi?id=8456890
Spoke to :dougc over IRC - I didn't get to fuzzing this with the new flags, but since this new assembler buffer might fix quite a few older ARM bugs, we'll land this first and file more bugs later.
Update and rebase.
Attachment #8454859 - Attachment is obsolete: true
Options available for testing:

* --asm-pool-max-offset=<n>  OR  environment variable ASM_POOL_MAX_OFFSET

  Range of <n>: 5 to 1024
  Default: 1024
  Description: The maximum pc relative OFFSET permitted in pool reference instructions.
  Introduced in bug: 1026919
  Available: 33

* --arm-asm-nop-fill=<n>  OR  environment variable ARM_ASM_NOP_FILL

  Range: 0 to 2^31-1
  Default: 0
  Suggested testing range: 0 and 1
  Description: Insert the given number of NOP instructions at all possible pool locations.
  Introduced in bug: 1020834
  Available: 32, 33. (Only useful in 33)

* --arm-hwcap="capability1 capability2 ..."  OR  environment variable ARMHWCAP

  Default (HW): read from /proc/cpuinfo
  Default (Simulator): "armv7 vfp vfpv3 vfpv4 neon"
  Suggested for hardfp testing (Simulator): "armv7 vfp vfpv3 vfpv4 neon hardfp"
  Suggested for ARMv6 testing: "vfp"
  Suggested for ARMv6 hardfp testing (Simulator): "vfp hardfp"
  Description: Specify ARM code generation features, or 'help' to list all features.
  Introduced in bug: 1028008
  Available: 33
Flags: needinfo?(gary)
Attachment #8454856 - Attachment is obsolete: true
Updated and rebased patches. This depends on the ESR31 patch in bug 1020834 and the Part 1 cleanup patch above for ESR31, in that order.

Combined these make a very significant improvement in the robustness of the ARM backend on ARMv6 devices, and also help for ARMv7 devices.

I particularly recommend these patches for builders of Firefox for Android ARMv6, and Firefox for the Linux on the Raspberry Pi (an ARMv6 device), because without these patches the compilation of asm.js code is very fragile and it often fails to compile and falls back to the JIT compiler.
Attachment #8454860 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/f114c4101f02
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Attachment #8456991 - Flags: checkin+
Comment on attachment 8457678 [details] [diff] [review]
2. Simplify the assembler buffer with constant pools. Backport for Aurora 32.

Approval Request Comment
[Feature/regressing bug #]: Long standing issues.

[User impact if declined]: The ARM assembler has correctness issues in pool allocation. It generally detects these and bails out of compilation so they are not security issue. There have also been some unresolved bugs found via fuzzing and some of these have been resolved. The bail out paths are not well exercised and this has been tickling other bugs in these paths (not resolved by this patch). When compilation bails out it can often fall back to an alternative compilation or interpretation approach which typically generates slower code. In the case of asm.js module compilation, the module is compiled in one big assembler block and bailing out on any pool allocation failure causes the entire module to fail to compile. Asm.js code is typically performance critical and the ahead-of-time compilation can save memory at runtime, so bailing out is a show stopper. The ARMv6 code generation is particularly dependant on pool allocated constant (the ARMv7 can inline the loading of integer constants), and the assembler's failure on ARMv6 devices is a big real show stopper.

[Describe test coverage new/current, TBPL]: I have done a lot of local testing using the ARM simulator and hardware including an Android Nexus 4, the Flame, and the RaspberryPi (an ARMv6 device). I test large asm.js demos, and run the jit-tests. This patch and supporting patches have added options that extend the test coverage and this has detected a number of new issues and all have been resolved. An earlier version of the patch has been fuzz tested in bug 1027079 and was report to be 'flawless' (I did find more flaws in subsequent testing, all fixed). The patch has landed on m-c. Here's a fresh Aurora try run: https://tbpl.mozilla.org/?tree=Try&rev=9af92a1bd001

[Risks and why]: There is some risk. This makes significant changes to low level code. It might uncover issues not considered. However this bug has taken the least risky approach possible to improve the ARM assemblers robustness: it has focused only on the simplification of the code and at some potential expense to performance; it allocates only to a forward pool rather than forward and backwards pools; it allocates to a single pool rather than multiple pools; it ignores differing reach of different classes of load instructions and uses only the minimum common reach; it ignores alignment of the constants etc. Many code paths have just been abandoned focusing the test coverage on those remaining. It does not add any fancy new optimizations or features. The code style has been cleaned up and the code documented, and the remaining code scrutinised for correctness and corrected, and it has been reviewed by the module owner. The code is much simpler and much more maintainable in the event that there are issues.

[String/UUID change made/needed]: n/a
Attachment #8457678 - Flags: approval-mozilla-aurora?
Comment on attachment 8457678 [details] [diff] [review]
2. Simplify the assembler buffer with constant pools. Backport for Aurora 32.

It is still aurora (just by a few days). We will have all the beta cycle to test it and backout if needed. I guess this kind of refactoring improves the life of developers during the beta cycle.

Ryan, before uplifting it, please wait for tbpl to be green.
Attachment #8457678 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Adding QA keywords to make sure QA sees it and knows that changed happened on the low level side of the ARM backend.
Keywords: qawanted, verifyme
About the ESR patches, except if you have some strong and great arguments, I don't see us approving such big patch on this branch.
(In reply to Sylvestre Ledru [:sylvestre] from comment #69)
> About the ESR patches, except if you have some strong and great arguments, I
> don't see us approving such big patch on this branch.

Ok. I would like to leave the ESR31 patches for third party builders of Firefox for ARM Linux such as for the Raspberry Pi. Debian Linux seems to follow the ESR releases. There was not much difference in the patch between 31 and 32, and the file with most of the substantive changes is common between 31, 32, and 33.

Users of Firefox for Android have only a short wait for 32, and I am not aware of any builders of Firefox ESR for Android.

Thank you for approving the Aurora 32 uplift.
>  I am not aware of any builders of Firefox ESR for Android.
Douglas, well, you might be interested by bug 1040319. Let me know if it changes anything on your side.

n-i on Mike to let him know about this in case he wants to cherry-pick for Debian.
Flags: needinfo?(mh+mozilla)
Try run looks green enough on the platforms we care about for this patch.

https://hg.mozilla.org/releases/mozilla-aurora/rev/7feb49f7e6f9
(In reply to Douglas Crosher [:dougc] from comment #60)
> Options available for testing:
> 
> * --asm-pool-max-offset=<n>  OR  environment variable ASM_POOL_MAX_OFFSET
> 
> * --arm-asm-nop-fill=<n>  OR  environment variable ARM_ASM_NOP_FILL

Thanks Douglas! I've implemented these 2 flags in fuzzing rev 72054f0f2210.

> * --arm-hwcap="capability1 capability2 ..."  OR  environment variable
> ARMHWCAP

These sound like the defaults are sufficient (as they can be read from my ARM board's hardware).
Flags: needinfo?(gary)
(In reply to Gary Kwong [:gkw] [:nth10sd] PTO Jul 19 to 27 from comment #73)
> (In reply to Douglas Crosher [:dougc] from comment #60)
...
> > * --arm-hwcap="capability1 capability2 ..."  OR  environment variable
> > ARMHWCAP
> 
> These sound like the defaults are sufficient (as they can be read from my
> ARM board's hardware).

If you want test coverage for compilation paths other than those of the ARM board then these flags are still needed. I you have hardware to cover the compilation paths you wish to test then it would not be necessary.

Note, when testing on hardware, only features supported by the hardware should be used - the code does not check the supplied features against /proc/cpuinfo so if a feature is requested that is not support on the hardware then invalid code will be generated for the hardware. The suggested ARMv6 flags, "vfp", should be a subset of all supported hardware features, but you should check: vfpv3, vfpv4, neon, and idiva.
^^^ comment 74.
Flags: needinfo?(gary)
This is my ARM board /proc/cpuinfo result:

Processor       : ARMv7 Processor rev 3 (v7l)
processor       : 0
BogoMIPS        : 1785.85

processor       : 1
BogoMIPS        : 1785.85

processor       : 2
BogoMIPS        : 2182.71

processor       : 3
BogoMIPS        : 2182.71

Features        : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt
CPU implementer : 0x41
CPU architecture: 7
CPU variant     : 0x2
CPU part        : 0xc0f
CPU revision    : 3

Hardware        : ODROIDXU
Revision        : 0000
Serial          : 0000000000000000
Board Type      : XU+E

I'm not sure if it can compile ARMv6 binaries on the board, but I've gotten hardfp binaries for the board by changing the libraries the compiler calls. Do you mean I can also pass in the hardfp flag for these binaries as well?
Flags: needinfo?(gary) → needinfo?(dtc-moz)
(In reply to Sylvestre Ledru [:sylvestre] from comment #71)
> >  I am not aware of any builders of Firefox ESR for Android.
> Douglas, well, you might be interested by bug 1040319. Let me know if it
> changes anything on your side.

Ok, thanks, so Mozilla will distribution ESR31 Firefox for Android for ARMv6 devices for 6 months and then discontinue it, and will not distribute any later versions for ARMv6. I can only offer it to builders and make the case, it's their decision. Perhaps the community will step up to continue ARMv6 Firefox for Android releases if there is any interest.
(In reply to Gary Kwong [:gkw] [:nth10sd] PTO Jul 19 to 27 from comment #76)
> This is my ARM board /proc/cpuinfo result: 
> Processor       : ARMv7 Processor rev 3 (v7l)
...
> Features        : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4
> idiva idivt

> I'm not sure if it can compile ARMv6 binaries on the board, but I've gotten
> hardfp binaries for the board by changing the libraries the compiler calls.
> Do you mean I can also pass in the hardfp flag for these binaries as well?

Your board supports all the features that are currently used by the JIT compiler. The --arm-hwcap argument allows the features that the JIT compiler uses to be overridden, so using "vfp" tells the JIT compiler not to use the ARMv7 features and the JIT compiler ARMv6 code generation paths will be tested. It is not necessary to actually compile the binary for the ARMv6 target. The 'hardfp' feature can only be chosen dynamically when using the simulator as this is an ABI issue.
Flags: needinfo?(dtc-moz) → needinfo?(gary)
I hit crash on Nexus-5 (b2g project). If I backout 194555:f114c4101f02 locally, I don't see any crash while scrolling homescreen.
Blocks: 1012680
So, am I right to say that wrt. --arm-hwcap:

For my ARM board, possible options are:
1 - <no options> (default)
2 - vfp

For the ARM simulator on other platforms:
1 - <no options> (default)
2 - hardfp

Are there any other possibilities to consider?
Flags: needinfo?(gary) → needinfo?(dtc-moz)
issue is resolved - removing old / stale qa-wanted keywords
Keywords: qawanted
Unable to verify due to this being a backend issue without STR.
QA Whiteboard: [QAnalyst-Triage?][QAnalyst-verify-]
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage?][QAnalyst-verify-] → [QAnalyst-Triage+][QAnalyst-verify-]
Flags: needinfo?(ktucker)
Flags: needinfo?(mh+mozilla)
See Also: → 1207827
Qanalyst is unable to verify this issue.
Flags: needinfo?(jmercado)
Whiteboard: QAExclude
Flags: needinfo?(jmercado)
I'm not sure we care too much about ARM32 boards anymore, clearing needinfo?.
Flags: needinfo?(dtc-moz)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: