Change nsTArray to use a custom iterator based on index instead of pointers to improve iterator invalidation safety of ranged for loops

RESOLVED FIXED in Firefox 53

Status

()

defect
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: smaug, Assigned: Nika)

Tracking

(Blocks 1 bug, {sec-want})

50 Branch
mozilla53
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox53 fixed)

Details

(Whiteboard: [adv-main53-])

Attachments

(1 attachment, 2 obsolete attachments)

Ranged loop is very much security hazard when done with arrays as member variables which can be often modified during looping.
It has happened several times during reviews that code author wasn't thinking about the issue and it would have lead to security critical bugs unless caught by reviewer.
And I think there has been couple of security bugs because of this.
So, we should force one to either use some other way to loop or take the array out from heap to a stack variable and loop over that.
Component: XPCOM → Rewriting and Analysis
Depends on: 1299509
ehsan. I'm worried about the impact this will have on breaking inbound because people wrote code which works locally but fails when pushed to infra. Do you think that the security benefits we could get from this are worth the backouts which will probably occur if we write an analysis like this?
Flags: needinfo?(ehsan)
FWIW, this would safe reviewers' time (no need to explain yet again how ranged loop is security hazard) and would probably catch security critical issues.
(In reply to Olli Pettay [:smaug] from comment #2)
> FWIW, this would safe reviewers' time (no need to explain yet again how
> ranged loop is security hazard) and would probably catch security critical
> issues.

Another option is to simply prohibit begin()/end() methods on datastructures (and related member functions, and overloads of std::{begin,end}()...) from returning pointer-based things.  Might get people to think about iterator invalidation a little harder, though it's also difficult to figure out what datastructures you'd want to whitelist.  Oh, and I guess the analysis proposed in comment 0 would also prevent people from doing dumb things with std::vector and similar...
How about instead of using pointers for the begin()/end() iterator values, instead using something potentially less performant, but probably similarially safe to the index based approach (and hopefully optimized away because iterators are usually local).

// XXX: Strawman impl - will be more complex
template<typename T>
struct nsTArrayIterator {
    nsTArray<T>& mArray;
    uint32_t mIndex;

    uint32_t Index() const {
        return min(mIndex, mArray.Length());
    }

    bool operator==(const nsTArrayIterator<T>& aOther) const {
        return Index() == aOther.Index();
    } 

    // Implement prefix and postfix operator++
    nsTArrayIterator<T>& operator++() {
        ++mIndex;
        return *this;
    }
    nsTArrayIterator<T> operator++(int) {
        nsTArrayIterator<T> it = *this;
        ++it;
        return it;
    }
};

Which doesn't have the same problems as the pointer based iterator approach.

N.B. /me doesn't know how much of a perf impact this would have...
perf impact shouldn't be significant. And yes, I like that approach, and could live with it.
Modifying the array might still lead to unexpected results, like missing some entries if one removes the current entry or so, but at least wouldn't lead to security bugs.
This is an implementation of the above idea. I'm doing a try run to see if it regresses talos.
Attachment #8786977 - Flags: review?(nfroyd)
Assignee: nobody → michael
Status: NEW → ASSIGNED
Component: Rewriting and Analysis → XPCOM
No longer depends on: 1299509
Flags: needinfo?(ehsan)
Summary: Add static analysis to prevent ranged looping of nsTArray/mozilla::Array when the array is not stack allocated → Change nsTArray to use a custom iterator based on index instead of pointers to improve iterator invalidation safety of ranged for loops
Attachment #8786977 - Attachment is obsolete: true
Attachment #8786977 - Flags: review?(nfroyd)
It seems like every platform other than linux dislikes this patch - I imagine that fixing that probably won't be _too_ hard.
Comment on attachment 8786980 [details] [diff] [review]
Change nsTArray to use a custom iterator based on index instead of pointers to improve iterator invalidation safety of ranged for loops

Review of attachment 8786980 [details] [diff] [review]:
-----------------------------------------------------------------

What does assembly for a range-based for loop using this patch look like compared to a regular for loop that iterates against Length()?  Bonus points if you do it for MSVC, too.

::: xpcom/glue/nsTArray.h
@@ +470,5 @@
> +    return Index() - aOther.Index();
> +  }
> +
> +  value_type& operator[](difference_type aIndex) const {
> +    return *(*this + aIndex);

I think this would be more clearly written as:

*this->operator+(aIndex);

I am, in general, skeptical that operator{+,-}{,=} and operator[] are needed; are people using it with STL things and so we need to opt in to all the STL iterator stuff?
(In reply to Nathan Froyd [:froydnj] from comment #9)
> What does assembly for a range-based for loop using this patch look like
> compared to a regular for loop that iterates against Length()?  Bonus points
> if you do it for MSVC, too.

Assembly takes up a lot of vertical space, so here's my testing of the Linux x64 assembly: https://pastebin.mozilla.org/8906989
I can test MSVC next time I reboot my computer into Windows.

> ::: xpcom/glue/nsTArray.h
> @@ +470,5 @@
> > +    return Index() - aOther.Index();
> > +  }
> > +
> > +  value_type& operator[](difference_type aIndex) const {
> > +    return *(*this + aIndex);
> 
> I think this would be more clearly written as:
> 
> *this->operator+(aIndex);

Fixed

> I am, in general, skeptical that operator{+,-}{,=} and operator[] are
> needed; are people using it with STL things and so we need to opt in to all
> the STL iterator stuff?

People are using this iterator with STL stuff, such as `std::stable_sort`. I ran into a bunch of them, and ended up reading the STL iterator specification and implementing a full STL `random_access_iterator`, to make sure that this iterator works correctly everywhere it is used.
Comment on attachment 8786980 [details] [diff] [review]
Change nsTArray to use a custom iterator based on index instead of pointers to improve iterator invalidation safety of ranged for loops

Review of attachment 8786980 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not entirely sure about dropping the bounds-checking on Index().  It's obviously faster, but it may make comparisons do very screwy things...ideally people wouldn't do screwy things with comparisons, but you never know...

I'd like to see the assembly from MSVC for the various cases.

::: xpcom/glue/nsTArray.h
@@ +370,5 @@
> +  typedef value_type&                     reference;
> +  typedef std::random_access_iterator_tag iterator_category;
> +
> +  const array_type* mArray;
> +  index_type mIndex;

Do you really intend for these to be public?

@@ +381,5 @@
> +    }
> +  }
> +
> +public:
> +  nsTArrayIterator() : mArray(nullptr), mIndex(0) {}

I believe that we don't need this constructor (assuming I am reading the C++ standard correctly), and can therefore safely dispense the check that mArray is nullptr, above.  I *really* hope I am reading the standard correctly.
Attachment #8786980 - Flags: review?(nfroyd)
(In reply to Nathan Froyd [:froydnj] from comment #11)
> Comment on attachment 8786980 [details] [diff] [review]
> Change nsTArray to use a custom iterator based on index instead of pointers
> to improve iterator invalidation safety of ranged for loops
> 
> Review of attachment 8786980 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I'm not entirely sure about dropping the bounds-checking on Index().  It's
> obviously faster, but it may make comparisons do very screwy
> things...ideally people wouldn't do screwy things with comparisons, but you
> never know...

What it means is that the loop will act like this:

uint32_t len = arr.Length();
for (int i = 0; i < len; ++i)

Instead of this:

for (int i = 0; i < arr.Length(); ++i)

(namely if you mutate the underlying array, we won't prevent you from running off the end, and then MOZ_CRASHing)

> I'd like to see the assembly from MSVC for the various cases.

Will do. I'll try to have something up today?

> ::: xpcom/glue/nsTArray.h
> @@ +370,5 @@
> > +  typedef value_type&                     reference;
> > +  typedef std::random_access_iterator_tag iterator_category;
> > +
> > +  const array_type* mArray;
> > +  index_type mIndex;
> 
> Do you really intend for these to be public?

Nope, I'll fix that

> @@ +381,5 @@
> > +    }
> > +  }
> > +
> > +public:
> > +  nsTArrayIterator() : mArray(nullptr), mIndex(0) {}
> 
> I believe that we don't need this constructor (assuming I am reading the C++
> standard correctly), and can therefore safely dispense the check that mArray
> is nullptr, above.  I *really* hope I am reading the standard correctly.

The C++ standard requires that all iterators be default constructable, and that all default constructed iterators compare equal.

Section 24.2.5 [forward.iterators] says:
A class or pointer type X satisfies the requirements of a forward iterator if
...
X satisfies the DefaultConstructible requirements (17.6.3.1),

Also see this for a less official source: http://www.cplusplus.com/reference/iterator/
(In reply to Michael Layzell [:mystor] from comment #12)
> (In reply to Nathan Froyd [:froydnj] from comment #11)
> > I'm not entirely sure about dropping the bounds-checking on Index().  It's
> > obviously faster, but it may make comparisons do very screwy
> > things...ideally people wouldn't do screwy things with comparisons, but you
> > never know...
> 
> What it means is that the loop will act like this:
> 
> uint32_t len = arr.Length();
> for (int i = 0; i < len; ++i)
> 
> Instead of this:
> 
> for (int i = 0; i < arr.Length(); ++i)
> 
> (namely if you mutate the underlying array, we won't prevent you from
> running off the end, and then MOZ_CRASHing)

OK, well that would defeat the whole purpose, then, wouldn't it.  But our accesses are still bounds-checked, so we would still crash, no?

> > @@ +381,5 @@
> > > +    }
> > > +  }
> > > +
> > > +public:
> > > +  nsTArrayIterator() : mArray(nullptr), mIndex(0) {}
> > 
> > I believe that we don't need this constructor (assuming I am reading the C++
> > standard correctly), and can therefore safely dispense the check that mArray
> > is nullptr, above.  I *really* hope I am reading the standard correctly.
> 
> The C++ standard requires that all iterators be default constructable, and
> that all default constructed iterators compare equal.

This comment box is not big enough to contain the WAT I want to write here.
(In reply to Nathan Froyd [:froydnj] from comment #13)
> OK, well that would defeat the whole purpose, then, wouldn't it.  But our
> accesses are still bounds-checked, so we would still crash, no?

Yes, we would still crash in that situation, because our accesses are bounds-checked. It just means that we won't avoid triggering the crash when you mutate the underlying array. (which might be OK, because the program is already misbehaving in that situation).

> This comment box is not big enough to contain the WAT I want to write here.

The C++ standard is pretty awesome.
Attachment #8786980 - Attachment is obsolete: true
I didn't manage to test this on a windows machine before heading home for the weekend. I'll test it next week when I'm back in the office.
Comment on attachment 8787792 [details] [diff] [review]
Change nsTArray to use a custom iterator based on index instead of pointers to improve iterator invalidation safety of ranged for loops

Review of attachment 8787792 [details] [diff] [review]:
-----------------------------------------------------------------

Clearing this until we get assembly from MSVC.  I wish we had some way to avoid the bounds checking etc. for ranged-for loop over local arrays.
Attachment #8787792 - Flags: review?(nfroyd)
?TakesTArray2@@YAXABV?$nsTArray@H@@@Z (void __cdecl TakesTArray2(class nsTArray<int> const &)):
  00000000: 56                 push        esi
  00000001: 57                 push        edi
  00000002: 8B 7C 24 0C        mov         edi,dword ptr [esp+0Ch]
  00000006: 33 F6              xor         esi,esi
  00000008: 8B 07              mov         eax,dword ptr [edi]
  0000000A: 39 30              cmp         dword ptr [eax],esi
  0000000C: 76 17              jbe         00000025
  0000000E: 56                 push        esi
  0000000F: 8B CF              mov         ecx,edi
  00000011: E8 00 00 00 00     call        ?ElementAt@?$nsTArray_Impl@HUnsTArrayInfallibleAllocator@@@@QBEABHI@Z
  00000016: FF 30              push        dword ptr [eax]
  00000018: E8 00 00 00 00     call        ?BlackBox@@YAXH@Z
  0000001D: 8B 07              mov         eax,dword ptr [edi]
  0000001F: 46                 inc         esi
  00000020: 59                 pop         ecx
  00000021: 3B 30              cmp         esi,dword ptr [eax]
  00000023: 72 E9              jb          0000000E
  00000025: 5F                 pop         edi
  00000026: 5E                 pop         esi
  00000027: C3                 ret

?TakesTArray@@YAXABV?$nsTArray@H@@@Z (void __cdecl TakesTArray(class nsTArray<int> const &)):
  00000000: 53                 push        ebx
  00000001: 56                 push        esi
  00000002: 57                 push        edi
  00000003: 8B 7C 24 10        mov         edi,dword ptr [esp+10h]
  00000007: 33 F6              xor         esi,esi
  00000009: 8B 07              mov         eax,dword ptr [edi]
  0000000B: 8B 18              mov         ebx,dword ptr [eax]
  0000000D: 85 DB              test        ebx,ebx
  0000000F: 74 15              je          00000026
  00000011: 56                 push        esi
  00000012: 8B CF              mov         ecx,edi
  00000014: E8 00 00 00 00     call        ?ElementAt@?$nsTArray_Impl@HUnsTArrayInfallibleAllocator@@@@QBEABHI@Z
  00000019: FF 30              push        dword ptr [eax]
  0000001B: E8 00 00 00 00     call        ?BlackBox@@YAXH@Z
  00000020: 46                 inc         esi
  00000021: 59                 pop         ecx
  00000022: 3B F3              cmp         esi,ebx
  00000024: 75 EB              jne         00000011
  00000026: 5F                 pop         edi
  00000027: 5E                 pop         esi
  00000028: 5B                 pop         ebx
  00000029: C3                 ret

Above is the MSVC assembly. Just like with gcc it is pretty much identical when we don't do the weird iterator clamping behavior.
Attachment #8787792 - Flags: review?(nfroyd)
Comment on attachment 8787792 [details] [diff] [review]
Change nsTArray to use a custom iterator based on index instead of pointers to improve iterator invalidation safety of ranged for loops

Review of attachment 8787792 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for letting this sit so long.  I wonder if it would be worth having:

  for (auto& t : LocalIteration(array)) { ... }

which would use direct pointers.  You'd need a static analysis to verify that, of course, and you'd need to check that you're not modifying the array in the process...which is probably kind of difficult.

::: xpcom/glue/nsTArray.h
@@ +356,5 @@
>    }
>  };
>  
> +template<class Element>
> +class nsTArrayIterator

Maybe worth a comment:

// We provide a custom iterator type for nsTArray, rather than using
// pointers directly, so that ranged for loops are safe in the presence
// of element removal and/or unexpected array modification via script.

Or somesuch?

@@ +404,5 @@
> +  bool operator>=(const iterator_type& aRhs) const {
> +    return mIndex >= aRhs.mIndex;
> +  }
> +
> +  value_type* operator->() const {

Might be worth a comment saying that we're depending on the bounds-checking that nsTArray does for this operator to be safe.  And possibly vice-versa: a comment on the nsTArray::{operator[],ElementAt} methods saying that the iterator depends on the bounds checking for safety?

I'm more in favor of the former than the latter.
Attachment #8787792 - Flags: review?(nfroyd) → review+
(In reply to Nathan Froyd [:froydnj] from comment #19)
> Sorry for letting this sit so long.  I wonder if it would be worth having:
> 
>   for (auto& t : LocalIteration(array)) { ... }
> 
> which would use direct pointers.  You'd need a static analysis to verify
> that, of course, and you'd need to check that you're not modifying the array
> in the process...which is probably kind of difficult.

I'm not convinced that we would get that significant of a performance advantage because of that, and I don't really feel like I have the time to write the static analysis right now. It would be really easy to implement though.

// strawman skeleton impl:
template<typename T>
class LocalIterate {
public:
  LocalIterate(nsTArray<T>& aArray) : mArray(aArray) {}
  T* begin() { return aArray.Elements(); }
  T* end() { return aArray.Elements() + aArray.Length(); }
  const T* begin() const { return aArray.Elements(); }
  const T* end() const { return aArray.Elements() + aArray.Length(); }
private:
  nsTArray<T>& mArray;
};

This feels to me like its outside of the scope of this bug however, maybe we should do it in a followup?
(In reply to Michael Layzell [:mystor] from comment #20)
> This feels to me like its outside of the scope of this bug however, maybe we
> should do it in a followup?

Sure.  I'm not expecting you or anybody else to work on it now or in the future, to be clear.
Blocks: 1306692
Pushed by michael@thelayzells.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3851e5f51530
Change nsTArray to use a custom iterator based on index instead of pointers to improve iterator invalidation safety of ranged for loops, r=froydnj
I expect this to result in a funny story.

Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/f879fa2dfc097f0427fe12afc277552620d9e061 for making e10s Linux32 debug test_peerConnection_capturedVideo.html permaorange with a 330 seconds without output hang with a funky looking stack, https://treeherder.mozilla.org/logviewer.html#?job_id=37106431&repo=mozilla-inbound, and making Linux32 debug non-e10s test_peerConnection_capturedVideo.html just softly and silently vanish with nothing but an exit code -11, https://treeherder.mozilla.org/logviewer.html#?job_id=37131825&repo=mozilla-inbound
Pushed by michael@thelayzells.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8cc548682ae3
Change nsTArray to use a custom iterator based on index instead of pointers to improve iterator invalidation safety of ranged for loops, r=froydnj
Keywords: sec-want
https://hg.mozilla.org/mozilla-central/rev/8cc548682ae3
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Whiteboard: [adv-main53-]
You need to log in before you can comment on or make changes to this bug.