Closed Bug 1635359 Opened 5 years ago Closed 3 years ago

Ensure iterating a mozilla::Span doesn't have an overhead over iterating raw pointers

Categories

(Core :: MFBT, task)

task

Tracking

()

RESOLVED FIXED

People

(Reporter: sg, Unassigned)

Details

Attachments

(5 files)

It would be great to be able to change all functions accepting a pair a raw pointer and an array length to use mozilla::Span. However, this requires that this doesn't introduce a (significant) overhead. What is somewhat concerning at the moment are the release-mode assertions in the Span iterators, e.g. here:

In particular, such uses should generate code that is essentially equivalent to using raw pointers:

// range-based for
for (auto x : span) { }

// copy
nsTArray<uint8_t> target;
target.SetLength(span.Length());
std::copy(span.cbegin(), span.cend(), target.begin());

While other release-mode assertions make sense to make (e.g. at https://searchfox.org/mozilla-central/rev/54f965e51e4f77866bec42737978d40d4c1fdfc5/mfbt/Span.h#576), the ones in the iterators seem overcautious to me. We don't have any assertions at all in https://searchfox.org/mozilla-central/rev/54f965e51e4f77866bec42737978d40d4c1fdfc5/xpcom/ds/ArrayIterator.h#49, e.g.

I think either of these should be done:

  • assert that the overhead of such uses as above is neglible. However, I fear this is not the case. The copy case e.g. should be mostly equivalent to a memcpy, but I can't imagine that the assertions can be optimized away entirely to have the optimizer figure that out. In that case the Span version will be magnitudes slower than the raw pointer version.
  • degrade the assertions in iterators to at most diagnostic assertions

mozilla::Span should increase safety against out-of-bounds accesses, but if these assertions prevent the use of mozilla::Span in certain situations, and the using code has to revert to raw pointers, safety is effectively reduced.

(In reply to Simon Giesecke [:sg] [he/him] from comment #0)

While other release-mode assertions make sense to make (e.g. at https://searchfox.org/mozilla-central/rev/54f965e51e4f77866bec42737978d40d4c1fdfc5/mfbt/Span.h#576), the ones in the iterators seem overcautious to me. We don't have any assertions at all in https://searchfox.org/mozilla-central/rev/54f965e51e4f77866bec42737978d40d4c1fdfc5/xpcom/ds/ArrayIterator.h#49, e.g.

https://searchfox.org/mozilla-central/rev/54f965e51e4f77866bec42737978d40d4c1fdfc5/xpcom/ds/ArrayIterator.h#101-108 have release-mode checks based on the ones in nsTArray. That is, all of our iterations and accesses to nsTArray have release-mode bounds checks. I'm not particularly worried about the ones in Span and if for some reason we find hotspots, we can address those. (The copy example should probably be using AppendElements, which should be free to manipulate raw pointers underneath; even if Span weren't bounds-checked, the copy example still has the bounds checking in nsTArray to contend with.)

Ok, I missed the bounds checks in the dereferencing operators. Despite the comment explicitly mentioning them.

The span_iterator has additional checks on the incrementing/decrementing operators, though. And I guess at least one of those is redundant. If we check the position is "valid" when changing the iterator state, then we know it's valid when dereferencing.

What's the rationale behind those checks? For mozilla::Span, they make less sense than for nsTArray. While the nsTArray knows its current size, and we get some extra safety against iterator invalidation from that, the Span cannot know if the size of the underlying data structure changed in between its creation and the time of the access. It only provides a guard against rather "obvious" mistakes such as doing std::copy(span.begin(), span.end() + 5, target.begin()) and I am not sure if there is value to do such checks in release mode.

On std::copy vs. AppendElements: If we know it's a span, then we can use AppendElements. But in a template where the iterators are not known to be contiguous, we can't use that. But that's admittedly hypothetical.

(In reply to Simon Giesecke [:sg] [he/him] from comment #2)

Ok, I missed the bounds checks in the dereferencing operators. Despite the comment explicitly mentioning them.

The span_iterator has additional checks on the incrementing/decrementing operators, though. And I guess at least one of those is redundant. If we check the position is "valid" when changing the iterator state, then we know it's valid when dereferencing.

What's the rationale behind those checks? For mozilla::Span, they make less sense than for nsTArray. While the nsTArray knows its current size, and we get some extra safety against iterator invalidation from that, the Span cannot know if the size of the underlying data structure changed in between its creation and the time of the access. It only provides a guard against rather "obvious" mistakes such as doing std::copy(span.begin(), span.end() + 5, target.begin()) and I am not sure if there is value to do such checks in release mode.

Henri, do you remember why we have both the asserts for dereferencing and the the asserts for iterator operations? It does seem to me like Simon is correct in saying that we could dispense with the latter (and indeed nsTArray's iterators don't feature checks for iterator ++/--/etc.); did the checks come in via GSL debug-mode assertions, and we just upgraded them to release-mode?

Flags: needinfo?(hsivonen)

Henri, do you remember why we have both the asserts for dereferencing and the the asserts for iterator operations?

Just because Microsoft had them in the code that I imported. I'm OK with removing the ones that are bad for performance.

did the checks come in via GSL debug-mode assertions, and we just upgraded them to release-mode?

IIRC, they weren't debug-mode per se originally.

Flags: needinfo?(hsivonen)

I had a look at the (current) GSL version and you remember correctly: the iterator assertions (using Expects) are there, and IIUC, they are enabled in release-mode.

Before proposing any patch, I will check what the generated code for cases like I mentioned above looks like. Maybe I am too pessimistic about compiler optimizations :)

There's one interesting thing though, which might make sense for MOZ_ASSERT/MOZ_RELEASE_ASSERT et al. in general: They use __builtin_expect for the conditions, which might benefit optimization decisions (even if the assertion itself is not active in the build). Maybe the effect is limited since PGO will figure that out anyway, but I guess it wouldn't hurt either.

(In reply to Simon Giesecke [:sg] [he/him] from comment #5)

Before proposing any patch, I will check what the generated code for cases like I mentioned above looks like. Maybe I am too pessimistic about compiler optimizations :)

For https://phabricator.services.mozilla.com/D51122 (bug 1592588), I've created https://godbolt.org/z/xruokz to argue against using mozilla::Span iterators in performance-sensitive code.

In bug 1583953, mozilla::Span iterators were also replaced to avoid the assertion overhead.

(In reply to André Bargull [:anba] from comment #7)

(In reply to Simon Giesecke [:sg] [he/him] from comment #5)

Before proposing any patch, I will check what the generated code for cases like I mentioned above looks like. Maybe I am too pessimistic about compiler optimizations :)

For https://phabricator.services.mozilla.com/D51122 (bug 1592588), I've created https://godbolt.org/z/xruokz to argue against using mozilla::Span iterators in performance-sensitive code.

In bug 1583953, mozilla::Span iterators were also replaced to avoid the assertion overhead.

Amazingly, the only release assert one need to comment out is the release assert for the iterator creation (!). Once you do that, the compiler is smart enough to optimize away the bounds checks: https://godbolt.org/z/5e-8AT

It's not entirely clear to me why that is the case, but you can play around with the example yourself.

It turns out that you need both to remove the bounds check on the initial construction (which we can "know" is true because we're constructing the iterators from inside Span) and the bounds check on the copy construction (which seems pretty reasonable: if we have an iterator in some state, copying the iterator shouldn't have to assert): https://godbolt.org/z/-WeRmP

Note that we are already a bit lax on this: operator= is defaulted, even though it should theoretically be doing the same checking as the copy constructor is. So bypassing the asserts in the two cases above ought to be fine.

Actually, I take that back: you just need to remove the checks on the constructor of const instances from non-const instances: https://godbolt.org/z/sUZf7R . That seems uncontroversial, so I'll write a patch to do that.

Turns out Simon's std::copy example still contains bounds checks, for reasons that are not clear to me; I would have expected the compiler to be able to optimize those away...

Severity: S1 → N/A
Assignee: nobody → nfroyd
Status: NEW → ASSIGNED

These are redundant with the checks that we already have for dereferencing.

Depends on D77120

We're going to introduce some friend declarations in the next patch, and
this change makes doing so easier. It's also slightly less confusing.

Depends on D77121

The compiler is not always able to remove these (always-successful)
checks, so we might as well help it along.

Depends on D77122

There's no need to provide the compiler a bunch of stuff that it's
trivially going to optimize away anyway, and we'd rather the release
assert not mess with the optimizer's view of the code.

Depends on D77123

Marking this as leave-open because I don't think we're done with this. In particular, I think it'd be nice to get rid of the span_iterator assertions that span_ exists, maybe at least downgrade the assertions on iterator <, and maybe a few other things.

Keywords: leave-open
Pushed by nfroyd@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/48fad69ddbff eliminate release assert in const `span_iterator` construction; r=sg https://hg.mozilla.org/integration/autoland/rev/e4c71f6c5012 remove release asserts for `span_iterator` {de,in}crement; r=sg https://hg.mozilla.org/integration/autoland/rev/48f3f6c588f7 rename Span type parameter for `span_iterator`; r=sg https://hg.mozilla.org/integration/autoland/rev/9aebd99826a3 provide a private constructor for `span_iterator` to bypass bounds checks; r=sg https://hg.mozilla.org/integration/autoland/rev/fe7259f87f29 remove bounds checks on trivial `span_iterator` construction; r=sg

The leave-open keyword is there and there is no activity for 6 months.
:sg, maybe it's time to close this bug?

Flags: needinfo?(sgiesecke)

It still makes sense to pursue the suggestions from Comment 17.

Flags: needinfo?(sgiesecke)

The bug assignee didn't login in Bugzilla in the last 7 months.
:sg, could you have a look please?
For more information, please visit auto_nag documentation.

Assignee: froydnj+bz → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(simon.giesecke)

Let's spin the follow-up to a new bug number so that landed patches and bug numbers have an obvious correspondence: bug 1762904.

Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(simon.giesecke)
Keywords: leave-open
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: