Document that writing past mLength code units via BeginWriting() is not OK

RESOLVED FIXED in Firefox 63

Status

()

enhancement
RESOLVED FIXED
11 months ago
11 months ago

People

(Reporter: hsivonen, Assigned: hsivonen)

Tracking

unspecified
mozilla63
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox63 fixed)

Details

Attachments

(1 attachment)

As seen in bug 1482828, we have code that does this:

 1) Call SetCapacity() on a string
 2) Use BeginWriting() to write past mLength of the string but staying within the capacity set in the first step.
 3) Call SetLength() with the number of code units actually written.

This is an obvious encapsulation violation: In step 2, data is written to the part of the string's buffer that as far as the string implementation is aware is not in use.

If SetLength() reallocates, it has to memcpy the *new* length amount of code units into the new buffer instead of copying the *old* length amount of code units. This results in needless copying when the above pattern is not in use. The pattern, of course, assumes that SetLength() with an argument that's not greater than the argument of the previous SetCapacity() will never allocate. That in turn poses the problem that SetLength() can't shrink the buffer to save memory. Being able to shrink the buffer seems beneficial in the light of the Fission MemShrink effort.

Worse, the pattern isn't even particularly useful compared to replacing the initial SetCapacity() call with a SetLength() call, which would not be an encapsulation violation. Compared to using SetLength() before the write, using SetCapacity() saves only one write to the mLength field. In particular, as currently implemented, SetCapacity() does *not* save a zero-terminator write and instead causes a cache-unfriendly non-linear write pattern anyway.

As far as I can tell, the pattern of using BeginWriting() to write past length after SetCapacity() is not documented as a permitted pattern either in the string headers or in https://developer.mozilla.org/en-US/docs/Mozilla/Tech/XPCOM/Guide/Internal_strings .

Since bug 1482828 adds a proper API that avoids useless or cache-unfriendly writes while enabling low-level writes to extend the string in an orderly fashion, I suggest explicitly documenting that writing past mLength via BeginWriting() is not OK.
(In reply to Henri Sivonen (:hsivonen) from comment #0)
> As seen in bug 1482828, we have code that does this:

As seen in bug 1472113, rather.
Comment on attachment 9002691 [details]
Bug 1484668 - Document that writing past mLength code units via BeginWriting() is not OK.

Nathan Froyd [:froydnj] has approved the revision.
Attachment #9002691 - Flags: review+
Pushed by hsivonen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8fd8a747cd87
Document that writing past mLength code units via BeginWriting() is not OK. r=froydnj
https://hg.mozilla.org/mozilla-central/rev/8fd8a747cd87
Status: ASSIGNED → RESOLVED
Closed: 11 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
I'm having trouble to understand this change. The whole point of SetCapacity is that it reserves enough memory so that can can append and set length larger.
If the meaning of SetLength changes, how?

I don't anymore know how to create large string in an efficient way, avoid as many allocations as possible. (Allocations are very slow).

+   * If your string is an nsAuto[C]String and you are calling
+   * SetCapacity() with a constant N, please instead declare the
+   * string as nsAuto[C]StringN<N+1> without calling SetCapacity().
That is rather scary suggestion. Could use way too much stack space.
(In reply to Olli Pettay [:smaug] from comment #6)
> I'm having trouble to understand this change. The whole point of SetCapacity
> is that it reserves enough memory so that can can append and set length
> larger.

No, the point is to be able to do a sequence of Append()s without reallocation.

> If the meaning of SetLength changes, how?

The change to SetLength() is that it can sometimes reallocate to reduce capacity. (There's a threshold so that it doesn't reallocate from very short to even shorter.)

> I don't anymore know how to create large string in an efficient way, avoid
> as many allocations as possible.

https://groups.google.com/forum/#!topic/mozilla.dev.platform/amMU6GKRIb4 suggests two ways.

> +   * If your string is an nsAuto[C]String and you are calling
> +   * SetCapacity() with a constant N, please instead declare the
> +   * string as nsAuto[C]StringN<N+1> without calling SetCapacity().
> That is rather scary suggestion. Could use way too much stack space.

Do you have a suggestion of a guideline for max size of N for which the above guidance should apply?
You need to log in before you can comment on or make changes to this bug.