Closed Bug 1283710 Opened 8 years ago Closed 8 years ago

Simplify or clarify the lifetime and the ownership of JSErrorReport members.

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox50 --- affected
firefox52 --- fixed

People

(Reporter: arai, Assigned: arai)

References

Details

Attachments

(9 files, 8 obsolete files)

1.80 KB, patch
arai
: review+
Details | Diff | Splinter Review
26.20 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
6.17 KB, patch
arai
: review+
Details | Diff | Splinter Review
42.25 KB, patch
arai
: review+
Details | Diff | Splinter Review
7.38 KB, patch
arai
: review+
Details | Diff | Splinter Review
8.81 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
1.37 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
29.29 KB, patch
froydnj
: review+
bholley
: review+
Waldo
: review+
Details | Diff | Splinter Review
6.53 KB, patch
bholley
: review+
Details | Diff | Splinter Review
related to bug 1283058.

lifetime and ownership own each pointer member in JSErrorReport are so much complicated, and messageArgs elements are unnecessarily hold after actual usage.

simplifying it will help adding notes for error message in bug 420857 and bug 104442.
> lifetime and ownership own each pointer member
ownership *of each pointer member
Blocks: 1283712
Assignee: nobody → arai.unmht
draft patches passed try run.
will post after bug 1283058.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9d8545c3d486
Depends on: 1283058
Prepared 7 patches.

[Part 1]

JSErrorReport.messageArgs is used only in ExpandErrorArgumentsVA, and in most case JSErrorReport.messageArgs is created or assigned there.  Also, if it's created from va_list, I think it points to invalid address after leaving the frame.

Anyway, we don't need to store such information in JSErrorReport struct, so removed the messageArgs member.  and also removed copying logic from CopyErrorReport.

There's one place that uses JSErrorReport.messageArgs in jsapi-tests/testParseJSON.cpp, but it's used to check the line/column number of the JSON error message, that can also be checked by message's string representation, so changed it to use `js::ErrorReport::message()` instead.
Attachment #8771234 - Flags: review?(jwalden+bmo)
[Part 2]

Before tweaking other JSErrorReport members, reordered existing members to reduce padding.
Attachment #8771235 - Flags: review?(jwalden+bmo)
[Part 3]

In ExpandErrorArgumentsVA, we duplicate JSErrorReport.ucmessage to `*messagep`, or `*messagep` to JSErrorReport.ucmessage, but in most case we need only one of them.

Also, it's unclear that who owns the allocated memory, and when it will be free'd.

So, first, changed JSErrorReport.ucmessage to private member JSErrorReport.message_, with an union of `const char16_t*` and `const char*`, to store one of them.
Also, added 2 flags
  * JSErrorReport.isTwoByteMessage_ to indicate if JSErrorReport.message_ is in char16_t or char.
  * JSErrorReport.ownMessage_ to indicate if the JSErrorReport instance owns the memory pointed by JSErrorReport.message_

In this part, only char16_t type is used, just to minimize the diff.

And added some accessor methods.
  * bool hasMessage()
  * bool hasUCMessage()
  * bool ownMessage()
  * const char* message()
  * const char16_t* ucmessage()
  * void ensureMessage(JSContext* cx)
  * void initMessageOwn(const char* messageArg)
  * void initMessageNotOwn(const char* messageArg)
  * void initUCMessageOwn(const char16_t* ucmessageArg)
  * void initUCMessageNotOwn(const char16_t* ucmessageArg)
  * JSString* newMessageString(JSContext* cx)
  * void freeMessage()

there JSErrorReport::init* have 2 variants *Own and *NotOwn.
Initializing message with *Own sets JSErrorReport.ownMessage_ to true, and the memory is free'd in JSErrorReport::freeMessage.
JSErrorReport::freeMessage is called in destructor.

JSErrorReport::ensureMessage converts char16_t* message to char* if needed.
This is here because in some case consumer needs char* specifically.

JSErrorReport::newMessageString converts message to JSString.
This is just to hide char16_t/char differences.
Attachment #8771236 - Flags: review?(jwalden+bmo)
[Part 4]

Like JSErrorReport.message_, the owner of JSErrorReport.linebuf_ is also unclear.
So added JSErrorReport.ownLinebuf_ flag.

JSErrorReport::initLinebuf is changed into 2 function, for *Own and *NotOwn.
 * initLinebufOwn
 * initLinebufNotOwn

Also added JSErrorReport::freeLinebuf and it's called inside destructor.
Attachment #8771237 - Flags: review?(jwalden+bmo)
[Part 5]

As mentioned in Part 3, we duplicate char16_t message to char message or the opposite way.
This part removes the code and store allocated string in JSErrorReport.message_ only.

In WarningReporter, message parameter is always the same as report.ucmessage(), so removed the parameter.

In error reporting, there could be 2 strings used, one is message, and other is the result of toString().  so there char* parameter is still kept (the name is changed in Part 6)

CompileError.message and js::ErrorReport.ownedMessage are used to hold the char* message, so removed them.

In js::PrintError, `report` parameter is always non-null, and |if (!report)| branch is not used, so removed it for simplicity.
Attachment #8771239 - Flags: review?(jwalden+bmo)
[Part 6]

As mentioned in Part 5, in error reporting related function, char* parameter is actually the result of toString(), so changed the name of those parameters to toStringResult or aToStringResult or such.
Attachment #8771240 - Flags: review?(jwalden+bmo)
[Part 7]

As the result of Part 5, CompileError has no other member than JSErrorReport, so changed it to the subclass of JSErrorReport.
Attachment #8771241 - Flags: review?(jwalden+bmo)
Comment on attachment 8771234 [details] [diff] [review]
Part 1: Remove JSErrorReport.messageArgs.

Review of attachment 8771234 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsapi-tests/testParseJSON.cpp
@@ +288,5 @@
>  }
>  
> +template<size_t N> inline bool
> +Error(JSContext* cx, const char (&input)[N], int32_t expectedLine,
> +      int32_t expectedColumn)

Use uint32_t for this, since our type for line/column is uint32_t.  (I'm fairly sure you can pass int to a template function like this without hitting compile errors -- if that's wrong, ignore me but let me know.)

::: js/src/jscntxt.cpp
@@ +471,5 @@
> +        uint16_t i = 0;
> +        while (messageArgs[i])
> +            js_free((void*)messageArgs[i++]);
> +    }
> +    js_free((void*)messageArgs);

I think you can get rid of the cast on this last line.

@@ +534,5 @@
>                      /* Do nothing. */
>                  } else if (argumentsType == ArgumentsAreASCII) {
>                      char* charArg = va_arg(ap, char*);
>                      size_t charArgLength = strlen(charArg);
> +                    messageArgs[i] = InflateString(cx, charArg, &charArgLength);

In passing I lol at how we get an exact length here, then we discard it and use js_strlen instead, below, and thus potentially have truncated the string.  Not something to fix now, and IMO something to fix by making all error-handling strings UTF-8 (or maybe UTF-16, or at least UCS-2 with explicit length passed around).

@@ +627,5 @@
>              goto error;
>          JS_snprintf(*messagep, nbytes, defaultErrorMessage, errorNumber);
>      }
> +    if (!messageArgsPassed)
> +        FreeMessageArgs(messageArgs, argumentsType);

This is fugly and should be an RAII'd thing, but that's definitely more change than we should be doing in this cleanup.  If I'm lucky, it's in one of the later patches.  :-)

@@ +632,5 @@
>      return true;
>  
>  error:
> +    if (!messageArgsPassed)
> +        FreeMessageArgs(messageArgs, argumentsType);

Ugh, even more fugly.
Attachment #8771234 - Flags: review?(jwalden+bmo) → review+
Attachment #8771235 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8771236 [details] [diff] [review]
Part 3: Change JSErrorReport.ucmessage to JSErrorReport.message_ with const char* and const char16_t* union, and release them in destructor if own.

Review of attachment 8771236 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsapi.cpp
@@ +5996,5 @@
> +
> +JSString*
> +JSErrorReport::newMessageString(JSContext* cx)
> +{
> +    return hasMessage() ? JS_NewStringCopyZ(cx, message())

This is interpreting the message as Latin-1, but the function above is creating a UTF-8 message.  If we're adding/improving all this stuff, I want us to be enforcing consistent encoding -- ideally UTF-8 if that's possible without changing too much beyond the patch's current scope.

::: js/src/jsapi.h
@@ +5143,5 @@
>  
>  class JSErrorReport
>  {
> +    // The (default) error message.
> +    // If ownMessage_ is true, the it is freed in destructor.

Ugh, I guess this freeing thing is enough to preclude using Variant, if its size-consumption considerations were solved.  (Just occurred to me the tag is always size_t -- filed and fixed bug 1287243 for that.)

@@ +5145,5 @@
>  {
> +    // The (default) error message.
> +    // If ownMessage_ is true, the it is freed in destructor.
> +    union {
> +        const char* oneByteChars;

latin1Chars or utf8Chars, please, whichever ends up being done.

@@ +5166,5 @@
>          flags(0), errorNumber(0),
> +        exnType(0), isMuted(false),
> +        isTwoByteMessage_(false), ownMessage_(false)
> +    {
> +        message_.oneByteChars = nullptr;

I think message_ could be inited in the decl list now, possibly with a static_cast<>'d nullptr to the first member type.  But this works too.

@@ +5183,5 @@
> +    bool            isMuted : 1;    /* See the comment in ReadOnlyCompileOptions. */
> +
> +private:
> +    bool isTwoByteMessage_ : 1;
> +    bool ownMessage_ : 1;

ownsMessage_

@@ +5185,5 @@
> +private:
> +    bool isTwoByteMessage_ : 1;
> +    bool ownMessage_ : 1;
> +
> +public:

These private/public should be indented two spaces.

@@ +5197,5 @@
>          return tokenOffset_;
>      }
>      void initLinebuf(const char16_t* linebuf, size_t linebufLength, size_t tokenOffset);
> +
> +    bool hasMessage() const {

hasLatin1Message or hasUTF8Message

Assume this comment about adding "Latin1" or "UTF8" applies to every name that refers to just "message" right now, please.

@@ +5203,5 @@
> +    }
> +    bool hasUCMessage() const {
> +        return isTwoByteMessage_ && message_.twoByteChars;
> +    }
> +    bool ownMessage() {

ownsMessage is a little more common for verb tense, I think -- this instance "owns the message".  Compare with this instance "has [a] message".

@@ +5220,5 @@
> +    void ensureMessage(JSContext* cx);
> +
> +    void initMessageOwn(const char* messageArg) {
> +        initMessageNotOwn(messageArg);
> +        ownMessage_ = true;

initOwnedMessage I think would be mildly better, and initUnownedMessage (or maybe initBorrowedMessage now that Rust and all?).  Same for the UC names.

::: js/src/jscntxt.cpp
@@ +593,5 @@
>                  }
>                  MOZ_ASSERT(expandedArgs == argCount);
>                  *out = 0;
>                  js_free(buffer);
> +                size_t msgLen = PointerRangeSize(static_cast<const char16_t*>(ucmessage),

This cast isn't necessary.

::: js/xpconnect/src/XPCComponents.cpp
@@ +2347,4 @@
>          const char16_t* linebuf = err->linebuf();
>  
>          nsresult rv = scripterr->InitWithWindowID(
> +                err->hasMessage() ? NS_ConvertASCIItoUTF16(err->message())

AAAAaaugughghghgh now you're interpreting this as ASCII which is kinda technically not Latin-1 or UTF-8 why would you do this to me aaaaaaugghghghghhhhhhhhhh

::: js/xpconnect/src/XPCConvert.cpp
@@ +1215,5 @@
>      RefPtr<nsScriptError> data;
>      if (report) {
>          nsAutoString bestMessage;
> +        if (report && report->hasMessage()) {
> +            CopyASCIItoUTF16(report->message(), bestMessage);

Now you're just twisting the knife.  :-P

@@ +1217,5 @@
>          nsAutoString bestMessage;
> +        if (report && report->hasMessage()) {
> +            CopyASCIItoUTF16(report->message(), bestMessage);
> +        } else if (report && report->hasUCMessage()) {
> +            bestMessage = static_cast<const char16_t*>(report->ucmessage());

Remove the unnecessary cast.

::: js/xpconnect/src/nsXPConnect.cpp
@@ +297,5 @@
>              AssignJSFlatString(aString, name);
>              aString.AppendLiteral(": ");
>          }
> +        if (aReport->hasMessage())
> +            aString.AppendASCII(aReport->message());

e_e

::: netwerk/base/ProxyAutoConfig.cpp
@@ +326,5 @@
>    nsString formattedMessage(NS_LITERAL_STRING("PAC Execution "));
>    formattedMessage += aKind;
>    formattedMessage += NS_LITERAL_STRING(": ");
> +  if (aReport->hasMessage())
> +    formattedMessage.AppendASCII(aReport->message());

(╯°□°)╯︵ ┻━┻
Attachment #8771236 - Flags: review?(jwalden+bmo) → review-
Comment on attachment 8771237 [details] [diff] [review]
Part 4: Release JSErrorReport.linebuf_ in destructor if own.

Review of attachment 8771237 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsapi.cpp
@@ +5983,5 @@
> +JSErrorReport::freeLinebuf()
> +{
> +    if (ownLinebuf_ && linebuf_) {
> +        js_free((void*)linebuf_);
> +        linebuf_ = nullptr;

You could maybe remove this and rely on fallthrough -- for this small a function it seems reasonable to me to do so.

::: js/src/jsapi.h
@@ +5203,5 @@
> +        initLinebufNotOwn(linebufArg, linebufLengthArg, tokenOffsetArg);
> +        ownLinebuf_ = true;
> +    }
> +    void initLinebufNotOwn(const char16_t* linebufArg, size_t linebufLengthArg, size_t tokenOffsetArg);
> +    bool ownLinebuf() {

ownsLinebuf
Attachment #8771237 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8771239 [details] [diff] [review]
Part 5: Do not duplicate message in ExpandErrorArgumentsVA, and remove message parameter from WarningReporter.

Review of attachment 8771239 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsapi.h
@@ +5276,5 @@
>  
>  namespace JS {
>  
>  typedef void
> +(* WarningReporter)(JSContext* cx, JSErrorReport* report);

If we're touching this, use C++11's better typedef syntax:

using WarningReporter = void (*)(JSContext* cx, JSErrorReport* report);

::: js/src/jscntxt.cpp
@@ +327,5 @@
>  
>      if (checkReportFlags(cx, &flags))
>          return true;
>  
>      message = JS_vsmprintf(format, ap);

Move the char* declaration down to here, while you're at it.

@@ +332,5 @@
>      if (!message) {
>          ReportOutOfMemory(cx);
>          return false;
>      }
> +    report.initMessageOwn(message);

Ugh, if we have message flowing in here, we're going to have to audit every caller of this function (and callers of it, etc.) to make sure they're all passing UTF-8-ful stuff.  This is going to turn into the same dumpster fire that bug 987069 is, isn't it.  :-(

@@ +376,5 @@
>  }
>  
>  bool
>  js::PrintError(JSContext* cx, FILE* file, const char* message, JSErrorReport* report,
>                 bool reportWarnings)

Could you file a bug to get rid of the |message| parameter, in favor of callers filling the message into |report| themselves?

@@ +596,5 @@
>               * entire message.
>               */
>              if (efs->format) {
> +                const char* message;
> +                message = DuplicateString(cx, efs->format).release();

I think this all fits on one line without exceeding 99ch.
Attachment #8771239 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8771240 [details] [diff] [review]
Part 6: Rename message to toStringResult if it is the result of toString.

Review of attachment 8771240 [details] [diff] [review]:
-----------------------------------------------------------------

Bah, so message/toStringResult has an encoding mess as well.  This looks fine except for that.  Since this necessarily touches all the different users, let's fix its encoding as well while we're here -- all the callers/users should at least be obvious, given this patch.

::: js/src/jscntxt.cpp
@@ +401,5 @@
>          JS_free(cx, tmp);
>      }
>  
> +    report->ensureMessage(cx);
> +    const char* message = toStringResult ? toStringResult : report->message();

This technically isn't equivalent -- maybe do

  const char* message;
  if (toStringResult) {
      message = toStringResult;
  } else {
      report->ensureMessage(cx);
      message = report->message();
  }

so as to avoid allocating memory ensuring a message that would otherwise go completely ignored?

::: js/src/jsfriendapi.h
@@ +1483,4 @@
>      // And for our filename.
>      JSAutoByteString filename;
>  
> +    // We may have a result of error.toString().

Ugh, this is so bogus.  See bug 633623.  Please clarify somehow in a comment that this is Bad and we should get rid of it, referring to that bug.

@@ +1484,5 @@
>      JSAutoByteString filename;
>  
> +    // We may have a result of error.toString().
> +    const char* toStringResult_;
> +    JSAutoByteString toStringResultBytesStorage;

Having both these fields seems...strange.  But I guess consistent with what we already do, so okay, I guess.  Still more muck to clean out eventually.

::: js/xpconnect/src/XPCConvert.cpp
@@ +1219,5 @@
>              CopyASCIItoUTF16(report->message(), bestMessage);
>          } else if (report && report->hasUCMessage()) {
>              bestMessage = static_cast<const char16_t*>(report->ucmessage());
> +        } else if (toStringResult) {
> +            CopyASCIItoUTF16(toStringResult, bestMessage);

Nooooooooooo ASCII again when will the madness stop
Attachment #8771240 - Flags: review?(jwalden+bmo) → review-
Comment on attachment 8771241 [details] [diff] [review]
Part 7: Make CompileError subclass of JSErrorReport.

Review of attachment 8771241 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsexn.cpp
@@ +908,5 @@
>                                  nullptr, ArgumentsAreASCII, &ownedReport, ap)) {
>          return false;
>      }
>  
> +    ownedReport.ensureMessage(cx);

Erm, should this have been in a previous patch, probably?
Attachment #8771241 - Flags: review?(jwalden+bmo) → review+
Thank you for reviewing :)

Now checking encoding of message, to replace it with UTF-8.
There are several places that passes Latin1, that could be fixed easily.

however, there are also some places that passes more random encoding, like, native encoding or depending on LC_MESSAGES env :/
So, our error reporting API provides 2 variants, ASCII (JS_ReportErrorNumber) and Unicode (JS_ReportErrorNumberUC).
And ASCII variant actually accepts Latin1 in so many case.

Maybe we'd better adding UTF-8 variant as well?  like JS_ReportErrorUTF8, JS_ReportErrorNumberUTF8, etc?
it will clarify what encoding they should use for format and parameters.
And hopefully we could deprecate ASCII variant at some point.
This is gonna be a bug 987069 trail of tears no matter how we do it, isn't it.  :-(

I would suggest having JS_ReportError*{Latin1,UTF8,UC} and then change every user of the ASCII-ish thing we have now (I think technically it's Latin-1, tho it might not be documented that way) to Latin1 or UTF8, in however many preliminary patches are necessary (feel free to split them up across people, this part's parallelizable).  Once that's done, the patch here can be about as minimal as it originally was, just more consistent about encodings without some big inherent sprawl to it.
Depends on: 1289003
Depends on: 1289050
Comment on attachment 8771234 [details] [diff] [review]
Part 1: Remove JSErrorReport.messageArgs.

will land this and RAII patch separately in bug 1290422, before adding Latin1 variant.
Attachment #8771234 - Attachment is obsolete: true
there are several places that uses really ASCII only string, such like static string without arguments, and as we're going to use UTF-8 as internal representation and Latin1 needs conversion or at least character range cgeck,
it might be better to add JS_ReportError*ASCII variant, so that we can convert it to UTF-8 without any conversion/range check.
we could be able to validate the string in debug build.
(In reply to Tooru Fujisawa [:arai] from comment #20)
> there are several places that uses really ASCII only string, such like
> static string without arguments, and as we're going to use UTF-8 as internal
> representation and Latin1 needs conversion or at least character range cgeck,
> it might be better to add JS_ReportError*ASCII variant, so that we can
> convert it to UTF-8 without any conversion/range check.
> we could be able to validate the string in debug build.

sorry, nvm, such case should use UTF8 variant.
Depends on: 1294940
No longer depends on: 1289003
No longer depends on: 1294940
After we change the internal representation to UTF-8, we'd better changing js::ExpandErrorArgumentsVA to operate on UTF-8, not UTF-16.
finally, we're back here.
Attachment #8771235 - Attachment is obsolete: true
Attachment #8796825 - Flags: review+
Changed JSErrorReport.message_ to JS::ConstUTF8CharsZ.

Now following methods are added to JSErrorReport:
  * JSErrorReport::message()
  * JSErrorReport::initOwnedMessage
  * JSErrorReport::initBorrowedMessage
  * JSErrorReport::newMessageString
    creates JSString from message
  * JSErrorReport::freeMessage
    free message if owned
Attachment #8771236 - Attachment is obsolete: true
Attachment #8796826 - Flags: review?(jwalden+bmo)
now encoding issue should be fixed :)
everything is in UTF-8.
Attachment #8771240 - Attachment is obsolete: true
Attachment #8796829 - Flags: review?(jwalden+bmo)
Most callsites use ASCII and UTF-8 and we don't need UTF-16 as a result.  no reason to use UTF-16 for processing parameter.
Changed js::ExpandErrorArgumentsVA and AutoMessageArgs. to use UTF-8.
Attachment #8796831 - Flags: review?(jwalden+bmo)
efs->format seems to be static string in all case.
so I don't see any reason to duplicate it.
am I missing some case?

Green on try run:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0a07c85458f1
Attachment #8796832 - Flags: review?(jwalden+bmo)
Comment on attachment 8796826 [details] [diff] [review]
Part 2: Change JSErrorReport.ucmessage to JSErrorReport.message_ with ConstUTF8CharsZ, and release them in destructor if it is owned.

Review of attachment 8796826 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jscntxt.cpp
@@ +666,1 @@
>                                                   static_cast<const char16_t*>(out));

Betting this fits in 99ch now.

@@ +686,5 @@
>                  if (!*messagep)
>                      goto error;
> +
> +                char* message = DuplicateString(cx, efs->format).release();
> +                if (!message)

Duplicating the same string twice here is a bit whack.  Maybe we can clean that up later, or it's cleaned up later in this patch series.

::: js/src/jsexn.cpp
@@ +842,2 @@
>              // correspond to |ErrorMessage|. But this is what we've historically
>              // done for duck-typed error objects.

Rewrap this paragraph to 79ch if necessary, looks like it might be.

@@ +843,5 @@
>              // done for duck-typed error objects.
>              //
>              // If only this stuff could get specced one day...
> +            if (str->ensureFlat(cx) && strChars.initTwoByte(cx, str)) {
> +                char* utf8 = JS::CharsToNewUTF8CharsZ(cx, strChars.twoByteRange()).c_str();

I think this might have been a pre-existing problem: if |str->ensureFlat| fails, we should clear a pending exception.  And now that we're also allocating, that could fail (maybe with pending exception, I can't remember).

Additionally, I'm not sure we should be returning false if the 16-bit allocation fails.  The rest of the error-handling in this function is careful not to return false on error, almost always.  (And always in the code surrounding this fillip.)  It seems like this should do similar things.

So I think we want something more like this:

  if (str) {
    // Note that using...
    char* utf8;
    if (str->ensureFlat(cx) &&
        strChars.initTwoBytes(cx, str) &&
        (utf8 = JS::CharsToNewUTF8CharsZ(...)))
    {
      ownedReport.initOwnedMessage(utf8);
       
    } else {
      cx->clearPendingException();
      str = nullptr;
    }
  }

::: js/src/vm/ErrorObject.cpp
@@ +155,1 @@
>          return nullptr;

Seems to me that

  UniquePtr<char[], JS::FreePolicy> utf8 = StringToNewUTF8CharsZ(...);
  if (!utf8)
      return nullptr;
  report.initOwnedMessage(utf8.release());

is slightly better.  (Best would be initOwnedMessage taking UniquePtr itself, but given what the other callers look like this isn't unreasonable.)
Attachment #8796826 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8796829 [details] [diff] [review]
Part 5: Rename message to toStringResult if it is the result of toString.

Review of attachment 8796829 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/WorkerPrivate.cpp
@@ +5826,4 @@
>        // Try again, with only a 1 KB string. Do this infallibly this time.
>        // If the user doesn't have 1 KB to spare we're done anyways.
> +      nsDependentCString truncatedToStringResult(aToStringResult, 1024);
> +      AppendUTF8toUTF16(truncatedToStringResult, message);

Um.  Truncating UTF-8 this way -- which I think just is a hard byte count -- could hit the middle of a codepoint.  And who knows what happens when such a string is consumed by anything else.

I literally do not know what the *proper* approach is here.  But here's a hard-coded version that suffices until *someone* somewhere (maybe the XPConnect reviewer?) can tell us what the right approach is:

  if (!AppendUTF8toUTF16(toStringResult, message, mozilla::fallible)) {
    uint32_t index = std::min(uint32_t(1024), toStringResult.Length());
    while (index > 0 && (toStringResult[index] & 0xC0) == 0x80)
      index--;
    nsDependentCString truncatedToStringResult(aToStringResult, index);
    AppendUTF8toUTF16(truncatedToStringResult, message);
  }

Ugh.

::: dom/workers/WorkerPrivate.h
@@ +1141,5 @@
>    bool
>    NotifyInternal(JSContext* aCx, Status aStatus);
>  
>    void
> +  ReportError(JSContext* aCx, const char* aToStringResult,

Make |aToStringResult| a ConstUTF8CharsZ.

::: js/src/jscntxt.h
@@ +630,5 @@
>   * is true.
>   * Returns false otherwise.
>   */
>  extern bool
> +PrintError(JSContext* cx, FILE* file, const char* toStringResult,

I think |toStringResult| can be ConstUTF8CharsZ -- every caller looks like it passes in a .c_str() of such, so easy to change.

::: js/xpconnect/src/XPCConvert.cpp
@@ +1094,3 @@
>                  JSString* str;
>                  if (nullptr != (str = ToString(cx, s)))
> +                    toStringResult.encodeLatin1(cx, str);

Can't this be Utf8, if the CopyASCIIToUTF16 below changes to CopyUTF8toUTF16?  Please do that.

I think this patch probably needs an XPConnect review at this point.

@@ +1190,5 @@
>  /********************************/
>  
>  // static
>  nsresult
> +XPCConvert::JSErrorToXPCException(const char* toStringResult,

This function's never used outside this function -- should convert it to file-static in a followup patch, seems to me.

::: js/xpconnect/src/nsXPConnect.cpp
@@ +177,5 @@
>      mWindowID = aWindowID;
>  
>      ErrorReportToMessageString(aReport, mErrorMsg);
> +    if (mErrorMsg.IsEmpty() && aToStringResult) {
> +        mErrorMsg.AssignWithConversion(aToStringResult);

mErrorMsg is nsString, so we have

void
nsString::AssignWithConversion( const nsACString& aData )
{
  CopyASCIItoUTF16(aData, *this);
}

as what's done here, with a UTF-8-encoded string.  Ixnay.  I think this wants to be AppendUTF8toUTF16, doesn't it?  (Double-check me on this.)
Attachment #8796829 - Flags: review?(jwalden+bmo) → review-
Comment on attachment 8796831 [details] [diff] [review]
Part 7: Use UTF-8 in js::ExpandErrorArgumentsVA and AutoMessageArgs.

Review of attachment 8796831 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jscntxt.cpp
@@ +467,1 @@
>      mozilla::Array<size_t, JS::MaxNumErrorArguments> lengths_;

How about Array<UTF8Chars, JS::MaxNumErrorArguments> instead?

@@ +471,5 @@
>    public:
>      AutoMessageArgs()
> +      : totalLength_(0), count_(0), allocatedElements_(false)
> +    {
> +        memset(args_.begin(), 0, args_.end() - args_.begin());

PodArrayZero(args_);

This wouldn't even be right, anyway -- subtracting const char** as this does will divide the byte-count by sizeof(const char*), so you won't clear the entire array.  Would be nice if somehow we could have a test that fails with this mistake in place, but it may not be writable.
Attachment #8796831 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8796832 [details] [diff] [review]
Part 8: Do not duplicate zero arguments error message.

Review of attachment 8796832 [details] [diff] [review]:
-----------------------------------------------------------------

Hm, I think this is okay, yeah.  In principle I could be worried that an |efs| could be dynamically allocated.  But that's not really possible with the current callback API just returning a pointer, with no ownership sense attached to it.
Attachment #8796832 - Flags: review?(jwalden+bmo) → review+
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #33)
> Comment on attachment 8796831 [details] [diff] [review]
> Part 7: Use UTF-8 in js::ExpandErrorArgumentsVA and AutoMessageArgs.
> 
> Review of attachment 8796831 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/jscntxt.cpp
> @@ +467,1 @@
> >      mozilla::Array<size_t, JS::MaxNumErrorArguments> lengths_;
> 
> How about Array<UTF8Chars, JS::MaxNumErrorArguments> instead?

UTF8Chars cannot be assigned :/

> .../mozilla-central/js/src/jscntxt.cpp:542:22: error: object of type 'JS::UTF8Chars' cannot be assigned
>       because its copy assignment operator is implicitly deleted
>             args_[i] = UTF8Chars(utf8, length);
>                      ^
> .../mozilla-central/obj-sm-ed2/dist/include/js/CharacterEncoding.h:75:19: note: copy assignment
>       operator of 'UTF8Chars' is implicitly deleted because base class 'mozilla::Range<unsigned char>' has a deleted
>       copy assignment operator
> class UTF8Chars : public mozilla::Range<unsigned char>
>                   ^
> .../mozilla-central/obj-sm-ed2/dist/include/mozilla/Range.h:20:22: note: copy assignment operator of
>       'Range<unsigned char>' is implicitly deleted because field 'mStart' has no copy assignment operator
>   const RangedPtr<T> mStart;
>                      ^
> 1 error generated.

also

>   /*
>    * You can only assign one RangedPtr into another if the two pointers have
>    * the same valid range:
>    *
>    *   char arr1[] = "hi";
>    *   char arr2[] = "bye";
>    *   RangedPtr<char> p1(arr1, 2);
>    *   p1 = RangedPtr<char>(arr1 + 1, arr1, arr1 + 2); // works
>    *   p1 = RangedPtr<char>(arr2, 3);                  // asserts
>    */
>   RangedPtr<T>& operator=(const RangedPtr<T>& aOther)

is there any way to store UTF8Chars to array?
Flags: needinfo?(jwalden+bmo)
Hm, right.

I'm increasingly convinced Range isn't the right representation for strings.  Possibly for strings in general, but *definitely* for UTF-8 strings in particular where indexing and byte length are generally-meaningless information.

I guess stick with what you have for now, because there's nothing quite right for this.  Sad.  :-\
Flags: needinfo?(jwalden+bmo)
Addressed review comments.
Attachment #8796829 - Attachment is obsolete: true
Attachment #8798407 - Flags: review?(jwalden+bmo)
Attachment #8798407 - Flags: review?(bobbyholley)
Comment on attachment 8798407 [details] [diff] [review]
Part 5: Rename message to toStringResult if it is the result of toString.

Review of attachment 8798407 [details] [diff] [review]:
-----------------------------------------------------------------

rs=me on the rename parts outside of the js engine.

::: dom/workers/WorkerPrivate.cpp
@@ +5840,5 @@
> +
> +      // Drop the last code point that may be cropped.
> +      while (index > 0 && (toStringResult[index] & 0xC0) == 0x80) {
> +        index--;
> +      }

This should move into a helper somewhere and be reviewed by somebody who knows more about unicode than I do (Waldo might work).
Attachment #8798407 - Flags: review?(bobbyholley) → review+
Attachment #8798408 - Flags: review?(bobbyholley) → review+
Comment on attachment 8798407 [details] [diff] [review]
Part 5: Rename message to toStringResult if it is the result of toString.

Review of attachment 8798407 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/workers/WorkerPrivate.cpp
@@ +5840,5 @@
> +
> +      // Drop the last code point that may be cropped.
> +      while (index > 0 && (toStringResult[index] & 0xC0) == 0x80) {
> +        index--;
> +      }

froydnj: how about we add (with any tpyos fixed)

#include <type_traits>

template<typename Char, typename UnsignedT>
inline void
RewindToPriorUTF8Codepoint(const Char* utf8Chars, UnsignedT* index)
{
  static_assert(std::is_same<Char, char>::value ||
                std::is_same<Char, unsigned char>::value ||
                std::is_same<Char, signed char>::value,
                "UTF-8 data must be in 8-bit units");
  static_assert(std::is_unsigned<UnsignedT>::value, "index type must be unsigned");
  while (*index > 0 && (utf8Chars[*index] & 0xC0) == 0x80)
    --*index;
}

to xpcom/string/nsUTF8Utils.h for use here as

  RewindToPriorUTF8Codepoint(toStringResult.BeginReading(), &index);

?  (Probably should be in mfbt, but there's only the one user so far, and I don't think we have quite the critical mass of UTF-8-related functionality yet to start mfbt/UTF8.h or something.)
Attachment #8798407 - Flags: review?(nfroyd)
Attachment #8798407 - Flags: review?(jwalden+bmo)
Attachment #8798407 - Flags: review+
Comment on attachment 8798407 [details] [diff] [review]
Part 5: Rename message to toStringResult if it is the result of toString.

Review of attachment 8798407 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry about the delay in reviewing; PTO and whatnot.  Waldo's proposed change is OK with two modifications:

- I'd suggest using traits from mozilla/TypeTraits.h instead of type_traits; we ran into some sticky problems when trying seemingly straightforward replacements of TypeTraits.h with type_traits, and I'd rather this patch not be held up by those problems.
- We change the prototype of this new function to:

template<typename Char, typename UnsignedT>
inline UnsignedT
RewindToPriorUTF8Codepoint(const Char* utf8Chars, UnsignedT index)

and avoid in-out parameters.

r=me with those changes.
Attachment #8798407 - Flags: review?(nfroyd) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/23938284613786e648c3d11002fa09ba64cffa95
Bug 1283710 - Part 1: Reorder JSErrorReport members. r=jwalden

https://hg.mozilla.org/integration/mozilla-inbound/rev/61f8250cbe0b82d7f918756d86e135d7b3fb401d
Bug 1283710 - Part 2: Change JSErrorReport.ucmessage to JSErrorReport.message_ with ConstUTF8CharsZ, and release them in destructor if it is owned. r=jwalden

https://hg.mozilla.org/integration/mozilla-inbound/rev/dcedbaefe39987d5f802b2c579391d7352df2891
Bug 1283710 - Part 3: Release JSErrorReport.linebuf_ in destructor if own. r=jwalden

https://hg.mozilla.org/integration/mozilla-inbound/rev/ee5215f1a38e702d7335da05316ead5978e64986
Bug 1283710 - Part 4: Do not duplicate message in ExpandErrorArgumentsVA, and remove message parameter from WarningReporter. r=jwalden

https://hg.mozilla.org/integration/mozilla-inbound/rev/d72527b7d3c0c328e65bdaf4c259921a1e8481c2
Bug 1283710 - Part 5: Rename message to toStringResult if it is the result of toString. r=bholley,jwalden,froydnj

https://hg.mozilla.org/integration/mozilla-inbound/rev/51b8d69edca019360cfc445812d51fbbba56450e
Bug 1283710 - Part 6: Make CompileError subclass of JSErrorReport. r=jwalden

https://hg.mozilla.org/integration/mozilla-inbound/rev/98339fa564f1b5ba8e0c518a0ff476447bcbf1d3
Bug 1283710 - Part 7: Use UTF-8 in js::ExpandErrorArgumentsVA and AutoMessageArgs. r=jwalden

https://hg.mozilla.org/integration/mozilla-inbound/rev/fed60fbf645df22d963397aa96c0a926005d2086
Bug 1283710 - Part 8: Do not duplicate zero arguments error message. r=jwalden

https://hg.mozilla.org/integration/mozilla-inbound/rev/f727edc4be4805fb275be192145f0a5902477589
Bug 1283710 - Part 9: Make JSErrorToXPCException a file static function. r=bholley
https://hg.mozilla.org/integration/mozilla-inbound/rev/eb95a12e5d860587a32637da920e0ecb2496c8c4
Bug 1283710 - followup: Free owned linebuf in JSErrorReport. r=jwalden CLOSED TREE
thanks :)
it's from other 2 issues, will fix them and re-land this.
Flags: needinfo?(arai.unmht)
https://hg.mozilla.org/integration/mozilla-inbound/rev/ca517b3fa4fbc9c7abac3de25f6a95df22b34b7f
Bug 1283710 - Part 1: Reorder JSErrorReport members. r=jwalden

https://hg.mozilla.org/integration/mozilla-inbound/rev/8900adb4c36dffa8ef6b89ba3e277e39551dea17
Bug 1283710 - Part 2: Change JSErrorReport.ucmessage to JSErrorReport.message_ with ConstUTF8CharsZ, and release them in destructor if it is owned. r=jwalden

https://hg.mozilla.org/integration/mozilla-inbound/rev/efd6ab563cb9e854fe3c0fbec60d7c491b213648
Bug 1283710 - Part 3: Release JSErrorReport.linebuf_ in destructor if own. r=jwalden

https://hg.mozilla.org/integration/mozilla-inbound/rev/e8bc246a57ad418864039ffe1dfb2cbe9a83fba1
Bug 1283710 - Part 4: Do not duplicate message in ExpandErrorArgumentsVA, and remove message parameter from WarningReporter. r=jwalden

https://hg.mozilla.org/integration/mozilla-inbound/rev/fe8948da2efd64f8695f00f3dd49437f0945051a
Bug 1283710 - Part 5: Rename message to toStringResult if it is the result of toString. r=bholley,jwalden,froydnj

https://hg.mozilla.org/integration/mozilla-inbound/rev/357f9ca720b4e71c12c75e2325c77e8c3386af88
Bug 1283710 - Part 6: Make CompileError subclass of JSErrorReport. r=jwalden

https://hg.mozilla.org/integration/mozilla-inbound/rev/c00f9da4f27def5b26dc39bfa6671c0c25f02b7c
Bug 1283710 - Part 7: Use UTF-8 in js::ExpandErrorArgumentsVA and AutoMessageArgs. r=jwalden

https://hg.mozilla.org/integration/mozilla-inbound/rev/5bde779b6b887d540053e675e29a20f3f9867bc7
Bug 1283710 - Part 8: Do not duplicate zero arguments error message. r=jwalden

https://hg.mozilla.org/integration/mozilla-inbound/rev/00e511e83e80366c8b2392ff159296dc752ef149
Bug 1283710 - Part 9: Make JSErrorToXPCException a file static function. r=bholley
Pushed by arai_a@mac.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ca517b3fa4fb
Part 1: Reorder JSErrorReport members. r=jwalden
https://hg.mozilla.org/integration/mozilla-inbound/rev/8900adb4c36d
Part 2: Change JSErrorReport.ucmessage to JSErrorReport.message_ with ConstUTF8CharsZ, and release them in destructor if it is owned. r=jwalden
https://hg.mozilla.org/integration/mozilla-inbound/rev/efd6ab563cb9
Part 3: Release JSErrorReport.linebuf_ in destructor if own. r=jwalden
https://hg.mozilla.org/integration/mozilla-inbound/rev/e8bc246a57ad
Part 4: Do not duplicate message in ExpandErrorArgumentsVA, and remove message parameter from WarningReporter. r=jwalden
https://hg.mozilla.org/integration/mozilla-inbound/rev/fe8948da2efd
Part 5: Rename message to toStringResult if it is the result of toString. r=bholley,jwalden,froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/357f9ca720b4
Part 6: Make CompileError subclass of JSErrorReport. r=jwalden
https://hg.mozilla.org/integration/mozilla-inbound/rev/c00f9da4f27d
Part 7: Use UTF-8 in js::ExpandErrorArgumentsVA and AutoMessageArgs. r=jwalden
https://hg.mozilla.org/integration/mozilla-inbound/rev/5bde779b6b88
Part 8: Do not duplicate zero arguments error message. r=jwalden
https://hg.mozilla.org/integration/mozilla-inbound/rev/00e511e83e80
Part 9: Make JSErrorToXPCException a file static function. r=bholley
You need to log in before you can comment on or make changes to this bug.