Closed Bug 887437 Opened 11 years ago Closed 11 years ago

Optimize encodeURIComponent

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla25

People

(Reporter: jandem, Assigned: jandem)

Details

Attachments

(1 file, 1 obsolete file)

encodeURIComponent perf is critical for Kraken stanford-crypto-sha256.

Currently, for every char, Encode does a js_strchr on this table:

    {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9',
     'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M',
     'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z',
     'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
     'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z',
     '-', '_', '.', '!', '~', '*', '\'', '(', ')', 0};

Using a few range checks instead wins at least 12% on this test. Ideally we'd use the char as index in an array of booleans.
Attached patch Patch (obsolete) — Splinter Review
The conversion from array-of-chars to array-of-booleans was a bit tedious but after that it was pretty straight-forward.
Attachment #768216 - Flags: review?(hv1989)
Attached patch PatchSplinter Review
Attachment #768216 - Attachment is obsolete: true
Attachment #768216 - Flags: review?(hv1989)
Attachment #768218 - Flags: review?(hv1989)
Comment on attachment 768218 [details] [diff] [review]
Patch

Review of attachment 768218 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!

::: js/src/jsstr.cpp
@@ +4419,2 @@
>              if (!sb.append(c))
>                  return false;

We can remove the check now here, since you reserved it. Though I know there hasn't been a concensus about removing/leaving this. So I'm fine either way.

@@ +4564,5 @@
>      Rooted<JSLinearString*> str(cx, ArgToRootedString(cx, args, 0));
>      if (!str)
>          return false;
>  
> +    return Encode(cx, str, js_isUriUnescaped, js_isUriReservedPlusPound, args.rval());

Is there a particular reason the arguments got switched? Is that just because "str_encodeURI_Component" has js_isUriUnescaped on the first place?
Attachment #768218 - Flags: review?(hv1989) → review+
Thanks for the quick review!

https://hg.mozilla.org/integration/mozilla-inbound/rev/e5f664c8b66d


(In reply to Hannes Verschore [:h4writer] from comment #3)
> We can remove the check now here, since you reserved it.

We reserve input.length chars, but encoding can result in a longer output string, for instance "#" -> "%23"

> Is there a particular reason the arguments got switched? Is that just
> because "str_encodeURI_Component" has js_isUriUnescaped on the first place?

Yeah that's one reason, but it's also a micro-optimization: js_isUriUnescaped includes more characters (and common ones like a-z, A-Z, 0-9) so it's a bit more efficient to test these first.
https://hg.mozilla.org/mozilla-central/rev/e5f664c8b66d
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla25
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: