Closed Bug 1488192 Opened 6 years ago Closed 6 years ago

Return the input when no characters were decoded/encoded in decodeURI/encodeURI

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla64

Tracking Flags:

Tracking

Status

firefox63

---

wontfix

firefox64

---

fixed

People

(Reporter: anba, Assigned: anba)

Details

Attachments

(3 files)

not-for-checkin-count-decode-encode.patch 6 years ago André Bargull [:anba] 3.38 KB, text/x-patch		Details
decode-encode-counter.txt 6 years ago André Bargull [:anba] 4.83 KB, text/plain		Details
bug-1488192.patch 6 years ago André Bargull [:anba] 11.53 KB, patch	jandem : review+	Details \| Diff \| Splinter Review

André Bargull [:anba]

Assignee

Description

•

6 years ago

Attached file not-for-checkin-count-decode-encode.patch — Details

It looks like {de,en}codeURI[Component] are quite often called on websites even though no characters need to be decoded resp. encoded. 

For example I got the following results when applying the attached patch and then browsing some news and Alexa 50 sites:

(gdb) p Decode_IdenticalTransferCount
$1 = 3592393
(gdb) p Decode_NonIdenticalTransferCountLatin1
$2 = 177340
(gdb) p Decode_NonIdenticalTransferCountUTF16
$3 = 809
(gdb) p Encode_IdenticalTransferCount
$4 = 445947
(gdb) p Encode_NonIdenticalTransferCountLatin1
$5 = 38846
(gdb) p Encode_NonIdenticalTransferCountUTF16
$6 = 2335

with
- Decode_IdenticalTransferCount: decodeURI and decodeURIComponent called and no characters needed to be decoded.
- Decode_NonIdenticalTransferCountLatin1: decodeURI and decodeURIComponent called, some characters were decoded, input was Latin-1.
- Decode_NonIdenticalTransferCountUTF16: decodeURI and decodeURIComponent called, some characters were decoded, input was UTF-16.
- Encode_IdenticalTransferCount: encodeURI and encodeURIComponent called and no characters needed to be encoded.
- Encode_NonIdenticalTransferCountLatin1: encodeURI and encodeURIComponent called, some characters were encoded, input was Latin-1.
- Encode_NonIdenticalTransferCountUTF16: encodeURI and encodeURIComponent called, some characters were encoded, input was UTF-16.

André Bargull [:anba]

Assignee

Comment 1

•

6 years ago

Attached file decode-encode-counter.txt — Details

More detailed results for a couple of sites

André Bargull [:anba]

Assignee

Comment 2

•

6 years ago

Attached patch bug-1488192.patch — Details — Splinter Review

Modifies Encode(...) and Decode(...) to append string ranges to the StringBuilder instead of single characters and leave the StringBuilder empty when no characters were decoded/encoded, in which case the callers can return the input string unchanged.


Drive-by changes:
- Remove the unnecessary null-character terminator in |hexBuf| in the Encode() function.
- Modified [1] to call StringBuffer::append(Latin1Char) instead of StringBuffer::append(char16_t), because the former should be slightly faster. (Unless the compiler already figured out that the input is definitely a Latin-1 characters, because |B < 128| is true.)
- Correct the OOM handling in DebugState::debugDisplayURL() to check for |cx->isThrowingOutOfMemory()|. Also added an assertion for "over-recursed" exceptions, which probably don't happen when calling |EncodeURI|, but if they actually do happen (and the assertion fails), we should change the code to handle over-recursed errors similar to OOM errors.

[1] https://searchfox.org/mozilla-central/rev/721842eed881c7fcdccb9ec0fe79e4e6d4e46604/js/src/builtin/String.cpp#3890,3895-3896

Attachment #9005999 - Flags: review?(jdemooij)

Jan de Mooij [:jandem]

Comment 3

•

6 years ago

Comment on attachment 9005999 [details] [diff] [review]
bug-1488192.patch

Review of attachment 9005999 [details] [diff] [review]:
-----------------------------------------------------------------

Wow, great find. That eliminates a lot of string allocations.

::: js/src/builtin/String.cpp
@@ +3755,3 @@
>  {
> +    if (!sb.empty()) {
> +        str = sb.finishString();

I was wondering about the empty-input-string case, but I see finishString returns cx->names().empty if length == 0, so that will be optimized correctly :)

Attachment #9005999 - Flags: review?(jdemooij) → review+

André Bargull [:anba]

Assignee

Comment 4

•

6 years ago

Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=64c2058bd3ad33ff36f4f32d760e417e53e099cb

Keywords: checkin-needed

Pulsebot

Comment 5

•

6 years ago

Pushed by btara@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c3b29fcce16f
Return input if no characters were modified in decode/encodeURI. r=jandem

Keywords: checkin-needed

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 6

•

6 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/c3b29fcce16f

Status: ASSIGNED → RESOLVED

Closed: 6 years ago

status-firefox64: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla64

Ryan VanderMeulen [:RyanVM]

Updated

•

6 years ago

status-firefox63: affected → wontfix

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Return the input when no characters were decoded/encoded in decodeURI/encodeURI

Categories

(Core :: JavaScript: Standard Library, defect)

Tracking

()

People

(Reporter: anba, Assigned: anba)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(3 files)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Attachment

General

Description

File Name

Content Type