Closed Bug 1454460 Opened 7 years ago Closed 7 years ago

Simplify BOM handling in StreamLoader

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla61

Tracking Flags:

Tracking

Status

firefox61

---

fixed

People

(Reporter: bholley, Assigned: bholley)

References

Details

Attachments

(2 files)

Buffer the potentially-BOM-related bytes separately and handle them on the fly. v1 7 years ago Bobby Holley (:bholley) 12.44 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review
Part 2 - Pass a bonafide nsACString to Servo. v2 7 years ago Bobby Holley (:bholley) 15.29 KB, patch	bzbarsky : review+	Details \| Diff \| Splinter Review

Bobby Holley (:bholley)

Assignee

Description

•

7 years ago

See bug 1454030 comment 6.

Bobby Holley (:bholley)

Assignee

Comment 1

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b0f2599a8ed0b5e091fb6befc870bad51639b2e0

Bobby Holley (:bholley)

Assignee

Comment 2

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=7df5f56fdc336064ecd71a35dbcfb60be6326e69

Bobby Holley (:bholley)

Assignee

Comment 3

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4938d68edb7e4686dfb9d98dd4b3302aecf6c417

Bobby Holley (:bholley)

Assignee

Comment 4

•

7 years ago

Attached patch Buffer the potentially-BOM-related bytes separately and handle them on the fly. v1 — Details — Splinter Review

MozReview-Commit-ID: 5zrKyadBppO

Attachment #8968328 - Flags: review?(bzbarsky)

Bobby Holley (:bholley)

Assignee

Comment 5

•

7 years ago

Attached patch Part 2 - Pass a bonafide nsACString to Servo. v2 — Details — Splinter Review

This is the rebased patch from the other bug. MozReview-Commit-ID: HCop9h2abZU

Attachment #8968345 - Flags: review?(bzbarsky)

Bobby Holley (:bholley)

Assignee

Updated

•

7 years ago

Blocks: 1346988
No longer blocks: 1454030

Boris Zbarsky [:bzbarsky]

Comment 6

•

7 years ago

Comment on attachment 8968328 [details] [diff] [review] Buffer the potentially-BOM-related bytes separately and handle them on the fly. v1 >+ * For uninteresting reasons, the caller may have the data split into two Why not just have CSS use the "streaming mode" SRICheckDataVerifier instead? That's designed to handle multiple chunks, etc... >+++ b/layout/style/SheetLoadData.h >+ const nsACString& aPrefix, >+ const nsACString& aRest, This needs at least documentation about how aPrefix differs from aRest. >+ if (encoding == UTF_8_ENCODING && Encoding::UTF8ValidUpTo(bytes) == bytes.Length()) { This is losing an optimization the old code used to have: when we had a UTF8 encoding (at least via the BOM) but not all the data was valid, it would pass in the valid length to DecodeWithoutBOMHandling. It seems like it should be possible to preserve that optimization. Maybe something like: size_t validated = 0; if (encoding == UTF_8_ENCODING) { validated = Encoding::UTF8ValidUpTo(bytes); } if (validated == bytes.Length()) { // Either this is UTF-8 and all valid, or it's not UTF-8 but is an // empty string. This assumes that an empty string in any encoding // decodes to empty string, which seems like a plausible assumption. utf8String.Assign(bytes); } else { rv = encoding->DecodeWithoutBOMHandling(bytes, utf8String, validated); } or something. >+++ b/layout/style/StreamLoader.h > nsCString mBytes; Please add a comment about how this never contains the BOM when there is a BOM involved, just the actual text bytes. >+ nsAutoCStringN<3> mFirstThreeBytes; I'm torn between that and mBOM or something. By the time we get to the end it has the latter meaning. During load it can be the first 3, or 2, or 1 or 0 bytes... Either way, we should document what's stored in this string at what stage of the load. r=me

Attachment #8968328 - Flags: review?(bzbarsky) → review+

Boris Zbarsky [:bzbarsky]

Comment 7

•

7 years ago

Comment on attachment 8968345 [details] [diff] [review] Part 2 - Pass a bonafide nsACString to Servo. v2 >+ if (aUTF8.IsEmpty()) { Worth testing && !aUTF16.IsEmpty()? Or should we just not worry too much about the empty-data case and the extra "copy" we end up doing then? r=me

Attachment #8968345 - Flags: review?(bzbarsky) → review+

Pulsebot

Comment 8

•

7 years ago

Pushed by bholley@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c92f6b8e882b Buffer the potentially-BOM-related bytes separately and handle them on the fly. r=bz https://hg.mozilla.org/integration/autoland/rev/985679bb1928 Pass a bonafide nsACString to Servo. r=bz https://hg.mozilla.org/integration/autoland/rev/a368c8d6676b Hoist SRI helper into Loader.cpp. r=bz

Bobby Holley (:bholley)

Assignee

Comment 9

•

7 years ago

(In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #6) > Comment on attachment 8968328 [details] [diff] [review] > Buffer the potentially-BOM-related bytes separately and handle them on the > fly. v1 > > >+ * For uninteresting reasons, the caller may have the data split into two > > Why not just have CSS use the "streaming mode" SRICheckDataVerifier instead? > That's designed to handle multiple chunks, etc... Per discussion on IRC, I'll add a third patch to move the helper to Loader.cpp. > > >+++ b/layout/style/SheetLoadData.h > >+ const nsACString& aPrefix, > >+ const nsACString& aRest, > > This needs at least documentation about how aPrefix differs from aRest. Done. I also renamed to aBytes1 and aBytes2, which I've decided is clearer. > > >+ if (encoding == UTF_8_ENCODING && Encoding::UTF8ValidUpTo(bytes) == bytes.Length()) { > > This is losing an optimization the old code used to have: when we had a UTF8 > encoding (at least via the BOM) but not all the data was valid, it would > pass in the valid length to DecodeWithoutBOMHandling. It seems like it > should be possible to preserve that optimization. Maybe something like: ok. > > >+++ b/layout/style/StreamLoader.h > > nsCString mBytes; > > Please add a comment about how this never contains the BOM when there is a > BOM involved, just the actual text bytes. > > >+ nsAutoCStringN<3> mFirstThreeBytes; > > I'm torn between that and mBOM or something. By the time we get to the end > it has the latter meaning. During load it can be the first 3, or 2, or 1 or > 0 bytes... I'll make it mBOMBytes. > > Either way, we should document what's stored in this string at what stage of > the load. Done. (In reply to Boris Zbarsky [:bz] (no decent commit message means r-) from comment #7) > Comment on attachment 8968345 [details] [diff] [review] > Part 2 - Pass a bonafide nsACString to Servo. v2 > > >+ if (aUTF8.IsEmpty()) { > > Worth testing && !aUTF16.IsEmpty()? Or should we just not worry too much > about the empty-data case and the extra "copy" we end up doing then? I'll just switch it to !aUTF16.IsEmpty() and MOZ_ASSERT(aUTF16.IsEmpty() || aUTF8.IsEmpty());

Cristina Coroiu [:ccoroiu]

Comment 10

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/c92f6b8e882b https://hg.mozilla.org/mozilla-central/rev/985679bb1928 https://hg.mozilla.org/mozilla-central/rev/a368c8d6676b

Status: NEW → RESOLVED

Closed: 7 years ago

status-firefox61: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla61

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Simplify BOM handling in StreamLoader

Categories

(Core :: CSS Parsing and Computation, enhancement)

Tracking

()

People

(Reporter: bholley, Assigned: bholley)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(2 files)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Attachment

General

Description

File Name

Content Type