Open Bug 1248745 Opened 8 years ago Updated 2 years ago

CSS errors in data URI encoded documents can be extremely large

Categories

(Core :: CSS Parsing and Computation, defect)

defect

Tracking

()

Tracking Status
firefox47 --- affected

People

(Reporter: mccr8, Unassigned)

References

Details

(Whiteboard: [MemShrink:P2])

This is from the DMD output from the page in bug 1248261:

Unreported {
  65 blocks in heap block record 1 of 1,964
  377,663,488 bytes (377,542,270 requested / 121,218 slop)
  Individual block sizes: 6,070,272 x 2; 6,008,832 x 2; 5,926,912 x 2; 5,910,528 x 4; 5,898,240; 5,881,856 x 4; 5,877,760 x 3; 5,873,664 x 2; 5,861,376 x 2; 5,857,280 x 2; 5,849,088 x 3; 5,812,224 x 2; 5,808,128; 5,799,936 x 2; 5,787,648 x 2; 5,754,880 x 4; 5,742,592 x 2; 5,738,496 x 2; 5,734,400 x 4; 5,730,304 x 4; 5,726,208 x 3; 5,722,112; 5,718,016 x 4; 5,713,920 x 7
  18.02% of the heap (18.02% cumulative)
  24.42% of unreported (24.42% cumulative)
  Allocated at {
    #01: nsStringBuffer::Alloc (c:\m-c\xpcom\string\nssubstring.cpp:218)
    #02: nsAString_internal::MutatePrep (c:\m-c\xpcom\string\nstsubstring.cpp:133)
    #03: nsAString_internal::SetCapacity (c:\m-c\xpcom\string\nstsubstring.cpp:665)
    #04: AppendUTF8toUTF16 (c:\m-c\xpcom\string\nsreadableutils.cpp:234)
    #05: AppendUTF8toUTF16 (c:\m-c\xpcom\string\nsreadableutils.cpp:213)
    #06: `anonymous namespace'::ShortTermURISpecCache::GetSpec (c:\m-c\layout\style\errorreporter.cpp:40)
    #07: mozilla::css::ErrorReporter::OutputError (c:\m-c\layout\style\errorreporter.cpp:192)
    #08: `anonymous namespace'::CSSParserImpl::ParseDeclaration (c:\m-c\layout\style\nscssparser.cpp:7211)
    #09: `anonymous namespace'::CSSParserImpl::ParseDeclarationBlock (c:\m-c\layout\style\nscssparser.cpp:6526)
    #10: `anonymous namespace'::CSSParserImpl::ParseRuleSet (c:\m-c\layout\style\nscssparser.cpp:5281)
    #11: `anonymous namespace'::CSSParserImpl::ParseSheet (c:\m-c\layout\style\nscssparser.cpp:1698)

It looks like this line in ErrorReporter.cpp:
  mFileName = sSpecCache->GetSpec(mURI);

...is allocating strings that are multiple megabytes long. It would be good if we could truncate those.
Whiteboard: [MemShrink]
It would be nice if we had a "capped string" for error messages that would just silently cap at say 10kb.
I'll bet the URI in question here is a data-URI-encoded SVG file, like the 2MB one that I attached on bug 1248261.

(In reply to Andrew McCreight [:mccr8] from comment #0)
> It looks like this line in ErrorReporter.cpp:
>   mFileName = sSpecCache->GetSpec(mURI);
> 
> ...is allocating strings that are multiple megabytes long. It would be good
> if we could truncate those.

GetSpec() there is implemented like so:
> 33   nsString const& GetSpec(nsIURI* aURI) {
> 34     if (mURI != aURI) {
> 35       mURI = aURI;
> 36 
> 37       nsAutoCString cSpec;
> 38       mURI->GetSpec(cSpec);
> 39       CopyUTF8toUTF16(cSpec, mSpec);
> 40     }
> 41     return mSpec;

The first time this is called for a given URI, it'll first have to allocate temporary space for a CString that's large enough to fit the URI, and then it'll have to allocate a UTF-16 string (in the outparam) that's large enough to fit the URI.

(In reply to Andrew McCreight [:mccr8] from comment #1)
> It would be nice if we had a "capped string" for error messages that would
> just silently cap at say 10kb.

Indeed! That would be a useful abstraction.  In this case, we could do "good enough" by truncating cSpec before we do the CopyUTF8toUTF16 operation, I think.
(Narrowing the scope of the bug summary slightly.)
Summary: CSS errors can be extremely large → CSS errors in data URI encoded documents can be extremely large
Thanks!

This is also extra fun because ShortTermURISpecCache::GetSpec does this:
      nsAutoCString cSpec;
      mURI->GetSpec(cSpec);
      CopyUTF8toUTF16(cSpec, mSpec);

So we do GetSpec(), which creates a new huge string in cSpec, then copy that string again, then destroys cSpec. So there's likely to be a ton of memory churn here. If you had some kind of magical capped length string, you could use one version for cSpec and one for mSpec and avoid allocating as many strings. Of course, the page is still using gigantic data URIs, so there's going to be problems either way.
Oops, you already posted about that.
(In reply to Andrew McCreight [:mccr8] from comment #4)
> So we do GetSpec(), which creates a new huge string in cSpec, then copy that
> string again, then destroys cSpec.

If nsAutoCString can benefit from string-buffer-sharing when not using its own stack-buffer (not sure, but I suspect it can), then we could do away with the copying-into-cSpec if we simply made nsSimpleURI::GetSpec() smarter, as discussed in bug 803709.

Then we could truncate cSpec (just imposing a shorter length on it, while still letting it share the same source buffer), to limit the amount that we'll have to copy in our CopyUTF8toUTF16 operation.  With all this, we'd only end up doing one copy here (for the type conversion), and that copy operation would be bounded by whatever length limit we want to impose.
See Also: → 803709
Whiteboard: [MemShrink] → [MemShrink:P2]

The C++ CSSParserImpl::ParseDeclaration code referenced in comment 0 doesn't exist anymore. (It's been replaced via stylo.)

Also: this particular SVG document in bug 1248261 doesn't seem to report any CSS parse errors anymore, which is presumably due to part of the stylo switchover. It's possible that this is still a problem for documents that do report CSS errors, though. I tried a quick local testcase with a 2000+ character data URI with color:1px on some element (which triggers a CSS parse error), and it seemed that the full URI was included (albeit visually ellipsized) in the console errors that resulted.

Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.