Closed Bug 1083913 Opened 10 years ago Closed 10 years ago

Switch statement too large internal error

Categories

(Core :: JavaScript Engine, defect)

33 Branch
defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla36
Tracking Status
firefox33 + wontfix
firefox34 + wontfix
firefox35 + wontfix
firefox36 + fixed

People

(Reporter: pavel.kakolin, Assigned: jimb)

References

Details

(Keywords: dev-doc-complete, regression, site-compat, Whiteboard: [Workaround in comment 55])

Attachments

(7 files, 7 obsolete files)

12.73 KB, patch
shu
: review+
Details | Diff | Splinter Review
11.05 KB, patch
shu
: review+
Details | Diff | Splinter Review
4.31 KB, patch
shu
: review+
Details | Diff | Splinter Review
11.36 KB, patch
shu
: review+
Details | Diff | Splinter Review
11.71 KB, patch
Details | Diff | Splinter Review
12.62 KB, patch
shu
: review+
Details | Diff | Splinter Review
5.04 KB, patch
shu
: review+
Details | Diff | Splinter Review
We develop complex GWT WebGL application which includes several megabytes of 3D models and animation data. At the moment the most complex model, "Elderly Woman", is slightly less than 10mb.

We experienced no problems with "Elderly Woman" model in previous versions of FireFox browser, but it is not possible to use it after FireFox 33 release.

1. Go to https://app.coachingspaces.com/#Space:BANKdjljXgW.D0pnNrrUrnE
2. Add Elderly Woman to stage

Expected result:
Elderly Woman is added to stage.

Actual result:
Elderly Woman is not added to stage.
"Switch statement too large" internal javascript error is thrown, although there are no switch statements in code in question.

At the moment we're working hard to create a simple testcase, but haven't succeeded yet because it seems that application should be complex enough for this error to be thrown.
Ooops, the right link to reproduce the problem is https://app.coachingspaces.com/localDemo.html?space=fd151fd61a23
It looks like we land in SetSrcNoteOffset with offset == -315513, which then does:

7100        if (size_t(offset) > SN_MAX_OFFSET) {
7101            ReportStatementTooLarge(bce->parser->tokenStream, bce->topStmt);
7102            return false;
7103        }

I expect a negative offset is not so much expected here....

How we get here... We seem to be doing:

  ptrdiff_t colspan = ptrdiff_t(columnIndex) - ptrdiff_t(bce->current->lastColumn);

where columnIndex == 68668 and bce->current->lastColumn == 8772789.  Then we detect taht colspan < 0, and add SN_COLSPAN_DOMAIN, which is 1 << 23.  But unfortunately colspan is < -(1<<23), so we still end up with a negative number, which we then pass in like so:

  NewSrcNote2(cx, bce, SRC_COLSPAN, colspan)

which is how we end up throwing.

So the point is, to get this codepath to throw in this way you have to have lines that are over 1<<23 characters long in your script source, which is why you're not having much luck with simple testcases.

That said, this code hasn't obviously changed much since bug 916564 at least based on a quick look.
I've just checked couple of previous ff releases from ftp://ftp.mozilla.org/pub/firefox/releases/:
that is not reproducible on 32.0.3 build for os x and is reproducible on 33.0b1
Right; I totally believe that this first appeared in 33.  I'm just not sure what caused it yet.

If you have the time and inclination to use http://mozilla.github.io/mozregression/ to pin this down to a one-day checkin range, that would be really helpful.  I can totally understand if you're not particularly interested in doing that, though.  ;)
Regression window(m-i)
Good:
https://hg.mozilla.org/integration/mozilla-inbound/rev/31bf8c4b3e19
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0 ID:20140623084043
Bad:
https://hg.mozilla.org/integration/mozilla-inbound/rev/83d14fd3d739
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0 ID:20140623084347
Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=31bf8c4b3e19&tochange=83d14fd3d739

Suspect:
80e4d6b3723a	Nick Fitzgerald — Bug 1000967 - Add source notes for |new| expression and function calls to improve source maps and debugging. r=ejpbruel
Blocks: 1000967
[Tracking Requested - why for this release]: Breaks web pages

Alice0775, Loic, thank you!
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(nfitzgerald)
Flags: needinfo?(ejpbruel)
I reported a similar issue (https://bugzilla.mozilla.org/show_bug.cgi?id=997730) with our app. It looks like it was solved, but this issue tells me it´s still a problem. This is really a pity, because our app runs only on Google Chrome so far. I had to remove the "We suggest Firefox" from our Website because of that.

When will someone start to fix it?
I'm going to look into this Soon(tm). Lots on my plate at the moment.
(In reply to Pavel Kakolin from comment #1)
> Ooops, the right link to reproduce the problem is
> https://app.coachingspaces.com/localDemo.html?space=fd151fd61a23

I'm, trying to repro with this test case and the elderly woman isn't added to the stage but I'm also not getting the "switch statement too large error" either. Instead, I'm seeing the following messages in the console after load, before even trying to add the elderly woman to the stage:

------------8<------------8<----------------8<------------------------

The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol. localDemo.html
Use of getPreventDefault() is deprecated.  Use defaultPrevented instead. 000-jquery-1.9.1.min.js:2
syntax error fileAccess:1
"WGLGraphics information:" localDemo.html:3
"	extensions: ANGLE_instanced_arrays EXT_blend_minmax EXT_frag_depth EXT_sRGB EXT_texture_filter_anisotropic OES_element_index_uint OES_standard_derivatives OES_texture_float OES_texture_float_linear OES_texture_half_float OES_texture_half_float_linear OES_vertex_array_object WEBGL_compressed_texture_s3tc WEBGL_depth_texture WEBGL_draw_buffers WEBGL_lose_context MOZ_WEBGL_lose_context MOZ_WEBGL_compressed_texture_s3tc MOZ_WEBGL_depth_texture " localDemo.html:3
"	max texture size: 4096" localDemo.html:3
"	max vertex shader uniform vectors: 1024" localDemo.html:3
"	anisotropy max = 16" localDemo.html:3
"Render target 3266x2134, devicePixelRatio = 2" localDemo.html:3
"CullFaceMode = 405" localDemo.html:3
"Canvas element: 3266 x 2134" localDemo.html:3
"RtCanvas: setRtSize 3266, 1312" localDemo.html:3
syntax error stat-mongo:1
"Statistics is sent [DEMO_SPACE_EVENT]" localDemo.html:3
syntax error ipinfo:1

------------8<------------8<----------------8<------------------------
Flags: needinfo?(nfitzgerald)
(In reply to Nick Fitzgerald [:fitzgen] from comment #10)

Thank you for your feedback on that issue.

You are right, in production we hide exceptions from our clients. You can force them to be shown be appending &errors=true to the URL, eg https://app.coachingspaces.com/localDemo.html?space=fd151fd61a23&errors=true

With that parameter added, after you try to add elderly woman to stage, popup dialog will appear with a list of caught exceptions on its left side.

You can click on the "com.google.gwt.core.client.JavaScriptException: (InternalError)" in that list and see a stacktrace and some other details in the right part of the dialog. I guess it won't be much helpful though.
> but I'm also not getting the "switch statement too large error"

Try just breakpointing on where it would be reported to deal with the "site catches the exception" issue...
Release managers want assignees for bugs blocking FF 34, so I'll assign this to you Nick. Feel free to reassign or rope in others.
Assignee: nobody → nfitzgerald
Status: NEW → ASSIGNED
Attached patch magic.patch (obsolete) — Splinter Review
Great fix, thanks to jimb and shu.
Flags: needinfo?(ejpbruel)
Assignee: nfitzgerald → jimb
Should fix the comments too, right?
Comment fixes, and at least some abstraction to get a handle on the insanity.
Attachment #8509938 - Attachment is obsolete: true
Attachment #8509981 - Flags: review?(shu)
(The above *does* also contain the actual fix.)
It does not contain a test case. :(
Comment on attachment 8509938 [details] [diff] [review]
magic.patch

Review of attachment 8509938 [details] [diff] [review]:
-----------------------------------------------------------------

Are SN_COLSPAN_DOMAIN and SN_MAX_OFFSET correlate ? Should we fix SN_MAX_OFFSET as well ?
Also 8Mb JS source seems to be reasonable, but is 2Gigabytes source reasonable ?
(In reply to Kirill from comment #19)
> Are SN_COLSPAN_DOMAIN and SN_MAX_OFFSET correlate ? Should we fix
> SN_MAX_OFFSET as well ?

Yes! But they had become un-correlated. That is the fix.

> Also 8Mb JS source seems to be reasonable, but is 2Gigabytes source
> reasonable ?

I think it is --- but expanding the size beyond 2GiB will require changing the source notes encoding, and that's a larger project, and a riskier patch.
Now, with tests, and a clarified relationship between offset sizes and column span sizes.

Try:
https://tbpl.mozilla.org/?tree=Try&rev=088251f2522e
Attachment #8509981 - Attachment is obsolete: true
Attachment #8509981 - Flags: review?(shu)
Attachment #8510453 - Flags: review?(shu)
It seems like the test times out on Windows XP debug builds. The timeout is 150s. A debug build runs in under 10s on my Linux box. Why would Windows XP be that much slower?
Windows XP is a 32-bit build; would a 128MiB source string be a problem there?
Comment on attachment 8510453 [details] [diff] [review]
Column spans are restricted to 31-bit, not 24-bit, signed values.

Review of attachment 8510453 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/BytecodeEmitter.cpp
@@ +427,5 @@
>      ptrdiff_t colspan = ptrdiff_t(columnIndex) - ptrdiff_t(bce->current->lastColumn);
>      if (colspan != 0) {
> +        // If the column span is so large that we can't store it, then just
> +        // discard this information because column information would most
> +        // likely be useless anyway once the column numbers are ~4000000.

I do wonder how this 4 million number was arrived at.

@@ -437,2 @@
>              return true;
> -        }

Saving those branches!

::: js/src/frontend/SourceNotes.h
@@ +168,5 @@
> +inline ptrdiff_t
> +SN_COLSPAN_TO_OFFSET(ptrdiff_t colspan) {
> +    MOZ_ASSERT(SN_REPRESENTABLE_COLSPAN(colspan));
> +    // Convert a 31-bit two's complement value into a 31-bit unsigned value.
> +    return (colspan + (1 << 30)) ^ (1 << 30);

Would it be clearer to write (SN_OFFSET_BITS - 1) here, as elsewhere?

@@ +175,5 @@
> +inline ptrdiff_t
> +SN_OFFSET_TO_COLSPAN(ptrdiff_t offset) {
> +    MOZ_ASSERT(SN_REPRESENTABLE_OFFSET(offset));
> +    // Convert a 31-bit unsigned value into a 31-bit two's complement value.
> +    return (offset ^ (1 << 30)) - (1 << 30);

Ditto.
Attachment #8510453 - Flags: review?(shu) → review+
(In reply to Shu-yu Guo [:shu] from comment #24)
> > +        // If the column span is so large that we can't store it, then just
> > +        // discard this information because column information would most
> > +        // likely be useless anyway once the column numbers are ~4000000.
> 
> I do wonder how this 4 million number was arrived at.

Seems bogus. I rewrote the comment for the modern era:

        // If the column span is so large that we can't store it, then just
        // discard this information. This can happen with minimized or otherwise
        // machine-generated code. Even gigantic column numbers are still
        // valuable if you have a source map to relate them to something real;
        // but it's better to fail soft here.

> Would it be clearer to write (SN_OFFSET_BITS - 1) here, as elsewhere?

Definitely. I'd actually already changed this after comment 19.
We can use this to test the fix without actually generating 128MiB strings of source code.
Attachment #8510453 - Attachment is obsolete: true
Attachment #8511368 - Flags: review?(shu)
Revised per comments, and to use new testing facility. Carrying forward r=shu, but further review welcome.
All I wanted was to have a testing facility that reported errors reasonably, so the fuzzers wouldn't hate it...
Attachment #8512290 - Flags: review?(shu)
... and the next thing I know I'm three patches deep ...
Attachment #8511368 - Attachment is obsolete: true
Attachment #8511368 - Flags: review?(shu)
Attachment #8512292 - Flags: review?(shu)
... and writing comments about undefined behavior and wrap-around pointer arithmetic.
Attachment #8511369 - Attachment is obsolete: true
Attachment #8512293 - Flags: review?(shu)
Attachment #8512290 - Flags: review?(shu) → review+
Comment on attachment 8512292 [details] [diff] [review]
Add a 'columnNumber' parameter to the JS shell's 'evaluate' and 'offThreadCompileScript' functions.

Review of attachment 8512292 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/Parser.cpp
@@ +526,5 @@
>      initCalled = true;
>  #endif
>  
> +    if (!tokenStream.init())
> +        return false;

This should be before initCalled = true for maximum safety, I think.

::: js/src/frontend/TokenStream.h
@@ +256,5 @@
>                  const char16_t *base, size_t length, StrictModeGetter *smg);
>  
>      ~TokenStream();
>  
> +    bool init();

Hm, this doesn't really initialize any state (nor does Parser::init). Perhaps rename to checkOffsetDomains (to make the game more general than checkColumnDomain, in anticipation of future checks) or something?
Attachment #8512292 - Flags: review?(shu) → review+
Comment on attachment 8512293 [details] [diff] [review]
Column spans are restricted to 30-bit, not 24-bit, signed values.

Review of attachment 8512293 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/TokenStream.cpp
@@ +319,5 @@
> +    // could yield userbuf.base values that point into the source buffer even
> +    // after being biased, such that the column number would suddenly wrap
> +    // around to zero in the middle of the source code. To avoid this, we
> +    // restrict column numbers on all platforms to UINTPTR_MAX / 4 on 32-bit
> +    // platforms in TokenStream::init.

This hack is all kinds of crazy.

In addition to what we talked about on IRC (that this comment suggest int31s, not int30s), isn't wrap around dependent on how low the base pointer is in memory? Do all our pages start at addresses >= UINTPTR_MAX / 4 or something?
Comment on attachment 8512293 [details] [diff] [review]
Column spans are restricted to 30-bit, not 24-bit, signed values.

Review of attachment 8512293 [details] [diff] [review]:
-----------------------------------------------------------------

I think TokenStream should always initialize linebase = base, store options().column explicitly, and use it to offset the pointer arithmetic in getColumn().

::: js/src/frontend/TokenStream.cpp
@@ +319,5 @@
> +    // could yield userbuf.base values that point into the source buffer even
> +    // after being biased, such that the column number would suddenly wrap
> +    // around to zero in the middle of the source code. To avoid this, we
> +    // restrict column numbers on all platforms to UINTPTR_MAX / 4 on 32-bit
> +    // platforms in TokenStream::init.

This hack is all kinds of crazy.

In addition to what we talked about on IRC (that this comment suggest int31s, not int30s), isn't wrap around dependent on how low the base pointer is in memory? Do all our pages start at addresses >= UINTPTR_MAX / 4 or something?
Oops, didn't mean to post the same comment twice.
(In reply to Shu-yu Guo [:shu] from comment #32)
> Comment on attachment 8512292 [details] [diff] [review]
> Add a 'columnNumber' parameter to the JS shell's 'evaluate' and
> 'offThreadCompileScript' functions.
> 
> Review of attachment 8512292 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/frontend/Parser.cpp
> @@ +526,5 @@
> >      initCalled = true;
> >  #endif
> >  
> > +    if (!tokenStream.init())
> > +        return false;
> 
> This should be before initCalled = true for maximum safety, I think.

Actually, since we check it in Parser's destructor, it needs to be set even if we throw here.


> 
> ::: js/src/frontend/TokenStream.h
> @@ +256,5 @@
> >                  const char16_t *base, size_t length, StrictModeGetter *smg);
> >  
> >      ~TokenStream();
> >  
> > +    bool init();
> 
> Hm, this doesn't really initialize any state (nor does Parser::init).
> Perhaps rename to checkOffsetDomains (to make the game more general than
> checkColumnDomain, in anticipation of future checks) or something?

I've renamed them to checkOptions.
Some confusions that we cleared up:

 - The domain of the column span as used in encoding source notes differs from the domain of the column offset in the TokenStream.
 - Pointer arithmetic itself is signed, even though for something like a TokenStream, we never get negative results.
 - The linebase hack is not worth the investment to remove in favor of something clearer.

I think the improved comment would just need to make clear up front that the COLSPAN representation is used by both TokenStream and source note encoding for simplicity, despite different invariants:

 (1) Source notes' colspans are signed and have the capacity to store 31 bits.

 (2) TokenStream's column offset is always conceptually unsigned, *but* it is computed via pointer arithmetic, the type of which is always signed. Plus, due to the way the linebase pointer is computed and that our character stream is on char16_t instead of char, we need int30s. This, of course, assuming that our source buffer isn't so big as to overflow this.

To use the same encoding then means taking the min of the two use cases' requirements, getting us int30s.
I am going to skip this patch for 33.1 (too big, too late, does not seem to affect too many people).

However, we could take it in 34.
I'm going to restart this with some cleanup patches. This one should cause no visible changes, but replaces the TokenBuf::base accessor with a bounds-checked alternative, and isolates all uses of base_ to within TokenBuf itself, in preparation for changing the meaning of base_ and avoiding the pointer-biasing hack altogether.
Attachment #8516483 - Flags: review?(shu)
This removes the pointer biasing hack from TokenBuf, replacing it with an explicit initial offset. At this point, TokenStream::linebase still uses the hack.
Attachment #8516484 - Flags: review?(shu)
Carrying forward shu's earlier r+, but re-review is always welcome.

This changes the initial column range check to use a limit that is independent of the range of our colspan representation, as those are two separate concerns. The tests are slightly different, too.
This should make a lot more sense than the prior version, as the other bizarre constraints on column numbers have been removed.
Attachment #8512292 - Attachment is obsolete: true
Attachment #8512293 - Attachment is obsolete: true
Attachment #8512293 - Flags: review?(shu)
Attachment #8517036 - Flags: review?(shu)
Comment on attachment 8516483 [details] [diff] [review]
Replace TokenBuf::base and TokenStream::rawBase with better abstractions.

Review of attachment 8516483 [details] [diff] [review]:
-----------------------------------------------------------------

Beautiful
Attachment #8516483 - Flags: review?(shu) → review+
Attachment #8516484 - Flags: review?(shu) → review+
Attachment #8516873 - Flags: review?(shu) → review+
Comment on attachment 8517036 [details] [diff] [review]
Column spans are restricted to 31-bit, not 24-bit, signed values.

Review of attachment 8517036 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/SourceNotes.h
@@ +168,5 @@
> +
> +inline ptrdiff_t
> +SN_OFFSET_TO_COLSPAN(ptrdiff_t offset) {
> +    // There should be no bits set outside the field we're going to sign-extend.
> +    MOZ_ASSERT(!(offset & ~((1U << SN_OFFSET_BITS) - 1)));

This is true because js_GetSrcNoteOffset strips the tag bit, right?
Attachment #8517036 - Flags: review?(shu) → review+
(In reply to Shu-yu Guo [:shu] from comment #47)
> > +    // There should be no bits set outside the field we're going to sign-extend.
> > +    MOZ_ASSERT(!(offset & ~((1U << SN_OFFSET_BITS) - 1)));
> 
> This is true because js_GetSrcNoteOffset strips the tag bit, right?

Yes.

Thanks for the reviews!
(In reply to Jim Blandy :jimb from comment #46)
> Try push for the whole stack:
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=2ee82bf3ffcc

That try push was unhappy for minor reasons:
- false -> ptr conversion
- unconditional use of offThreadCompileScript in tests

Fresh try:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=be169fdade27
That try push showed up a regression in js/src/tests/js1_5/Regress/regress-115436.js. This is not S-S, and will be quick to fix.

Details: we try to compile a lazy function:

function x(y,y)
{
  return 3;
}

The lazy function compiler retrieves the source code, and passes to the compiler the substring starting with the opening parenthesis of the argument list, and ending after the body's closing curly brace. This is all correct; that's what js::frontend::CompileLazyFunction expects. But do note that this substring starts in the middle of the line, at column 10.

The compiler notices the duplicate argument names, and tries to report an error. The code that produces the error message tries to gather the text of the source line containing the error. The true start of the line, which we track so that we can produce accurate column numbers, is off the beginning of the buffer, but the source line gathering in code TokenStream::reportCompileErrorNumberVA tries to fetch it anyway.

We used to just let it grab all the text. This turns out to have been safe, because we were always passing CompileLazyFunction a pointer into the midst of the full decompressed source, so the text was actually there. Now we have tighter checks on random access to source text characters, and an assertion fails.

I don't think we can continue to let it fetch the text: CompileLazyFunction really has no right accessing text outside the buffer it was passed. Perhaps it could know that it was parsing a portion of a larger enclosing string, but that's a bigger change.

I think the fix for now is to simply clip the text included in the error message to the text that CompileLazyFunction actually has a right to assume is there.
Attachment #8517656 - Flags: review?(shu) → review+
Given the risk associated with taking this somewhat largish change so late in the beta cycle, I'm marking this as wontfix for 34. jimb will post a workaround for those of you with impacted sites. We can consider this for 35, which has more time left in the cycle. I would want to see this land on 35 while it is still on Aurora.
Since this bug is triggered by very large (multi-million) column numbers, it should be easy to work around it by placing newlines in the code periodically. For example, even minimized source will often contain the string "{var ". Replacing that with "{var\n" (being sure to avoid string and regexp literals!) should effectively avoid triggering the bug.
Whiteboard: Workaround in comment 55
Whiteboard: Workaround in comment 55 → [Workaround in comment 55]
This should have been marked wontfix for 35 when we moved to Beta.  At this point we'll let this ride and ship with FF36.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: