Closed Bug 1746374 Opened 3 years ago Closed 1 year ago

Stack trace column numbers differ from Safari and Chrome if line contains Unicode characters

Categories

(Core :: JavaScript Engine, defect, P3)

Firefox 95
Desktop
All
defect

Tracking

()

RESOLVED FIXED
118 Branch
Tracking Status
firefox-esr102 --- wontfix
firefox-esr115 --- wontfix
firefox95 --- wontfix
firefox96 --- wontfix
firefox97 --- wontfix
firefox116 --- wontfix
firefox117 --- wontfix
firefox118 --- fixed

People

(Reporter: robertknight, Assigned: arai)

References

(Blocks 2 open bugs)

Details

Attachments

(2 files)

Attached file emoji-test.html

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.1 Safari/605.1.15

Steps to reproduce:

Load the attached file in Firefox and Chrome and look at the exception logged in devtools.

Actual results:

Firefox reports the error at line 6, column 28. Chrome reports the issue as being at line 6, column 40.

The difference appears to be due to Chrome reporting column numbers in terms of code units, whereas Firefox reports code points. When a source line contains a significant number of Unicode characters requiring multiple UTF-16 code units to represent, the resulting column numbers can differ significantly.

In the context of a downstream tool or service using sourcemaps to map locations in a minified JS file to source column numbers, this can result in the error being mapped to the wrong line.

In our case we had a file containing a table of emojis in our source bundle, and both Sentry and Firefox dev tools mapped errors from other files, that occurred later in the bundle, to the correct location when the error came from Chrome but the wrong location when it came from Firefox.

Expected results:

Since many tools that process sourcemaps or the error.stack property are written in JavaScript, I think it makes sense to report column numbers in UTF-16 code units. Safari seems to also report code units.

Related downstream issue and workaround in our application: https://github.com/hypothesis/client/issues/4045 and https://github.com/hypothesis/client/pull/4047.

The TL;DR of the workaround is that we configured the minifier to limit the length of lines in the output, in order to limit the impact.

Managed to reproduce this issue on Windows 10 x64, macOS 11.6 and on Ubuntu 20.04 x64.

Severity: -- → S4
Status: UNCONFIRMED → NEW
Component: Untriaged → Console
Ever confirmed: true
OS: Unspecified → All
Product: Firefox → DevTools
Hardware: Unspecified → Desktop

Let's see what the Spidermonkey folks think of this

Note that Robert filed an issue on the tc39 repo too: https://github.com/tc39/proposal-error-stacks/issues/42

Component: Console → JavaScript Engine
Product: DevTools → Core

Waldo did a bunch of work a few years back to count code points instead of code units for column numbers. Bug 1551916 appears to be where we flipped the switch, and bug 1601863 is where we removed the old code-unit support. Looking at this patch, it doesn't seem like it would be too hard to revert to the old behaviour if we decided that was the best course of action.

Given that it both makes working with the numbers in JS more difficult (since JS strings are always only counted in code units), and also drastically complicates the error stack proposal (possibly to the point of killing it), imo it would be ideal if it could be reverted.

Severity: S4 → S3
Priority: -- → P3

Note that merely reverting that, would give you counts of code units for all column numbers -- but code units depend on the form of encoding of the script in question. Because internally some scripts are handled as UTF-8, some as UTF-16, depending on context, the exact same script would produce different column number behavior.

If you wanted to forcibly apply UTF-16-code unit counts as column numbers, different sorts of work would have to be done to compute that, for the case of scripts whose source text is UTF-8.

Or, arguably, there could somehow be "display" column numbers that are accountings of code points, and column numbers that are simple indexable code unit counts. Programmatic access could use the indexable kind. Columns for display -- such as those in some places in the developer console, correlating with column numbers as displayed in a code editor working on the corresponding source code -- would use the code point variety.

I want to put attention to this issue again. From my testing (https://github.com/source-map/source-map-rfc/issues/5#issuecomment-1538000188) Firefox is the only browser which does not report UTF-16 offsets.

Duplicate of this bug: 1845093

The spec was also changed in https://github.com/tc39/source-map-spec/pull/8

I also want to stress this.
I have spent more than half an hour trying to reproduce a bug which turned out to be a bad stacktrace in the end (and it will likely happen again if this is not fixed). Our current solution is to tell users to use Chrome instead and we configure our code minimizers to avoid unicode + force a maximum line length to work around the issue.

I have prototyped a solution in the duplicate mentioned above but won't be able to get it production ready.
Would be great if somebody who is familiar with the codebase could have a look.

I'll look into this, using UTF-16 code units count both for UTF-16 source and UTF-8 source.

Assignee: nobody → arai.unmht
Status: NEW → ASSIGNED
Depends on: 1846913
Blocks: 1144340
Pushed by arai_a@mac.com:
https://hg.mozilla.org/integration/autoland/rev/b4cb88d106bd
Use the number of UTF-16 code units as column number. r=iain
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 118 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: