js::PrintError should probably print its data as UTF-8
Categories
(Core :: JavaScript Engine, defect, P3)
Tracking
()
People
(Reporter: Waldo, Assigned: ptomato)
References
Details
Attachments
(1 file)
[jwalden@find-waldo-now src]$ cat ~/moz/inflight/tc-utf8/burrito-1F32F.js function f() { var x = ` 🌯🌯🌯🌯🌯🌯`; 🌯; } f(); [jwalden@find-waldo-now src]$ dbg/js/src/js -f ~/moz/inflight/tc-utf8/burrito-1F32F.js /home/jwalden/moz/inflight/tc-utf8/burrito-1F32F.js:3:15 SyntaxError: illegal character: /home/jwalden/moz/inflight/tc-utf8/burrito-1F32F.js:3:15 </</</</</</`; </; } f(); /home/jwalden/moz/inflight/tc-utf8/burrito-1F32F.js:3:15 ...............^ The reason for the </ goo is, unsurprisingly, that the UTF-16 code units are getting cast-truncated to char when individually fputc'd by PrintErrorLine, called by js::PrintError: 491 for (size_t i = 0; i < n; i++) { (rr) n 492 fputc(static_cast<char>(linebuf[i]), file); (rr) n <491 for (size_t i = 0; i < n; i++) { (rr) n 492 fputc(static_cast<char>(linebuf[i]), file); (rr) n /491 for (size_t i = 0; i < n; i++) { (note the '<' and '/' at the starts of the lines) with context being (rr) bt 5 #0 PrintErrorLine (file=0x7f6920f25680 <_IO_2_1_stderr_>, prefix=0x7f69127c46a0 "/home/jwalden/moz/inflight/tc-utf8/burrito-1F32F.js:3:27 ", report=0x7f69126bc7e0) at /home/jwalden/moz/after/js/src/vm/JSContext.cpp:491 #1 0x00005574c5361c7e in PrintSingleError<JSErrorReport> (cx=0x7f6913618000, file=0x7f6920f25680 <_IO_2_1_stderr_>, toStringResult=..., report=0x7f69126bc7e0, kind=PrintErrorKind::Error) at /home/jwalden/moz/after/js/src/vm/JSContext.cpp:577 #2 0x00005574c53617f2 in js::PrintError (cx=0x7f6913618000, file=0x7f6920f25680 <_IO_2_1_stderr_>, toStringResult=..., report=0x7f69126bc7e0, reportWarnings=false) at /home/jwalden/moz/after/js/src/vm/JSContext.cpp:603 #3 0x00005574c4f1c5b1 in js::shell::AutoReportException::~AutoReportException (this=0x7ffd78921210) at /home/jwalden/moz/after/js/src/shell/js.cpp:9114 #4 0x00005574c4f22873 in Shell (cx=0x7f6913618000, op=0x7ffd78921418, envp=0x7ffd78921658) at /home/jwalden/moz/after/js/src/shell/js.cpp:10537 (More stack frames follow...) We could just change this to print out UTF-8, but sadly at the moment js::PrintError doesn't print all data in wholly consistent encoding -- earlier in the line a file name is printed out, and at present file names are not consistently encoded as UTF-8. So, this probably requires multiple steps of work beyond just changing the aforementioned loop to print UTF-16 code points, not truncated code units.
Comment 1•5 years ago
|
||
(In reply to Jeff Walden [:Waldo] from comment #0) > [...] at present file names are not consistently encoded as UTF-8. Marking as depends on bug 1492090.
Reporter | ||
Comment 2•5 years ago
|
||
(In reply to André Bargull [:anba] from comment #1) > (In reply to Jeff Walden [:Waldo] from comment #0) > > [...] at present file names are not consistently encoded as UTF-8. > > Marking as depends on bug 1492090. Huh, nice find! Maybe someone should review that patchwork.
Updated•5 years ago
|
![]() |
||
Updated•4 years ago
|
![]() |
||
Comment 3•4 years ago
|
||
[Tracking Requested - why for this release]:
Release Note Request (optional, but appreciated)
[Why is this notable]:
[Affects Firefox for Android]:
[Suggested wording]:
[Links (documentation, blog post, etc)]:
Comment 4•4 years ago
|
||
Tracking and relnote flag nominations require a comment. Resetting.
Updated•4 years ago
|
Assignee | ||
Comment 5•4 years ago
|
||
We could just change this to print out UTF-8, but sadly at the moment js::PrintError doesn't print all data in wholly consistent encoding -- earlier in the line a file name is printed out, and at present file names are not consistently encoded as UTF-8.
So, this probably requires multiple steps of work beyond just changing the aforementioned loop to print UTF-16 code points, not truncated code units.
Given that the file name stuff is probably fixed by bug 1492090, which has patches waiting for review already, would it be reasonable to just fix the loop here?
Assignee | ||
Comment 6•3 years ago
|
||
This would be nicer if we changed the loop that prints tokenOffset()
number of '.' characters, to take into account the visual column width
of the code points that it's supposed to line up with. But I don't see
any readily available function for that.
Depends on D73519
Updated•3 years ago
|
Pushed by ncsoregi@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/faeed99f813e Print code points in PrintErrorLine. r=jwalden
Comment 8•3 years ago
|
||
Backed out for bustages on for bustages on testPrintError.cpp
Backout link: https://hg.mozilla.org/integration/autoland/rev/ea102a49618733aae60628d0f4304e8708274016
Log link: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=302143544&repo=autoland&lineNumber=7304
Assignee | ||
Comment 9•3 years ago
•
|
||
I've updated the patches from this bug and bug 1506323 to fix the build failures. Here's a retry of the jobs that failed: https://treeherder.mozilla.org/#/jobs?repo=try&revision=880f1222a163c8cbd31c1ebf88a25603755990b4
Comment 10•3 years ago
|
||
Pushed by btara@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/96a3f8a62db6 Print code points in PrintErrorLine. r=jwalden
Comment 11•3 years ago
|
||
bugherder |
Updated•3 years ago
|
Description
•