1694067 - Install a telemetry probe for JS parse errors on UI resources

Assignee

Description

•

4 years ago

I'm investigating the problem called Yellow Screen of Death (bug 1675823).

My current hypothesis is that YSOD is part of a wider range of issues coming from broken I/O transfers from the omni.ja! files.
In case of files that are part of the FIrefox UI, we have to assume that those errors result in broken UI.

YSOD exposes the problem in XHTML and, indirectly, DTD files, but it doesn't show us whether the same problem is also affecting other files types like JavaScript.

There are generally speaking two types of broken UI resources experiences - one when the file sent is empty, and another when the file is interrupted.

The new probe we added in bug 1693146 is giving us insight into the former and we see a number of JS UI files that are potentially being sent zero-byte-loads.

On nightly, we've only been monitoring it for 4 days but we see 333 zero_byte_loads of omni.ja!/chrome/browser/content/browser/aboutNetErrorCodes.js and 193 of omni.ja!/chrome/toolkit/content/global/certviewer/pkijs_bundle.js.

Unfortunately, at the moment, the probe only shows zero_byte_loads, and not interrupted loads. Those are harder to recognize without additionally stating the file.

To validate the hypothesis that JavaScript resources are also affected, I'd like to see if it's possible to add a probe in SpiderMonkey that would report JS equivalent of YSOD - an error resulting in malformed (zero bytes or interrupted) JS resource load.

Due to the nature of JS an empty file would not result in a JS error, but an interrupted resource would likely trigger an error. Unfortunately, we can't reason about what JS error would that be because it depend on which byte the load stops at.

But, if we can assume that we should experience no JS parser errors coming from omni.ja! resources at runtime, then the probe should give us a good signal/noise ratio.

Zibi Braniecki [:zbraniecki][:gandalf]

Assignee

Updated

•

4 years ago

Blocks: 1675823

Zibi Braniecki [:zbraniecki][:gandalf]

Assignee

Comment 1

•

4 years ago

Is there a reasonable way to install this probe?

I'm looking at https://searchfox.org/mozilla-central/source/js/src/frontend/TokenStream.cpp and I see a bunch of error() calls that seem tobe defined in https://searchfox.org/mozilla-central/source/js/src/frontend/ErrorReporter.h#83 which unfortunately is used outside of TokenStream as well.

If my reasoning is correct, we'd want an event in error() but only if it's coming from TokenStream. Is that correct?

Tooru Fujisawa [:arai]

Comment 2

•

4 years ago

if JS files in omni.ja is guaranteed not to hit any syntax error in "complete load" case, "interrupted load" case can be caught by checking parser error at the top level.

Normal JS script case:

https://searchfox.org/mozilla-central/rev/7bb1cc6abf6634b2a20f71935e1e519e73402b63/js/src/frontend/BytecodeCompiler.cpp#625

template <typename Unit>
bool frontend::ScriptCompiler<Unit>::compileScriptToStencil(
    JSContext* cx, CompilationStencil& stencil, SharedContext* sc) {
...
      pn = parser->globalBody(sc->asGlobalContext());

ES module case:

https://searchfox.org/mozilla-central/rev/7bb1cc6abf6634b2a20f71935e1e519e73402b63/js/src/frontend/BytecodeCompiler.cpp#694

template <typename Unit>
bool frontend::ModuleCompiler<Unit>::compile(JSContext* cx,
                                             CompilationStencil& stencil) {
...
  ParseNode* pn = parser->moduleBody(&modulesc);

pn becomes nullptr only if the parser hits any syntax-related error.

The URL of the script is stored in stencil.source.filename() or compilationState_.input.options.filename().

Ted Campbell [:tcampbell]

Comment 3

•

4 years ago

We have a mechanism in place for scripts throwing exceptions: https://searchfox.org/mozilla-central/rev/c8ce16e4299a3afd560320d8d094556f2b5504cd/js/src/vm/JSContext.cpp#1029-1063

I would have thought that the interceptor would catch parse errors, but there may be some missing cases or might be early in process.
@arai, can you do some digging?

Flags: needinfo?(arai.unmht)

Tooru Fujisawa [:arai]

Comment 4

•

4 years ago

Parser error is also set to pending exception, so it can catch.

for off-thread compilation, parser error will be stored into ParseTask::errors first, and then reported when finishing the task.
so, the order might be different, but still can be caught (I think the order isn't important here)

So, if we can differentiate initial compilation error from others there, the hook can be used as well.
maybe by checking JSEXN_SYNTAXERR, and the stack is top-level (not inside any Function or eval), and filename

Flags: needinfo?(arai.unmht)

Zibi Braniecki [:zbraniecki][:gandalf]

Assignee

Updated

•

4 years ago

Summary: Can we set a telemetry event for errors coming from omni.ja scripts? → Install a telemetry probe for JS parse errors on UI resources

Zibi Braniecki [:zbraniecki][:gandalf]

Assignee

Comment 5

•

4 years ago

Proposal for the Event probe:

ui.resource.broken:
  parse:
    bug_numbers:
    description: >-
        This probe is an extension of `ysod` probe to cover wider range of
        parser errors coming from resources in our UI.
        Each parser error is likely to result in broken user interface.
    products:
      - "firefox"
    record_in_processes: ["main", "content"]
    release_channel_collection: opt-out
    expiry_version: "never"
    notification_emails:
      - zbraniecki
      - vchin
      - kershaw
    objects:
      - "js"
      - "html"
      - "css"
      - "xml"
      - "dtd"
    extra_keys:
      error_code: Code of an error as a number. The actual value depends on the file type.
      location: Location as Row:Column of where the error happened.
      file_name: The name of the file being loaded.

This would eventually swallow YSOD event and live side-by-side with the network event as parser event.

Nicolas B. Pierron [:nbp]

Updated

•

4 years ago

Assignee: nobody → zbraniecki

Severity: -- → N/A

Priority: -- → P1

Zibi Braniecki [:zbraniecki][:gandalf]

Assignee

Comment 6

•

4 years ago

Attached file Bug 1694067 - Install a telemetry probe for JS parse errors on UI resources. — Details

Zibi Braniecki [:zbraniecki][:gandalf]

Assignee

Comment 7

•

4 years ago

I tried to follow the instructions, but hit the roadblack in that SM doesn't like to operate on XPCOM Strings. What's the procedure here?

Flags: needinfo?(arai.unmht)

Tooru Fujisawa [:arai]

Comment 8

•

4 years ago

in SpiderMonkey, telemetry is handled by callback into the embedding side.

example usage:

https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/js/src/gc/Statistics.cpp#1166

  runtime->addTelemetry(JS_TELEMETRY_GC_REASON, uint32_t(reason));

https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/js/src/vm/Runtime.cpp#314

void JSRuntime::addTelemetry(int id, uint32_t sample, const char* key) {
  if (telemetryCallback) {
    (*telemetryCallback)(id, sample, key);
  }
}

https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/js/xpconnect/src/XPCJSRuntime.cpp#2967

  JS_SetAccumulateTelemetryCallback(cx, AccumulateTelemetryCallback);

https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/js/xpconnect/src/XPCJSRuntime.cpp#2521

static void AccumulateTelemetryCallback(int id, uint32_t sample,
                                        const char* key) {
  switch (id) {
    case JS_TELEMETRY_GC_REASON:
      Telemetry::Accumulate(Telemetry::GC_REASON_2, sample);
      break;

Flags: needinfo?(arai.unmht)

Tooru Fujisawa [:arai]

Comment 9

•

4 years ago

So, add a new ID to this list, and call runtime->addTelemetry with it,

https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/js/public/friend/UsageStatistics.h#25-59

#define MAP_JS_TELEMETRY(_)                 \
  _(JS_TELEMETRY_GC_REASON)                 \
...

and add the corresponding branch in the callback
if the parameter number or type doesn't match, it might require new callback, or new parameter.

Zibi Braniecki [:zbraniecki][:gandalf]

Assignee

Comment 10

•

4 years ago

That seems to work with histograms, not events. I may have to add my own machinery to funnel event info to embedding side :(

Matthew Gaudet (he/him) [:mgaudet]

Updated

•

4 years ago

Priority: P1 → P2

Bugzilla

Install a telemetry probe for JS parse errors on UI resources

Categories

(Core :: JavaScript Engine, task, P2)

Tracking

()

People

(Reporter: zbraniecki, Assigned: zbraniecki)

References

(Blocks 1 open bug)

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Comment 5

Updated

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Updated

Attachment

General

Description

File Name

Content Type