Open Bug 1694067 Opened 5 years ago Updated 4 years ago

Install a telemetry probe for JS parse errors on UI resources

Categories

(Core :: JavaScript Engine, task, P2)

task

Tracking

()

People

(Reporter: zbraniecki, Assigned: zbraniecki)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

I'm investigating the problem called Yellow Screen of Death (bug 1675823).

My current hypothesis is that YSOD is part of a wider range of issues coming from broken I/O transfers from the omni.ja! files.
In case of files that are part of the FIrefox UI, we have to assume that those errors result in broken UI.

YSOD exposes the problem in XHTML and, indirectly, DTD files, but it doesn't show us whether the same problem is also affecting other files types like JavaScript.

There are generally speaking two types of broken UI resources experiences - one when the file sent is empty, and another when the file is interrupted.

The new probe we added in bug 1693146 is giving us insight into the former and we see a number of JS UI files that are potentially being sent zero-byte-loads.

On nightly, we've only been monitoring it for 4 days but we see 333 zero_byte_loads of omni.ja!/chrome/browser/content/browser/aboutNetErrorCodes.js and 193 of omni.ja!/chrome/toolkit/content/global/certviewer/pkijs_bundle.js.

Unfortunately, at the moment, the probe only shows zero_byte_loads, and not interrupted loads. Those are harder to recognize without additionally stating the file.

To validate the hypothesis that JavaScript resources are also affected, I'd like to see if it's possible to add a probe in SpiderMonkey that would report JS equivalent of YSOD - an error resulting in malformed (zero bytes or interrupted) JS resource load.

Due to the nature of JS an empty file would not result in a JS error, but an interrupted resource would likely trigger an error. Unfortunately, we can't reason about what JS error would that be because it depend on which byte the load stops at.

But, if we can assume that we should experience no JS parser errors coming from omni.ja! resources at runtime, then the probe should give us a good signal/noise ratio.

Is there a reasonable way to install this probe?

I'm looking at https://searchfox.org/mozilla-central/source/js/src/frontend/TokenStream.cpp and I see a bunch of error() calls that seem tobe defined in https://searchfox.org/mozilla-central/source/js/src/frontend/ErrorReporter.h#83 which unfortunately is used outside of TokenStream as well.

If my reasoning is correct, we'd want an event in error() but only if it's coming from TokenStream. Is that correct?

if JS files in omni.ja is guaranteed not to hit any syntax error in "complete load" case, "interrupted load" case can be caught by checking parser error at the top level.

Normal JS script case:

https://searchfox.org/mozilla-central/rev/7bb1cc6abf6634b2a20f71935e1e519e73402b63/js/src/frontend/BytecodeCompiler.cpp#625

template <typename Unit>
bool frontend::ScriptCompiler<Unit>::compileScriptToStencil(
    JSContext* cx, CompilationStencil& stencil, SharedContext* sc) {
...
      pn = parser->globalBody(sc->asGlobalContext());

ES module case:

https://searchfox.org/mozilla-central/rev/7bb1cc6abf6634b2a20f71935e1e519e73402b63/js/src/frontend/BytecodeCompiler.cpp#694

template <typename Unit>
bool frontend::ModuleCompiler<Unit>::compile(JSContext* cx,
                                             CompilationStencil& stencil) {
...
  ParseNode* pn = parser->moduleBody(&modulesc);

pn becomes nullptr only if the parser hits any syntax-related error.

The URL of the script is stored in stencil.source.filename() or compilationState_.input.options.filename().

We have a mechanism in place for scripts throwing exceptions: https://searchfox.org/mozilla-central/rev/c8ce16e4299a3afd560320d8d094556f2b5504cd/js/src/vm/JSContext.cpp#1029-1063

I would have thought that the interceptor would catch parse errors, but there may be some missing cases or might be early in process.
@arai, can you do some digging?

Flags: needinfo?(arai.unmht)

Parser error is also set to pending exception, so it can catch.

for off-thread compilation, parser error will be stored into ParseTask::errors first, and then reported when finishing the task.
so, the order might be different, but still can be caught (I think the order isn't important here)

So, if we can differentiate initial compilation error from others there, the hook can be used as well.
maybe by checking JSEXN_SYNTAXERR, and the stack is top-level (not inside any Function or eval), and filename

Flags: needinfo?(arai.unmht)
Summary: Can we set a telemetry event for errors coming from omni.ja scripts? → Install a telemetry probe for JS parse errors on UI resources

Proposal for the Event probe:

ui.resource.broken:
  parse:
    bug_numbers:
    description: >-
        This probe is an extension of `ysod` probe to cover wider range of
        parser errors coming from resources in our UI.
        Each parser error is likely to result in broken user interface.
    products:
      - "firefox"
    record_in_processes: ["main", "content"]
    release_channel_collection: opt-out
    expiry_version: "never"
    notification_emails:
      - zbraniecki
      - vchin
      - kershaw
    objects:
      - "js"
      - "html"
      - "css"
      - "xml"
      - "dtd"
    extra_keys:
      error_code: Code of an error as a number. The actual value depends on the file type.
      location: Location as Row:Column of where the error happened.
      file_name: The name of the file being loaded.

This would eventually swallow YSOD event and live side-by-side with the network event as parser event.

Assignee: nobody → zbraniecki
Severity: -- → N/A
Priority: -- → P1

I tried to follow the instructions, but hit the roadblack in that SM doesn't like to operate on XPCOM Strings. What's the procedure here?

Flags: needinfo?(arai.unmht)

in SpiderMonkey, telemetry is handled by callback into the embedding side.

example usage:

https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/js/src/gc/Statistics.cpp#1166

  runtime->addTelemetry(JS_TELEMETRY_GC_REASON, uint32_t(reason));

https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/js/src/vm/Runtime.cpp#314

void JSRuntime::addTelemetry(int id, uint32_t sample, const char* key) {
  if (telemetryCallback) {
    (*telemetryCallback)(id, sample, key);
  }
}

https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/js/xpconnect/src/XPCJSRuntime.cpp#2967

  JS_SetAccumulateTelemetryCallback(cx, AccumulateTelemetryCallback);

https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/js/xpconnect/src/XPCJSRuntime.cpp#2521

static void AccumulateTelemetryCallback(int id, uint32_t sample,
                                        const char* key) {
  switch (id) {
    case JS_TELEMETRY_GC_REASON:
      Telemetry::Accumulate(Telemetry::GC_REASON_2, sample);
      break;
Flags: needinfo?(arai.unmht)

So, add a new ID to this list, and call runtime->addTelemetry with it,

https://searchfox.org/mozilla-central/rev/eeb8cf278192d68b3977d0adb4d43f1463439269/js/public/friend/UsageStatistics.h#25-59

#define MAP_JS_TELEMETRY(_)                 \
  _(JS_TELEMETRY_GC_REASON)                 \
...

and add the corresponding branch in the callback
if the parameter number or type doesn't match, it might require new callback, or new parameter.

That seems to work with histograms, not events. I may have to add my own machinery to funnel event info to embedding side :(

Priority: P1 → P2
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: