Closed Bug 1419432 Opened 2 years ago Closed 2 years ago

Crashes at 0x100 (Mac, Linux) or 0x10101 (Windows) in strlen | nsTSubString<char>::Assign | ContentProcess::Init reading past end of mozilla::dom::ContentPrefs::gInitPrefs

Categories

(Core :: Preferences: Backend, defect, P2, critical)

Unspecified
All
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox57 --- wontfix
firefox58 - wontfix
firefox59 + wontfix
firefox60 + wontfix
firefox61 --- fixed

People

(Reporter: marcia, Assigned: njn)

References

Details

(Keywords: crash, regression)

Crash Data

This bug was filed from the Socorro interface and is
report bp-9db6abbc-6f4b-4b82-9fa5-ebaf50171121.
=============================================================

Seen while looking at crash stats - this one of two signatures which have risen to the top 10: http://bit.ly/2iBc9M8. This looks to be a startup crash so there are no comments.

There seems to be a stack common to the reports:

double_conversion::BignumDtoa(double, double_conversion::BignumDtoaMode, int, double_conversion::Vector<char>, int*, int*)

Top 10 frames of crashing thread:

0 libc-2.26.so libc-2.26.so@0x96506 
1 libxul.so nsTSubstring&lt;char&gt;::Assign xpcom/string/nsTSubstring.cpp:365
2 libxul.so mozilla::dom::ContentProcess::Init xpcom/string/nsTString.h:81
3 libxul.so XRE_InitChildProcess toolkit/xre/nsEmbedFunctions.cpp:671
4 firefox content_process_main ipc/contentproc/plugin-container.cpp:63
5 firefox main.cold.3 
6 libc-2.26.so libc-2.26.so@0x20f69 
7 firefox firefox@0x15e5f 
8 firefox double_conversion::BignumDtoa 
9 ld-2.26.so ld-2.26.so@0xf625 

=============================================================
Seems to crash in xpcom/string/ on startup
David, could you help with that?
Thanks
Flags: needinfo?(dbaron)
The other crashes that happen within ContentProcess::Init are perhaps relevant:

https://crash-stats.mozilla.com/search/?proto_signature=~ContentProcess%3A%3AInit&product=Firefox&date=%3E%3D2017-11-14T10%3A39%3A28.000Z

The crashes in this bug in particular are #3 and #4 on that list.  bug 1407751 is #2.  #1 is unfiled.  They may be the same thing reflected on different OSes.  I'll try to dig in more later.
(In reply to David Baron :dbaron: ⌚️UTC-8 from comment #2)
> The crashes in this bug in particular are #3 and #4 on that list.  bug
> 1407751 is #2.  #1 is unfiled.  They may be the same thing reflected on

Though bug 1407751 doesn't cover all crashes in nsTString::Assign and may be unrelated to those crashes.
So the common element here is that the crash address is *always* 0x100.

The signature is the above signatures on Linux, and the better signatures are on Mac (differing between 10.13 and 10.8-10.12), although there are also mac signatures showing up as libsystem_c.dylib.


I think a query for the crashes that I think are covered by the bug is:
https://crash-stats.mozilla.com/search/?proto_signature=~ContentProcess%3A%3AInit&address=%3D0x100
Note that clicking on a signature in that result list produces a constrained result for that signature, not a full result.
Crash Signature: [@ libc-2.26.so@0x96506] [@ libc-2.26.so@0x1574b1] → [@ strlen | nsTSubstring<T>::Assign ] [@ nsTSubstring<T>::Assign ] [@ libc-2.26.so@0x96506] [@ libc-2.26.so@0x1574b1] [@ libc-2.23.so@0x8b746] [@ libsystem_c.dylib@0xf72] [@ libsystem_c.dylib@0x1132] [@ libsystem_c.dylib@0x1732] [@ libsystem_c.dyl…
Summary: Crash in libc-2.26.so@0x96506 → Crashes at 0x100 in strlen | nsTSubString<char>::Assign | ContentProcess::Init
Though there are a few crashes on Windows that seem likely to be the same, but they all have a crash address of 0x10101:

https://crash-stats.mozilla.com/search/?proto_signature=~ContentProcess%3A%3AInit&address=%3D0x10101
Flags: needinfo?(dbaron)
Summary: Crashes at 0x100 in strlen | nsTSubString<char>::Assign | ContentProcess::Init → Crashes at 0x100 (Mac, Linux) or 0x10101 (Windows) in strlen | nsTSubString<char>::Assign | ContentProcess::Init
I debugged bp-50b1cee2-207f-4305-ba87-b86260171118 since Windows crashes are easier to debug.

I suspect we're getting a bad index into the ContentPrefs array here:
https://hg.mozilla.org/mozilla-central/annotate/45715ece25fc/dom/ipc/ContentProcess.cpp#l174 though I haven't actually confirmed that.  I did confirm that we're trying to construct a string whose data is 0x10101.

gps suggested the possibility of a frankenbuild, though it seems like there are other possibilities.
Component: General → Preferences: Backend
After the end of gContentPrefs, there are a few pointers that do point to strings ("8", "7", ... "2", "val"), and then the first thing that isn't a pointer to a string is in fact 0x10101.
Flags: needinfo?(wmccloskey)
Flags: needinfo?(robert.strong.bugs)
Flags: needinfo?(mconley)
Summary: Crashes at 0x100 (Mac, Linux) or 0x10101 (Windows) in strlen | nsTSubString<char>::Assign | ContentProcess::Init → Crashes at 0x100 (Mac, Linux) or 0x10101 (Windows) in strlen | nsTSubString<char>::Assign | ContentProcess::Init reading past end of mozilla::dom::ContentPrefs::gInitPrefs
I loaded the raw dump of https://crash-stats.mozilla.com/report/index/1aed4417-34ec-42be-841d-a8d090171117.

In ContentProcess::Init() https://hg.mozilla.org/mozilla-central/annotate/f0c0fb9182d6/dom/ipc/ContentProcess.cpp#l174

I got index = 0xda (218) and the content of str is "219:0|220:0|"

The content around str is:
0x0000020FE7C199EB  32 30 34 3a 30 7c 32 30 35 3a 30 7c 32 30 37 3a 31 7c  204:0|205:0|207:1|
0x0000020FE7C199FD  32 31 36 3a 31 7c 32 31 37 3a 31 7c 32 31 38 3a 30 7c  216:1|217:1|218:0|
0x0000020FE7C19A0F  32 31 39 3a 30 7c 32 32 30 3a 30 7c 00 00 00 00 00 00  219:0|220:0|......
0x0000020FE7C19A21  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ..................
0x0000020FE7C19A33  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ..................

Index 218 is valid on that build. It should be "ui.popup.disable_autohide" so I don't think we have an out-of-bound access of gInitPrefs.

I ran the build and try to debug gInitPrefs (221 elements) out-of-bound like gInitPrefs[228] or gInitPrefs[250]. I still got valid strings.
(In reply to Cervantes Yu [:cyu] [:cervantes] from comment #8)
> 
> I got index = 0xda (218) and the content of str is "219:0|220:0|"
> 
> Index 218 is valid on that build. It should be "ui.popup.disable_autohide"
> so I don't think we have an out-of-bound access of gInitPrefs.
> 

Maybe I'm missing something - but that doesn't square with what I've found.

Here's ContentPrefs.cpp for the build for that crash signature: (rev f0c0fb9182d6, apparently)

https://hg.mozilla.org/mozilla-central/file/f0c0fb9182d6/dom/ipc/ContentPrefs.cpp

Here are the content prefs:

This user is on the Nightly channel, so we can keep the browser.startup.record entry. I think we can remove the fuzzing.enabled entry. The Stylo entries should stay.

That leaves me with this list:

[
  "accessibility.monoaudio.enable",
  "accessibility.mouse_focuses_formcontrol",
  "accessibility.tabfocus_applies_to_xul",
  "app.update.channel",
  "browser.dom.window.dump.enabled",
  "browser.sessionhistory.max_entries",
  "browser.sessionhistory.max_total_viewers",
  "browser.startup.record",
  "content.cors.disable",
  "content.cors.no_private_data",
  "content.notify.backoffcount",
  "content.notify.interval",
  "content.notify.ontimer",
  "content.sink.enable_perf_mode",
  "content.sink.event_probe_rate",
  "content.sink.initial_perf_time",
  "content.sink.interactive_deflect_count",
  "content.sink.interactive_parse_time",
  "content.sink.interactive_time",
  "content.sink.pending_event_mode",
  "content.sink.perf_deflect_count",
  "content.sink.perf_parse_time",
  "device.storage.prompt.testing",
  "device.storage.writable.name",
  "dom.allow_XUL_XBL_for_file",
  "dom.allow_cut_copy",
  "dom.enable_frame_timing",
  "dom.enable_performance",
  "dom.enable_resource_timing",
  "dom.event.handling-user-input-time-limit",
  "dom.event.touch.coalescing.enabled",
  "dom.forms.autocomplete.formautofill",
  "dom.ipc.processPriorityManager.backgroundGracePeriodMS",
  "dom.ipc.processPriorityManager.backgroundPerceivableGracePeriodMS",
  "dom.ipc.useNativeEventProcessing.content",
  "dom.max_chrome_script_run_time",
  "dom.max_ext_content_script_run_time",
  "dom.max_script_run_time",
  "dom.mozBrowserFramesEnabled",
  "dom.performance.enable_notify_performance_timing",
  "dom.performance.enable_user_timing_logging",
  "dom.storage.testing",
  "dom.url.encode_decode_hash",
  "dom.url.getters_decode_hash",
  "dom.use_watchdog",
  "dom.vibrator.enabled",
  "dom.vibrator.max_vibrate_list_len",
  "dom.vibrator.max_vibrate_ms",
  "dom.webcomponents.customelements.enabled",
  "dom.webcomponents.enabled",
  "focusmanager.testmode",
  "font.size.inflation.disabledInMasterProcess",
  "font.size.inflation.emPerLine",
  "font.size.inflation.forceEnabled",
  "font.size.inflation.lineThreshold",
  "font.size.inflation.mappingIntercept",
  "font.size.inflation.maxRatio",
  "font.size.inflation.minTwips",
  "full-screen-api.allow-trusted-requests-only",
  "full-screen-api.enabled",
  "full-screen-api.unprefix.enabled",
  "gfx.font_rendering.opentype_svg.enabled",
  "hangmonitor.timeout",
  "html5.flushtimer.initialdelay",
  "html5.flushtimer.subsequentdelay",
  "html5.offmainthread",
  "intl.charset.fallback.tld",
  "intl.ime.hack.on_ime_unaware_apps.fire_key_events_for_composition",
  "javascript.enabled",
  "javascript.options.asmjs",
  "javascript.options.asyncstack",
  "javascript.options.baselinejit",
  "javascript.options.baselinejit.threshold",
  "javascript.options.baselinejit.unsafe_eager_compilation",
  "javascript.options.discardSystemSource",
  "javascript.options.dump_stack_on_debuggee_would_run",
  "javascript.options.gczeal",
  "javascript.options.gczeal.frequency",
  "javascript.options.ion",
  "javascript.options.ion.offthread_compilation",
  "javascript.options.ion.threshold",
  "javascript.options.ion.unsafe_eager_compilation",
  "javascript.options.jit.full_debug_checks",
  "javascript.options.native_regexp",
  "javascript.options.parallel_parsing",
  "javascript.options.shared_memory",
  "javascript.options.streams",
  "javascript.options.strict",
  "javascript.options.strict.debug",
  "javascript.options.throw_on_asmjs_validation_failure",
  "javascript.options.throw_on_debuggee_would_run",
  "javascript.options.wasm",
  "javascript.options.wasm_baselinejit",
  "javascript.options.wasm_ionjit",
  "javascript.options.werror",
  "javascript.use_us_english_locale",
  "jsloader.shareGlobal",
  "layout.css.servo.chrome.enabled",
  "layout.css.stylo-blocklist.blocked_domains",
  "layout.css.stylo-blocklist.enabled",
  "layout.idle_period.required_quiescent_frames",
  "layout.idle_period.time_limit",
  "layout.interruptible-reflow.enabled",
  "mathml.disabled",
  "media.clearkey.persistent-license.enabled",
  "media.cubeb.backend",
  "media.cubeb.sandbox",
  "media.cubeb_latency_msg_frames",
  "media.cubeb_latency_playback_ms",
  "media.decoder-doctor.wmf-disabled-is-failure",
  "media.decoder.recycle.enabled",
  "media.dormant-on-pause-timeout-ms",
  "media.eme.audio.blank",
  "media.eme.enabled",
  "media.eme.video.blank",
  "media.ffmpeg.enabled",
  "media.ffvpx.enabled",
  "media.ffvpx.low-latency.enabled",
  "media.flac.enabled",
  "media.forcestereo.enabled",
  "media.gmp.decoder.enabled",
  "media.gmp.insecure.allow",
  "media.gpu-process-decoder",
  "media.libavcodec.allow-obsolete",
  "media.ogg.enabled",
  "media.ogg.flac.enabled",
  "media.resampling.enabled",
  "media.ruin-av-sync.enabled",
  "media.rust.test_mode",
  "media.suspend-bkgnd-video.delay-ms",
  "media.suspend-bkgnd-video.enabled",
  "media.use-blank-decoder",
  "media.video_stats.enabled",
  "media.volume_scale",
  "media.webspeech.recognition.enable",
  "media.webspeech.recognition.force_enable",
  "media.webspeech.synth.force_global_queue",
  "media.webspeech.test.enable",
  "media.webspeech.test.fake_fsm_events",
  "media.webspeech.test.fake_recognition_service",
  "media.wmf.allow-unsupported-resolutions",
  "media.wmf.enabled",
  "media.wmf.skip-blacklist",
  "media.wmf.vp9.enabled",
  "network.IDN.blacklist_chars",
  "network.IDN.restriction_profile",
  "network.IDN.use_whitelist",
  "network.IDN_show_punycode",
  "network.buffer.cache.count",
  "network.buffer.cache.size",
  "network.captive-portal-service.enabled",
  "network.cookie.cookieBehavior",
  "network.cookie.lifetimePolicy",
  "network.dns.disablePrefetch",
  "network.dns.disablePrefetchFromHTTPS",
  "network.jar.block-remote-files",
  "network.loadinfo.skip_type_assertion",
  "network.notify.changed",
  "network.offline-mirrors-connectivity",
  "network.protocol-handler.external.jar",
  "network.proxy.type",
  "network.security.ports.banned",
  "network.security.ports.banned.override",
  "network.standard-url.enable-rust",
  "network.standard-url.max-length",
  "network.sts.max_time_for_events_between_two_polls",
  "network.sts.max_time_for_pr_close_during_shutdown",
  "network.tcp.keepalive.enabled",
  "network.tcp.keepalive.idle_time",
  "network.tcp.keepalive.probe_count",
  "network.tcp.keepalive.retry_interval",
  "network.tcp.sendbuffer",
  "nglayout.debug.invalidation",
  "privacy.donottrackheader.enabled",
  "privacy.firstparty.isolate",
  "privacy.firstparty.isolate.restrict_opener_access",
  "privacy.resistFingerprinting",
  "security.data_uri.unique_opaque_origin",
  "security.fileuri.strict_origin_policy",
  "security.sandbox.content.level",
  "security.sandbox.content.tempDirSuffix",
  "security.sandbox.logging.enabled",
  "security.sandbox.mac.track.violations",
  "security.sandbox.windows.log.stackTraceDepth",
  "signed.applets.codebase_principal_support",
  "svg.disabled",
  "svg.display-lists.hit-testing.enabled",
  "svg.display-lists.painting.enabled",
  "svg.new-getBBox.enabled",
  "svg.paint-order.enabled",
  "svg.path-caching.enabled",
  "svg.transform-box.enabled",
  "toolkit.asyncshutdown.crash_timeout",
  "toolkit.asyncshutdown.log",
  "toolkit.osfile.log",
  "toolkit.osfile.log.redirect",
  "toolkit.telemetry.enabled",
  "toolkit.telemetry.idleTimeout",
  "toolkit.telemetry.initDelay",
  "toolkit.telemetry.log.dump",
  "toolkit.telemetry.log.level",
  "toolkit.telemetry.minSubsessionLength",
  "toolkit.telemetry.scheduler.idleTickInterval",
  "toolkit.telemetry.scheduler.tickInterval",
  "toolkit.telemetry.testing.overridePreRelease",
  "toolkit.telemetry.unified",
  "ui.key.menuAccessKeyFocuses",
  "ui.popup.disable_autohide",
  "ui.use_activity_cursor",
  "view_source.editor.external",
];

That list has only 210 entries. 218 goes way past the end.

Or did I miss something?
Flags: needinfo?(mconley) → needinfo?(cyu)
(In reply to Mike Conley (:mconley) (:⚙️) - Backlogged on reviews and needinfos from comment #9)
> 
> That list has only 210 entries. 218 goes way past the end.
> 
> Or did I miss something?

Some prefs are removed in bug 1414759. The build I was debugging is 20171113220112 and it has 221 prefs so 218~220 are still in the range.
Flags: needinfo?(cyu)
I think it's also worth double-checking that the debug identifiers match the build you think they're from.  A possible cause here is that we're somehow mixing libraries (e.g., firefox vs. libxul or something similar) between different firefox versions.
OS: Linux → All
Some of these crashes are random addresses, and at least one (likely more) are wildptr EXEC crashes -> Sec-critical

https://crash-stats.mozilla.com/report/index/1da77afb-e97b-4912-aa6b-2dc3a0171118

It's unclear if the stack full of Necko code is relevant

Also: there may be multiple bugs under this signagture; the EXEC wildptr might be different than the big spike
Group: core-security
Group: core-security → dom-core-security
Digging down into these signatures a bit, if you filter off processes that have the process type set you get down to the child init reports - 

https://crash-stats.mozilla.com/search/?signature=%3DnsTSubstring%3CT%3E%3A%3AAssign&process_type=%21content&product=Firefox&date=%3E%3D2017-11-29T07%3A59%3A28.000Z&date=%3C2017-12-06T07%3A59%3A28.000Z&_sort=-date&_facets=signature&_facets=shutdown_progress&_facets=process_type&_facets=startup_time&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#crash-reports


- High uptimes implying this is content process restart related
- really high occurrence level on OSX compared to other platforms
- seems to be more common in beta 57 builds than 57 release for some reason.

Stephen, please take a look at some of these osx content process startup crashes, see if you can find anything.
Flags: needinfo?(wmccloskey)
Flags: needinfo?(spohl.mozilla.bugs)
Flags: needinfo?(robert.strong.bugs)
Priority: -- → P2
On macOS, the following crash signature seemed to account for most of the crashes:

0 	strlen 	
1 	nsTSubstring<char>::Assign(char const*, unsigned int, mozilla::fallible_t const&)
2 	nsTSubstring<char>::Assign(char const*, unsigned int)
3 	mozilla::dom::ContentProcess::Init(int, char**)
4 	XRE_InitChildProcess(int, char**, XREChildData const*)
5 	main
6 	start

I can't seem to find any crashes with the above signature after the 57.0b99 (20171112125346) build.

I've had trouble finding the pushlog for 57.0b99, but bug 1415799 caught my attention. This overhauled libpref and landed around the same time.
Flags: needinfo?(spohl.mozilla.bugs)
> I've had trouble finding the pushlog for 57.0b99, but bug 1415799 caught my
> attention. This overhauled libpref and landed around the same time.

There have been a ton of changes to libpref recently, all under bug 1407526. Bug 1415799 was a very innocuous one -- it just inlined a few functions -- and seems unlikely to be a cause.
(In reply to Stephen A Pohl [:spohl] from comment #14)
> I can't seem to find any crashes with the above signature after the 57.0b99
> (20171112125346) build.

Marcia, the crash rate on nightly seems to have gone back down to virtually zero. Could you confirm?
Flags: needinfo?(mozillamarcia.knous)
Aha.... I wonder if this is the known-ASAN-problem with process init when you have MOZ_LOG/etc set -- gcp?  (At least the ones referenced in comment 15)
Flags: needinfo?(gpascutto)
(In reply to Stephen A Pohl [:spohl] from comment #16)
> (In reply to Stephen A Pohl [:spohl] from comment #14)
> > I can't seem to find any crashes with the above signature after the 57.0b99
> > (20171112125346) build.
> 
> Marcia, the crash rate on nightly seems to have gone back down to virtually
> zero. Could you confirm?

yes, it appears on nightly we have no crashes, but still have them in beta in nsTSubstring<T>::Assign.
Flags: needinfo?(mozillamarcia.knous)
Crashes that occur immediately at startup in a content process are unlikely to be exploitable, so I'm going to downgrade this to sec-high.
Keywords: sec-criticalsec-high
3 or even 4K crashes/day doesn't sound like MOZ_LOG/etc
>I wonder if this is the known-ASAN-problem with process init when you have MOZ_LOG/etc set -- gcp?

I don't know. The fix for that is turning out to be pretty hard to land.
Flags: needinfo?(gpascutto)
Marcia, is there a way to determine the nightly build id when this stopped showing up?
Flags: needinfo?(mozillamarcia.knous)
(In reply to Stephen A Pohl [:spohl] from comment #22)
> Marcia, is there a way to determine the nightly build id when this stopped
> showing up?

Bugzilla Crash Stop extension can give you a good visual indicator - you can install it from AMO.
Flags: needinfo?(mozillamarcia.knous)
The extension doesn't seem to work. I just get empty white space where it says "Crash Data:". If someone has this extension working it would be great to know in what build these crashes stopped working so we can close this bug out. Thanks!
Bug 1297740 (ASAN issue with MOZ_LOG) just landed on Nightly.
Marcia, can you help Stephen get this nailed down? Maybe he was forgetting a needinfo.
Flags: needinfo?(mozillamarcia.knous)
(In reply to Stephen A Pohl [:spohl] from comment #22)
> Marcia, is there a way to determine the nightly build id when this stopped
> showing up?

Here is what I can see for each nightly signature:

nsTSubstring<T>::Assign - Last seen in 20180119103852 in 59 Nightly. Don't see any crashes after that even into 60.
nsTSubstring<T>::Assign - Fennec last seen in 20180121100227 (1 crash) in 59 nightly


There are still crashes in 57.0.4, and 58 betas up to the release candidate. I don't yet see any crashes in 58 release. in 57.0.4 nsTSubstring<T>::Assign is the highest volume signature, but still isn't super high for release (around 250 crashes).

Let me know if you need more.
Flags: needinfo?(mozillamarcia.knous)
(In reply to Marcia Knous [:marcia - use ni] from comment #27)
> nsTSubstring<T>::Assign - Last seen in 20180119103852 in 59 Nightly. Don't
> see any crashes after that even into 60.

Ok, so this continued to show up well beyond my comment 16. It seems like there is just a delay until we start seeing reports from the most recent builds. This is therefore not fixed yet.
For some reason, nsTSubstring<T>::Assign in Firefox signature spiked in Nightly in the build from the 30th with almost 5200 crashes, but only 7 installs.
So... could this be a case of:
another process/firefox profile updated the code in the directory and when we started a new content process, it was a different version than the running instance of the Master process, and if preferences were removed; boom?

We've seen some issues with this mostly with developers/nightly (since they update far more often, and automatically).

We *should* block such occurences.  But we don't.  Note: I'm referring to the stack like in comment 14;; there are other bugs in these signatures as per above.

bz: does this sound like what you've hit?
Flags: needinfo?(spohl.mozilla.bugs)
Flags: needinfo?(bzbarsky)
Could you reference any bugs where we've encountered this? I haven't heard of this issue before.
Flags: needinfo?(spohl.mozilla.bugs) → needinfo?(rjesup)
bz has hit this... we've discussed it, but I don't know if there's a bug filed
Flags: needinfo?(rjesup)
> Could you reference any bugs where we've encountered this?

My Bugzilla searches are failing me so far, but this is a well-known issue that's trivial to reproduce:

1)  Make sure you have a downloaded but not installed update.
2)  Start a new process (browser toolbox devtool, browser content devtool, launch with new profile via
    about:profiles, etc).  This will install the update.
3)  Open some tabs and navigate to places.

When you open new tabs, we start new content processes, these are not the same version as the parent.  They will often disagree on IPC bits and crash....
Flags: needinfo?(bzbarsky)
Oh, here we go: bug 1366808 and its many duplicates.
Ugh... This is just terrible. I'll make a note of this and see if we can get this addressed sooner rather than later, even if this doesn't turn out to be related to this security bug.
Tracking for 59/60 since it's a sec high issue and it sounds like we may have a chance to fix it before the 59 release.
This shouldn't be a security issue. The original issue, which seems to be the most common cause of this crash by far and is most of the discussion in this bug, is a crash at the address 0x100 which seems safe.
Group: dom-core-security
nsTSubstring<T>::Assign is an extremely common function so different crashes are going to be bucketed together. Somebody could write a Socorro patch if they want to improve that.
ContentChild::Init() checks that the parent has the right build ID on this line:
  GetIPCChannel()->SendBuildID();
But the pref loading happens in ContentProcess::Init(), which I'm guessing happens first, so the parent could easily send down stuff in an unexpected way that crashes the child. The pref parsing code is a bunch of random char* stuff, so it could have problems. I think I did one round of hardening on it, but I doubt we've fuzzed it or anything so who knows what could go wrong.
Nicholas?
Flags: needinfo?(n.nethercote)
What's the question?
Flags: needinfo?(n.nethercote) → needinfo?(rjesup)
njn: comment 40, and this bug in general -- see comment 30 for a hypothesis as to the source of the problem.  Is your pref work hardening it against this sort of problem?  Have you considered this (evil) case?  (I'd hope we'll resolve this general mismatch, but that muight be a while.
Flags: needinfo?(rjesup)
If the binary used by the parent process doesn't match the binary used by the content processes, I can believe it would cause this crash. It might also cause prefs to get bogus values and/or types, which could lead to all sorts of bad behaviour, depending on which pref was affected.

My work on prefs does not relate to this at all. I wasn't even aware that binary mismatches like these were a possibility or a problem until a couple of days ago.

One possibility is to use pref names instead of indices in the -{int,bool,string}Prefs arguments, but that would make them a lot uglier than they already are.
> One possibility is to use pref names instead of indices in the
> -{int,bool,string}Prefs arguments, but that would make them a lot uglier
> than they already are.
Depends on: 1436911
(In reply to Nicholas Nethercote [:njn] from comment #45)
> > One possibility is to use pref names instead of indices in the
> > -{int,bool,string}Prefs arguments, but that would make them a lot uglier
> > than they already are.

I have plan for this now: it will involve using the pref names instead of indices, and they will be passed via shared memory.
See the tree of blocking bugs for the path forward.
Assignee: nobody → n.nethercote
With bug 1436911 landed this should now be impossible, because pref names (which don't change between different Firefox builds) are used instead of pref indices (which can change.)
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
I keep getting release-mgmt emails about this being tracked for Beta 60. Does anybody know how to stop that from happening? Bug 1436911, which fixes this, landed in 61.
You need to log in before you can comment on or make changes to this bug.