Closed
Bug 1322204
Opened 8 years ago
Closed 8 years ago
Crash in IPCError-browser | PURLClassifierParent::StartClassify
Categories
(Toolkit :: Safe Browsing, defect)
Tracking
()
RESOLVED
FIXED
mozilla53
Tracking | Status | |
---|---|---|
firefox50 | --- | unaffected |
firefox51 | --- | unaffected |
firefox52 | --- | unaffected |
firefox53 | + | fixed |
People
(Reporter: ting, Assigned: ehsan.akhgari)
References
Details
(Keywords: crash, regression, topcrash)
Crash Data
Attachments
(2 files)
1.85 KB,
patch
|
smaug
:
review+
gchang
:
approval-mozilla-aurora-
|
Details | Diff | Splinter Review |
1.92 KB,
patch
|
jcristau
:
approval-mozilla-beta-
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-73286bd8-5009-4df8-a805-f50152161206.
=============================================================
#3 top crash of Nightly 20161204030210 on Windows, 53 crashes from 10 installations. It was first observed at build 20161203030204.
Reporter | ||
Updated•8 years ago
|
Crash Signature: [@ IPCError-browser | PURLClassifierParent::StartClassify ] → [@ IPCError-browser | PURLClassifierParent::StartClassify ]
[@ mozilla::ipc::FatalError | mozilla::dom::indexedDB::PBackgroundIDBDatabaseChild::Write ]
Reporter | ||
Updated•8 years ago
|
Crash Signature: [@ IPCError-browser | PURLClassifierParent::StartClassify ]
[@ mozilla::ipc::FatalError | mozilla::dom::indexedDB::PBackgroundIDBDatabaseChild::Write ] → [@ IPCError-browser | PURLClassifierParent::StartClassify ]
[@ mozilla::ipc::FatalError | mozilla::dom::indexedDB::PBackgroundIDBDatabaseChild::Write ]
[@ mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | mozilla::ipc::IProtocol::Fa…
Comment 1•8 years ago
|
||
75% of the crashes with the PURLClassifierParent::StartClassify signature also contain mozilla::ipc::MessageChannel::WaitForSyncNotifyWithA11yReentry(). Out of the remaining 50 crashes I spot checked, the stacks all looked junky. Does this mean we're waiting for some sync IPC a11y messages unrelated to the URL classfier? It would be good to improve the crash signature for this case if that's true. The other two signatures don't seem to involve a11y at all, so maybe that's a red herring.
The URL classifier IPC stuff seems to have been added recently, in bug 1318768, but that hit Nightly around 11/22 or 11/23, which doesn't explain this regression.
Comment 2•8 years ago
|
||
Maybe this is a regression from bug 1321868? That landed in the regression range, and it sounds like it could add uses of the URL classfier in the content process. Ehsan, could you look into that? Thanks.
Flags: needinfo?(ehsan)
Keywords: regression,
topcrash
Comment 3•8 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #1)
> 75% of the crashes with the PURLClassifierParent::StartClassify signature
> also contain
> mozilla::ipc::MessageChannel::WaitForSyncNotifyWithA11yReentry(). Out of the
> remaining 50 crashes I spot checked, the stacks all looked junky. Does this
> mean we're waiting for some sync IPC a11y messages unrelated to the URL
> classfier? It would be good to improve the crash signature for this case if
> that's true. The other two signatures don't seem to involve a11y at all, so
> maybe that's a red herring.
I think the a11y thing appears in the stack because the ipc message is a sync message so we are waiting replys from the parent. I didn't load the mindumps but I think the parent probably killed the content process here:
http://searchfox.org/mozilla-central/rev/a8b5f53e7df90df655a0982e94087ee83290c22e/dom/ipc/URLClassifierParent.cpp#24
Updated•8 years ago
|
Crash Signature: mozilla::ipc::IProtocol::FatalError | mozilla::dom::PContentBridgeChild::Write ] → mozilla::ipc::IProtocol::FatalError | mozilla::dom::PContentBridgeChild::Write ]
[@ mozilla::ipc::FatalError | mozilla::dom::PContentChild::Write]
Assignee | ||
Comment 5•8 years ago
|
||
Definitely looks like a regression from bug 1321868.
Assignee | ||
Comment 6•8 years ago
|
||
I think this happens in safe mode, because of <http://searchfox.org/mozilla-central/rev/b9c9d257366379ac984622dbef38432a7fecd2e9/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#1275>. All of the reports I looked at were in safe mode, and they all had low runtimes, because this crash will happen the first time Web content loads any scripts.
Sadly Socorro was unable to detect the 100% correlation with safe mode. Nick, any idea why? Or do you know who is the right person to ask? Thanks!
Flags: needinfo?(n.nethercote)
Assignee | ||
Comment 7•8 years ago
|
||
Attachment #8817458 -
Flags: review?(bugs)
Updated•8 years ago
|
Attachment #8817458 -
Flags: review?(bugs) → review+
Comment 8•8 years ago
|
||
> Sadly Socorro was unable to detect the 100% correlation with safe mode.
> Nick, any idea why? Or do you know who is the right person to ask? Thanks!
Marco, is the SafeMode field looked at by the correlation detector? If not, perhaps it should be.
Flags: needinfo?(n.nethercote) → needinfo?(mcastelluccio)
Updated•8 years ago
|
Crash Signature: mozilla::ipc::IProtocol::FatalError | mozilla::dom::PContentBridgeChild::Write ]
[@ mozilla::ipc::FatalError | mozilla::dom::PContentChild::Write] → mozilla::ipc::IProtocol::FatalError | mozilla::dom::PContentBridgeChild::Write ]
[@ mozilla::ipc::FatalError | mozilla::dom::PContentChild::Write]
[@ mozilla::ipc::FatalError | mozilla::dom::PBrowserChild::Write]
Comment 10•8 years ago
|
||
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7849f81b44de
Fail gracefully when the URL classifier service isn't available because we're in safe mode; r=smaug
Comment 11•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Comment 12•8 years ago
|
||
The signatures "mozilla::ipc::FatalError | mozilla::dom::indexedDB::PBackgroundIDBDatabaseChild::Write", "mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError | mozilla::ipc::IProtocol::FatalError | mozilla::dom::PContentBridgeChild::Write", "mozilla::ipc::FatalError | mozilla::dom::PContentChild::Write", "mozilla::ipc::FatalError | mozilla::dom::PBrowserChild::Write" are not correlated with safe mode (100% false in all of them).
The signature "IPCError-browser | PURLClassifierParent::StartClassify" is correlated to safe mode, but there was a bug in the correlation tool (or rather, in the version of Spark we have on analysis.telemetry.mozilla.org: https://issues.apache.org/jira/browse/SPARK-16664) that caused it to only look for correlations in addons and modules.
I've fixed it now with a workaround, and the correlation to safe mode is showing up: "(100.0% in signature vs 01.87% overall) safe_mode = 1".
Flags: needinfo?(mcastelluccio)
Comment 13•8 years ago
|
||
(In reply to Marco Castelluccio [:marco] from comment #12)
> The signatures "mozilla::ipc::FatalError |
> mozilla::dom::indexedDB::PBackgroundIDBDatabaseChild::Write",
> "mozilla::ipc::FatalError | mozilla::ipc::IProtocol::HandleFatalError |
> mozilla::ipc::IProtocol::FatalError |
> mozilla::dom::PContentBridgeChild::Write", "mozilla::ipc::FatalError |
> mozilla::dom::PContentChild::Write", "mozilla::ipc::FatalError |
> mozilla::dom::PBrowserChild::Write" are not correlated with safe mode (100%
> false in all of them).
>
> The signature "IPCError-browser | PURLClassifierParent::StartClassify" is
> correlated to safe mode
Does this mean we have combined distinct bugs into this bug report? Should those other signatures be put into a new bug? (I'm not sure who is the best person to answer this question.)
> but there was a bug in the correlation tool (or
> rather, in the version of Spark we have on analysis.telemetry.mozilla.org:
> https://issues.apache.org/jira/browse/SPARK-16664) that caused it to only
> look for correlations in addons and modules.
> I've fixed it now with a workaround, and the correlation to safe mode is
> showing up: "(100.0% in signature vs 01.87% overall) safe_mode = 1".
Good to hear it's been fixed. Are there tests for the correlation tool? Could a new test be added to catch problems like this in the future?
Flags: needinfo?(mcastelluccio)
Comment 15•8 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #13)
> Does this mean we have combined distinct bugs into this bug report? Should
> those other signatures be put into a new bug? (I'm not sure who is the best
> person to answer this question.)
I looked over the signatures, and they are all gone from recent builds, except for mozilla::ipc::FatalError | mozilla::dom::PContentChild::Write. I'll file a new bug for that, and remove it from here.
Updated•8 years ago
|
Crash Signature: mozilla::ipc::IProtocol::FatalError | mozilla::dom::PContentBridgeChild::Write ]
[@ mozilla::ipc::FatalError | mozilla::dom::PContentChild::Write]
[@ mozilla::ipc::FatalError | mozilla::dom::PBrowserChild::Write] → mozilla::ipc::IProtocol::FatalError | mozilla::dom::PContentBridgeChild::Write ]
[@ mozilla::ipc::FatalError | mozilla::dom::PBrowserChild::Write]
Comment 16•8 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #15)
> (In reply to Nicholas Nethercote [:njn] from comment #13)
> > Does this mean we have combined distinct bugs into this bug report? Should
> > those other signatures be put into a new bug? (I'm not sure who is the best
> > person to answer this question.)
>
> I looked over the signatures, and they are all gone from recent builds,
> except for mozilla::ipc::FatalError | mozilla::dom::PContentChild::Write.
> I'll file a new bug for that, and remove it from here.
The original mozilla::ipc::FatalError | mozilla::dom::PContentChild::Write actually should be constrained by |proto signature contains PURLClassifer|
The last build id with this signature is 20161213030215
https://crash-stats.mozilla.com/search/?signature=~mozilla%3A%3Aipc%3A%3AFatalError%20%7C%20mozilla%3A%3Adom%3A%3APContentChild%3A%3AWrite&proto_signature=~PURLClassifier&date=%3E%3D2016-09-14T08%3A38%3A36.000Z&date=%3C2016-12-14T08%3A38%3A36.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#crash-reports
Comment 17•8 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #13)
> > but there was a bug in the correlation tool (or
> > rather, in the version of Spark we have on analysis.telemetry.mozilla.org:
> > https://issues.apache.org/jira/browse/SPARK-16664) that caused it to only
> > look for correlations in addons and modules.
> > I've fixed it now with a workaround, and the correlation to safe mode is
> > showing up: "(100.0% in signature vs 01.87% overall) safe_mode = 1".
>
> Good to hear it's been fixed. Are there tests for the correlation tool?
> Could a new test be added to catch problems like this in the future?
I have some basic tests for the tool, but I wouldn't have caught this anyway as it's a pretty unexpected bug (Spark 2.0.0 silently fails if a dataset contains more than _exactly_ 200 columns) and I was using the version which happened to have the fix (2.0.1).
BTW, I've now added a test specifically for this particular failure case and I'm locally using 2.0.0, the same version we have on analysis.telemetry.mozilla.org.
I've also implemented some sanity checks to make sure the results are generated every day and are reasonable.
Flags: needinfo?(mcastelluccio)
Assignee | ||
Comment 18•8 years ago
|
||
[Tracking Requested - why for this release]:
A crash regression introduced in this release.
tracking-firefox53:
--- → ?
Updated•8 years ago
|
Updated•8 years ago
|
status-firefox50:
--- → unaffected
status-firefox51:
--- → unaffected
status-firefox52:
--- → unaffected
Comment 19•8 years ago
|
||
Comment on attachment 8817458 [details] [diff] [review]
Fail gracefully when the URL classifier service isn't available because we're in safe mode
Approval Request Comment
[Feature/Bug causing the regression]: This feature is needed for the Shield study to be run on Release 52 (bug 1335232), which we'll use to study the effect of making flash click-to-play by default.
[User impact if declined]: Can't run the study as intended
[Is this code covered by automated tests?]: Yes
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]: Not for this feature independently. We'll do QE on the study as a whole to make sure all pieces work as expected
[List of other uplifts needed for the feature/fix]: Bug 1318768, Bug 1323220, Bug 1325255, Bug 1322204, Bug 1325651, Bug 1319571, Bug 1321377, Bug 1307604, Bug 1323064, Bug 1335549, Bug 1333303, Bug 1333483, Bug 1336714, Bug 1338287
[Is the change risky?]: No
[Why is the change risky/not risky?]: Fixes Bug 1318768
[String changes made/needed]: none
Attachment #8817458 -
Flags: approval-mozilla-aurora?
Comment 20•8 years ago
|
||
Attachment #8837400 -
Flags: approval-mozilla-beta?
Comment 21•8 years ago
|
||
Comment on attachment 8817458 [details] [diff] [review]
Fail gracefully when the URL classifier service isn't available because we're in safe mode
It's been in Aurora53 already. Aurora53-.
Attachment #8817458 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora-
Comment 22•8 years ago
|
||
Comment on attachment 8837400 [details] [diff] [review]
Fail gracefully when the URL classifier service isn't available because we're in safe mode (beta uplift)
this was deemed too risky for beta
Attachment #8837400 -
Flags: approval-mozilla-beta? → approval-mozilla-beta-
You need to log in
before you can comment on or make changes to this bug.
Description
•