Closed
Bug 1408631
Opened 8 years ago
Closed 7 years ago
Crash in shutdownhang | nsThread::Shutdown | nsUrlClassifierDBService::Shutdown
Categories
(Toolkit :: Safe Browsing, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla58
People
(Reporter: skywalker333, Assigned: dimi)
References
(Blocks 1 open bug)
Details
(Keywords: crash, topcrash, Whiteboard: #sbv4-m10)
Crash Data
Attachments
(3 files, 1 obsolete file)
|
59 bytes,
text/x-review-board-request
|
francois
:
review+
|
Details |
|
26.28 KB,
patch
|
francois
:
review+
ritu
:
approval-mozilla-esr52+
|
Details | Diff | Splinter Review |
|
1.18 KB,
patch
|
tnguyen
:
review+
ritu
:
approval-mozilla-esr52+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-318d46dc-b5f5-4477-8a8e-842bf0171014.
=============================================================
Firefox crashed while shutting down.
Four crash reports were generated during the shutdown process. I submitted them manually via about:crashes.
bp-318d46dc-b5f5-4477-8a8e-842bf0171014
10/14/2017 2:33 am
bp-8481b722-082c-4a64-8a84-fce970171014
10/14/2017 2:32 am
bp-d800ad24-94aa-44e3-98de-b74930171014
10/14/2017 2:30 am
bp-dadfd412-a085-4a06-a216-0a7320171014
10/14/2017 2:18 am
bp-3d0413f6-08d5-4837-b974-f064e0171014
10/14/2017 1:33 am
| Reporter | ||
Comment 1•8 years ago
|
||
bp-318d46dc-b5f5-4477-8a8e-842bf0171014
10/14/2017 2:33 am
Firefox 57.0b8 Crash Report [@ shutdownhang | nsThread::Shutdown | nsUrlClassifierDBService::Shutdown ]
Processor Notes processor_prod-processor-i-092db6dce0c531c7c_1383; MozillaProcessorAlgorithm2015; skunk_classifier: reject - not a plugin hang
bp-8481b722-082c-4a64-8a84-fce970171014
10/14/2017 2:32 am
Firefox 57.0b8 Crash Report [@ IPCError-browser | ShutDownKill ]
Processor Notes processor_prod-processor-i-0df39363997b2d110_1395; MozillaProcessorAlgorithm2015; non-integer value of "SecondsSinceLastCrash"; Signature replaced with an IPC Channel Error, was: "RtlAnsiStringToUnicodeString | MsgWaitForMultipleObjectsEx | mozilla::widget::WinUtils::WaitForMessage | nsAppShell::ProcessNextNativeEvent"; skunk_classifier: reject - not a plugin hang
bp-d800ad24-94aa-44e3-98de-b74930171014
10/14/2017 2:30 am
Firefox 57.0b8 Crash Report [@ IPCError-browser | ShutDownKill ]
Processor Notes processor_prod-processor-i-0807ffe45d5860cab_1392; MozillaProcessorAlgorithm2015; non-integer value of "SecondsSinceLastCrash"; Signature replaced with an IPC Channel Error, was: "js::GCMarker::lazilyMarkChildren"; skunk_classifier: reject - not a plugin hang
bp-dadfd412-a085-4a06-a216-0a7320171014
10/14/2017 2:18 am
Firefox 57.0b8 Crash Report [@ IPCError-browser | ShutDownKill ]
Processor Notes processor_prod-processor-i-0adac543901330d10_1396; MozillaProcessorAlgorithm2015; non-integer value of "SecondsSinceLastCrash"; Signature replaced with an IPC Channel Error, was: "RtlAnsiStringToUnicodeString | MsgWaitForMultipleObjectsEx | mozilla::widget::WinUtils::WaitForMessage | nsAppShell::ProcessNextNativeEvent"; skunk_classifier: reject - not a plugin hang
bp-3d0413f6-08d5-4837-b974-f064e0171014
10/14/2017 1:33 am
Firefox 57.0b8 Crash Report [@ shutdownhang | nsThread::Shutdown | nsUrlClassifierDBService::Shutdown ]
Processor Notes processor_prod-processor-i-0dd4f3221581842b7_1390; MozillaProcessorAlgorithm2015; skunk_classifier: reject - not a plugin hang
| Reporter | ||
Comment 2•8 years ago
|
||
Signature report for shutdownhang | nsThread::Shutdown | nsUrlClassifierDBService::Shutdown
Showing results from 7 days ago
5,364 Results
Windows 7 2329 43.4%
Windows Vista 1565 29.2%
Windows XP 1016 18.9%
Windows 10 349 6.5%
Windows 8.1 85 1.6%
Windows 8 10 0.2%
Windows Srv2003 10 0.2%
Firefox 52.4.0esr 2840 52.9% 2138
Firefox 45.2.0esr 1004 18.7% 443
Firefox 52.3.0esr 380 7.1% 377
Firefox 52.2.1esr 206 3.8% 90
Firefox 44.0b1 200 3.7% 168
Firefox 52.2.0esr 79 1.5% 47
Firefox 52.0esr 75 1.4% 56
Firefox 52.4.1esr 54 1.0% 40
Firefox 45.1.1esr 51 1.0% 37
Uptime Range
1-5 min 2341 43.6%
5-15 min 1471 27.4%
> 1 hour 846 15.8%
15-60 min 700 13.0%
< 1 min 6 0.1%
Architecture
x86 5199 96.9%
amd64 165 3.1%
| Reporter | ||
Comment 3•8 years ago
|
||
Signature report for shutdownhang | nsThread::Shutdown | nsUrlClassifierDBService::Shutdown
ccd9b01d-80e0-4f9d-9b10-f20af0171013 not connecting 2017-10-13 10:17:03 en-GB
f0fd3cac-1a71-4edd-a1d5-3a9670171012 Firefox keeps not responding I have refreshed firefox at least 8 times today but it is still a problem 2017-10-12 19:38:28 en-GB
c4db96d7-54cb-40cd-8613-b6a350171012 I can't do a Google search. 2017-10-12 19:15:39 en-US
4ae3099e-b2fb-4f04-ba1b-4b9da0171012 have no idea why 2017-10-12 10:47:43 en-US
34e61f50-0cd5-4522-acd8-503310171012 Unable to access. No connection. 2017-10-12 09:53:25 en-GB
9fa3153d-389a-48d9-a049-006820171012 Impossible d'avoir accès aux sites de mes avoris Internet 2017-10-12 07:25:45 fr
3c4805b9-42da-41cd-bce8-0699e0171011 it takes 30 seconds to go from one tab to another tab and the screen freezes a lot. 2017-10-11 18:48:38 en-US
06930b16-316c-4175-ba25-5cf990171011 Unable to open the desk icon. 2017-10-11 12:35:16 en-GB
7a61de79-5375-4124-87ef-bd18c0171008 no internet 2017-10-08 18:00:12 en-US
9a2a7b56-ef73-4341-9e2a-e40bb0171008 it does it alot 2017-10-08 01:49:04 en-US
b2c29f1a-7952-49c3-b604-b5eed0171007 why does firefox ALWAYS crash? 2017-10-07 23:06:04 en-US
f3586715-54e8-47e8-a642-678100171007 Was trying to update Firefox, would not do so, have had these problems since. 2017-10-07 18:33:10 en-US
Comment 4•8 years ago
|
||
Not sure what happened but I guess that "URL classifier" thread could not be shutdowned.
Frame Module Signature Source
0 xul.dll nsTArray_Impl<nsTArray<unsigned short>, nsTArrayInfallibleAllocator>::RemoveElementsAt(unsigned __int64, unsigned __int64) xpcom/ds/nsTArray.h:2059
1 xul.dll nsUrlClassifierPrefixSet::~nsUrlClassifierPrefixSet() toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp:66
2 xul.dll nsUrlClassifierPrefixSet::`scalar deleting destructor'(unsigned int)
3 xul.dll nsUrlClassifierPrefixSet::Release() toolkit/components/url-classifier/nsUrlClassifierPrefixSet.cpp:36
The above frames appeared in "URL classifier" thread of all crash report
Updated•8 years ago
|
Component: Untriaged → Safe Browsing
Priority: -- → P1
Product: Firefox → Toolkit
| Assignee | ||
Comment 5•8 years ago
|
||
As Thomas mentioned in Comment 4, almost every shutdown hang happened while releasing nsUrlClassifierPrefixSet::mIndexDeltas [1] which is a two-dimension array stores all the prefix data.
From crash reports, it doesn't look like we have any deadlock here because we are just freeing memory at this moment (nsTArray).
I guess most cases free |mIndexDelta| won't take too much time but maybe when CPU is very busy then freeing this big array may cause shutdown hang, not sure if we should add a telemetry to confirm this.
If free the array is really time-consuming in certain circumstances, one solution I can think of now is replace
fixed-length prefix set with variable-length prefix set implementation. The drawback of this approach will be lost the memory space optimization we have done for 4-bytes prefix set, and also, it is a big change for safebrowsing.
[1] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierPrefixSet.h#83
Comment 6•8 years ago
|
||
(In reply to Dimi Lee[:dimi][:dlee] from comment #5)
> I guess most cases free |mIndexDelta| won't take too much time but maybe
> when CPU is very busy then freeing this big array may cause shutdown hang,
> not sure if we should add a telemetry to confirm this.
GCP, have you dealt with something like that before in the Safe Browsing code?
> If free the array is really time-consuming in certain circumstances, one
> solution I can think of now is replace
> fixed-length prefix set with variable-length prefix set implementation. The
> drawback of this approach will be lost the memory space optimization we have
> done for 4-bytes prefix set, and also, it is a big change for safebrowsing.
While it could in theory help with this shutdown hang, it's likely to make memory usage significantly worse.
Updated•8 years ago
|
Flags: needinfo?(gpascutto)
Whiteboard: #sbv4-m10
| Assignee | ||
Comment 7•8 years ago
|
||
> > If free the array is really time-consuming in certain circumstances, one
> > solution I can think of now is replace
> > fixed-length prefix set with variable-length prefix set implementation. The
> > drawback of this approach will be lost the memory space optimization we have
> > done for 4-bytes prefix set, and also, it is a big change for safebrowsing.
>
> While it could in theory help with this shutdown hang, it's likely to make
> memory usage significantly worse.
Not sure if we could replace nsTArray<nsTArray<uint16_t>> with nsTArray<nsCString>.
Maybe it is faster to release nsCString ? Also need to check the "append" performance for nsTArray & nsCString
Comment 8•8 years ago
|
||
(In reply to François Marier [:francois] from comment #6)
> (In reply to Dimi Lee[:dimi][:dlee] from comment #5)
> > I guess most cases free |mIndexDelta| won't take too much time but maybe
> > when CPU is very busy then freeing this big array may cause shutdown hang,
> > not sure if we should add a telemetry to confirm this.
>
> GCP, have you dealt with something like that before in the Safe Browsing
> code?
No.
>Not sure if we could replace nsTArray<nsTArray<uint16_t>> with nsTArray<nsCString>.
If you try to use contiguous storage, you'll just move this bug from a slow shutdown to OOM crashes, see the original bug which added this code, bug 1046038.
If the string uses ropes or some other kind of fragmented storage, then you're at a fairly large risk of simply ending back up at the beginning of this bug.
I see 3 approaches:
1) Try to find a middle ground. Right now we use ~6000 arrays instead of one. Maybe a 100 or so will solve this bug while not regressing the OOM problem much. Of course the handling code might end up more complicated :-/
2) Fix bug 662444. This has enormous benefits for the project aside from making this bug moot.
3) Investigate why the freeing takes so long. 6000 arrays is indeed a lot, but should it really take so long that we pick it up? Surely a complicated webpage is worse? Note that the user comments in comment 3 talk about the browser being entirely broken.
Flags: needinfo?(gpascutto)
Comment 9•8 years ago
|
||
From reading bug 1046038, fixing this might be as simple as tuning DELTAS_LIMIT = 120. There is some trickiness there if you want to keep the file format compatible, but you might be in luck that for increasing the value there's no issue...
Comment 10•8 years ago
|
||
...and note that DELTAS_LIMIT also affects the time for a SafeBrowsing lookup (it's your worst case linear scan for a certain prefix). But I suspect you can get away with quite a bit bigger values without any measurable performance effect.
| Assignee | ||
Comment 11•8 years ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #8)
> 1) Try to find a middle ground. Right now we use ~6000 arrays instead of
> one. Maybe a 100 or so will solve this bug while not regressing the OOM
> problem much. Of course the handling code might end up more complicated :-/
> 2) Fix bug 662444. This has enormous benefits for the project aside from
> making this bug moot.
> 3) Investigate why the freeing takes so long. 6000 arrays is indeed a lot,
> but should it really take so long that we pick it up? Surely a complicated
> webpage is worse? Note that the user comments in comment 3 talk about the
> browser being entirely broken.
GCP, Thanks for all these great suggestions!
Hi francois,
The shutdown hang happened when clear lookupCache[1], I am thinking maybe we could do this first in "quit-application" WITHOUT blocking main-thread. which may give us some extra time before the blocking call inside nsUrlClassifierDBService::Shutdown[2].
I just checked the code I think it is doable in current SafeBrowinsg implementation (After Henry modified the update mechanism).
This approach doesn't fix the huge nsTArray release problem mentioned by GCP and if "quit-application" to "profile-before-change" doesn't take too long it also won't solve this issue. But I think we can give it a try because in theory, it should be able to improve shutdown time a little bit. If the crash rate is still high, then we can take one of the approaches suggested by GCP, how do you think?
[1] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/Classifier.cpp#906
[2] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#2441
Flags: needinfo?(francois)
Comment 12•8 years ago
|
||
(In reply to Dimi Lee[:dimi][:dlee] from comment #11)
> This approach doesn't fix the huge nsTArray release problem mentioned by GCP
> and if "quit-application" to "profile-before-change" doesn't take too long
> it also won't solve this issue. But I think we can give it a try because in
> theory, it should be able to improve shutdown time a little bit. If the
> crash rate is still high, then we can take one of the approaches suggested
> by GCP, how do you think?
That's a good idea. It's worth trying this simple change before playing with the DELTA_LIMITS.
Also, thanks GCP for all of this background on the existing storage code!
Flags: needinfo?(francois)
| Assignee | ||
Updated•8 years ago
|
Assignee: aosmond → dlee
Status: NEW → ASSIGNED
| Comment hidden (mozreview-request) |
Comment 14•8 years ago
|
||
| mozreview-review | ||
Comment on attachment 8920265 [details]
Bug 1408631 - Release SafeBrowsing lookupcache in worker thread while shutdown.
https://reviewboard.mozilla.org/r/191278/#review196546
Code looks great, as always :)
Just a few documentation suggestions to make the intention clearer.
::: commit-message-31af3:2
(Diff revision 1)
> +Bug 1408631 - Release SafeBrowsing lookupcache in worker thread while shutdown. r?francois
> +The blocking shutdown code is handled in ::Shutdown while receive
I'm not entirely sure, but I think you might need an empty line between the short description and the long one.
However, given that this comment is already in the code, you can simply remove it from the commit message.
::: toolkit/components/url-classifier/nsUrlClassifierDBService.h:227
(Diff revision 1)
>
> // Provide a way to forcibly close the db connection.
> nsresult GCC_MANGLING_WORKAROUND CloseDb();
>
> + // Release objects that won't affect update thread.
> + // Currently only Classifier::mLookupcache is released.
We can drop that comment since it's obvious from the code in the cpp file.
::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:820
(Diff revision 1)
> LOG(("urlclassifier db closed\n"));
>
> return NS_OK;
> }
>
> +// PreShutdown is executed prior than close the update thread.
"prior to terminating the update thread."
::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:831
(Diff revision 1)
> + if (mClassifier) {
> + // Classifier close will release all lookup caches which may be a time-consuming job.
> + // See Bug 1408631.
> + mClassifier->Close();
> + }
> +
I would actually move the comment that you have before the function name, right in here.
That way, it will be closer to the place where a person would mistakenly add more cleanups:
// WARNING: nothing we put here should affect an ongoing update in the update thread. When in doubt, put things in Shutdown() instead.
::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:2446
(Diff revision 1)
> }
> } else if (!strcmp(aTopic, "quit-application")) {
> // Tell the update thread to finish as soon as possible.
> gShuttingDownThread = true;
> +
> + // The blocking shutdown code is handled in ::Shutdown while receive
I think I know what you mean, but this comment is a little bit confusing.
Does the following match what you meant?
> The code in ::Shutdown() is run on a 'profile-before-change' event and ensures that objects are freed by blocking on this freeing.
>
> We can however speed up the shutdown time by using the worker thread to release, in an earlier event, any objects that cannot affect an ongoing update on the update thread."
::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:2461
(Diff revision 1)
> }
>
> return NS_OK;
> }
>
> +// Post a PreShutdown task to work thread to release objects without blocking
"worker thread"
Attachment #8920265 -
Flags: review?(francois) → review-
| Comment hidden (mozreview-request) |
| Assignee | ||
Comment 16•8 years ago
|
||
(In reply to François Marier [:francois] from comment #14)
>
> Just a few documentation suggestions to make the intention clearer.
>
Thanks for your review! I am really appreciate that help me make the comment way much better than the original version :)
Comment 17•8 years ago
|
||
| mozreview-review | ||
Comment on attachment 8920265 [details]
Bug 1408631 - Release SafeBrowsing lookupcache in worker thread while shutdown.
https://reviewboard.mozilla.org/r/191278/#review196890
Attachment #8920265 -
Flags: review?(francois) → review+
| Assignee | ||
Comment 18•8 years ago
|
||
Comment 19•8 years ago
|
||
Pushed by dlee@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1a4d6bea605f
Release SafeBrowsing lookupcache in worker thread while shutdown. r=francois
Comment 20•8 years ago
|
||
| bugherder | ||
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox58:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Comment 21•8 years ago
|
||
This signature has over 2500 reports from ESR52 in the last week. Is this something we can/should try to uplift?
Flags: needinfo?(dlee)
Comment 22•8 years ago
|
||
We should probably consider a fix for ESR52.5.0. I wonder if we can figure out the reason for the regression in ESR 52.4.0.
tracking-firefox-esr52:
--- → ?
| Assignee | ||
Comment 23•8 years ago
|
||
Reopen this bug because the patch landed in Comment 19 is only an improvement of nsUrlClassifierDBService shutdown time, it doesn't guarantee the bug is fixed. I'll still need to keep tracking the crash rate to see if it really helps.
Hi Ryan, Liz
I'll set this bug resolve fixed if the crash rate is decreased, then we can see if we want to uplift this patch or not.
Status: RESOLVED → REOPENED
Flags: needinfo?(dlee)
Resolution: FIXED → ---
Updated•8 years ago
|
Target Milestone: mozilla58 → ---
| Assignee | ||
Updated•8 years ago
|
Status: REOPENED → ASSIGNED
| Assignee | ||
Comment 24•8 years ago
|
||
Just an update that there is no crash on nightly since we landed this patch.
| Assignee | ||
Comment 25•8 years ago
|
||
There is no crash on nightly for the past two weeks.
I think the patch landed did reduce the reproduce rate of this bug.
Hi francois,
I'll set this bug as resolve fixed if you are ok with it.
Flags: needinfo?(francois)
Comment 26•8 years ago
|
||
(In reply to Dimi Lee[:dimi][:dlee] from comment #25)
> I'll set this bug as resolve fixed if you are ok with it.
Sounds good to me. Do you want to port your fix to ESR52 too?
Flags: needinfo?(francois)
Comment 27•8 years ago
|
||
We would normally build ESR 52.5.0 today, but I have put it off until tomorrow morning.
I don't think we would want to uplift this to 57, but if it lands on ESR52 branch by the time I go to build then, great. If not then it will have to wait till 52.6.0.
Flags: needinfo?(dlee)
| Assignee | ||
Comment 28•8 years ago
|
||
Hi Liz,
Sorry I found ESR52 doesn't contain the latest update scheme introduced in Bug 1338017.
This patch is built on the top of that, so I can't uplift this.
Flags: needinfo?(dlee)
| Assignee | ||
Updated•8 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 8 years ago → 8 years ago
Resolution: --- → FIXED
Comment 29•8 years ago
|
||
Leaving 57 set to fix-optional in case the crash rate is high enough to warrant dot release consideration eventually.
tracking-firefox-esr52:
? → ---
Target Milestone: --- → mozilla58
Comment 30•8 years ago
|
||
The crash rate on ESR52 is extremely high (~40000 crashes in the last week) - is there anything we can do to stop the bleeding there?
tracking-firefox-esr52:
--- → ?
Flags: needinfo?(dlee)
| Reporter | ||
Comment 31•8 years ago
|
||
[Tracking Requested - why for this release]:
55,632 Results in the past 7 days.
Top Crasher for 52.5.0esr.
Still high crash rate for 57.x.
Signature report for shutdownhang | nsThread::Shutdown | nsUrlClassifierDBService::Shutdown
Showing results from 7 days ago
Windows Vista 22024 39.6%
Windows 7 17702 31.8%
Windows XP 6549 11.8%
Windows 10 6182 11.1%
Windows 8.1 2611 4.7%
Windows 8 331 0.6%
Windows Srv'03 232 0.4%
Linux 1 0.0%
Firefox 57.0.1 62 0.1% 85
Firefox 57.0 499 0.9% 399
Firefox 52.5.0esr 23123 41.8% 12435
Firefox 52.4.1esr 2240 4.1% 1582
Firefox 52.4.0esr 8834 16.0% 8314
Firefox 52.3.0esr 4162 7.5% 3251
Firefox 52.2.1esr 1123 2.0% 1198
Firefox 52.2.0esr 1148 2.1% 839
Firefox 52.1.2esr 361 0.7% 224
Firefox 52.1.1esr 158 0.3% 134
Firefox 52.1.0esr 148 0.3% 135
Firefox 52.0esr 985 1.8% 722
Firefox 52.0.2esr 359 0.6% 200
Firefox 52.0.2 488 0.9% 522
Firefox 52.0.1esr 484 0.9% 222
x86 54029 97.1%
amd64 1603 2.9%
Top Crasher for Firefox 52.5.0esr
Top 50 Crashing Signatures. 7 days ago
1 12.46% 2.73% shutdownhang | nsThread::Shutdown | nsUrlClassifierDBService::Shutdown
application
23123 23123 0 0 17052 1 2016-10-22
Recent crash report:
Report ID Date Submitted
bp-0d12ed6b-2951-490b-9e2a-dfe400171204
12/03/17 7:09 PM
Firefox 52.5.0esr
| Assignee | ||
Comment 32•8 years ago
|
||
I just checked the crash report for ESR52, most of shutdown hangs happened because we were updating.
We can apply the fix in Bug 1315386 to ESR 52 to reduce the crash rate.
As for 57, we can use the patch in thus bug.
Hi francois, do you have any concern about this?
Flags: needinfo?(dlee)
| Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(francois)
Comment 33•8 years ago
|
||
(In reply to Dimi Lee[:dimi][:dlee] from comment #32)
> I just checked the crash report for ESR52, most of shutdown hangs happened
> because we were updating.
> We can apply the fix in Bug 1315386 to ESR 52 to reduce the crash rate.
>
> As for 57, we can use the patch in thus bug.
>
> Hi francois, do you have any concern about this?
That sounds fine to me. It's not a small patch, but since it's the top crasher for 52, we probably need it.
BTW, we'll have to grab the patch on bug 1336005 too.
Thomas, do you remember any other fixes you may have made to the shutdown-aware patch that we should also backport to 52?
Flags: needinfo?(francois) → needinfo?(tnguyen)
Comment 34•8 years ago
|
||
I think that's all. Other shutdown fixes are landed after we rewrote thread model of url-classifier.
Flags: needinfo?(tnguyen)
| Assignee | ||
Comment 35•8 years ago
|
||
Hi francois,
This patch merged the fix in Bug 1315386 and Bug 1336005.
Would you mind taking a quick look to make sure everything is ok?
Attachment #8934406 -
Flags: review?(francois)
Comment 36•8 years ago
|
||
Comment on attachment 8934406 [details] [diff] [review]
esr52-uplift.patch
Review of attachment 8934406 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks for taking care of this Dimi!
Attachment #8934406 -
Flags: review?(francois) → review+
| Assignee | ||
Comment 37•8 years ago
|
||
Comment on attachment 8934406 [details] [diff] [review]
esr52-uplift.patch
[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
Not a sec bug
User impact if declined:
High crash rate when shutdown
Fix Landed on Version:
53
Risk to taking this patch (and alternatives if risky):
The risk is that this fix doesn't reduce the rate of shutdown hang or causes other shutdown crash.
But since we have test this for a while (This fix first landed on Nov 2016), I think it won't be too risky.
String or UUID changes made by this patch:
None
Attachment #8934406 -
Flags: approval-mozilla-esr52?
Comment on attachment 8934406 [details] [diff] [review]
esr52-uplift.patch
This patch isn't trivial. Typically ESR uplifts are limited to security and stability fixes. The crash volume for this signature is high enough to justify inclusion in ESR52.6.
Attachment #8934406 -
Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
| Assignee | ||
Updated•7 years ago
|
| Assignee | ||
Updated•7 years ago
|
Comment 40•7 years ago
|
||
Also, not seeing any crash reports on Fx58, so re-resolving this as fixed. Trevor - status flags track older branches, bug resolution tracks m-c.
Status: REOPENED → RESOLVED
Closed: 8 years ago → 7 years ago
Resolution: --- → FIXED
Comment 41•7 years ago
|
||
Dimi, it would appears that the ESR52 patch is hitting Valgrind leaks extremely frequently. Can you please take a look?
https://treeherder.mozilla.org/logviewer.html#?job_id=155218195&repo=mozilla-esr52
Flags: needinfo?(dlee)
| Assignee | ||
Comment 42•7 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #41)
> Dimi, it would appears that the ESR52 patch is hitting Valgrind leaks
> extremely frequently. Can you please take a look?
> https://treeherder.mozilla.org/logviewer.html#?job_id=155218195&repo=mozilla-
> esr52
Hi Ryan,
I suspect this is related to a fix(Bug 1336240) we didn't include in the patch uplifted, but I am not 100% sure
I'll upload a patch that includes the fix in Bug 1336240, could you help check if it fixes the Valgrind leaks? Thanks!
I am out of office tomorrow, also, I don't know how to run try on mozilla-esr52.
Flags: needinfo?(ryanvm)
| Assignee | ||
Comment 43•7 years ago
|
||
Comment 44•7 years ago
|
||
For Valgrind, pushing to Try from ESR52 should Just Work with the |try: -b o -p linux64-valgrind -u none -t none| syntax. That said, I ran a push your attached patch and it still leaks :(
https://treeherder.mozilla.org/logviewer.html#?job_id=155645562&repo=try
Flags: needinfo?(ryanvm)
| Assignee | ||
Comment 45•7 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #44)
> For Valgrind, pushing to Try from ESR52 should Just Work with the |try: -b o
> -p linux64-valgrind -u none -t none| syntax. That said, I ran a push your
> attached patch and it still leaks :(
>
> https://treeherder.mozilla.org/logviewer.html#?job_id=155645562&repo=try
Thanks for help, I am working on this now.
| Assignee | ||
Comment 46•7 years ago
|
||
| Assignee | ||
Comment 47•7 years ago
|
||
Attachment #8941769 -
Attachment is obsolete: true
Flags: needinfo?(dlee)
| Assignee | ||
Comment 48•7 years ago
|
||
Comment on attachment 8942859 [details] [diff] [review]
valgrind.patch
Hi Thomas,
This patch fixes the Valgrind leaks in esr52, could you help check this? Thanks!
Attachment #8942859 -
Flags: review?(tnguyen)
Comment 49•7 years ago
|
||
Comment on attachment 8942859 [details] [diff] [review]
valgrind.patch
Review of attachment 8942859 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp
@@ +1514,5 @@
> return NS_OK;
> }
> + if (gShuttingDownThread) {
> + *didLookup = false;
> + return NS_ERROR_ABORT;
This lgtm. But I am still a bit concerned about the root cause of the leak.
Given that adding this could fix the leak (or at lease slim the leak). Because we add a shutdown check to mWorkerProxy->Lookup https://dxr.mozilla.org/mozilla-esr52/rev/e8ff30ea74ad56ef53d435613672f05ff7675740/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#1577, we could guess that the leak is related to our queue (thread queue and lookup request queue).
The leak log looks weird, I don't see anything from that. But if it relates to thread queue, I think it would be worth to add the shutdown check to other calls contain mWorkerProxy like GetTables
Attachment #8942859 -
Flags: review?(tnguyen) → review+
| Assignee | ||
Comment 50•7 years ago
|
||
(In reply to Thomas Nguyen[:tnguyen] ni? plz from comment #49)
> This lgtm. But I am still a bit concerned about the root cause of the leak.
> Given that adding this could fix the leak (or at lease slim the leak).
> Because we add a shutdown check to mWorkerProxy->Lookup
> https://dxr.mozilla.org/mozilla-esr52/rev/
> e8ff30ea74ad56ef53d435613672f05ff7675740/toolkit/components/url-classifier/
> nsUrlClassifierDBService.cpp#1577, we could guess that the leak is related
> to our queue (thread queue and lookup request queue).
> The leak log looks weird, I don't see anything from that. But if it relates
> to thread queue, I think it would be worth to add the shutdown check to
> other calls contain mWorkerProxy like GetTables
Thanks for your review, thomas.
I agree it would worth to add shutdown check to other calls, also maybe need to check why
we don't have this in central.
But for uplift, I want to make the changes as few as possible, so I'll only add one more in this case.
try looks good:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=87c81806244cf581865306927a49bb1319f4cbf6
| Assignee | ||
Comment 51•7 years ago
|
||
Comment on attachment 8942859 [details] [diff] [review]
valgrind.patch
[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
Not a sec bug
User impact if declined:
No, this fix a try error. The memory leaks only occurs while shutdown
Fix Landed on Version:
This fix previous try error for the uplift patch, it doesn't include in other branch.
Risk to taking this patch (and alternatives if risky):
I think this patch should be safe seems it just add one more shutdown check to prevent memory leak.
String or UUID changes made by this patch:
No
Attachment #8942859 -
Flags: approval-mozilla-esr52?
Comment on attachment 8942859 [details] [diff] [review]
valgrind.patch
Low risk, ESR52+
Attachment #8942859 -
Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
Comment 53•7 years ago
|
||
| bugherder uplift | ||
Thanks for sorting this out, Dimi!
https://hg.mozilla.org/releases/mozilla-esr52/rev/6f5c8df1925d
https://hg.mozilla.org/releases/mozilla-esr52/rev/3b1faddc0e5f
You need to log in
before you can comment on or make changes to this bug.
Description
•