Closed
Bug 1205110
Opened 10 years ago
Closed 5 years ago
frequent crashes when visiting linkedin.com profiles, often with high purple buffer value
Categories
(Core :: General, defect)
Tracking
()
People
(Reporter: bugzilla.mozilla.org, Unassigned)
References
Details
(Keywords: crash, perf, Whiteboard: [platform-rel-Linkedin])
Crash Data
Attachments
(10 files, 3 obsolete files)
|
70.83 KB,
image/png
|
Details | |
|
85.37 KB,
image/png
|
Details | |
|
87.31 KB,
image/png
|
Details | |
|
115.05 KB,
image/png
|
Details | |
|
102.80 KB,
image/png
|
Details | |
|
230.41 KB,
image/png
|
Details | |
|
44.83 KB,
text/plain
|
Details | |
|
183.79 KB,
application/x-gzip
|
Details | |
|
496.19 KB,
application/x-gzip
|
Details | |
|
676.80 KB,
application/x-gzip
|
Details |
For quite a while I've been getting reproducible crashes when visiting profile pages on linkedin.com. Going by https://community.linkedin.com/questions/2148/mozilla-firefox-not-loading-linkedin-pages.html others seem to be experiencing similar issues.
What happens is that upon visiting one of the triggering pages, FF will become unresponsive, grey out, consume 100% of CPU and have a quick increase in RAM usage until it crashes.
Example crashes
bp-40972f72-d0a9-4347-9ac3-504672150915
bp-890c81e7-827c-46eb-9540-9b4152150915
possibly related bugs
bug 1165934
bug 1140519
bug 1098484
| Reporter | ||
Comment 1•10 years ago
|
||
I've deleted cookies to no avail.
| Reporter | ||
Updated•10 years ago
|
OS: Unspecified → Linux
Hardware: Unspecified → x86
| Reporter | ||
Comment 2•10 years ago
|
||
I'm running Ubuntu Trusty.
| Reporter | ||
Comment 3•10 years ago
|
||
This is reproducible in safe-mode.
Comment 4•9 years ago
|
||
Are you still seeing this? If so, is there any chance that you can do some process sampling or checks with the gecko profiler ( https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Reporting_a_Performance_Problem ) to get a better idea of what's hanging here?
Component: General → Untriaged
Flags: needinfo?(bugzilla.mozilla.org)
| Reporter | ||
Comment 5•9 years ago
|
||
I am sorry about the delay in responding.
The problem is still the same. I will have to read up on the URL given to give the additional requested information.
Flags: needinfo?(bugzilla.mozilla.org)
| Reporter | ||
Comment 6•9 years ago
|
||
Apparently, this is the #1 crasher for FF44 with a number of open bug tickets for it.
http://wc.devil.mx/proxify.php?u=rSKmbsBFPEniK20TNhrhQ6p1BcqcoIq5JkQgHVP5L%2F3arQZvCO3K6n4K16RUiUFttnLxX94E8bhtOe530xhnPq%2BU93Z6boDCc0pGvQ%3D%3D
Comment 7•9 years ago
|
||
Hi,
The issue is pretty hard to reproduce, but it's still reproducible on the latest Firefox release (45.0.2, Build ID 20160407164938). I can only reproduce if opening Firefox and going to LinkedIn.com is the first thing I do once Linux x32 is up and running. Opening 5-6 tabs of people's profiles results in a browser hang, unresponsive slow script dialog and eventually crash (bp-8354bf87-5323-476f-8e47-2ec202160415). I've tried to get a gecko profile, but it has proved impossible due to the immediate hang.
As the browser hangs the CPU usage is all over the place, see the attached screenshot.
The issue in only reproducible on Linux x32. And it is no longer reproducible if you have worked on the machine before attempting to reproduce this issue.
Thanks,
Cipri
Comment 8•9 years ago
|
||
(In reply to Rolf Leggewie from comment #6)
> Apparently, this is the #1 crasher for FF44 with a number of open bug
> tickets for it.
Unfortunately, OOM|small can happen for a lot of different reasons (you'll see that basically any time Firefox fails to allocate a small amount of memory due to lack of continuous address space), so that by itself doesn't say much. What matters more for diagnosing what's wrong here is what's below that (i.e. where we were trying to allocate memory when we failed to do so). Unfortunately, that's also broken for some reason (we should be seeing more useful call stack information than just libxul.so at random addresses)!
I don't understand why we're not getting usable crash stacks in the submitted crash reports. I know at least the people on my team have been explicitly using official Mozilla binaries rather than Ubuntu's while trying to reproduce. Ted, do you have any thoughts for what might be going on?
Flags: needinfo?(ted)
Comment 9•9 years ago
|
||
Are these Ubuntu binaries? The build ID there doesn't match the build id from our official 40.0.3 Linux x86 build:
http://ftp.mozilla.org/pub/firefox/candidates/40.0.3-candidates/build1/linux-i686/en-US/firefox-40.0.3.txt
If Ubuntu builds are missing symbols we need Chris Coulson to help.
Flags: needinfo?(ted)
Comment 10•9 years ago
|
||
The build in comment 7 was a Mozilla binary IIUC.
https://crash-stats.mozilla.com/report/index/8354bf87-5323-476f-8e47-2ec202160415
Comment 11•9 years ago
|
||
OK, there's another issue there (from the modules tab):
Ø libxul.so 000000000000000000000000000000000 libxul.so
This is probably due to us trying to `mmap` libxul and failing due to OOM (or address space exhaustion or fragmentation or whatever). Breakpad generates the debug identifiers there by `mmap`ing each file and reading some data out of it:
https://chromium.googlesource.com/breakpad/breakpad/+/master/src/client/linux/minidump_writer/linux_dumper.cc#148
On Windows we VirtualAlloc some extra memory and free it in the pre-minidump-writing callback to work around this problem:
https://dxr.mozilla.org/mozilla-central/rev/21bf1af375c1fa8565ae3bb2e89bd1a0809363d4/toolkit/crashreporter/nsExceptionHandler.cpp#396
We could do the same thing on Linux (although it's probably only necessary on Linux/x86, since x86-64 has lots of address space). Alternately we could try to fix Breakpad to not `mmap` the entire file.
Comment 12•9 years ago
|
||
Are you able to get a short profile?
https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler
Flags: needinfo?(bugzilla.mozilla.org)
Comment 13•9 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #11)
> ...
> We could do the same thing on Linux (although it's probably only necessary
> on Linux/x86, since x86-64 has lots of address space). Alternately we could
> try to fix Breakpad to not `mmap` the entire file.
followup bug?
Flags: needinfo?(ted)
Comment 14•9 years ago
|
||
StartingLinkedIn
Comment 15•9 years ago
|
||
MemoryEatenUp, but swap hardly touched
Comment 16•9 years ago
|
||
System Monitor
Comment 17•9 years ago
|
||
All memory used now
Comment 18•9 years ago
|
||
After killing Firefox CPU is freed and memory released.
Attachment #8764651 -
Attachment is obsolete: true
Comment 19•9 years ago
|
||
Ubuntu System monitor
Comment 20•9 years ago
|
||
In my case Firefox 45.0 on Ubuntu 16.04 LTS 64 bit LinkedIn "People you may know" gobbles up all memory. I have attached above screenshots to illustrate the issue. Will run the Gekko profiler and revert.
Comment 21•9 years ago
|
||
I've managed to get a crash on the latest Nightly:
https://crash-stats.mozilla.com/report/index/75c61a78-5dff-4efc-a4bb-843d82160623
Component: Untriaged → DOM: CSS Object Model
Product: Firefox → Core
For an out-of-memory crash, the stack is uninteresting. What is needed is knowledge of what memory usage is increasing. Can you keep an eye on about:memory and see what memory usage is going up?
Comment 23•9 years ago
|
||
I've attached the logs I got through about:memory, but honestly I'm not entirely sure of how to interpret it. Could you please help me understand this?
Flags: needinfo?(ciprian.muresan) → needinfo?(dbaron)
Comment 24•9 years ago
|
||
(In reply to Wayne Mery (:wsmwk, NI for questions) from comment #13)
> (In reply to Ted Mielczarek [:ted.mielczarek] from comment #11)
> > ...
> > We could do the same thing on Linux (although it's probably only necessary
> > on Linux/x86, since x86-64 has lots of address space). Alternately we could
> > try to fix Breakpad to not `mmap` the entire file.
>
> followup bug?
Feel free to file one.
Flags: needinfo?(ted)
This:
│ ├──289.36 MB (56.11%) ── purple-buffer
seems the most suspicious.
mccr8, could you take a look, or if not, bounce to somebody else?
Flags: needinfo?(dbaron) → needinfo?(continuation)
Comment 26•9 years ago
|
||
Maybe Olli could take a look on Tuesday. If not, I should be able to on Wednesday or Thursday.
Comment 27•9 years ago
|
||
I have seen a large-ish purple buffer a few times, though I can't reproduce the rest of it. I'll try to figure out what is in the purple buffer.
Assignee: nobody → continuation
Comment 28•9 years ago
|
||
Well, I can't reproduce the issue here, and I am not seeing a large purple buffer, so I am not going to be able to investigate this.
Assignee: continuation → nobody
Flags: needinfo?(continuation)
Updated•9 years ago
|
platform-rel: --- → ?
Whiteboard: [platform-rel-Linkedin]
Updated•9 years ago
|
platform-rel: ? → +
Comment 29•8 years ago
|
||
Florin, is it possible to get some help with QA on LinkedIn profiling to see if this is still happening in case there have been design changes on their end since this was filed?
Flags: needinfo?(florin.mezei)
| Reporter | ||
Comment 30•8 years ago
|
||
I'm sorry for not getting back sooner to this ticket. I continued to experience this problem reproducibly until I switched my i386 system to an amd64 kernel and amd64 firefox. It seems memory addressing is more capable with 64bit such as to make it harder to reproduce there.
Flags: needinfo?(bugzilla.mozilla.org)
Comment 31•8 years ago
|
||
Ciprian, could you please take a look and see if this is still reproducible on latest Firefox?
Flags: needinfo?(florin.mezei) → needinfo?(ciprian.muresan)
Comment 32•8 years ago
|
||
The issue is still reproducible on the latest Firefox release (51.0.1) and on the latest Nightly (54.0a1, Build ID 20170214110212).
bp-678566e5-5847-459c-a84c-985ab2170215 - Release crash
bp-e8d90d36-78a3-43a1-ad10-128762170215 - Nightly crash
Attached about:memory logs from when the issue started to appear.
Flags: needinfo?(ciprian.muresan)
Comment 33•8 years ago
|
||
Attached about:memory logs after I left the issue manifest itself a bit.
Comment 34•8 years ago
|
||
Is that the memory log from only the parent process? Can you attach one from the content process, please?
Flags: needinfo?(ciprian.muresan)
Comment 35•8 years ago
|
||
(In reply to Ciprian Muresan [:cmuresan] from comment #32)
> bp-e8d90d36-78a3-43a1-ad10-128762170215 - Nightly crash
The stack is around allocating from GetBorderTopWidth.
Comment 36•8 years ago
|
||
Apparently, if I wait too much before saving memory logs, it won't save logs for the content process.
Attachment #8837539 -
Attachment is obsolete: true
Attachment #8837540 -
Attachment is obsolete: true
Flags: needinfo?(ciprian.muresan)
Comment 37•8 years ago
|
||
Thanks, Ciprian. Olli, WDYT about the memory report here? It doesn't look like a whole lot of memory used to me.
Flags: needinfo?(bugs)
Comment 38•8 years ago
|
||
Nick, do you have any thoughts on how we could capture what's causing the OOM here?
Flags: needinfo?(n.nethercote)
Comment 39•8 years ago
|
||
Various things...
On Windows we have a mechanism that periodically takes memory reports if we're close to running out of address space, and the most recent memory report gets incorporated into the crash report. But that doesn't run on Linux. So the best way to make progress is to get memory reports from about:memory as close to the point of crashing as possible.
The "linkedin.memory2.txt" attachment *does* have the content process. Search for "Web Content (pid 16186)".
The only suspicious thing I see in the "linkedin.memory2.txt" attachment is the high purple buffer value, which dbaron mentioned in comment 25. It's possible that was a temporary transient thing. Other than that, the most important numbers are as follows.
> Main Process
> 340.95 MB ── resident
> 1,045.40 MB ── vsize
>
> Web Content (pid 16186)
> 624.46 MB ── resident
> 2,216.17 MB ── vsize
Those look pretty normal. It's typical on Linux for vsize to be significantly higher than resident. (E.g. in my current Linux session I have 278 & 1206 in the main process and 410 & 1024 in the content process.) I wouldn't expect this to be a problem... unless you're running a 32-bit build, which is not standard on Linux. Rolf said in comment 30 that he was running a 32-bit build, then switched to 64-bit and the problem went away. And Ciprian is also running a 32-bit build (comment 7).
In the memory-report-linkedin3.json.gz attachment, the only surprising things I see are these:
> Web Content (pid 2561)
>
> ├────459 (12.96%) -- top(https://www.linkedin.com/, id=6442450949)/active
> │ ├──389 (10.99%) -- window(https://www.linkedin.com/)/dom
> │ │ ├──378 (10.67%) ── event-listeners
> │ │ └───11 (00.31%) ── event-targets
> │ ├───37 (01.04%) ++ window(https://ad-emea.doubleclick.net/adi/linkedin.dart/oz-winner;optout=false;lang=en;tile=2;sz=300x250;s=0;v=6;u=cQENd4lKkTBjdD1LnSRvgz9f;mod=50;title=en;func=qa;coid=1963799;ind=4;occ=407;pocc=3;pocc=8687;pocc=3076;cntry=ro;reg=0;sub=0;gdr=m;seg=9005;seg=548;sjt=554;tile_p=2;adsuite=v2.2.6-min;sfadapter=t;ord=3358423959628?li_ads_3p=control)/dom
> │ └───33 (00.93%) ++ (9 tiny)
> ├────342 (09.66%) -- top(https://www.linkedin.com/in/hanspeschke?authType=NAME_SEARCH&authToken=jgA3&locale=de_DE&srchid=4932510911487150419061&srchindex=1&srchtotal=229198&trk=vsrp_people_res_name&trkInfo=VSRPsearchId%3A4932510911487150419061%2CVSRPtargetId%3A361578890%2CVSRPcmpt%3Aprimary%2CVSRPnm%3Atrue%2CauthType%3ANAME_SEARCH, id=6442450959)/active
> │ ├──257 (07.26%) -- window(https://www.linkedin.com/in/hanspeschke?authType=NAME_SEARCH&authToken=jgA3&locale=de_DE&srchid=4932510911487150419061&srchindex=1&srchtotal=229198&trk=vsrp_people_res_name&trkInfo=VSRPsearchId%3A4932510911487150419061%2CVSRPtargetId%3A361578890%2CVSRPcmpt%3Aprimary%2CVSRPnm%3Atrue%2CauthType%3ANAME_SEARCH)/dom
> │ │ ├──255 (07.20%) ── event-listeners
> │ │ └────2 (00.06%) ── event-targets
> │ └───85 (02.40%) ++ (12 tiny)
> ├────333 (09.40%) -- top(https://www.linkedin.com/in/hanspeschke?authType=NAME_SEARCH&authToken=jgA3&locale=de_DE&srchid=4932510911487150419061&srchindex=1&srchtotal=229198&trk=vsrp_people_res_name&trkInfo=VSRPsearchId%3A4932510911487150419061%2CVSRPtargetId%3A361578890%2CVSRPcmpt%3Aprimary%2CVSRPnm%3Atrue%2CauthType%3ANAME_SEARCH, id=6442450953)/active
> │ ├──257 (07.26%) -- window(https://www.linkedin.com/in/hanspeschke?authType=NAME_SEARCH&authToken=jgA3&locale=de_DE&srchid=4932510911487150419061&srchindex=1&srchtotal=229198&trk=vsrp_people_res_name&trkInfo=VSRPsearchId%3A4932510911487150419061%2CVSRPtargetId%3A361578890%2CVSRPcmpt%3Aprimary%2CVSRPnm%3Atrue%2CauthType%3ANAME_SEARCH)/dom
> │ │ ├──255 (07.20%) ── event-listeners
> │ │ └────2 (00.06%) ── event-targets
>
> 2,116 (100.0%) -- observer-service-suspect
> ├────522 (24.67%) ── referent(topic=memory-pressure)
> ├────205 (09.69%) ── referent(topic=xpcom-shutdown)
> ├────159 (07.51%) ── referent(topic=dom-private-storage2-changed)
> ├────159 (07.51%) ── referent(topic=dom-storage2-changed)
> ├────159 (07.51%) ── referent(topic=network:offline-status-changed)
> ├────138 (06.52%) ── referent(topic=service-worker-get-client)
> ├────114 (05.39%) ── referent(topic=chrome-flush-skin-caches)
> ├────110 (05.20%) ── referent(topic=agent-sheet-added)
> ├────110 (05.20%) ── referent(topic=agent-sheet-removed)
> ├────110 (05.20%) ── referent(topic=author-sheet-added)
> ├────110 (05.20%) ── referent(topic=author-sheet-removed)
> ├────110 (05.20%) ── referent(topic=user-sheet-added)
> └────110 (05.20%) ── referent(topic=user-sheet-removed)
>
> Main Process
>
> 2,163 (100.0%) -- event-counts
> ├──2,145 (99.17%) -- window-objects
> │ ├──1,695 (78.36%) -- top(chrome://browser/content/browser.xul, id=3)/active
> │ │ ├──1,693 (78.27%) -- window(chrome://browser/content/browser.xul)/dom
> │ │ │ ├──1,661 (76.79%) ── event-listeners
> │ │ │ └─────32 (01.48%) ── event-targets
> │ │ └──────2 (00.09%) ── window(about:blank)/dom/event-targets [2]
> │ ├────200 (09.25%) -- top(about:memory, id=106)/active
> │ │ ├──184 (08.51%) -- window(about:newtab)/dom
> │ │ │ ├──183 (08.46%) ── event-listeners
> │ │ │ └────1 (00.05%) ── event-targets
> │ │ └───16 (00.74%) ++ window(about:memory)/dom
> │ ├────198 (09.15%) -- top(about:newtab, id=150)/active/window(about:newtab)/dom
> │ │ ├──197 (09.11%) ── event-listeners
> │ │ └────1 (00.05%) ── event-targets
That seems like a lot of event listeners and observers. Not sure what to make
of that. It could just be LinkedIn, or maybe we have some kind of leak? The fact that we have many event listeners in the main process is interesting.
Anyway, I think the biggest question here is about 32-bit builds on Linux. I
don't think we distribute them. Are they a high priority? A standard 64-bit
build should avoid these problems.
Flags: needinfo?(n.nethercote)
Updated•8 years ago
|
Blocks: QF-LinkedIn
Updated•8 years ago
|
Whiteboard: [platform-rel-Linkedin] → [platform-rel-Linkedin][qf]
Comment 40•8 years ago
|
||
This is incredibly reproducible on Firefox 52.0.2 64-bit on Windows 10. Merely scrolling through your social feed, or even worse, checking messages will kill the content process. I've eliminated all extensions (except for the Gecko profiler), and I even set dom.ipc.processCount to 50 to ensure that my tab was in its own process, thus eliminating other website interference.
Comment 41•8 years ago
|
||
Scrolling down Linkedin creates all the time more and more DOM content, so that naturally takes more and more memory. But how much... that is hard to say.
(This could be something totally different, but reminds me a bit about similar issue in Facebook, where they added all the time more and more objects to some array.)
But so far I haven't managed to reproduce this on FF. No unexpected memory usage or slowness.
I did get some weird behavior in Chrome. After scrolling down for awhile, scrolling became really jank-y and the relevant process started to take some CPU constantly.
Kenan, could you perhaps use about:memory when you're starting to see some badness. Hopefully the child process isn't all blocked and about:memory can still get a memory report.
Flags: needinfo?(bugs) → needinfo?(koenigseggcc)
Updated•8 years ago
|
Whiteboard: [platform-rel-Linkedin][qf] → [platform-rel-Linkedin][qf-]
Comment 42•8 years ago
|
||
I reported about the this issue, or about the issue I see to LinkedIn.
Comment 43•8 years ago
|
||
Hi Olli,
I have an about:memory measurement from my session as the tab was loading. However, to clarify, the content process dies (i.e. it says 'Gah. Your tab just crashed.') pretty much 100% of the time when browsing to messages. I've seen this on every computer I use on Windows 10.
The profiles have a bunch of tabs in them, so if you search for 'linkedin' I think that will give you what you want.
Comment 44•8 years ago
|
||
Memory profile when loading LinkedIn in new tab.
Comment 45•8 years ago
|
||
Memory profile after LinkedIn crashes its content process.
Flags: needinfo?(koenigseggcc)
Comment 46•8 years ago
|
||
Hmm, is the tab load memory-report somehow busted. Can't seem to load it locally to about:memory.
Oh, btw, remove Gecko Profiler if you're seeing crashes.
Comment 47•8 years ago
|
||
Kenan, do you get crash-reports?
Comment 48•8 years ago
|
||
I've disabled the Gecko profiler. Seems to have made the crashses less likely, but they still happen. I do get crash reports, and I just added a comment to one of them referencing the link to this ticket.
Comment 49•8 years ago
|
||
Could you give link to the relevant crash report here?
Comment 50•8 years ago
|
||
I think this is one of them. https://crash-stats.mozilla.com/report/index/6bcec12d-6ca5-4bc9-916a-0d5db2170408
Comment 51•8 years ago
|
||
"Uptime 16 seconds " hints that is isn't about LinkedIn.
And the stack trace tells that the crash is about the Gecko profiler.
Comment 52•8 years ago
|
||
The previo1330193(In reply to Kenan from comment #50)
> I think this is one of them.
> https://crash-stats.mozilla.com/report/index/6bcec12d-6ca5-4bc9-916a-
> 0d5db2170408
See bug 1330193
Comment 53•5 years ago
|
||
Can you still reproduce?
Flags: needinfo?(koenigseggcc)
Flags: needinfo?(bugzilla.mozilla.org)
OS: Linux → All
Summary: frequent crashes when visiting linkedin.com profiles → frequent crashes when visiting linkedin.com profiles, often with high purple buffer value
Comment 54•5 years ago
|
||
Have not seen this issue for a long while, now on 80.0.1 (64-bit) Ubuntu 20.04.
Comment 55•5 years ago
|
||
I think things are fine. Sorry for the slow response.
Flags: needinfo?(koenigseggcc)
Comment 56•5 years ago
|
||
Resolving this per the last few comments. Thank you!
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
| Reporter | ||
Updated•5 years ago
|
Flags: needinfo?(bugzilla.mozilla.org)
Updated•3 years ago
|
Performance Impact: --- → -
Whiteboard: [platform-rel-Linkedin][qf-] → [platform-rel-Linkedin]
You need to log in
before you can comment on or make changes to this bug.
Description
•