Closed
Bug 1202080
Opened 10 years ago
Closed 10 years ago
[e10s] Reproducible crash in telemetry.mozilla.org: nsTArray_Impl<T>::AppendElements<T>(unsigned int) | nsLineBreaker::AppendText(nsIAtom*, unsigned char const*, unsigned int, unsigned int, nsILineBreakSink*)
Categories
(Core :: DOM: UI Events & Focus Handling, defect)
Tracking
()
RESOLVED
FIXED
mozilla43
Tracking | Status | |
---|---|---|
e10s | m8+ | --- |
People
(Reporter: vladan, Assigned: masayuki)
References
Details
(Keywords: crash)
Crash Data
Attachments
(1 file)
16.78 KB,
text/plain
|
Details |
This bug was filed from the Socorro interface and is
report bp-9ae67460-fb62-44da-9360-0da3a2150905.
=============================================================
This is a reproducible crash in Nightly 41 and Aurora 42 (on Windows 7) -- this is an e10s only crash!
STR:
1. Load https://telemetry.mozilla.org/new-pipeline/evo.html#!aggregates=median&cumulative=0&end_date=2015-08-15&keys=!__none__!__none__&max_channel_version=nightly%252F43&measure=TELEMETRY_PENDING_LOAD_FAILURE_PARSE&min_channel_version=nightly%252F40&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2015-08-10&trim=1&use_submission_date=0
2. Type "firstpaint" (no quotes) into the histogram name box
Other signatures:
https://crash-stats.mozilla.com/report/index/e4e05602-7f40-4cc4-856f-f8e362150905
https://crash-stats.mozilla.com/report/index/ce4dc923-c116-4d7a-bbc4-2beee2150905
Reporter | ||
Updated•10 years ago
|
Component: General → Layout: Text
Product: Firefox → Core
Reporter | ||
Updated•10 years ago
|
Group: core-security
Comment 1•10 years ago
|
||
Based on the stack this just looks like some kind of stack overflow, so I don't think this really needs to be hidden.
The reason code is EXCEPTION_STACK_OVERFLOW. Is it reasonable to have a stack so deep?
Flags: needinfo?(dmajor) → needinfo?(dbaron)
If it turns out that this amount of stack is really necessary (and it's not "just a logic bug") then maybe we could look at the delta between ChildEBP values and slim down the heaviest frames. I bet PGO gets pretty aggressive on inlining in layout code.
This recursive step:
3d 0015b24c 5b47d457 xul!PresShell::ProcessReflowCommands+0x9c
3e 0015b308 5b47d248 xul!PresShell::FlushPendingNotifications+0x209
3f 0015b324 5c40e4ab xul!PresShell::FlushPendingNotifications+0x1d
40 0015b330 5c40e5a7 xul!mozilla::ContentEventHandler::InitBasic+0x1b
41 0015b348 5c40e42c xul!mozilla::ContentEventHandler::InitCommon+0x1d
42 0015b36c 5c40fbec xul!mozilla::ContentEventHandler::Init+0xf
43 0015b384 5bb0254d xul!mozilla::ContentEventHandler::OnQueryEditorRect+0xf
44 0015b5f0 5ba4a90d xul!mozilla::EventStateManager::PreHandleEvent+0x451e50
45 0015b690 5b740d13 xul!PresShell::HandleEvent+0x5d1102
46 0015b6bc 5b741a00 xul!nsViewManager::DispatchEvent+0xb8
47 0015b6dc 5c6af8e5 xul!nsView::HandleEvent+0x43
48 0015b70c 5c6adf6f xul!mozilla::widget::PuppetWidget::DispatchEvent+0x6a
49 0015b878 5c6b4603 xul!mozilla::ContentCacheInChild::CacheEditorRect+0x9a
4a 0015b88c 5c6b44c2 xul!mozilla::widget::PuppetWidget::NotifyIMEOfPositionChange+0x1f
4b 0015b89c 5c6aaba2 xul!mozilla::widget::PuppetWidget::NotifyIMEInternal+0x54
4c 0015b8b0 5bab88af xul!nsBaseWidget::NotifyIME+0x86
4d 0015b8e4 5b47f519 xul!mozilla::IMEStateManager::NotifyIME+0x4e99f4
4e 0015b8ec 5ba24698 xul!nsCOMPtr_base::~nsCOMPtr_base+0x15
4f 0015b914 6bbc2cad xul!nsDocShell::NotifyScrollObservers+0x62b9ac
50 0015b930 5c416fdf mozglue!arena_dalloc_small+0x3d
51 0015b968 5b687eee xul!mozilla::IMEContentObserver::PositionChangeEvent::Run+0x38
52 0015b970 5b817e86 xul!nsRunnable::Release+0x1f
53 0015b990 5b3d3d85 xul!nsAutoScriptBlocker::~nsAutoScriptBlocker+0x81
54 0015ba24 5b47d64e xul!mozilla::ScrollFrameHelper::ReflowFinished+0x17f
55 0015ba44 5b47a250 xul!PresShell::HandlePostedReflowCallbacks+0xd0
56 0015ba60 5b479288 xul!PresShell::DidDoReflow+0x1b
57 0015babc 5b47d457 xul!PresShell::ProcessReflowCommands+0xe2
seems like an infinite recursion bug. (The stuff from frames 00-3c is normal layout recursion; this bit is not.)
Flags: needinfo?(dbaron)
Maybe Mats can look?
Flags: needinfo?(mats)
Comment 8•10 years ago
|
||
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #0)
> STR:
>
> 1. Load <url>
> 2. Type "firstpaint" (no quotes) into the histogram name box
To clarify:
1. Load <url>
1b. wait for the page to fully load (takes about 20 sec)
1c. click on the TELEMETRY_PENDING_LOAD_FAILURE_PARSE "link" near the
top of the page. In the drop-down menu that is presented, there is
a "Search" text input field: this is the "histogram name box"
2. Type "firstpaint" (no quotes) into that box
(please correct me if I misunderstood)
Comment 9•10 years ago
|
||
The bug doesn't seem to occur on Linux (trunk or aurora).
A breakpoint at mozilla::ContentEventHandler::OnQueryEditorRect
is never hit.
Comment 10•10 years ago
|
||
Looking at the stack in comment 6, it seems like an IME problem.
Since PositionChangeEvent::Run leads to flushing layout, it seems like
we should dispatch an async event instead of using a script runner?
Flags: needinfo?(mats) → needinfo?(masayuki)
Assignee | ||
Comment 11•10 years ago
|
||
Yeah, that looks the root cause is a call of IMEContentObserver::PositionChangeEvent::Run() fails to check if it's safe to notify IME of reflow.
But I'm not sure why IMEContentObserver::AChangeEvent::IsSafeToNotifyIME()
http://hg.mozilla.org/mozilla-central/annotate/fb720c90eb49/dom/events/IMEContentObserver.cpp#l1029
and IMEContentObserver::IsSafeToNotifyIME()
http://hg.mozilla.org/mozilla-central/annotate/fb720c90eb49/dom/events/IMEContentObserver.cpp#l903
are returns true at that time.
Flags: needinfo?(masayuki)
Assignee | ||
Comment 12•10 years ago
|
||
Oh, but this should block the nested flush:
http://hg.mozilla.org/mozilla-central/annotate/fb720c90eb49/dom/events/IMEContentObserver.cpp#l949
I guess that before flushing the pending notifications of IMEContentObserver, two or more IMEContentObserver::PositionChangeEvent is added to the script runner. Next, during a notification is flushed, script blocker is added again and destroyed during the previous script blocker is being destroyed. This must cause this symptom. I'll try to write the patch to fix this unexpected nest.
Assignee | ||
Comment 13•10 years ago
|
||
FYI: If it's true, this may be affected to current release too. Because IME or our IME event handler tries to retrieve something (e.g., caret rect) when it's notified of NOTIFY_IME_OF_POSITION_CHANGE.
Reporter | ||
Comment 14•10 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #2)
> Is it okay to unhide this?
I'm fine with unhiding it if you guys don't think there's a security risk (i.e. it's just stack space getting exhausted)
Flags: needinfo?(vdjeric)
Assignee | ||
Comment 15•10 years ago
|
||
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #14)
> (In reply to Andrew McCreight [:mccr8] from comment #2)
> > Is it okay to unhide this?
>
> I'm fine with unhiding it if you guys don't think there's a security risk
> (i.e. it's just stack space getting exhausted)
Could you check it?
Flags: needinfo?(continuation)
Assignee | ||
Comment 16•10 years ago
|
||
I filed bug 1203381 as performance bug which I found before this bug was reported. I guess that the optimization will fix this bug.
Comment 17•10 years ago
|
||
(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #15)
> Could you check it?
I'm sorry, I don't understand what you mean. Would you like me to unhide the bug? (I don't personally know if this is stack space exhaustion.)
Flags: needinfo?(continuation)
Assignee | ||
Comment 18•10 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #17)
> Would you like me to unhide the bug?
Yes, if it's safe.
Assignee | ||
Comment 19•10 years ago
|
||
> 2. Type "firstpaint" (no quotes) into the histogram name box
Where is the "histogram name box"? I cannot find it...
Flags: needinfo?(vladan.bugzilla)
Reporter | ||
Comment 20•10 years ago
|
||
(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #19)
> > 2. Type "firstpaint" (no quotes) into the histogram name box
>
> Where is the "histogram name box"? I cannot find it...
Click on the link in comment 0, click on the underlined text TELEMETRY_PENDING_LOAD_FAILURE_PARSE text. You will see a list box pop out as well as a search box. Type "firstpaint" (no quotes) into the search box
Flags: needinfo?(vladan.bugzilla)
Assignee | ||
Comment 21•10 years ago
|
||
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #20)
> (In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #19)
> > > 2. Type "firstpaint" (no quotes) into the histogram name box
> >
> > Where is the "histogram name box"? I cannot find it...
>
> Click on the link in comment 0, click on the underlined text
> TELEMETRY_PENDING_LOAD_FAILURE_PARSE text. You will see a list box pop out
> as well as a search box. Type "firstpaint" (no quotes) into the search box
Thank you. But I cannot reproduce this bug with typing firstpaint there... Do you reproduce it randomly?
Flags: needinfo?(vladan.bugzilla)
Assignee | ||
Comment 22•10 years ago
|
||
This build includes performance optimized patches. With the patches, recursive call should be prevented completely. Do you reproduce with this build?
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f0d27dc6b463
Win x86:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/masayuki@d-toybox.com-f0d27dc6b463/try-win32/
Updated•10 years ago
|
Group: core-security
Reporter | ||
Comment 23•10 years ago
|
||
(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #22)
> This build includes performance optimized patches. With the patches,
> recursive call should be prevented completely. Do you reproduce with this
> build?
I can't reproduce this crash in Nightly anymore, possibly because telemetry.mozilla.org has fewer histograms in that list box. I'll look into it and test your patch if I can find a way to reproduce the crash
Flags: needinfo?(vladan.bugzilla)
Reporter | ||
Comment 24•10 years ago
|
||
I was able to reproduce the crash again. It turns out e10s is required for this crash
The crash does not reproduce when I use your build
Reporter | ||
Updated•10 years ago
|
tracking-e10s:
--- → ?
Comment 25•10 years ago
|
||
Masayuki, assigning to you since your patch on bug 1203381 fixes this
Assignee: nobody → masayuki
Assignee | ||
Comment 26•10 years ago
|
||
This should be fixed by the fix of bug 1203381. Please reopen this if it's not so.
Status: NEW → RESOLVED
Closed: 10 years ago
Component: Layout: Text → Event Handling
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Reporter | ||
Comment 27•10 years ago
|
||
Did you uplift the fix in bug 1203381 to Aurora though?
Assignee | ||
Comment 28•10 years ago
|
||
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #27)
> Did you uplift the fix in bug 1203381 to Aurora though?
It was landed before the merge. So, already fixed on 43 Aurora.
Updated•6 years ago
|
Component: Event Handling → User events and focus handling
You need to log in
before you can comment on or make changes to this bug.
Description
•