Closed Bug 1119938 Opened 10 years ago Closed 10 years ago

Tab loading throbber stops spinning at random times

Categories

(Core :: Graphics: ImageLib, defect)

All
Windows 10
defect
Not set
normal

Tracking

()

VERIFIED FIXED
Tracking Status
firefox36 --- unaffected
firefox37 + verified
firefox38 + verified

People

(Reporter: ntim, Assigned: seth)

References

Details

(Keywords: regression)

Attachments

(1 file)

Reproducible :
- with and without e10s
- on a blank profile
- on Windows 10 (haven't tried on other platforms)

STR :
- Start latest Nightly
- Load a page

Actual results :
Loading throbber spins fine for a few seconds, and no longer spins for the next page loads.

Expected results :
Loading throbber should spin correctly.
This bug happens in random ways, I can reproduce it on every startup, but the throbber sometimes spins a bit, and gets stuck at random states.
Judging from when bug 927349 landed, I think bug 927349 is the culprit.
Blocks: 927349
Flags: needinfo?(bbirtles)
Keywords: regression
Summary: Loading throbber no longer spins → Tab loading throbber stops spinning at random times
Component: Theme → Layout
Product: Firefox → Core
Version: unspecified → Trunk
Backing bug 927349 out locally does seem to have improved the situation for me.
I'm afraid I won't be able to look into this until I get back to the office next Tuesday. Based on comment 3, however, it seems like the next step is to get a more precise regression window.
Flags: needinfo?(bbirtles)
Per the dupe, this looks to be bug 1116733. Seth?
Blocks: 1116733
No longer blocks: 927349
Component: Layout → ImageLib
Flags: needinfo?(seth)
(In reply to :Gijs Kruitbosch from comment #6)
> Per the dupe, this looks to be bug 1116733. Seth?

Or not...  (comment #3) -- Ryan, was this completely fixed with the local backout you did, or just "better than it was"? Looks like a bunch of this stuff (per bug 1117607, quite a few things were backed out at the same time) is involved, and it's not clear if there is one issue or multiple, and when what exact c-set broke which parts. :-(
Blocks: 927349
Flags: needinfo?(ryanvm)
See Also: → 1120036
Not fixed. The green loading throbber still stops a 3 o'clock for me.
Flags: needinfo?(ryanvm)
I just used mozregression to bisect it down. It's pointing to bug 1116733 as the culprit. Will back it out locally to confirm.
[Tracking Requested - why for this release]: Throbber mostly busted.

Confirmed that the throbber is working fine again with changesets c356ab8b348a, b4cdc04f6555, and c96ef32cd8a5 (bug 1116733) and changeset a49774cdd1b1 (bug 1116747) backed out.

Adding in-testsuite? to this, but I can't help but wonder if one of our existing tests would have caught this if they weren't so insanely flaky...
Assignee: nobody → seth
No longer blocks: 927349
Flags: in-testsuite?
It's possible that bug 1121297 will fix this bug.
(In reply to Seth Fowler [:seth] from comment #11)
> It's possible that bug 1121297 will fix this bug.

I applied those two patches locally and the problem remains :(
Incidentally, I just viewed this animated GIF, and it froze on the last frame: https://i.imgur.com/xXj3pdl.gif  Seems likely to be related to this bug.  I reproduced it once in a fresh profile, too, but was unable to reproduce on later (re)loads (even with shift+reload & clearing my cache). I can reproduce pretty reliably in my main profile, though.  I have e10s disabled, if that matters. Using Linux Nightly 38.0a1 (2015-01-14)
So: my guess is that what we're seeing here is RasterImage::OnAddedFrame and
RasterImage::DecodingComplete getting reordered, in such a way that the |if
(mAnim)| check in DecodingComplete fails (because we haven't gotten the first
OnAddedFrame notification yet), which causes us to never call
FrameAnimator::SetDoneDecoding, which causes the affected animation to freeze at
the final frame awaiting more data that never comes.

If that's the cause, then bug 1079627 should have fixed it, as indeed Alice0775
White reports in bug 1120036 comment 7. Post-bug 1079627, we should never start
a decode on one thread and finish it on another, which is the root cause of the
scenario above.

However, if this is still reproducible after bug 1079627 makes it into Nightly
(and I don't expect that to happen, if my guess about the cause is correct) then
this patch should fix it by removing any ordering dependency between
RasterImage::OnAddedFrame and RasterImage::DecodingComplete.

I'd be interested to hear if anyone can still reproduce this issue with the
Nightly that gets spun tonight, and if so, what effect this patch has. I can't
really verify the fix myself as I just cannot reproduce this reliably on OS X
*or* linux.
Here's a try job for this patch, just so we can move more quickly if it does turn out to be useful:

https://tbpl.mozilla.org/?tree=Try&rev=94a0375f4e50
Flags: needinfo?(seth)
Ben asked over in bug 1120036, which is now resolved, if there's any information that people who can reproduce this could provide to help it get fixed faster. If anyone can still reproduce and wants to get dirty with a debugger and help, please see bug 1120036 comment 20 for directions on how to get some information that would help identify the problem.
Fresh build off m-c tip today is working great. Looks like your theory is confirmed :)
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #19)
> Fresh build off m-c tip today is working great. Looks like your theory is
> confirmed :)

Fantastic, thanks for verifying the fix!

I'm going to keep this open over the weekend in case anyone can still reproduce. If I don't hear from anyone by Tuesday, I'll mark this resolved.
Seems to be ok in Nightly:

gecko.mstone = 38.0a1
gecko.buildID = 20150118030202
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0

Though I did encounter a page this morning where the throbber wasn't visible in a tab as it was loading. I'm not sure if it's related to this. Filed as bug #1123090.


Still freezes in Aurora:

gecko.mstone = 37.0a2
gecko.buildID = 20150118004006
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:37.0) Gecko/20100101 Firefox/37.0
(In reply to Ray Satiro from comment #21)
> Though I did encounter a page this morning where the throbber wasn't visible
> in a tab as it was loading. I'm not sure if it's related to this. Filed as
> bug #1123090.

I suspect that's different; thanks for filing the bug!

> Still freezes in Aurora:

Yeah, the fix needs to be uplifted. I'll aim to get that done this week.
FWIW, bug 1079627 landing fixed more than just the throbber for me - Nightly was being crazy slow in general during page load (especially on image heavy pages), and now it's back to normal. Since this bug is on Aurora, I assume that slowness is too, so uplifting bug 1079627 is critical (AWSY regression notwithstanding).
That's interesting, Emanuel! The only performance feedback I've heard about bug 1079627 is that it made Nightly *slower* on single-core devices. I haven't heard anything about a positive performance impact until your comment.

Would you mind describing the computer or device you saw this on, Emanuel? OS, processor speed, number of cores, amount of memory?
Flags: needinfo?(emanuel.hoogeveen)
Windows 7 x64 w/ SP1
8-core (4 physical) Intel Ivy Bridge @4.2GHz
32 gigs of DDR3

Running 64-bit Nightly. Notably, all cores are currently busy doing some Mathematica calculations, which is probably making things less responsive.

The behavior I saw before bug 1079627 landed is that reloading a page would hang the browser for a second or so at a time, seeming to slowly load the page in chunks. On an image heavy page, images seemed to load in batches. It didn't always reproduce, although this bug did, and session restore felt slow also (I have about 20 pinned tabs).

If I had to guess I'd say it felt like it was waiting on a batch of n threads and only moving on after some timeout value expired. But I'm only saying that because I happened to catch njn's DMD doing that because of the awful NS_StackWalk implementation ;)
Flags: needinfo?(emanuel.hoogeveen)
(In reply to Seth Fowler [:seth] from comment #24)
> That's interesting, Emanuel! The only performance feedback I've heard about
> bug 1079627 is that it made Nightly *slower* on single-core devices. I
> haven't heard anything about a positive performance impact until your
> comment.
> 
> Would you mind describing the computer or device you saw this on, Emanuel?
> OS, processor speed, number of cores, amount of memory?

I also noticed what Emanuel describes but in Aurora. I figured it's the lack of throbber movement that made things seem slower but I don't know if they actually are slower. Is there a good test for that? Intel i7 2600k Quad Core/8GB RAM, Win 7 x64 SP1.
(In reply to Ray Satiro from comment #26)
> I also noticed what Emanuel describes but in Aurora.

You noticed the slow down he described, right? Not the subsequent speed up? (Since the speed up Emanuel mentioned was from bug 1079627, which hasn't been uplifted to Aurora yet.) Just making sure I understand.

> Is there a good test for that?

I'm not aware of a good *simple* test; usually we have to use a profiler to measure these things. There's work going on right now to add native events like image decoding to the Firefox developer tools timeline view. So soon this stuff will become easier to visualize, but it's not quite ready yet.
(In reply to Daniel Holbert [:dholbert] from comment #14)
> Incidentally, I just viewed this animated GIF, and it froze on the last
> frame: https://i.imgur.com/xXj3pdl.gif

(following up on this comment -- I can't repro this imgur gif-freeze in today's nightly, 2015-01-20, so I think this part is fixed. If I hit it again, I'll file a new bug.)
Just to confirm, the issue doesn't seem to occur anymore.
(In reply to Seth Fowler [:seth] from comment #27)
> (In reply to Ray Satiro from comment #26)
> > I also noticed what Emanuel describes but in Aurora.
> 
> You noticed the slow down he described, right? Not the subsequent speed up?
> (Since the speed up Emanuel mentioned was from bug 1079627, which hasn't
> been uplifted to Aurora yet.) Just making sure I understand.

Right, not the subsequent speed up. My activity is different in Aurora than it is in Nightly so I don't know if I would have noticed it in Nightly. What I can confirm for Nightly is that the throbber works again, that's all. In Aurora I often open up several tabs at once by middle-clicking on any of a number of bookmark bar folders. That seems much slower currently. I figure it's because the throbbers in each tab are frozen that it just looks slower. I don't have a metric to give you, sorry.

(In reply to Tim Nguyen [:ntim] from comment #29)
> Just to confirm, the issue doesn't seem to occur anymore.

But where? In Nightly it doesn't but in Aurora 37.0a2 (2015-01-20) it still does for me anyway. My understanding is there is a fix that will be uplifted.
(In reply to Ray Satiro from comment #30)
> (In reply to Tim Nguyen [:ntim] from comment #29)
> > Just to confirm, the issue doesn't seem to occur anymore.
> 
> But where? In Nightly it doesn't but in Aurora 37.0a2 (2015-01-20) it still
> does for me anyway. My understanding is there is a fix that will be uplifted.
In Nightly. Yes, it still needs to be uplifted.
Thanks for sharing your experiences!

I am planning to uplift this but it can't happen yet because I need to fix bug 1122704 first. Expect it soon, though.
I've been seeing general slowness too with Aurora, but it's hard to tell if it seems slower only because the throbbers aren't spinning.
I've noticed Firefox got slower with pages that contain many images (CPU usage spiked too while loading these pages).
(In reply to Marco Castelluccio [:marco] from comment #34)
> I've been seeing general slowness too with Aurora, but it's hard to tell if
> it seems slower only because the throbbers aren't spinning.

Hopefully that will be resolved soon with the uplift of bug 1079627. I'll try to get that done ASAP; unfortunately there is another bug blocking the uplift.
If the only blocking bug is the memory regression, I'd take the memory regression instead of such a slow browser.
Aurora is generally very slow for last 3 weeks.
Why this bug is still not fixed in Aurora?
(In reply to Maxim Shpakov from comment #40)
> Aurora is generally very slow for last 3 weeks.
> Why this bug is still not fixed in Aurora?

It's not fixed in Aurora because there was another bug, bug 1122704, blocking uplift. It turned out to be very complex to fix, and so I was only able to get it resolved yesterday. I will be requesting uplift for bug 1079627 (which fixes this issue and also the slowness people are talking about in earlier comments) today, so you can expect the fix in Aurora soon.
I reproduced the initial issue (throbber stops spinning) on Windows 10 64-bit using old Nightly (2015-01-09). 
Was not able to reproduce the issue anymore using the str from this bug and the duplicates using latest Nightly with both e10s enabled or disabled on Windows 10 64-bit, Mac OS X 10.9.5 and Ubuntu 14.04 32-bit.
Seth - Bug 1122704 states that it was fixed by bug 1125490. Does bug 1125490 need to be uplifted to fix this bug or can this be resolved now that bug 1079627 has been uplifted to 37?
Flags: needinfo?(seth)
So bug 1125490 is indeed uplifted. We can resolve this.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(seth)
Resolution: --- → FIXED
(In reply to Bogdan Maris, QA [:bogdan_maris] from comment #42)
> I reproduced the initial issue (throbber stops spinning) on Windows 10
> 64-bit using old Nightly (2015-01-09). 
> Was not able to reproduce the issue anymore using the str from this bug and
> the duplicates using latest Nightly with both e10s enabled or disabled on
> Windows 10 64-bit, Mac OS X 10.9.5 and Ubuntu 14.04 32-bit.

Same results using Firefox 37 beta 1 and latest Aurora 37.0a2 (2015-02-23)
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: