Closed Bug 926430 Opened 6 years ago Closed 6 years ago

The wrong thumbnails are displayed in empty grid positions

Categories

(Firefox for Android :: General, defect)

All
Android
defect
Not set

Tracking

()

VERIFIED FIXED
Firefox 28
Tracking Status
firefox27 --- verified
firefox28 --- verified
fennec 27+ ---

People

(Reporter: AdrianT, Assigned: rnewman)

References

Details

(Keywords: regression)

Attachments

(8 files, 3 obsolete files)

44.52 KB, image/png
Details
4.81 KB, patch
bnicholson
: review+
Details | Diff | Splinter Review
3.41 KB, patch
bnicholson
: review+
Details | Diff | Splinter Review
6.85 KB, patch
bnicholson
: review+
Details | Diff | Splinter Review
4.79 KB, patch
bnicholson
: review+
Details | Diff | Splinter Review
7.12 KB, patch
bnicholson
: review+
Details | Diff | Splinter Review
12.75 KB, patch
bnicholson
: review+
Details | Diff | Splinter Review
35.87 KB, image/png
Details
No description provided.
Sorry submitted the bug to early

Nightly 27.0a1 2013-10-14
Samsung Galaxy R (Android 2.3)/ LG Optimus 2X (Android 2.2)

Steps to reproduce:
1) Start with a clean profile or create one if needed
2) Start the app

Expected results:
In the Top Sites grid only the first 2 positions are displayed

Actual results:
The last 2 grid positions which are empty display the thumbnails for firefox support and AMO pages

This is not reproducible on tablets or on phones running Android 4.0 and higher
This seems like a regression from bug 914296.
tracking-fennec: --- → ?
(In reply to Adrian Tamas (:AdrianT) from comment #1)
> This is not reproducible on tablets or on phones running Android 4.0 and
> higher

Ten points for guessing the Android version of the device I tested the patches on :P
Flags: needinfo?(adrian.tamas)
I'm not in a good position to be able to debug this one - don't have any devices with Android <4. Hard to fix that which I can't reproduce. 

Merry christmas: Have a bug.
Regression window:
good build: Nightly 27.0a1 (2013-10-13)
bad build: Nightly 27.0a1 (2013-10-14)
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a50141faaab9&tochange=211337f7fb83
Regression window:
good build: 1381680689
bad build: 1381685430
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=33563ee5a859&tochange=20ffeada8eca

Is it still necessary to do a regression window in fx-team branch?
Flags: needinfo?(adrian.tamas) → needinfo?(aaron.train)
No. This is from bug 914296 more or less which is confirmed in that inbound range.
Flags: needinfo?(aaron.train)
I will try to look at this when I'm done with my review queue and Bug 922694.
Flags: needinfo?(rnewman)
tracking-fennec: ? → 27+
Looking at this now.
Assignee: nobody → rnewman
Status: NEW → ASSIGNED
tracking-fennec: 27+ → ?
Flags: needinfo?(rnewman)
Reproduced on 2.3.6.

UI-related logging:

02-10 12:04:46.289 I/GeckoToolbar( 2201): zerdatime 515572 - Throbber stop
02-10 12:04:46.359 D/GeckoTopSitesPage( 2201): Creating TopSitesLoader: 0
02-10 12:04:46.367 D/GeckoTopSitesPage( 2201): TopSitesLoader.loadCursor()
02-10 12:04:46.632 D/GeckoTopSitesPage( 2201): onLoadFinished: 6 rows.
02-10 12:04:46.656 D/GeckoFavicons( 2201): Cancelling favicon load (1)
02-10 12:04:49.445 D/GeckoFavicons( 2201): Cancelling favicon load (7)
02-10 12:04:49.484 D/GeckoTopSitesPage( 2201): TopSitesLoader.loadCursor()
02-10 12:04:49.593 D/GeckoTopSitesPage( 2201): onLoadFinished: 6 rows.
02-10 12:04:49.617 D/GeckoFavicons( 2201): Cancelling favicon load (2)
02-10 12:04:49.617 D/GeckoFavicons( 2201): Cancelling favicon load (3)
02-10 12:04:49.671 D/GeckoFavicons( 2201): Cancelling favicon load (5)
02-10 12:04:49.671 D/GeckoFavicons( 2201): Cancelling favicon load (6)
A little bit of safety/style/logging I noticed along the way.
Attachment #823657 - Flags: review?(bnicholson)
This didn't work anyway, but at least now we don't write to the DB!
Attachment #823659 - Flags: review?(bnicholson)
Here we do two things:

* Try not to do work on redundant calls.
* Try not to display a favicon in the case of view recycling. My logging never revealed that this was happening, but adding the code made the bug go away, so I point my finger at races in the logging!

You'll note that we still run the bindView body twice for thumbnails, because those are loaded late and swapped in.
Attachment #823664 - Flags: review?(bnicholson)
Comment on attachment 823657 [details] [diff] [review]
Part 0: trivial stuff.

Review of attachment 823657 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/home/TopSitesPage.java
@@ +514,5 @@
>      }
>  
>      public class TopSitesGridAdapter extends CursorAdapter {
>          // Cache to store the thumbnails.
> +        private volatile Map<String, Bitmap> mThumbnails;

I think mThumbnails must be used only from the UI thread, so I'd prefer we just note that requirement here instead of making this volatile.
Attachment #823657 - Flags: review?(bnicholson) → review+
Comment on attachment 823659 [details] [diff] [review]
Part 1: don't (fail to!) store JAR icons in the DB.

Review of attachment 823659 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/favicons/LoadFaviconTask.java
@@ +183,5 @@
> +        final String uriString = targetFaviconURI.toString();
> +        Bitmap image = fetchJARFavicon(uriString);
> +        if (image != null) {
> +            return image;
> +        }

downloadFavicon() is called in only two places, and they're both immediately after your `image = fetchJARFavicon(mFaviconUrl);` below. So there's no need to call fetchJARFavicon() again here, right?
(In reply to Brian Nicholson (:bnicholson) from comment #15)

> downloadFavicon() is called in only two places, and they're both immediately
> after your `image = fetchJARFavicon(mFaviconUrl);` below. So there's no need
> to call fetchJARFavicon() again here, right?

guessDefaultFavicon can return a jar: URI, so I wanted to catch that in the call from line 327.

We could call fetchJARFavicon on the guessed URI, of course, and remove the call from downloadFavicon. Your choice.
Comment on attachment 823659 [details] [diff] [review]
Part 1: don't (fail to!) store JAR icons in the DB.

Review of attachment 823659 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Richard Newman [:rnewman] from comment #16)
> guessDefaultFavicon can return a jar: URI, so I wanted to catch that in the
> call from line 327.
> 
> We could call fetchJARFavicon on the guessed URI, of course, and remove the
> call from downloadFavicon. Your choice.

Argh, this code is such a mess. I do think I'd prefer having it there instead of downloadFavicon.
Attachment #823659 - Flags: review?(bnicholson) → review+
Comment on attachment 823703 [details] [diff] [review]
Part 3: more error handling, plus review comments. v1

Review of attachment 823703 [details] [diff] [review]:
-----------------------------------------------------------------

Just spotted the following...

::: mobile/android/base/favicons/LoadFaviconTask.java
@@ -179,5 @@
>          if (targetFaviconURI == null) {
>              return null;
>          }
>  
>          final String uriString = targetFaviconURI.toString();

This variable is now unused.
Attachment #823664 - Flags: review?(bnicholson) → review+
Comment on attachment 823703 [details] [diff] [review]
Part 3: more error handling, plus review comments. v1

Review of attachment 823703 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/favicons/LoadFaviconTask.java
@@ +170,5 @@
>          if (uri.startsWith("jar:jar:")) {
>              Log.d(LOGTAG, "Fetching favicon from JAR.");
> +            try {
> +                return GeckoJarReader.getBitmap(sContext.getResources(), uri);
> +            } catch (Exception e) {

Swallowing exceptions with no output is bad -- Log.e/w here with the exception?

I wish we had some way to report generic exceptions without crashing. There are lots of places in our code where a problem shouldn't be fatal, but catching a generic exception and not reporting it means it's much easier for bugs to go on unfixed. This isn't exactly a prime example of where we would want such a feature compared to some other places we catch generic exceptions (DB upgrades come to mind), but I'll rant here anyway. :)
Attachment #823703 - Flags: review?(bnicholson) → review+
Richard, could you test your build to see if this fixes bug 931949?
Flags: needinfo?(rnewman)
Duplicate of this bug: 931949
Done.

I did find two minor issues in testing:

* In some circumstances, the first attempt to pin a site fails.
* Once I saw one of the default support thumbnails fail to draw anything.

I don't know if these are regressions; in general the behavior now is better than it was. I'll try to repro and fix these two before I land.
Flags: needinfo?(rnewman)
Blocks: 932594
Here's why pinning doesn't work first time.

02-14 12:26:11.328 I/GeckoTopSitesPage(17037): onContextItemSelected: pin
02-14 12:26:11.328 I/GeckoTopSitesPage(17037): Pinning Firefox: Support
02-14 12:26:11.382 I/GeckoLocalBrowserDB(17037): pinSite updated: 0
02-14 12:26:15.421 I/GeckoTopSitesPage(17037): onContextItemSelected: pin
02-14 12:26:15.421 I/GeckoTopSitesPage(17037): Pinning Firefox: Support
02-14 12:26:15.531 I/GeckoLocalBrowserDB(17037): pinSite updated: 1


It doesn't update any rows. Why? Beats me.
(In reply to Richard Newman [:rnewman] from comment #24)

> It doesn't update any rows. Why? Beats me.

*stab stab*

        insertBookmark(uri, values);

        // Return 0 if we added a new row
        return 0;
    }

...

        if (updated > 0)
            getContext().getContentResolver().notifyChange(uri, null);

*stab*
Found another existing bug. We have logic to avoid drawing a favicon on top of an existing thumbnail. That doesn't work if a view is recycled and we don't have a thumbnail for the new site.

This bug never ends.
Attachment #824382 - Flags: review?(bnicholson) → review+
This adds:

* Inline favicon cache fetching. We now show favicons instantaneously, rather than spawning an async task to do so, in most cases. This is blazingly fast.
* Pinning works.
* Random DB operations are now quicker and code is less stupid.
* TopSitesGridItemView makes sense.
* Thumbnails always stick with their sites.
* We don't even flicker favicons unless they're not in the cache.
Attachment #824382 - Attachment is obsolete: true
Attachment #824400 - Flags: review?(bnicholson)
Comment on attachment 824400 [details] [diff] [review]
Part 4: correct the use of update-or-insert, such that pinning starts to work. v2

Review of attachment 824400 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/home/TopSitesGridItemView.java
@@ +134,5 @@
>  
> +    public static enum UpdateResult {
> +        NO_CHANGE,
> +        THUMBNAIL_SHOWN,
> +        FAVICON_SHOWN,

From what I can see, THUMBNAIL_SHOWN and FAVICON_SHOWN are handled identically -- can we just merge these together into a single IMAGE_SHOWN?

@@ +200,5 @@
> +        if (favicon != null) {
> +            // Great!
> +            displayFavicon(favicon.image, favicon.faviconURL);
> +            return UpdateResult.FAVICON_SHOWN;
> +        }

It's not clear to me why this is necessary. Doesn't Favicons#getSizedFaviconForPageFromLocal check the cache already? Your comment says we save an AsyncTask, but looking at [1], I don't see any AsyncTasks being spawned for cache hits. I see that we do unconditionally post a Runnable at [2], but that could easily be changed to check if we're on the UI thread already.

[1] http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/favicons/Favicons.java#155
[2] http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/favicons/Favicons.java#74

::: mobile/android/base/home/TopSitesPage.java
@@ +573,5 @@
>  
> +            switch (updated) {
> +              case FAVICON_SHOWN:
> +                // Nothing to do.
> +                return;

I think you're missing a THUMBNAIL_SHOWN fall-through here, but why not just omit this case altogether?
(In reply to Brian Nicholson (:bnicholson) from comment #29)

> It's not clear to me why this is necessary. Doesn't
> Favicons#getSizedFaviconForPageFromLocal check the cache already?

It does, but it also does more work.


> comment says we save an AsyncTask, but looking at [1], I don't see any
> AsyncTasks being spawned for cache hits. 

Yeah, the comment is insufficient.

For cache hits, that method fires a runnable, which is the exact opposite of what we want here: we need to show something *right now*, and we want to avoid flicker. (I wrote this patch because the implementation without it looked *so janky*!)

For cache misses in getSizedFaviconForPageFromLocal, we spawn a LoadFaviconTask (Favicons.java line 210). That's the thorough-but-slow path that we want to trigger on failure.

The approach in this patch:

* Doesn't scale images, 'cos that's expensive.
* Doesn't spawn a runnable, 'cos we want the result immediately, on this thread.
* Doesn't spawn a task on failure, 'cos we're going to fall back to doing the whole caboodle.

That is: it's a super optimistic look into the cache to avoid drawing the temporary favicon for 50msec. And the return code is to avoid triggering the slow/long path.


> I think you're missing a THUMBNAIL_SHOWN fall-through here, but why not just
> omit this case altogether?

Fair!
Comment on attachment 823664 [details] [diff] [review]
Part 2: actually fix the bug.

Review of attachment 823664 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/home/TopSitesPage.java
@@ -575,5 @@
> -                        @Override
> -                        public void onFaviconLoaded(String url, String faviconURL, Bitmap favicon) {
> -                            view.displayFavicon(favicon, faviconURL);
> -                        }
> -                    }));

Drive-by: I'm still not entirely convinced this is going to bring the ideal UX. Before the new Favicons API landed, we were loading thumbnails and favicons (where thumbnails were not available) in ThumbnailsLoader using batched queries. This means that we were updating the UI only once using more efficient queries. Now we're first trying to load thumbnails and then loading favicons individually for each item for which a thumbnail was not found. Furthermore, this approach adds a lot more complexity to handle view recycling and will make it much harder to implement things like bug 914872 properly.

I realize this was a necessary thing to do in order to land the new Favicons API. But, going forward, I think our efforts should go into changing the Favicons API to allow us to load everything in one go (thumbnails + favicons) in ThumbnailsLoader (or something along these lines) instead of pushing this complexity to the UI code.
(In reply to Lucas Rocha (:lucasr) from comment #31)

> Drive-by: I'm still not entirely convinced this is going to bring the ideal
> UX. Before the new Favicons API landed, we were loading thumbnails and
> favicons (where thumbnails were not available) in ThumbnailsLoader using
> batched queries. This means that we were updating the UI only once using
> more efficient queries. Now we're first trying to load thumbnails and then
> loading favicons individually for each item for which a thumbnail was not
> found.

That's orthogonal to this bug, and the last patch on this bug isn't intended to yield the ideal UX (though it's way better than we have right now).

All this patch does is avoid flicker and/or late display due to bindView occurring some time before a delayed favicon fetch returns.

It avoids flicker in two ways: if an icon is in the cache, it uses that; if it's not, it uses the default icon, and triggers the delayed fetch to do more work (DB fetch, scaling).

We can populate the cache however we choose.

(Earlier patches avoid more jank by killing a needless DB write, amongst other things.)


> Furthermore, this approach adds a lot more complexity to handle view
> recycling

No it doesn't. It dramatically cleans up a lot of the mess that was already in the code, and adds a single optimization to do something strictly better than using the default favicon. Please read the patch again.


> But, going forward, I think our efforts should go into changing the
> Favicons API to allow us to load everything in one go (thumbnails +
> favicons) in ThumbnailsLoader (or something along these lines) instead of
> pushing this complexity to the UI code.

I agree to a point. But consider:

* Sometimes UI work has to occur synchronously (e.g., the view has to be ready to show when bindView returns).
* Sometimes fetches need to be asynchronous and cross-thread.

There will inevitably need to be a "load now if we can" and "load later" primitive pair, and all of the consequent touchpoints between UI code and loading code. You have exactly the same pair of concepts in Smoothie, IIRC.

Furthermore, how we *load* does not necessarily dictate a particular binding between the loading code and the UI. Right now the view is super dumb, with the page poking it to set state. This patch shifts that ever so slightly to be better OO, but the logic change is really just the introduction of a small "get it now if we can" method.
(In reply to Richard Newman [:rnewman] from comment #32)
> (In reply to Lucas Rocha (:lucasr) from comment #31)
> 
> > Drive-by: I'm still not entirely convinced this is going to bring the ideal
> > UX. Before the new Favicons API landed, we were loading thumbnails and
> > favicons (where thumbnails were not available) in ThumbnailsLoader using
> > batched queries. This means that we were updating the UI only once using
> > more efficient queries. Now we're first trying to load thumbnails and then
> > loading favicons individually for each item for which a thumbnail was not
> > found.
> 
> That's orthogonal to this bug, and the last patch on this bug isn't intended
> to yield the ideal UX (though it's way better than we have right now).

I agree this is orthogonal to this bug. I should have said so. My points are more about the current approach for loading favicons, not about your patches. 

> > Furthermore, this approach adds a lot more complexity to handle view
> > recycling
> 
> No it doesn't. It dramatically cleans up a lot of the mess that was already
> in the code, and adds a single optimization to do something strictly better
> than using the default favicon. Please read the patch again.

Yeah, I'm not blaming your patches for the complexity (you're right, they do improve things) but the underlying approach for loading favicons (that was introduced when we landed new Favicons API) that your patches are trying to fix here. For instance, some of these changes would be necessary if were doing the batched loading of thumbnails/favicons in a Loader.

This patch fixes actual bugs in our code so let's go ahead and land them. My only request here is, please, file a follow-up bug to track this discussion.
(In reply to Lucas Rocha (:lucasr) from comment #33)

> I agree this is orthogonal to this bug. I should have said so. My points are
> more about the current approach for loading favicons, not about your
> patches. 

Gotcha. Sorry for misinterpreting!

> This patch fixes actual bugs in our code so let's go ahead and land them. My
> only request here is, please, file a follow-up bug to track this discussion.

Absolutely.

Is Bug 926139 enough coverage?
This shouldn't be contentious.
Attachment #824400 - Attachment is obsolete: true
Attachment #824400 - Flags: review?(bnicholson)
Attachment #824851 - Flags: review?(bnicholson)
Attachment #824852 - Flags: review?(bnicholson)
Now fixes that other bug, too…
Attachment #824962 - Flags: review?(bnicholson)
Attachment #824852 - Attachment is obsolete: true
Attachment #824852 - Flags: review?(bnicholson)
Blocks: 926566
Attachment #824851 - Flags: review?(bnicholson) → review+
Comment on attachment 824962 [details] [diff] [review]
Part 5: inline favicon lookup, bnicholson styleee. v2

Review of attachment 824962 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM!
Attachment #824962 - Flags: review?(bnicholson) → review+
tracking-fennec: ? → 27+
Duplicate of this bug: 931918
Flagging myself for needinfo for uplift.
Flags: needinfo?(rnewman)
Keywords: verifyme
Just a note: I've observed that this bug is only 99% fixed once I start fixing inefficiencies elsewhere.

I've seen the first site's favicon draw in the last spot (even when the first site has a thumbnail). This only happens sometimes, and it isn't easily reproducible. The original issue reliably showed two favicons in the last two empty spots.

Must be another much narrower timing issue hiding; I'll look into it as a follow-up.
Flags: needinfo?(rnewman)
Comment on attachment 823657 [details] [diff] [review]
Part 0: trivial stuff.

[Approval Request Comment]

** Multi-patch approval request! There are six commits in this series. **

Bug caused by (feature/regressing bug #): 
  Long, long, troubled history.

User impact if declined: 
  This is a multi-parter. It dramatically improves behavior wrt the actual bug (doesn't 100% solve it, but close), and it also makes our DB operations more efficient -- including saving us a DB write when about:home first loads!

  The final part makes the responsiveness of favicon loads *much* better, by avoiding a thread switch on cache hits.

Testing completed (on m-c, etc.): 
  Been in Nightly for a few days with no complaints. Lots of manual testing and some Robocop coverage.

Risk to taking this patch (and alternatives if risky): 
  It's not trivial, but it's nowhere near as risky as it looks.

String or IDL/UUID changes made by this patch:
  None.
Attachment #823657 - Flags: approval-mozilla-aurora?
Comment on attachment 823657 [details] [diff] [review]
Part 0: trivial stuff.

Although moderately risky its ok to uplift given the perf wins and where we are in the cycle. It will be helpful if we can identify testcases for QA to do extensive testing here to help catch fallouts.
Attachment #823657 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Blocks: 935144
The bug is still reproducible on Aurora and Nightly
Device: HTC Desire (Android 2.2)
Check the screen capture attached.
See Comment 43.
(In reply to flaviu.cos from comment #47)
> Created attachment 832151 [details]
> The file attached is a screen capture
> 
> The bug is still reproducible on Aurora and Nightly
> Device: HTC Desire (Android 2.2)
> Check the screen capture attached.

(In reply to Richard Newman [:rnewman] from comment #48)
> See Comment 43.

Let's open a new bug to track the remaining issues.
Depends on: 939060
Verified as fixed in builds:
- 27 beta 6;
- Aurora 28.0a2 (2014-01-14);
Device: HTC Desire (Android 2.2)
Status: RESOLVED → VERIFIED
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.