Closed Bug 796474 Opened 7 years ago Closed 6 years ago

[email] quasi-infinite scroll may hitch under OMTC; consider using absolute positioning with virtual coordinate space

Categories

(Firefox OS Graveyard :: Gaia::E-Mail, defect, P1)

defect

Tracking

(blocking-b2g:1.4+, b2g-v1.4 fixed, b2g-v2.0 fixed)

VERIFIED FIXED
2.0 S1 (9may)
blocking-b2g 1.4+
Tracking Status
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed

People

(Reporter: ghtobz, Assigned: jrburke)

References

Details

(Keywords: perf, Whiteboard: [c=handeye p= s=2014.05.09.t u=1.4] [label:email] [UX-?][p=13])

Attachments

(10 files, 2 obsolete files)

[GitHub issue by asutherland on 2012-08-11T23:21:55Z, https://github.com/mozilla-b2g/gaia/issues/3373]
When we scroll down far enough, we may remove (newer) messages from the top of the list.  We do this by removing their DOM nodes and adjusting scrollTop to compensate.  Per Doug Sherk, there is likely an inherent race between the off-the-main-thread-compositing mechanisms' value and the JS execution thread for reading/manipulating this variable.  Specifically, it's very likely that we would perceive and adjust an out-of-date value, causing the display to jump back up the list a bit.

The proposed solution is to avoid causing a change in coordinate space by using position: absolute.  When scrolling stops (or when compelled to due to running low on coordinate space), we can perform the instantaneous fix-up then.  Sticking the messages in bounded-size position:absolute containers in order to reduce reflows might make sense.

Note that hitching in our scrolling could also be the result of our back-end running in the same thread as the UI, GC, etc.  Some type of logging/profiling should be used before assuming it's this, and platform people should be consulted to see if there are better things we can do or platform could be enhanced to help us do.  For example, roc had at one point posited about dealing with a similar situation where one would replace the XUL tree-view with an HTML approach using a virtual coordinate space and having the system call into user code to create DOM as-needed based on the virtual coordinate space.
Component: Gaia → Gaia::E-Mail
There has been no movement on this bug for sometime.
Requesting QA to test id this bug is still valid.
Flags: needinfo?(carlos.martinez)
Keywords: qawanted
No idea of what this bug is about... I´ve tried to browse a lot of emails in the inbox, but I haven´t seen anything strange.
Flags: needinfo?(carlos.martinez)
Keywords: qawanted
(In reply to Carlos Martínez Toral [:carlosmartinez] from comment #2)
> No idea of what this bug is about... I´ve tried to browse a lot of emails in
> the inbox, but I haven´t seen anything strange.

Ok thanks for looking Carlos, when i reference comment #0 I too am unclear as to the specifics of this bug.

So. RFI to Andrew Sutherland (reporter) to clarify the steps to reproduce please.
Flags: needinfo?(bugmail)
Flags: needinfo?(bugmail)
Hi Andrew. Thanks for your comments in response to other RFI's on the other email app bugs so far. Do you think you could clarify for QA the steps to reproduce this bug please.
Flags: needinfo?(bugmail)
Whiteboard: [label:email] → [label:email] [UX-?]
It's not a QA issue, it's a speculative technical issue that we need to keep an eye on.  This bug would most likely come into play if people start reporting that scrolling is jerky/choppy/jumps backwards while they are scrolling and the e-mail app is not actively synchronizing.  We would try and get profiling traces for those bugs or otherwise duplicate and then this bug would track the technical details.
Flags: needinfo?(bugmail)
(In reply to Andrew Sutherland (:asuth) from comment #5)
> It's not a QA issue, it's a speculative technical issue that we need to keep
> an eye on.  This bug would most likely come into play if people start
> reporting that scrolling is jerky/choppy/jumps backwards while they are
> scrolling and the e-mail app is not actively synchronizing.  We would try
> and get profiling traces for those bugs or otherwise duplicate and then this
> bug would track the technical details.

Thanks for clarifying
Summary: [email] quasi-infinite scroll may hitch under OMTC → [email] quasi-infinite scroll may hitch under OMTC; consider using absolute positioning with virtual coordinate space
Whiteboard: [label:email] [UX-?] → [label:email] [UX-?][p=13]
Target Milestone: --- → 1.4 S2 (28feb)
Keywords: perf
Priority: -- → P1
Whiteboard: [label:email] [UX-?][p=13] → [label:email] [UX-?][p=13][c=handye p= s= u=]
I'm going to work this initially, at least through providing the following back-end support:

- Get the number of currently displayable messages
- Providing random seek support based on an absolutely positioned message number

The primary open question at this time I think is how to deal with changes in the number of messages above our current position.  I think it's clear that we want the current set of messages visible to the user to be sticky regardless of what is happening outside that view range, just like we do currently.

There seem to be a few possible things we can do / ingredients

A) Use (relative) grow/shrink requests when we haven't scrolled outside our buffer range.  This effectively keeps the coordinate space relative.

B) If we scroll outside the pre-buffered area, re-open/re-seek the slice... there are variants here:
B1) Always seek absolutely based on coordinate position.  This depends on us adjusting the coordinate space by simultaneously re-positioning elements and manipulating scrollTop potentially aggressively.
B2) Distort the coordinate space by a factor related to the delta in messages above our previous viewing area.  So like if there were 100 messages above our viewing location and there are now 10 more and we go into empty space, treat every 10 pixels of movement as if they were 11.  This avoids jumpiness and potentially minimizes distortion of the user's concept of velocity/etc. in the app *unless there are a lot of messages*.
C) Always do things relatively.  When we hit the top (y=0) and it turns out there should still be a bunch more messages, do the fix-up then.

C) Perform fix-ups opportunistically when we think the scroll is quiescent.  It sounds like even though onscroll is not particularly realtime for us, we can probably infer things by also explicitly hooking touch events.  Specifically, we perform a fix-up after we haven't seen scrolling/touching for a little bit and there's no active touch.  We can also perfectly fix things up when the display switches to the message reader.

D) Hold the absolute positioning thing effectively an invariant.  Do 'C' for fix-ups, but anywhere we'd have a concept of slop or "I'll fix this up when things quiesce", don't.  Fix things up immediately or cause visual jumping as needed.  (Which might not be that needed...)


I would lean towards A, B1, C.  I would probably also lean towards having all of this position information be controlled via the slice since that should make things consistent and transparent-ish for unified folders and search slices  (which this plays into server-side searches where we lazily fetch messages realllllly well).  Which is not to say we can't put a convenience getter on MailHeader to provide the number; we can, the MailHeader knows the slice it belongs to.  But in terms of the messages we send over the wire, we'd be saying "hey, slice, I am saying your first header can be found at absolute position N out of M in the folder/whatever."


Useful context is :roc's post at http://robert.ocallahan.org/2014/02/implementing-virtual-widgets-on-web.html.  Particularly interesting is :Rik's question near the bottom where :roc's reply indicates that using "transform: translate" potentially has the same performance as using top.  This seems potentially useful to make it more visually obvious to the user when we're inserting messages in the middle of the message list.  At least as of :gal's message of 7/26/2012 we weren't support to animate 'top', but that may now be optimized in gecko.

James, I'm very interested in your thoughts on all this; no decision really needed until early next week.
Assignee: nobody → bugmail
Status: NEW → ASSIGNED
Flags: needinfo?(jrburke)
Some thoughts, but maybe we spend some time in video on Friday or Monday walking through some of it:

1) We have HeaderCursor now, and we talked about collapsing that more into the MailAPI (bug 937959). The main thing to note: it has a "currently selected message" concept, and I think that will play into some slice solution here too, tracking that, and having the slice fetch more as the user goes next/prev in the message_reader.

2) I am a bit worried about how we can track the fancy scroll acceleration rate to know how far/fast to grow the slice, and if we can get the data over the from the worker fast enough. Maybe it just means picking really large slices, like 20-30 messages at a time. I think this is similar to your concerns in C.

3) It is tempting to create a branch that uses some dummy data to see how fast we need to be, and in what chunks we need to grab things, and if there is a miss (we don't have the data ready for when the scrollTop enters that section) what does that look like, what mitigation strategy to we use, and if it is different based on the rate of change of scrollTop, and can we (or should we) sample it fast enough to know.

4) My rough attempt at the separation of concerns:

The front end UI component tracks scrollTop and uses that to translate into figuring out, based on the max number of messages possible in the slice and how big the UI component knows each message entry is in the DOM, what slice range is needed, and asks the slice to be get them ready. The slice might have them already ready in the buffer, or it may need to fetch. Maybe the slice needs a rate of change number too to know if it should just speculatively skip over some entries?

The slice does so and populates slice.items with those entries. I wonder, does this mean now that slice.items will have holes? Could we get into a fast enough scroll that the front end would end up with ranges that are not fetched.

I also wonder if we need a "slice, do not notify me of events right now because I'm scrolling, just get things into slice.items" mode.

Reading this back, it may be slightly different than just saying "grow slice" in a direction and in a way that there are no holes. Maybe that is not possible if the IDs all need to be known to know how to keep growing. I periodically purge my slice memory (hey just like a slice?).

I feel like we need some experimentation done, mainly #3, to set a baseline on our limitations. Although we probably have enough to know some basics on how to extend the slices, and perhaps melding in what HeaderCursor does now. 

A more realtime talk is worthwhile though, and after that, you start in with some of the more basic slice changes (max number, possibly collapsing HeaderCursor into the slice), and I start with the #3 dummy data branch to get a feel for UI limits/failure modes perhaps. But a realtime talk will do better to fix actual direction.
Flags: needinfo?(jrburke)
Let's do the realtime talk on Friday.  Maybe you can play with roc's demo from his blog at http://people.mozilla.org/~roc/virtual-list-demo.html on a buri device/similar with some instrumentation to get a better idea of what type of events we're likely to see?

Food for thought for that though:

In terms of slice growing, I think it's perfectly fine for us to aggressively re-seek the slice based on an absolute message number.  The only thing the back-end uses the slice for is to answer the questions:
- Should I tell the front-end about the changes to this message?  If the message is in the slice range, yes.
- Should I keep caching the blocks associated with this message?  If the block covers the slice range, yes.

We could even change things up entirely and say forget grow/shrink, everything is seek(focal index / possibly suid, num messages before that one, num messages after that one).  The back-end gets that, checks for overlap with what it currently has told the front-end about, reuses what it can, initiates loads for what is new.

The invariant is that the back-end only ever tells the front-end about a continuous range.  No holes; you only find out about what-turned-out-to-be-holes by having them filled in with new messages.  In the case a seek leaves us with no overlap, you basically hear about a delete for everything you already knew about followed by a new splice covering the core of what you seeked to.

(If you want gaps/holes, you create multiple slices and drive them around and de-dupe/avoid overlap yourself.)


It probably does make sense to maintain the concept of the current message cursor in the back-end, as you say.  Right now the back-end is effectively maintaining precise slice-start and slice-end cursors.  We identify the message by timestamp and id which is as full a name we get.  And if those messages are deleted, we update the effective cursor by having them then point in whatever was left at the edge position in the slice.  (We will probably freak out and explode if all of the messages go away, though.)

If we make that selected/focused message cursor canonical, it could likewise automatically adjust.  Although for UX purposes, I'm currently thinking we really just want to be showing something like "hey, this message got deleted!" in the message reader rather than jumping the user somewhere, so I'm still not sure how valuable having the worker thread back-end be aware of it is.  Having it in MailAPI so the UI-ish code doesn't have to be so clever still makes tremendous sense.
blocking-b2g: --- → 1.4+
Clearing blocking flag - per a drivers' decision, QC needs to nominate individual actionable bugs to be FC & QC blockers, not block on meta bugs.
blocking-b2g: 1.4+ → ---
Before going down this road we might want to see why the simple demo using absolute positioning in bug 975831 still checkerboards.
Yes - I was also going to type the following to productivity guys:

We are currently prototyping the virtual list thing in bug 975680 for contacts. We are looking at adding a small library in shared/ to accomplish this. At first implementation it does seem fairly implementation agnostic and would be great if the email app could also leverage this. We are currently still experiencing checkerboarding, but if we can fix this in bug 975831 then it would be nice if we can share an implementation.
(In reply to Kevin Grandon :kgrandon from comment #12)
> At first implementation it does seem fairly implementation agnostic
> and would be great if the email app could also leverage this.

Absolutely; nothing e-mail specific about this.  The only issue is that I think e-mail's needs are probably greater out of the gate than contact's.  We expect that messages will be added and removed, potentially in large numbers, while the user may be actively scrolling (comment 7).  I suspect contacts can currently make the simplifying assumption that there will be minimal changes to the contacts database by other applications while the contacts app is visible and avoid that altogether.  So the shared code may get a bit more complex...
Andrew, since this is on Gaia side, let me know if e-mail can hit the performance goals for 1.4 without this fixed.
Flags: needinfo?(bugmail)
(In reply to Milan Sreckovic [:milan] from comment #14)
> Andrew, since this is on Gaia side, let me know if e-mail can hit the
> performance goals for 1.4 without this fixed.

If performance goals includes specific numbers, I do not know them and cannot say.  I think there's a few things generally going on.:

- Glitchy UX while scrolling due to our infinite scroll logic.  Problem tracked as bug 959782, this bug is the solution to that bug.
- Checkerboarding of the e-mail app.  This was investigated on bug 965019 where all data thus far indicates that the problem is likely caused by our infinite scroll logic touching scrollTop more than it should be.  That's bug 862872, but the fix for that is again this bug.

If we only care about specific checkerboarding metrics, we may be there already.  If we care about the glitchy UX for v1.4, we definitely need to fix this bug.
Flags: needinfo?(bugmail)
Assignee: bugmail → jrburke
Re-noming to keep this on the radar.
blocking-b2g: --- → 1.4?
I should have checked bugmail sooner, will need to investigate work in bug 975831 some more. In the meantime:

I was working on something for email for faster scroll, first pass at it here:

http://jrburke.com/work/gaia/email-vscroll-1.zip

Installable via the app manager:
https://github.com/jrburke/gaia-dev-zip#for-the-ux-person

This version of the email app requires an account to be configured, but just uses canned data for the actual message list, and plugs into some of the existing data/card machinery for the message list.

However, the data is all fake, and not calling into the back end slices yet to grow or shrink data on the scroll. Also, if you end up tapping on a message, it will likely not end well.

I was just prepping something that could have hooks to call out for those slice actions, and something that would run within the DOM that email normally has.

The current compare diff from email master:

https://github.com/jrburke/gaia/compare/max-headroom

The main action is in the vscroll.js file. VScroll uses a list of NodeCache objects, where a NodeCache object is a div that contains a few of the message item divs. 

The hope was that by using this sort of coarser reusable/repositionable element, instead of updating individual item divs was the scroll happened, it would allow for less work.

However, I have not profiled it yet, and it could be that the batched work is more of a drag on painting than the list app :bkelly has adapted in bug 975831.

I can also investigate more in the next coming days, including adapting what :bkelly has to email if the vscroll.js thing does not work out, but just had a moment to update while tending to family -- will be online only sporadically tomorrow too.
(In reply to Andrew Sutherland (:asuth) from comment #15)
> ...
> If we only care about specific checkerboarding metrics, we may be there
> already.  If we care about the glitchy UX for v1.4, we definitely need to
> fix this bug.

Thanks!  Making 1.4+
blocking-b2g: 1.4? → 1.4+
Duplicate of this bug: 959782
No longer blocks: 959782
Target Milestone: 1.4 S2 (28feb) → 1.4 S3 (14mar)
Whiteboard: [label:email] [UX-?][p=13][c=handye p= s= u=] → [c=handeye p= s= u=1.4] [label:email] [UX-?][p=13]
Target Milestone: 1.4 S3 (14mar) → 1.4 S4 (28mar)
Candice,

Mind if you can sync-up with the team for the status update and ETA on this bug. QC is looking at the APZC blockers bug list and this is one of them in the list
Flags: needinfo?(cserran)
Whiteboard: [c=handeye p= s= u=1.4] [label:email] [UX-?][p=13] → [c=handeye p= s= u=1.4] [label:email] [UX-?][p=13]ETA: 3/28
Flags: needinfo?(cserran)
An in-progress update:

I have a scroll approach working that basically just grows the back-end data slice as the user scrolls down, but does not shrink it. So once reaching the bottom of the scrollable area, all data is loaded for that folder, so further scrolls in that folder do not trigger any data actions, besides snippet fetching.

This is likely a bit memory wasteful, so further enhancements are shrinking the slice and holding on to offsets, and asuth and I have talked about a "seek" API for the slices to jump past some items for very large jumps in the scroll position.

There are still some rough areas, not working yet with the new code:

* drafts
* search view
* loading of new items places the user at the end of that new item fetch list instead of keeping them at the top
* cookie cache (currently disabled)

I also do not consider the code that is there polished or ready for review, but this is just meant to give a feel for how the scrolling may work once all the loose ends are finished up, and if this sort of scroll behavior is decidedly better than what is there now.

The VScroll module handles the virtual scrolling parts, and is a bit different than the approach demonstrated by roc/bkelly/kgrandon in that there are three container divs that hold a set of message divs. Since the backend slice data source fetches items in blocks, I felt this matched better to that data source method, by filling in the container div with that data fetch result, and resulted in fewer div repositions in the scroll area.

It could have undesirable behaviors though, still evaluating it. Perhaps having those larger container divs (each one is twice the size of window.innerHeight) causes other problems for APZC, still evaluating. The plus side though is that if it is a problem, that should be more easily reworkable once all of the data interaction and VScroll APIs are sorted out and how they interact with the rest of the code.

Zip file is here:

http://jrburke.com/work/gaia/email-scroll-1.zip

Installable via the app manager following this process:

https://github.com/jrburke/gaia-dev-zip#for-the-ux-person

Working branch diff:

https://github.com/jrburke/gaia/compare/bug796474-email-virtual-scroll

I continue to work on the code, with the rough areas mentioned above as the next work items.
(In reply to James Burke [:jrburke] from comment #21)
> An in-progress update:
> 
> I have a scroll approach working that basically just grows the back-end data
> slice as the user scrolls down, but does not shrink it. So once reaching the
> bottom of the scrollable area, all data is loaded for that folder, so
> further scrolls in that folder do not trigger any data actions, besides
> snippet fetching.
> 
> This is likely a bit memory wasteful, so further enhancements are shrinking
> the slice and holding on to offsets, and asuth and I have talked about a
> "seek" API for the slices to jump past some items for very large jumps in
> the scroll position.
> 
> There are still some rough areas, not working yet with the new code:
> 
> * drafts
> * search view
> * loading of new items places the user at the end of that new item fetch
> list instead of keeping them at the top
> * cookie cache (currently disabled)
> 
> I also do not consider the code that is there polished or ready for review,
> but this is just meant to give a feel for how the scrolling may work once
> all the loose ends are finished up, and if this sort of scroll behavior is
> decidedly better than what is there now.
> 
> The VScroll module handles the virtual scrolling parts, and is a bit
> different than the approach demonstrated by roc/bkelly/kgrandon in that
> there are three container divs that hold a set of message divs. Since the
> backend slice data source fetches items in blocks, I felt this matched
> better to that data source method, by filling in the container div with that
> data fetch result, and resulted in fewer div repositions in the scroll area.
> 
> It could have undesirable behaviors though, still evaluating it. Perhaps
> having those larger container divs (each one is twice the size of
> window.innerHeight) causes other problems for APZC, still evaluating. The
> plus side though is that if it is a problem, that should be more easily
> reworkable once all of the data interaction and VScroll APIs are sorted out
> and how they interact with the rest of the code.
> 
> Zip file is here:
> 
> http://jrburke.com/work/gaia/email-scroll-1.zip
> 
> Installable via the app manager following this process:
> 
> https://github.com/jrburke/gaia-dev-zip#for-the-ux-person
> 
> Working branch diff:
> 
> https://github.com/jrburke/gaia/compare/bug796474-email-virtual-scroll
> 
> I continue to work on the code, with the rough areas mentioned above as the
> next work items.

Thanks for the update James. I understand this is a complex bug and work still remains. Could you please then update the ETA since it seems like this ended up being much larger than expected?
Updated ETA to 4/11. I am not confident at all in the code right now. I thought I had the issues in comment 21 sorted, but found three new problems introduced, and cookie cache is still not restored. I was also getting set to do a perf capture run, but do not feel it wise until the code just works as expected.
Whiteboard: [c=handeye p= s= u=1.4] [label:email] [UX-?][p=13]ETA: 3/28 → [c=handeye p= s= u=1.4] [label:email] [UX-?][p=13]ETA: 4/11
Target Milestone: 1.4 S4 (28mar) → 1.4 S5 (11apr)
Blocks: 986699
Duplicate of this bug: 991900
New version is up, zip file is here:

http://jrburke.com/work/gaia/email-scroll-2.zip

Installable via the app manager following this process:

https://github.com/jrburke/gaia-dev-zip#for-the-ux-person

Working branch diff:

https://github.com/jrburke/gaia/compare/bug796474-email-virtual-scroll

This one should work for all the cases where the previous one did not, and is cleaned up a bit from the previous one.

Notable behaviors as compared to existing master code:
* No scroll area resizings as the slice data comes in (one of the main goals of the change).
* If scroll gets ahead of available data, a "..." placeholder data is used, and the data eventually replaces it after it is fetched.
* when loading a folder for the first time, it now waits and shows the top of the messages, instead of showing possibly a mid-range of the messages with newer ones popping in above them.
* snippets are not fetched until scrolling stops completely.

Things still left to consider:

* doing the scrolling using just the item divs instead of container divs that hold item divs. I see some checkerboarding now on really fast scrolls, and could be because of the container divs, needs some profiling and more experimentation. Also probably needs some others to try it to see what is acceptable. My super fast may be considered "extreme".
* consider shrinking the slice as the user scrolls down/away from active areas, if memory is a concern. Will lead to seeing the "..." placeholders more though.
* code cleanup, review.
(In reply to James Burke [:jrburke] from comment #25)
> * doing the scrolling using just the item divs instead of container divs
> that hold item divs. I see some checkerboarding now on really fast scrolls,
> and could be because of the container divs, needs some profiling and more
> experimentation. Also probably needs some others to try it to see what is
> acceptable. My super fast may be considered "extreme".

So at our recent performance work week I believe Gordon told us some small amount of checkerboarding at extreme scroll speeds was acceptable.  In these cases the device is still clearly responding so the user does not feel it has broken.

Also, I believe he spoke with gfx about implementing a long term plan to implement low-res tiles instead of checkerboarding when we detect we can't keep up.

Is that correct Gordon?
Flags: needinfo?(gbrander)
I captured a quick video of the current state of the scrolling and how it behaves in the app. Unfortunately it looks like the light conditions combined with my phone capture resulted in seeing a grid pattern on the screen. That grid pattern is not something the user sees though, just a lighting/capture artifact:

http://jrburke.com/work/gaia/email-scroll.mov
I used get_about_memory.py to capture the memory for this scenario:

* configured one IMAP email account.
* loaded 4 chunks of messages (tapped "load more messages" 3 times)
* killed email
* reopened email
* scrolled to the bottom of the list, then back to the top
* waited 8 seconds and then ran get_about_memory.py

I uploaded the memory captures to this bug. Download them and load them up in about:memory to view them:

memory-reports-old: the "old" capture (labeled as "email-master" in the process list).

memory-reports-vscroll: the "new" code for virtual scrolling (labeled as "E-Mail" in the process list).

The total explicit allocation for "email-master" is 19.58 MB where the new one, "email" is 18.39 MB. I think some of that is just swirl with memory, and believe some of the sub-numbers do show a slight increase in JS data size. However, as the numbers seem to indicate, there may be some general memory thrash savings since the new approach does not create and destroy nodes for messages, but instead reuses nodes.

Given this peek into the memory, I do not believe the concerns I had for extra memory usage from storing more data as the slice grows is worth trying to address as part of this change. I believe it is not a big of an effect compared to other memory churn.

For me, this means we do not need to worry so much about shrinking slices for this change and we do not need to implement a "seek" method for the slices right now, as the data seems to populate in a reasonable time on a fast scroll, and the "..." placeholder indicates that the data will be there soon.

Also based on feedback I got about the video, checkboarding does not seem to be an issue, so I will stick with the current "container + item" div approach for the scrolling.

I will proceed with just making sure the existing code is ready for review. 

If after practical in-field use we feel like changing some of the behaviors, those can be filed as follow-on bugs. However, the code as-is seems to be a general improvement compared to what exists in master now.
I agree that it's a reasonable simplification to avoid the issue of shrinking/seeking in this patch given the complexity we're already at.  However, we are going to want to change our cache flushing a little.  Your testing regimen won't be catching the potential badness of the body block retention.  To see that you'd need to use the message reader navigation buttons to effectively load all the involved body blocks.

We want to:
- Amend FolderStorage.flushExcessCachedBlocks so that it just discards all the body blocks it knows about.  The logic there was written assuming the slice would be fairly constrained and was arguably already playing a bit fast-and-loose in potentially keeping all the body blocks it knew about in memory.
- Make a call to flushExcessCachedBlocks from within growSlice (but mutex-guarded just like in sliceShrunk! v. important!).  Given the current cache flushing mechanism and the prior change, I don't think it really matters where we do the flush, but in terms of working with the GC, I'd trigger the flush prior to fetching more headers since then the new headers can use the forgotten-about bodies.


It's worth noting that in general the cache flushing logic needs to be smarter to better bound our memory usage.  Or it could be stupid and a lot more aggressive.  The only real correctness requirement we have is that we don't evict blocks loaded by the things that hold the mutexes; theoretically (with a good amount of confidence) if flushExcessCachedBlock discarded all the blocks it knew about, things would still work.  There'd just be slightly more latency when viewing a message and less chance to benefit from locality.  (Although we don't flush all that aggressively, we would flush as a result of flag changes from reading unread messages.)
:asuth: I did the following change for flushing (link will eventually not work, whenever I rebase for final code review):

https://github.com/jrburke/gaia-email-libs-and-more/commit/068927299efc13702f0cb4fd527bf9c9da3d2c81

Thinking more about it: while that change helps while slices are growing, at some point the slice will not grow any more, and could be fairly big, but this change will not help with cleanup once the user starts to use next/previous in that big slice?

On the "in general" topic: my first thought is to go the "stupid aggressive" route: only keep one bodyBlock, the most recent one, or capped to cycling over a limited 5 slots. This would hopefully make flushing easier too as we just limited slot overwrites, no pruning. Not sure if that works well for multiple message operations, but as far as message body use in the UI, I don't expect the front end to need to deal with one at a time, at least with the current UI.
(In reply to James Burke [:jrburke] from comment #32)
> Thinking more about it: while that change helps while slices are growing, at
> some point the slice will not grow any more, and could be fairly big, but
> this change will not help with cleanup once the user starts to use
> next/previous in that big slice?

Yeah, my previous request was just to have us roughly maintain the same amount of covering our ears and yelling "LA LA LA MEMORY IS CHEAP LA LA LA" ;)

> On the "in general" topic: my first thought is to go the "stupid aggressive"
> route: only keep one bodyBlock, the most recent one, or capped to cycling
> over a limited 5 slots. This would hopefully make flushing easier too as we
> just limited slot overwrites, no pruning. Not sure if that works well for
> multiple message operations, but as far as message body use in the UI, I
> don't expect the front end to need to deal with one at a time, at least with
> the current UI.

Building on your suggestion, I propose the following:

- Continue to not flush blocks while the mutex is held.  This is mainly a locality/worst-case-garbage-avoidance thing; I don't think it actually affects correctness.  The sync process loads a lot of headers and potentially wants to touch all of them, it would be a waste if we both had to re-do the I/O to fetch them and our heap grew because they didn't get GC'ed before the new copies got loaded.

- (Continue to not flush dirty blocks, required for correctness)

- Keep 1 body block, the most recent one, like you say.  Do this by having maybeDiscard take a number of blocks to keep around; have it subtract that N=1 off the length of the list.

- Make _loadBlock for a body block trigger a setTimeout iff _mutexQueue.length === 0.  Have the timeout-triggered logic also check _mutexQueue when it actually fires before making the call.  Use a timeout-value on human-scale around the order of how long we expect it would take the user to realize they hit the opposite arrow navigation button from what they meant.  I would say 5 seconds seems reasonable.  The question is then whether we worry that the worst-case scenario of a QA-person hitting the arrow buttons as fast as they can needs to be addressed and merits additional logic like a hard-limit to call the flush method immediately (or to constantly reset the timer and divice the timeout by the number of blocks in the list). Arguably letting a user super-quickly navigate down through a list and back up again is a functionality win for us as long as it doesn't kill the phone.

Since this is fairly simple and uplift begging is not fun, I'd suggest maybe making this change as part of this patch after all to reduce headaches.
Current state of the code, which seems to be performing well, includes some changes since last posting, updated marionette tests to work with the new code, and did some cleanup:

http://jrburke.com/work/gaia/email-scroll-3.zip

Branch compares:

https://github.com/jrburke/gaia-email-libs-and-more/compare/bug796474-email-virtual-scroll
https://github.com/jrburke/gaia/compare/bug796474-email-virtual-scroll

Given the above desire to clean up body blocks, I do not expect this to land tomorrow. I need to do that change, then start the review process, which will likely have some back and forth. Even without cleaning up the body blocks, landing tomorrow was unlikely given the expected review cycle. Pushing ETA out by a week.
Whiteboard: [c=handeye p= s= u=1.4] [label:email] [UX-?][p=13]ETA: 4/11 → [c=handeye p= s= u=1.4] [label:email] [UX-?][p=13]ETA: 4/18
Target Milestone: 1.4 S5 (11apr) → 1.4 S6 (25apr)
Does this issue fix bug 986699?

Upon landing this fix what else would we need to fix on bug 986699?
Flags: needinfo?(jrburke)
I do not have access to the type of hardware that was used to report bug 986699, so I am not sure. I believe in general scrolling is better. Someone who does could use this version of the app to give it an initial try:

http://jrburke.com/work/gaia/email-scroll-3.zip

One way to install it using the app manager:

https://github.com/jrburke/gaia-dev-zip#for-the-ux-person
Flags: needinfo?(jrburke)
Tapas

Could use your help here for testing.
Flags: needinfo?(tkundu)
(In reply to Preeti Raghunath(:Preeti) from comment #37)
> Tapas
> 
> Could use your help here for testing.

I installed email-scroll app from #comment 36 and I still see white screen during scrolling emails using 'email-scroll app'.
Flags: needinfo?(tkundu)
Attached file GELAM pull request
Attachment #8407180 - Flags: review?(bugmail)
Attached file Gaia pull request (obsolete) —
Attachment #8407182 - Flags: review?(bugmail)
Latest zip file that matches the latest code, which includes the bodyBlock changes suggested in comment 33:

http://jrburke.com/work/gaia/email-scroll-4.zip
(In reply to James Burke [:jrburke] from comment #41)
> Latest zip file that matches the latest code, which includes the bodyBlock
> changes suggested in comment 33:
> 
> http://jrburke.com/work/gaia/email-scroll-4.zip

We should get someone on QA to pretest this to verify it & determine if there's regressions. Let me look for an assignee to help out on this.
Some clarification from QA was wanted on what this pull request for this bug addresses, so here it is:

This zip is just for the email app that includes the pull request changes in it:

http://jrburke.com/work/gaia/email-scroll-4.zip

Installable via the app manager following this process:

https://github.com/jrburke/gaia-dev-zip#for-the-ux-person

This means it should be installable next to the existing email app, as an app called "email-scroll-4". So you can launch both the existing email app and this patched one to see how the behaviors are different.

The main issue that is addressed by this pull request:

No scroll area resizings as the slice data comes in. This means that as you scroll, you should not see abrupt stops in scrolling, a flash, then the scrollbar resize to adjust to the new data that was loaded.

Instead, the scroll area is sized to the total number of messages the email app has downloaded, and scrolling should be smooth, with a high frame rate.

You may see some display of message divs that are just "..." while the data is loaded from the back end, but you should continue to be able to scroll, and once scrolling stops, you should see the message data pop in, with the "..." going away. 

Sometimes, depending on how far and fast you scroll, inertial scrolling may need to completely stop for a moment before the final message data replaces the "..." placeholders.

That behavior is "working as designed" as the data is not available yet from the back end of the mail app. It is up to you and product/UX to decide if that is "working as desired", however I do believe the overall scroll experience is much better than before.

There still may be some cases where you can induce some amount of partial checkerboarding, but it should be very temporary, and could just be partially on the screen. We can continue to hone in on any causes of that as part of bug 986699, but it should be much better than before, and any further fixes would require the changes in this changeset as a baseline anyway.

So main point: checkerbaording may not be totally solved with this changeset, but the scroll hitching, the main point of this ticket, should be fixed.

Other notable behaviors as compared to existing master code:

* when loading a folder for the first time, it now waits and shows the top of the messages, instead of showing possibly a mid-range of the messages with newer ones popping in above them.
* snippets are not fetched until scrolling stops completely. This is done to help avoid checkerboarding.
i've done some thorough testing against the app, and overall it seems fairly usable.  It still checkerboards a little when inertial scrolling down a large inbox.  this is usually when i have just started up the app, or if i'm switching folders.    If i close the app or return to the folder, then thats when the ... appears when inertial scrolling, as mentioned above.   

So i'm not sure if this really fixed bug 986699, if the expectation is not to see any white scrolling in bug 986699, then this doesnt fully fix it.  

scrolling within a content body seems to work as expected, once the content in the body has loaded.  I did get an OOM when scrolling a really long body, and filed bug 996991 to track that.  

The rest of the basic functionality seems to work fine.  i went through save draft, send, receive, delete, move to folders, flag messages, switch accounts, notifications, and search functionality.  Didnt see any new bugs that wasnt already reproducing on the default email app.

If anyone else has time, please test it out and give a second opinion here.
Thanks James!
After I tested the local patch (via POP3+IMAP accounts), my test results are same as Tony mentioned.

By the way, as you mentioned, "we can see some display of message divs that are just '...' while the data is loaded from the back end."
It might be inappropriate. I attach a screenshot for UX team to review it. (2014-04-15-09-41-22.png)

* Build Information
 - Local patch: http://jrburke.com/work/gaia/email-scroll-4.zip
 - Gaia:        3534b4880653f24529c973ec2f09ef51a3d6392f
 - Gecko:       https://hg.mozilla.org/releases/mozilla-aurora/rev/12171cfe5ded
 - BuildID:     20140414160201
 - Version:     30.0a2
(In reply to William Hsu [:whsu] from comment #46)
> Created attachment 8407366 [details]
> 2014-04-15-09-41-22.png

While i understand this is the way that James has designed the perceptive workaround to the checkerboarding, i have to agree with William on this.  This is really strange UX to be showing the user.   showing rows of "..." in email scrolling is unconventional and unintuitive if you're a normal user.   Flagging UX here for some better thoughts on improving this flow.
Flags: needinfo?(jhuang)
Flags: needinfo?(hhsu)
Couple of thing need to be confirmed after I tested done.

(1) When I scroll down to end, user can see "Load more messages from server". Why not be able to auto loading message when user scrrow down to bottom?
(2) after I saw around over 140 mails, I quickly scroll up to top (or scroll down to bottom) user can see the blank page. Although it is not always reporduce. (Check attachement)
(3) As William mentioned that user sometime can see the "..." while loading messages.

Edward
Flags: needinfo?(gbrander)
Comment on attachment 8407180 [details] [review]
GELAM pull request

Removing review flag until we get broader agreement on the design goals for this ticket. I will post a comment soon that outlines the higher level design decisions involved in this ticket.
Attachment #8407180 - Flags: review?(bugmail)
Attachment #8407182 - Flags: review?(bugmail)
 (In reply to Edward Chen[:edchen] from comment #48)
> Couple of thing need to be confirmed after I tested done.
> 
> (1) When I scroll down to end, user can see "Load more messages from
> server". Why not be able to auto loading message when user scrrol down to
> bottom?
  [William] This is current UX spec. Please refer to V1.3 POP3 specification and you will know why we need the design.
> (2) after I saw around over 140 mails, I quickly scroll up to top (or scroll
> down to bottom) user can see the blank page. Although it is not always
> reporduce. (Check attachement)
  [William] Please see Tony's test results(Comment 44)

> (3) As William mentioned that user sometime can see the "..." while loading
> messages.
> 
> Edward
So there are few design decisions that are needed for this feature, and while I made a choice for some of them, it makes sense to explicitly enumerate them so that we are all on the same page, and we can get agreement on those decisions. First, the thinking that lead to the current behavior in the pull request:

High level goal: smooth scrolling that responds quickly to user action.

The email behavior in master right now makes this difficult because:

* not all messages that in the backend are shown in the scroll area at once, to reduce memory requirements and to help with general UI performance: the fewer DOM nodes in the DOM, the better it is for UI responsiveness, in general.
* the scroll area right now is only sized to the number of messages currently fetched from the back end.

For instance, on my first sync of an imap account, it fetched 19 inbox messages. When I restart the app, it only has 15 of those fetched initially. As I scroll down, it fetches the 4 other messages. I see a jump in the scroll bar and the scroll area height as those other 4 messages are fetched.

That jump in the scroll bar and scroll height is the primary fix target of this bug.

Ideally checkerboarding is also improved. However, with current email master, and that 19 messages example, I still see some partial checkerboarding as just those 19 messages are scrolled, like the type that Tony saw. For email master, all the divs for the messages are already in the DOM, so unlike in the pull request version, no fancy reusing of DOM nodes, just plain scrolling of a div area with its full contents.

That partial checkerboarding could just be due to the type of DOM/styles we are using or background/other scroll hooks, but I also suspect some issues just with APZC.

## Design decisions for virtual scroll

The virtual scroll solution uses a set of reused DOM nodes to show the messages. So not a new DOM node for each message. Plus, now the UI knows the scroll height that is needed to show all the messages (that the backend synced locally) in that folder.

This helps because now inertial scrolling can continue past the amount of messages the UI knows about. In the above example, the UI front end knows the message data for 15 of the messages at first, but knows there is a total of 19.

If the user scrolls faster than we can fetch data from the back end, which is very possible -- a quick flick scrolls the UI very far -- then the user could end up in a scroll position that is past that 15 message mark. So instead of just showing a blank space, the pull request shows "..." in the reused div for the messages that would be in that scroll location. Once the data is fetched from the back end, it should update to the real message data.

So, the high level design decisions for this ticket are:

1) Should the user be able to scroll to areas where the data is not available from the back end yet?

If answer to 1) is YES:
2) What sort of placeholder should we use in that case, as not showing a placeholder and just a white screen seems bad. The pull request currently uses '...' as it was non-locale specific, and it was lightweight, just text, so nothing extra needed for image display for example. So the UX team needs to decide if '...' is acceptable or if something else is desired.

If answer to 1) is NO:
3) Is the existing behavior on email master preferable? If YES, then we can close this ticket.

If answer to 3) is NO:
4) Should the email app fetch all the message header data from the app's back end so there is no need for a placeholder state for messages? I assumed the answer was closer to NO. Saying YES could mean potentially fetching a lot of data, and the most likely use cases for email are dealing with new messages near the top, not needing to scroll to old messages. So that seemed wasteful and slower for initial startup. But YES is an option.

Currently what is implemented for 4) is GRADUAL: not all data is fetched at the start. However, if the user scrolls down, missing data is fetched, so '...' is seen at first, but once the bottom is reached, all the data is now in memory for the front end and it no longer needs to show the '...'.

So right now, the pull request made these design decisions:

1) YES
2) '...'
3) NO
4) GRADUAL: not initially, but once fetched, the front end holds on to the messages, so should not see '...' again unless the folder is changed, or older messages are pulled from the email server.

If UX wants something different, then I can update the pull request to match.

Here is a video I did previously, but uploaded to vimeo. It helps illustrate the current design choices if reading all of this text is not as helpful:

https://vimeo.com/92172879

As for checkerboarding: I do not believe it will fully go away. I do believe some of it is just inherent in the platform, but also open that email needs to do some changes. But those changes need very fine grained perf captures for specific reproducible cases, and I would want to follow up on closing any holes with it in bug 986699.

However, I will look more into :edchen's full white screen in comment 49, and I will look at some tweaks to see if I can improve it any more while waiting on overall design agreement for this bug. Although for the full white screen: it looks like a capture from simulator or desktop firefox, with maybe dragging the scroll cursor really far either up or down using the mouse. So, while I would still want to be sure that the UI is not stuck in that white screen state (:edchen is it stuck like that?), also that case seems be a bit further outside the most likely target uses of email on touch devices.
Attached image Dot_3line_1.png
Hi James, thank you for your profoundly explanation! We definitely go with (1)

After discussed with Mike & Harly, we like the idea that rendering the data gradually, and we also think the "..." that James proposed is somehow reasonable to show the "loading" process if the data is not yet fully rendered. 

So here we got another proposal which is gradually rendering the data by 4 levels:

Level 1: Full info - Name, title & message body

Level 2: Half info - Name & title

Level 3: One info - Name only

Level 4: No info - a new UI (see attachment)

We are thinking about reducing no-information cases as much as possible. Level 3 is to show at least the name of the message so that more lists will show informations rather than show nothing.

If it really really has no time to render the data at all, then a new UI is proposed to replace "...". It's like the same concept that indicate the data is rendering.

Yet we're not quite sure if showing the name only is helpful for reducing the blank lists or not. James could you give some more feedbacks here?

As for the new UI, it'd be better to do it by coding if possible. Yet it also need feedback from James to see if we could coding it. I'll deliver accurate visual assets for it later.

Any thoughts? Let me know:)
Flags: needinfo?(jhuang)
The states a header can have are:

- No info (fetching from database)

- All info but the snippet (fetched from the database, but a snippet has not yet been downloaded from the server).

- All info including the snippet.  Note that in some cases a snippet cannot be computed until the body has been fully downloaded by displaying the message in the message reader.

I think those correspond to level 4, level 2, and level 1.  There is no level 3 and it's unlikely we would realize any performance benefit from simulating it by pretending we don't have the message subject/snippet/etc.
I will experiment with providing just background fills for some of the slots where the message data shows, as the attachment in comment 54 indicates, instead of using the '...' placeholder. 

I think we should be able to do it, I just need to observe the impact -- tweaking too many styles could trigger reflows that would slow down the scrolling. I hope to have more information end of Monday.

As :asuth mentioned, the possible rendering we will show is either 2 or 1 (1 if the message snippet was fetched), otherwise, 4, with the gray boxes as shown in the attachment in comment 54 will be shown (if the experimentation works out).

Because of that experimentation and because I am in a country that has a holiday tomorrow, I am moving the ETA to 4/25. The good news is that this code is in the review stage now: getting QA/UX feedback, and :asuth has already started a light review of the code.
Whiteboard: [c=handeye p= s= u=1.4] [label:email] [UX-?][p=13]ETA: 4/18 → [c=handeye p= s= u=1.4] [label:email] [UX-?][p=13]ETA: 4/25
James, have you tried enabling layer borders in the developer menu?  I'm curious if you are getting large scrollable layers for visible content or if individual items or group divs are getting the layers.

While looking at roc's virtual-list-demo I found I had to actually remove a DOM node if I wanted to re-use it.  Leaving it attached to the document and repositioning would eventually lead to each list item getting its own layer while scrolling fast.
Here is a variant of the virtual scroll that uses gray backgrounds for some cells of information, for when the data is not yet available for that item:

App zip:

http://jrburke.com/work/gaia/email-redacto.zip

Video:

https://vimeo.com/92457670

In the video, the placeholders only show up in the first 10 seconds, you may need to pause the video in that first 10 second range to see them.

:bkelly asked about layer tile borders. With this gray block background, video of the layer tile borders:

https://vimeo.com/92457793

Looks like maybe a couple extra on first scroll, when the placeholder styles are used, but then settles down after that.

The gray placeholders where constructed by just turning on the background color for some existing divs in the item entry. I felt this was the least intrusive style change. I feel like it creates an extra bit more of work to do, but may just be subjective as the gray boxes are wider, so more visible when they are not there, as compared to the other zip in comment 43 that uses "...".

Asking Juwei to for UX feedback on if this gray box style is good, or if something else is desired.
Flags: needinfo?(jhuang)
(In reply to Tony Chung [:tchung] from comment #44)
> i've done some thorough testing against the app, and overall it seems fairly
> usable.  It still checkerboards a little when inertial scrolling down a
> large inbox.  this is usually when i have just started up the app, or if i'm
> switching folders.    If i close the app or return to the folder, then thats
> when the ... appears when inertial scrolling, as mentioned above.   
> 
> So i'm not sure if this really fixed bug 986699, if the expectation is not
> to see any white scrolling in bug 986699, then this doesnt fully fix it.  
> 
> scrolling within a content body seems to work as expected, once the content
> in the body has loaded.  I did get an OOM when scrolling a really long body,
> and filed bug 996991 to track that.  
> 
> The rest of the basic functionality seems to work fine.  i went through save
> draft, send, receive, delete, move to folders, flag messages, switch
> accounts, notifications, and search functionality.  Didnt see any new bugs
> that wasnt already reproducing on the default email app.
> 
> If anyone else has time, please test it out and give a second opinion here.

Thanks for the information. I start to wonder, is it acceptable to fix this in the next version of Firefox OS?
blocking-b2g: 1.4+ → 1.4?
Hi James, thanks for your quick updates!! U are the best :)
I'm totally ok with the timing the grey box appear, yet can the style be fine tune a little bit more?

I wish the grey box can be 3 lines in order to present the loading process:
First line is shorter, second & third line is longer, thiner and lighter (just like the link https://bug796474.bugzilla.mozilla.org/attachment.cgi?id=8408095) 

I'm not quite sure if it's available to code the grey box precisely like the attachment?
If yes, Fang can provide a visual spec for you to measurement!
If not, let's think about how to make a better layout based on coding constraint (such as how thick the grey box is, how may lines it should be or how long/short a grey box can be)
Flags: needinfo?(jhuang)
Flags: needinfo?(hhsu)
(In reply to James Burke [:jrburke] from comment #58)
> :bkelly asked about layer tile borders. With this gray block background,
> video of the layer tile borders:
> 
> https://vimeo.com/92457793

For what its worth, the layering looks reasonable to me.  (Not an expert on layering, though.)
(In reply to Kevin Hu [:khu] from comment #59)
> Thanks for the information. I start to wonder, is it acceptable to fix this
> in the next version of Firefox OS?

This patch is a prerequisite to fixing any other checkerboarding/scroll responsiveness issues on v1.4.  If everyone is okay with the state of things on v1.4, then we can defer to 2.0.

However, bug 986699 is currently blocking-b2g=1.4+ and in bug 986699 comment 11 :jsmith has indicated that it is a QC blocker.
(In reply to Juwei Huang from comment #60)
> I wish the grey box can be 3 lines in order to present the loading process:
> First line is shorter, second & third line is longer, thiner and lighter
> (just like the link
> https://bug796474.bugzilla.mozilla.org/attachment.cgi?id=8408095) 
> 
> I'm not quite sure if it's available to code the grey box precisely like the
> attachment?

I believe James is reusing the existing display DOM structure when we have content.  For performance reasons, I believe we want to leave that layout intact so we can minimize reflows.  (Noting that we may need to further improve our CSS to fully realize gains from minimizing reflows since the current CSS does at least use "display: table" for .msg-header-author-and-date.  However the rest of the header content is less content dependent including the subject and snippet.)

As such, we are limited to being able to manipulate the padding/margin/border of the header items within the confines of the existing layout.  I think it would be ideally if you could use the app manager to connect to the e-mail app on a device and use the DOM inspector to manipulate the CSS so you can see what it looks like within the existing constraints.  You can delete the text content of the nodes to assist.  And then I think you can even save your CSS changes to disk afterwards!

A starting point for this is https://developer.mozilla.org/en-US/Firefox_OS/Using_the_App_Manager.  There should also be separate documentation on the developer tools which work the same way for local webpages as they do for connected devices.  Engineers like :evanxd at the Taipei office should be able to help you get started if you need any assistance.  The app manager is an invaluable tool for this type of thing, so I think it's definitely worth taking the hit now to learn how to use it if you are not already familiar.  This will help UX and engineering avoid round-trip cycles related to styling.
Attached image bgfill.png
On Juwei's query about styling: I originally tried coloring the background of the From, Subject and Snippet divs, but there currently is no space between them, so it all connects together as a block. To get spacing in there would be enough to trigger layout changes that would slow things down. See the uploaded attachment bgfill.png for what it looks like with two and three of the sections colored in. Juwei, is one of the options in bgfill.png (the two or three of the connected divs) more desirable than what I did before?

:asuth's suggestion of playing with the colors via the app manager is a good suggestion if something in the existing implementation or in bgfill.png is not desired.

On :khu's query about the suitability of this for 1.4, and its relation to bug 986699, note that even when this lands, I do not expect for all checkboarding to go away, and working on checkerboarding (the focus of bug 986699) will take a while, needing to target specific cases and needing very specific perf captures, which I expect will be hard to do.

I do think it is worth it for someone to ask QC how much of a blocker bug 986699, particularly since the bug is "You will see white screens occasionally", which is very vague. I was hoping that by switching to this virtual scroll it would solve all cases of checkerboarding, but playing with it now, it is not going to do that, and I believe some cases are just because of APZC quirks, or because of an interaction of email workload and the moment of scrolling, and the amount of effort to figure out what cases are app and what are APZC will be difficult to do as most of the time, exact causes are not 100% reproducible.

I'll also add this note to bug 986699, also suggesting they try the latest virtual scroll app zip to see if it is better overall for them.
Flags: needinfo?(jhuang)
Milan

Please weigh in for technical analysis on APZC.
Flags: needinfo?(milan)
I don't think we should care about checkerboarding in this bug.  It was really the "glitchy UX while scrolling" (see comment 15).  If that is already taken care of, I don't think the remaining stuff is a blocker (in this bug.)
Flags: needinfo?(milan)
> A starting point for this is
> https://developer.mozilla.org/en-US/Firefox_OS/Using_the_App_Manager.  There
> should also be separate documentation on the developer tools which work the
> same way for local webpages as they do for connected devices.  Engineers
> like :evanxd at the Taipei office should be able to help you get started if
> you need any assistance.  The app manager is an invaluable tool for this
> type of thing, so I think it's definitely worth taking the hit now to learn
> how to use it if you are not already familiar.  This will help UX and
> engineering avoid round-trip cycles related to styling.
Sure.
Juwei can use app manager to hack Email app now.
Attached image Email_scrolling.png
Hi James& Andrew,
I've asked Evan to help me with the App Manager, and it works! Thanks guys :)
I think instead of using the gray box, we probably could try to use special symbols to present the gray line....
I tested it over App Manager so it would be like :
▃▃▃▃▃▃▃
▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃
▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃▃

And I use #e7e8e9 as the colour changes, you can check the attachment for the result.
It looks pretty close to what we want:)

Still not quite sure if it effects the scrolling performance or not... James can you help to try it? Thanks !!
Flags: needinfo?(jhuang)
Comment on attachment 8407182 [details] [review]
Gaia pull request

Juwei:

I updated the code to use the unicode character and coloring you specified, the app zip is here:

http://jrburke.com/work/gaia/email-redacto-2.zip

video:

https://vimeo.com/92651265

I was a bit conservative on the length of the bars used, since different resolution/sizes of screens can be used for FirefoxOS, and by using characters, it makes it harder to scale efficiently, without doing some size calculations first. However, I think what is in place in this new app zip seems decent. Just note it will not scale wider for wider screened devices, it will stay the same size as it is shown in that video.

Also, if the font ever changes to make that character look weird, that could also change this appearance.

I think these are all fine, just calling out the constraints.

Asking for ui-review? now on that design, and if given a + by Juwei, then I'll integrate it into this bug's pull request.
Attachment #8407182 - Flags: ui-review?(jhuang)
Adam

Can you let us know if this is needed for 1.4 or 2.0? We have a dependent QC bug which needs to fixed for 1.4 and hence the ask.
Flags: needinfo?(arogers)
This is essentially a new feature (despite blocking a bug), if the work is done I'd like to see this in 1.4.  That written, I would not block on this.
Flags: needinfo?(arogers)
Comment on attachment 8407182 [details] [review]
Gaia pull request

Hi James, thank you for your reminding of all these concerns. I'll mark the review as a + now and keep tracking on the possible changes due to different device resolution.
Attachment #8407182 - Flags: ui-review?(jhuang) → ui-review+
Comment on attachment 8407180 [details] [review]
GELAM pull request

r=asuth with the commit I pushed to your branch to slightly adjust the discard logic so that we will retain 0 body blocks if there are no active folder slices.  (Feel free to squash into your commit; no need to retain attribution.)

The core rationale is that an inactive FolderStorage should not be holding any cached data.  This is important since we maintain a FolderStorage object for all known folders at all times.  (Although in the future we may move to lazily instantiate and definitely will move for them to not hold full block-info state at all times.)
Attachment #8407180 - Flags: review+
Pending Release Management decision. Leaving as 1.4? (nom) - FxOS Perf Triage
Moving back to 1.4+ after input from :jrburke, :asuth, and partners. This is nearly complete and is our best path toward addressing checkerboarding in email scroll.
blocking-b2g: 1.4? → 1.4+
Attachment #8407822 - Attachment is obsolete: true
Comment on attachment 8407182 [details] [review]
Gaia pull request

:jrburke, as part of my review process I've written up a bunch of block comments to document state variables and have pushed them to your branch.  To have a place to hang the comments I added them to the prototype.  Arguably it might make more sense/be more efficient if they were moved to the constructor as direct assignments.

Unfortunately I still have a bit more investigation to go.  Current notes:

- onChange's logic isn't sufficiently stable if the user jiggle-scrolls where they scroll down 5 pixels, then up 5 pixels, then down 5 pixels, etc.  Since the book-keeping relating to NodeCaches is not entirely trivial, it might make sense to just explicitly track absolutely-positioned thresholds in either pixel or item index-space.  We trigger when we cross those.  Removing/repurposing a NodeCache would subtract off its height from the corresponding threshold, adding a NodeCache would increase the threshold.  This approach is safe in the face of insertions/deletions because our coordinate space is forced to be stable by just shunting ourselves to the _recalculate code-path in that case.

- I think I fixed minor scroll accounting issues related to that.

- I've started to make NodeCaches aware of their own height because it makes me uncomfortable for us to assume they all always have a constant number of nodes in them given that we do have boundary conditions at the top/bottom/in small folders and more efficient treatment of insertions/removals will absolutely require that in the future.

Everything else is hopefully obvious from any comments I added to the source.

Feel free to revise/rewrite/remove any comments that are wrong/you don't like/etc.
Blocks: 989116
Attached file Gaia pull request
This pull request supersedes the previous one for this issue:
https://github.com/mozilla-b2g/gaia/pull/18345

This one is similar in concept, but the vscroll module no longer uses the larger NodeCache approach, and just uses a simpler, direct cache of message item nodes. The outside API to vscroll stayed the same, so the rest of the pull request is the same, except for message_list's loadNextChunk, where there is an extra guard around the requestGrowth call.

As the UI is the same, carrying the ux-review+ over from the previous pull request.
Attachment #8407182 - Attachment is obsolete: true
Attachment #8412439 - Flags: review?(bugmail)
Forgot to put this up last night -- app zip of latest pull request code:

http://jrburke.com/work/gaia/email-scroll-5.zip
Awesome work in removing the NodeCaches and other improvements!  I think we are basically there.

I've pushed another patch to your branch.  You can see my changes here in a friendly side-by-side diff: https://reviewboard.allizom.org/r/76/diff/2-3/

The main problem I ran into was a persistent gap in the middle of the visible messages, as mentioned on the screen.  I think this was mainly due to visibility guards in _render being inside the loop instead of outside the loop, but there also might have been a node shortage.  I tried to address these things and help my brain by:

- Cleaned up the constants we use (name-wise) to decide how many nodes to make/etc, posing things in terms of prerendering, prefetching, and extra (which gives us our trailing stuff).  The prior choice of constants was leftover from the NodeCache stuff.  I chose numeric values that were most equivalent to what the code was already doing.  It may be desirable to alter them slightly (but obeying any suggested constraints in the comments.)

- Made the _render/prepareData/friends be very explicit in terms of exhaustive indices they want to deal with rather than using counts or inferred priority of the visible nodes.  I also stripped off guards/pre-checks in favor of only having the guards in the callees.

- Different changes to loadNextChunk.  The important one is that we definitely don't want to pass "true" to requestGrowth if the user didn't push the "get more messages" button.  I converted the awkward loop-based probing logic to closed form math.  There's also some churn related to names / the waiting mechanism that may look a little arbitrary but is mainly the result of iteration and confusion on my part.  Right now I've tried to keep the state machine minimal and with comments indicating clearly why we won't drop a request on the floor.  Also, I think I nearly screwed up and created a potential infinite asynchronous call loop at one point (that your logic did not have), so I really wanted to avoid that.

- resize logic.  I'm a little confused by the spec on this, but it seems like resize might only ever be sent to document.defaultView (AKA window), although Firefox may behave differently right now.  Anyways, I didn't understand what the impact of us treating a resize like a scroll was since all of our logic already pivots on the scrollTop which I think isn't going to change in the face of the keyboard stealing our screen real estate (which is a separate issue; see mentioned bug).  So I removed our resize handler and added some comments.


A very nice bug I added that we need to fix:

- I broke the search base-case when trying to ensure that our concept of screen size didn't get screwed up by the keyboard.  I did this by having us setData immediately prior to us forcing the search field to be focused  (I started with directly calling _init, which is why it has a guard inside it now, but then I changed it to setData when I realized _init really wanted setData).

Except, as I'm sure you know because of how you structured it, this doesn't work because we don't actually have a valid search slice at that point.  So what might be happening is we still have the old message slice happening under the hood or something like that.  (note: in my confusion, I also added a || 0 for the headerCount since I was thinking there actually was a search slice...)

I think the best thing might be for us to normalize things by letting MailAPI create an empty search slice at startup.  This will hopefully let us normalize things, and then maybe we can make it do handy things like setTimeout() a fake oncomplete call for a subsequent turn of the event loop if that lets us normalize things further.

The other alternative is to allow us to save off the metrics before we focus, but change the other stuff back to pre-asuth meddling.


Note: Please don't feel compelled to do any work on fixing the bug I added on Sunday.  I won't be hacking again until east coast evening, but I will be around and I will happily fix my bug as proposed.  Standard IRC write-token rules apply.
Er, and one other bug I may or may not have introduced:

The first time I tested the current code that I pushed, the "load more messages" button did not appear.  But then the next few times it did.  I worry that there may be some type of bug wherein we never quite grow the slice up to the boundary.  So I modified some console logging so when we hit the boundary we should see something like:
  LOG: message_list loadNextChunk growing 7 to 190 of 191
the message is actually (growth requested, high index desired, headerCount) and high index desired + 1 should === hearerCount.  So that case right there is correct although weird-looking.

Looking at the logs I'm also slightly worried that there may be a situation where we are showing "load more messages" before we've actually displayed all the messages.  This would only really be a problem if the UI is not properly representing headerCount and then I think could be self-solving, but I am going to want to think about that a bit more tomorrow.
Comment on attachment 8412439 [details] [review]
Gaia pull request

From my perspective, I think we're good with my change stack, but I need you to review them.  r=asuth.  If you have any substantial changes on top of them, please flip review back to me.  I think some of the constants might need upward adjustments; I don't think I need to review for that.

The changes here are:

- I un-exploded search by reverting my change of having us use setData when there was no search yet.  Instead, as proposed, I capture the metrics at that point.  Everything else stays the same.

- I did some GELAM enhances/cleanups relating to search (and pushed to your branch for that too).  We now lie and pretend we have one more header than we really have in order to get the UI to trigger us to search more messages on demand.  The key thing here is that if we don't end up finding another message, we need to retract that phantom message so we get rid of the default-data message that the UI ends up showing.  Our logic is pretty solid in there, but there's still the underlying issue about headerCount propagation needing to really be more of a first-class thing.  We should address that when we support shrinking off the top.

- GELAM: We weren't reporting headerCount changes on newly fetched messages because of the _curSyncSlice weirdness that I'm not crazy about (but that is my fault of course :).  I added some comments to clarify why we're doing what we're doing there in order to hopefully make it easier to fix that.

- GELAM: The search filter logic could stall out if it encountered a header-fetch that reports a chunk of 0 headers.  This led me to notice that we had some potential ordering hazards relating to getMessageBody and caching, so I think I've addressed that (see comments).  Thinking about this more, I do realize that it is fairly gibberish for us to get a 0-header response.

I think this is the edge case where we're asking for messages older than a given message M but M is the oldest message in its block.  We issue these calls on a per-block granularity.  Clearly we should just eat a callback that would occur with zero messages unless we're doing it to say we're totally at the end of the line.  However because of the flag edge case there and potentially required test cases and sleepiness factors, I don't want to do that right now.

- caching bug; I noticed the number of cookies we were writing to was way up.  The change to use expandos for the index is great but they don't get cloned in cloneNode so I added some logic for that.  Cookie cache usage is the same for me as it was before.

- As noted on IRC, my GELAM diff is showing whatever newline problem we had with make "install-into-gaia" is back.  My memory is foggy about causality here; I think at least one of the problems we saw once was just due to stale submodules and a failure to "make clean".  Please double-check your gelam checkout has all the most up-to-date submodules ("git submodule sync --recursive" "git submodule update --recursive").  I've done so on mine, so if your modules stay the same, that will hopefully help track down the problem.  I suspect we just need to newline normalize the bad JS module in question.

We should address this prior to landing in gaia just to reduce uplift incompatibilities.


In terms of any checkerboarding with this, I'm pretty much of the mind that we should only delay landing this if we see correctness issues.  (Although again, minor constant tweaks make good sense to me before landing.)


For the GELAM stuff, let's avoid squashing the commits for history purposes.  (But do squash gaia for sure, authorship to you.)

reviewboard full diff: https://reviewboard.allizom.org/r/76/diff/4/
delta diff: https://reviewboard.allizom.org/r/76/diff/3-4/
Attachment #8412439 - Flags: review?(jrburke)
Attachment #8412439 - Flags: review?(bugmail)
Attachment #8412439 - Flags: review+
Here is a zip of the latest code in the pull request, :tchung expressed an interest in trying out the latest state:

http://jrburke.com/work/gaia/email-scroll-6.zip

The code is still under review, so not final. Hopefully the only diffs now will be minor things around constants and then possibly a fix related to the HTML cookie cache -- I do not see the app resuming to it. Or if it is, it is getting wiped out quickly with a "loading messages" message.

So feel free to check it out, but there will likely be another (hopefully small) follow-on commit that will be done before this pull request reaching the "final" merge state.

On the plus side, it seems to be performing even better, at least it seems very rare to have a data placeholder view now.
Target Milestone: 1.4 S6 (25apr) → 2.0 S1 (9may)
Comment on attachment 8412439 [details] [review]
Gaia pull request

Changes look good to me. Since I changed the dom structure since when I started the pull request, I had to adjust some of the marionette email test helpers to match. The good part was that they went back to be closer to existing master tests. Squashed/rebased and it passes travis.

For QA:

The background design decisions are outlined in comment 52, but the summary is: scrolling should no longer stop abruptly, the scroll bar jumps around a bit, then the user can continue to scroll.

Instead, now scrolling should allow full inertial scrolling movement for the number of messages the email app has stored for that folder. Sometimes you will see the data placeholder that Juwei describes in comment 68 when the scroll gets to a position where the email UI does not have the data from the email backend worker, but it should eventually be replaced with real data.

You may see occasional checkerboarding for parts of the screen while scrolling, particularly if the email backend is doing work, but separate bugs should be filed for any cases that are seen as not up to standards and have reasonable steps to reproduce. And bugs like bug 986699 should likely be retested with this latest code.

Overall, scrolling feels much better than before.

Merged to master:
https://github.com/mozilla-b2g/gaia/commit/4788795b80c317a97391445512b3e0a72b7d02f4

from pull request:
https://github.com/mozilla-b2g/gaia/pull/18671
Attachment #8412439 - Flags: review?(jrburke) → review+
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Needs rebasing for v1.4 uplift.
Flags: needinfo?(jrburke)
Whiteboard: [c=handeye p= s= u=1.4] [label:email] [UX-?][p=13]ETA: 4/25 → [c=handeye p= s= u=1.4] [label:email] [UX-?][p=13]
(In reply to James Burke [:jrburke] from comment #85)
> Comment on attachment 8412439 [details] [review]
> Gaia pull request
> 
> Changes look good to me. Since I changed the dom structure since when I
> started the pull request, I had to adjust some of the marionette email test
> helpers to match. The good part was that they went back to be closer to
> existing master tests. Squashed/rebased and it passes travis.
> 
> For QA:
> 
> The background design decisions are outlined in comment 52, but the summary
> is: scrolling should no longer stop abruptly, the scroll bar jumps around a
> bit, then the user can continue to scroll.
> 
> Instead, now scrolling should allow full inertial scrolling movement for the
> number of messages the email app has stored for that folder. Sometimes you
> will see the data placeholder that Juwei describes in comment 68 when the
> scroll gets to a position where the email UI does not have the data from the
> email backend worker, but it should eventually be replaced with real data.
> 
> You may see occasional checkerboarding for parts of the screen while
> scrolling, particularly if the email backend is doing work, but separate
> bugs should be filed for any cases that are seen as not up to standards and
> have reasonable steps to reproduce. And bugs like bug 986699 should likely
> be retested with this latest code.
> 
> Overall, scrolling feels much better than before.
> 
> Merged to master:
> https://github.com/mozilla-b2g/gaia/commit/
> 4788795b80c317a97391445512b3e0a72b7d02f4
> 
> from pull request:
> https://github.com/mozilla-b2g/gaia/pull/18671

Edward, William, do you mind retesting the app?   See the notes above for focus areas.   This has landed on nightly, so you can just grab a fresh master build and test it.   Report back here what you find.

Thanks, Tony
Flags: needinfo?(whsu)
Flags: needinfo?(edchen)
Fixed on v1.4 branch:
https://github.com/mozilla-b2g/gaia/commit/81e97c3ca58be0487292011bc59efa4cebab30be

from pull request:
https://github.com/mozilla-b2g/gaia/pull/18808

That pull request was a roll up commit of the following bugs, since bug 796474, a 1.4+ bug, depended on changes in the other bugs, and the other bugs also block another 1.4 bug, bug 989116.

Commits that were part of the roll up:

bug 838843: fc4a74a8400838e5fd18da6b7d8851a5a4380019
bug 882917: 0a8ccfdb26a33789bec754d769b0786570ceb28c
bug 796474: 67868019817334815ebb881ef8cd1b478989aa01
Flags: needinfo?(jrburke)
Thanks James and Tony! :)
It doesn't have any side effect. Good Job!

* Test Scenario:
 - Precondition: Set up IMAP and POP3 account on the email app
 - Scenario:
  i. Scroll the inbox quickly         (Pass)
 ii. Scroll the inbox slowly          (Pass)
iii. Move files then scroll the page  (Pass)
 iv. Kill the email app while scrolling the page, then check the inbox again (Pass)
  v. Check the implement of visual design (As attachment - 2014-04-30-16-06-25.png)
 vi. Measure the FPS (Pass)
    - Before: 33
    - After:  33

* Test Device:
  - Hamachi

* Build Information:
  - Gaia      cadddcac2b8ce162a5e27e6dc105557b00a94478
  - Gecko     https://hg.mozilla.org/mozilla-central/rev/b681a6daea3b
  - BuildID   20140428160200
  - Version   32.0a1
--------------------------------------------------------------------------------

Hello~Edward,
Please review my test results/cases then do extra tests if you think I missed something.
Also, please mark as "VERIFIED" if you feel the patch is qualified.
Thanks!
Flags: needinfo?(whsu)
(In reply to William Hsu [:whsu] from comment #89)
> Thanks James and Tony! :)
> It doesn't have any side effect. Good Job!
> 
> * Test Scenario:
>  - Precondition: Set up IMAP and POP3 account on the email app
>  - Scenario:
>   i. Scroll the inbox quickly         (Pass)
>  ii. Scroll the inbox slowly          (Pass)
> iii. Move files then scroll the page  (Pass)
>  iv. Kill the email app while scrolling the page, then check the inbox again
> (Pass)
>   v. Check the implement of visual design (As attachment -
> 2014-04-30-16-06-25.png)
>  vi. Measure the FPS (Pass)
>     - Before: 33
>     - After:  33
> 
> * Test Device:
>   - Hamachi
> 
> * Build Information:
>   - Gaia      cadddcac2b8ce162a5e27e6dc105557b00a94478
>   - Gecko     https://hg.mozilla.org/mozilla-central/rev/b681a6daea3b
>   - BuildID   20140428160200
>   - Version   32.0a1
> -----------------------------------------------------------------------------
> ---
> 
> Hello~Edward,
> Please review my test results/cases then do extra tests if you think I
> missed something.
> Also, please mark as "VERIFIED" if you feel the patch is qualified.
> Thanks!

Thanks guys.  i couple of other things to try out to make sure we didnt regress:
* email search combinations
* save as draft, retrieve draft 
* add / view / delete attachments
* email notifications still work
* save emails to folder, move back to inbox
* scroll a really long email in the body  (ie. bug 996991 didnt regress)
Currently we are not observed side-effect in this build, so I marked it to "VERIFIED".
Status: RESOLVED → VERIFIED
Flags: needinfo?(edchen)
Whiteboard: [c=handeye p= s= u=1.4] [label:email] [UX-?][p=13] → [c=handeye p= s=2014.05.09.t u=1.4] [label:email] [UX-?][p=13]
Depends on: 1009183
Duplicate of this bug: 986699
Whiteboard: [c=handeye p= s=2014.05.09.t u=1.4] [label:email] [UX-?][p=13] → [c=handeye p= s=2014.05.22.t u=1.4] [label:email] [UX-?][p=13]
Whiteboard: [c=handeye p= s=2014.05.22.t u=1.4] [label:email] [UX-?][p=13] → [c=handeye p= s=2014.05.09.t u=1.4] [label:email] [UX-?][p=13]
No longer blocks: 1014278
Depends on: 1014278
No longer blocks: 986699
You need to log in before you can comment on or make changes to this bug.