893838 - [Messages] Deleting threads and messages takes forever

Reporter

Description

•

11 years ago

Device: Keon
Software: Boot2Gecko 1.0.1.0-prerelease

On deleting multiple message threads, the gray screen with the deleting message and spinner stays forever. We need to long press the home button, and then kill the Messages app to use it again.

The messages are actually deleted, but the spinner stays on screen.

Rick Waldron [:rwaldron]

Assignee

Updated

•

11 years ago

Assignee: nobody → waldron.rick

Josh Matthews [:jdm]

Updated

•

11 years ago

Assignee: waldron.rick → nobody

Rick Waldron [:rwaldron]

Assignee

Comment 1

•

11 years ago

Josh, why did you remove me as owner?

Rick Waldron [:rwaldron]

Assignee

Updated

•

11 years ago

Assignee: nobody → waldron.rick

Josh Matthews [:jdm]

Comment 2

•

11 years ago

Just a Bugzilla quirk; it wasn't intentional.

Rick Waldron [:rwaldron]

Assignee

Comment 3

•

11 years ago

(In reply to Josh Matthews [:jdm] from comment #2)
> Just a Bugzilla quirk; it wasn't intentional.

Ok, cool. I wasn't sure if there was something I was missing :) Thanks!

Rick Waldron [:rwaldron]

Assignee

Updated

•

11 years ago

Summary: Deleting threads takes forever → [SMS][MMS] Deleting threads takes forever

Rick Waldron [:rwaldron]

Assignee

Updated

•

11 years ago

blocking-b2g: --- → leo?

Summary: [SMS][MMS] Deleting threads takes forever → [SMS][MMS] Deleting threads and messages takes forever

Wayne Chang [:wchang]

Comment 5

•

11 years ago

Hi Rick,

This was reported on v1.0.1 but you're nom'ing it for leo?

In that case, would the almost ready patch @ https://bugzilla.mozilla.org/show_bug.cgi?id=890174 cover this bug also?

Flags: needinfo?(waldron.rick)

Rick Waldron [:rwaldron]

Assignee

Comment 6

•

11 years ago

(In reply to Wayne Chang [:wchang] from comment #5)
> Hi Rick,
> 
> This was reported on v1.0.1 but you're nom'ing it for leo?

The problem exists on all branches

> 
> In that case, would the almost ready patch @
> https://bugzilla.mozilla.org/show_bug.cgi?id=890174 cover this bug also?

It covers only part of this this bug, but it's fine to go ahead with bug 890174 to make the deadline.

Flags: needinfo?(waldron.rick)

Wayne Chang [:wchang]

Comment 7

•

11 years ago

(In reply to Rick Waldron from comment #6)
> (In reply to Wayne Chang [:wchang] from comment #5)
> > Hi Rick,
> > 
> > This was reported on v1.0.1 but you're nom'ing it for leo?
> 
> The problem exists on all branches
> 
> > 
> > In that case, would the almost ready patch @
> > https://bugzilla.mozilla.org/show_bug.cgi?id=890174 cover this bug also?
> 
> It covers only part of this this bug, but it's fine to go ahead with bug
> 890174 to make the deadline.

See https://bugzilla.mozilla.org/show_bug.cgi?id=890174#c62
Leo+'ing this bug.

Hi Youngjun, 
as discussed, we will prefer to have this patch which covers more performance issues instead of bug 890174.

blocking-b2g: leo? → leo+

Flags: needinfo?(jjoons79)

Target Milestone: --- → 1.1 QE5

Rick Waldron [:rwaldron]

Assignee

Comment 8

•

11 years ago

Just to check-in: I've confirmed instant "Delete All" on light and medium reference workload, which is 200 and 500 messages respectively. (same for heavy, and x-heavy, but see below)

Instantly deletes all threads:

- if NOT all threads have loaded. 
- if all threads have loaded.

(leaves no threads behind)

The real problem occurs when the user completely closes the app and then re-opens it: If you close the app shortly after attempting to delete a large number of threads, the process of deleting them is terminated. This is completely insane, but there is no way around it, because the underlying message service is pathologically broken for not providing any mechanism to handle storage service level delete requests. No amount of in-app optimization will mitigate this problem (see Borja's points above about "deleteThreads")

The platform/service (ie. the implementation backing navigator.mozMobileMessage) should:

1. mark threads as "deleted" and all of its corresponding messages (at that moment) as "deleted"
2. not send threads or messages that are marked "deleted" to the application.
3. remove the "deleted" mark if a new message arrives for a thread currently marked "deleted" and send to the application. The "deleted" messages stay marked as "deleted". The new message should not be marked "deleted"

Now the platform/service (again, the implementation backing navigator.mozMobileMessage) can wait until a more opportune moment to execute the delete tasks.

Rick Waldron [:rwaldron]

Assignee

Comment 9

•

11 years ago

Note that if a user doesn't actually close the app all the way, just minimizes it, the deleted threads do not return and the perceived "instant delete" is upheld.

Rick Waldron [:rwaldron]

Assignee

Comment 10

•

11 years ago

Attached file Github Pull Request https://github.com/mozilla-b2g/gaia/pull/11053 — Details

Attachment #778211 - Flags: review?(gnarf37)

Borja Salguero [:borjasalguero]

Comment 11

•

11 years ago

As we agreed with Gabriele and other peers in bug https://bugzilla.mozilla.org/show_bug.cgi?id=890174, we *can not* delete all threads directly from DOM. As when retrieving all messages given a thread you can retrieve an error from the cursor, you need to ensure that we are going to delete *only* the threads without an error in the cursor of getMessages. As we have decided to move here for a complete solution, this should be included as well. Details here https://bugzilla.mozilla.org/show_bug.cgi?id=892447. As you are going to be in charge of this, Im gonna mark the other as DUP of this one.

Borja Salguero [:borjasalguero]

Comment 13

•

11 years ago

Comment on attachment 778211 [details] [review]
Github Pull Request https://github.com/mozilla-b2g/gaia/pull/11053

Hi Rick. As we are going to work in a complete solution, please take into account all suggestions agreed in bug https://bugzilla.mozilla.org/show_bug.cgi?id=890174. 

That's why this patch should include the following https://bugzilla.mozilla.org/show_bug.cgi?id=893838#c11. Thanks!

Attachment #778211 - Flags: review?(fbsc)

Rick Waldron [:rwaldron]

Assignee

Comment 14

•

11 years ago

(In reply to Borja Salguero [:borjasalguero] from comment #11)
> As we agreed with Gabriele and other peers in bug
> https://bugzilla.mozilla.org/show_bug.cgi?id=890174, we *can not* delete all
> threads directly from DOM. As when retrieving all messages given a thread
> you can retrieve an error from the cursor, you need to ensure that we are
> going to delete *only* the threads without an error in the cursor of
> getMessages. As we have decided to move here for a complete solution, this
> should be included as well. Details here
> https://bugzilla.mozilla.org/show_bug.cgi?id=892447. As you are going to be
> in charge of this, Im gonna mark the other as DUP of this one.

I've read this bug and I don't see any discussion explaining the rationale for ignoring the user's intention to "Delete all threads" if the getMessages cursor has an error. 

Under what circumstances would there be an error? This doesn't necessarily matter, but I'm curious if there are any known cases, and if so, how can we simulate this error for testing.

Julien Wajsberg [:julienw]

Comment 15

•

11 years ago

(In reply to Rick Waldron from comment #14)
> (In reply to Borja Salguero [:borjasalguero] from comment #11)
> > As we agreed with Gabriele and other peers in bug
> > https://bugzilla.mozilla.org/show_bug.cgi?id=890174, we *can not* delete all
> > threads directly from DOM. As when retrieving all messages given a thread
> > you can retrieve an error from the cursor, you need to ensure that we are
> > going to delete *only* the threads without an error in the cursor of
> > getMessages. As we have decided to move here for a complete solution, this
> > should be included as well. Details here
> > https://bugzilla.mozilla.org/show_bug.cgi?id=892447. As you are going to be
> > in charge of this, Im gonna mark the other as DUP of this one.
> 
> I've read this bug and I don't see any discussion explaining the rationale
> for ignoring the user's intention to "Delete all threads" if the getMessages
> cursor has an error. 

I think the error can happen in the first delete operation (eg: corrupted database) and then we want to stop. I don't expect there could be an error for one thread and not for another thread, but still, this is good practice to handle errors even when they're unexpected ;)

The main concern I have here is that if there is an error what's displayed to the user is different from what is in the database. Therefore the user would think that the messages were first deleted and then appear again, and would think this is a bug.

Probably we should display a banner when there is an error ? I think we're not explaining anything to the user right now either...

Gabriele Svelto [:gsvelto]

Comment 16

•

11 years ago

I've finally had some time to test Rick's PR and the deletion of all messages in the x-heavy workload now happens in less than 5 seconds (at least from the user perspective) which is an excellent result!

Now that's fast enough to make the switch-to-edit-mode transition look bad (it takes a looong time) but I'd say that's material for a follow up.

Julien Wajsberg [:julienw]

Comment 17

•

11 years ago

I'm still not convinced we should do stuff in background without notifiying the user.

Gabriele Svelto [:gsvelto]

Comment 18

•

11 years ago

(In reply to Julien Wajsberg [:julienw] from comment #17)
> I'm still not convinced we should do stuff in background without notifiying
> the user.

I think that's a problem only if this has significant side effects; one thing that comes to mind right now is if the user manually kills the application from the application switcher while the operation is in progress. What happens then? Will only some of the messages be deleted? Similarly it would be interesting to test what happens if we receive or send a new message while the delete operation is ongoing.

Julien Wajsberg [:julienw]

Comment 19

•

11 years ago

The application could be killed by the system too.

But the send/receive case are interesting too. Note that we could receive a sms in the current master behaviour too, I don't know what happens ;)

Gabriele Svelto [:gsvelto]

Comment 20

•

11 years ago

(In reply to Julien Wajsberg [:julienw] from comment #19)
> But the send/receive case are interesting too. Note that we could receive a
> sms in the current master behaviour too, I don't know what happens ;)

In the current system I think that since we're gathering the ids from the selected threads one at a time it would depend on when the message comes and if it belongs to a thread or not:

- If it doesn't belong to any thread it won't be affected so when deletion is over it will be the only message left.

- If it belongs to a thread it depends on when it arrives. If it arrives before the IDs from the relevant thread are read then it will be deleted together with the others, if it arrives later then it wont'.

... or at least, that's the theory, I've never tried myself :)

Julien Wajsberg [:julienw]

Comment 21

•

11 years ago

My guts here is that we should never delete an unread message, unless it was already here when the user checks the thread (which is earlier than when the user hits the "delete" button). Which might be doable or not.

Borja Salguero [:borjasalguero]

Comment 22

•

11 years ago

We should ask Gene, but AFAIK when deleting a thread, if a new message is coming we are going to retrieve a *new* thread id, so take this into account because the threading is not the same before deleting...

Rick Waldron [:rwaldron]

Assignee

Comment 23

•

11 years ago

(In reply to Gabriele Svelto [:gsvelto] from comment #18)
> (In reply to Julien Wajsberg [:julienw] from comment #17)
> > I'm still not convinced we should do stuff in background without notifiying
> > the user.
> 
> I think that's a problem only if this has significant side effects; one
> thing that comes to mind right now is if the user manually kills the
> application from the application switcher while the operation is in
> progress. What happens then? Will only some of the messages be deleted?
> Similarly it would be interesting to test what happens if we receive or send
> a new message while the delete operation is ongoing.

I described a solution here https://bugzilla.mozilla.org/show_bug.cgi?id=893838#c8 but it requires platform changes. There is no way to avoid the outstanding issues without some level of platform support.

Rick Waldron [:rwaldron]

Assignee

Comment 24

•

11 years ago

(In reply to Julien Wajsberg [:julienw] from comment #21)
> My guts here is that we should never delete an unread message, unless it was
> already here when the user checks the thread (which is earlier than when the
> user hits the "delete" button). Which might be doable or not.

This is why I added the Zombie api, to create messages and threads from Zombies that I could generate during the delete process. Newly received messages (that may create threads) are not included in the delete list.

Rick Waldron [:rwaldron]

Assignee

Comment 25

•

11 years ago

(In reply to Julien Wajsberg [:julienw] from comment #17)
> I'm still not convinced we should do stuff in background without notifiying
> the user.

The alternative is blocking the user until the operation is complete, which defeats the purpose of using async operations eg. DOMCursor, don't you think? To me, this is the quintessential use case for async programming: user executes command and carries on interacting with application, meanwhile a process with indeterminate operational latency executes until it has completed. Wouldn't you agree?

Rick Waldron [:rwaldron]

Assignee

Comment 26

•

11 years ago

Corey and I just discussed the possibility of storing a "deleted threads" list in IndexedDB (along with a last updated timestamp) which will allow us to prevent threads from re-rendering when the app is closed and re-opened. If a previously deleted thread re-appears and hasn't been updated since the delete flag was set, don't render it and queue a task to delete it (again)

Julien Wajsberg [:julienw]

Comment 27

•

11 years ago

(In reply to Rick Waldron from comment #25)
> (In reply to Julien Wajsberg [:julienw] from comment #17)
> > I'm still not convinced we should do stuff in background without notifiying
> > the user.
> 
> The alternative is blocking the user until the operation is complete, which
> defeats the purpose of using async operations eg. DOMCursor, don't you
> think? To me, this is the quintessential use case for async programming:
> user executes command and carries on interacting with application, meanwhile
> a process with indeterminate operational latency executes until it has
> completed. Wouldn't you agree?

I thought I wrote this somewhere but can't find it anymore: I think that whenever something happens in the background we should have a notification somewhere.

My proposition was:
- keep the blocking panel in 1.1.
- for 1.2 :
  + think of a generic way to show a background task is happening. (UX work needed)
  + think of a generic way to handle an interrupted background task (a "journal" like in filesystems, which you are suggesting in comment 26)

The feature will _not_ be complete without all this.

now, to me, bug 890174 comment 66 implies that we're good with the current performance for 1.1 and therefore we don't need to do more leo+ work on this topic. Wayne, is it right ? Should we remove leo+ from this bug so that we can quietly work on a more durable solution ?

Flags: needinfo?(wchang)

Rick Waldron [:rwaldron]

Assignee

Comment 28

•

11 years ago

Corey and I discussed this yesterday and after some clarification, I agree. More inline...

(In reply to Julien Wajsberg [:julienw] from comment #27)

> 
> My proposition was:
> - keep the blocking panel in 1.1.

Unfortunate, but I support this

> - for 1.2 :
>   + think of a generic way to show a background task is happening. (UX work
> needed)

Corey suggested reusing the banner that shows other operational indicators (eg. Resize notice, convert notice) 

>   + think of a generic way to handle an interrupted background task (a
> "journal" like in filesystems, which you are suggesting in comment 26)

Sounds good.

> 
> The feature will _not_ be complete without all this.

+1

> 
> now, to me, bug 890174 comment 66 implies that we're good with the current
> performance for 1.1 and therefore we don't need to do more leo+ work on this
> topic. Wayne, is it right ? Should we remove leo+ from this bug so that we
> can quietly work on a more durable solution ?

As long as we're ok with leaving the system in the state it's in, I will support this decision.

Julien Wajsberg [:julienw]

Comment 29

•

11 years ago

(In reply to Rick Waldron from comment #28)
> >   + think of a generic way to show a background task is happening. (UX work
> > needed)
> 
> Corey suggested reusing the banner that shows other operational indicators
> (eg. Resize notice, convert notice) 

Yep that's also what I had in mind, although it's maybe too intrusive if it's displayed for a long time. UX will decide :)

Wayne Chang [:wchang]

Comment 30

•

11 years ago

Just confirmed this with partner. bumping to koi?

> 
> now, to me, bug 890174 comment 66 implies that we're good with the current
> performance for 1.1 and therefore we don't need to do more leo+ work on this
> topic. Wayne, is it right ? Should we remove leo+ from this bug so that we
> can quietly work on a more durable solution ?

blocking-b2g: leo+ → koi?

Flags: needinfo?(wchang)

Julien Wajsberg [:julienw]

Comment 32

•

11 years ago

(In reply to Rick Waldron from comment #28)
> 
> >   + think of a generic way to handle an interrupted background task (a
> > "journal" like in filesystems, which you are suggesting in comment 26)
> 
> Sounds good.
> 

Discussing this with Gabriele, we think it's probably not a good idea to have this in Gaia. Rather, we could think of asking the platform to do an action, and this action will be done asynchronously rather than synchronously like it is now. All the journaling stuff would be hidden inside the platform.

This is actually already how the message sending is done, this could be extended to other actions, notably deletions.

Gene, is it something that sound reasonable ?

Flags: needinfo?(gene.lian)

Mike Lee [:mlee]

Updated

•

11 years ago

Status: NEW → ASSIGNED

Keywords: perf

OS: Linux → Gonk (Firefox OS)

Hardware: x86 → ARM

Whiteboard: [c= ]

Gene Lian (I already quit Mozilla)

Comment 33

•

11 years ago

(In reply to Julien Wajsberg [:julienw] from comment #32) 
> Discussing this with Gabriele, we think it's probably not a good idea to
> have this in Gaia. Rather, we could think of asking the platform to do an
> action, and this action will be done asynchronously rather than
> synchronously like it is now. All the journaling stuff would be hidden
> inside the platform.

The thread is too long to follow all the details but comment #26 seems the final idea we want to achieve. I think if we really want to implement a journaling file system, indeed, this kind of mechanism should be constructed in the place where the database is. That is, put it in the Gecko, not in the Gaia.

> This is actually already how the message sending is done, this could be
> extended to other actions, notably deletions.

So, let me define this problem again to make sure we're on the same stage. The problem definition is: deleting thread would take long time and user (or OOM) would kill the app when the threads are still being deleted. Later, when the app is reopen, getThreads() should return the threads that are still valid instead of the ones that have been deleted.

For example, if we have thread #1, #2, #3 and #4 and we hope to delete #1, #2 and #3. However, deleting thread #2 takes hours to finish. If we kill the app and reopen it at this moment, the ideal thread list should only remain thread #4 even if the thread #3 wasn't started to be deleted before the app is relaunched.

However, from the point of view of Gecko, the DB operations won't get interrupted when the content process is being killed because the DB is maintained in the parent process. Therefore, we don't really need to *restart* the pending delete tasks, since all the delete tasks always keep processing.

What we really need to worry about is getThread() shouldn't return the thread that has been asked to be deleted (thread #3 in the above example) and that's what the jounaling file system is working for. That is, when the Gaia wants to delete certain threads, the platform should *quickly* mark those threads as "to-be-deleted" before the DB really starts to delete them. Later, at any time points, the getThreads() shouldn't return any threads that are marked as "to-be-deleted".

Is that accurate? Please correct me if I'm wrong. Thanks!

Flags: needinfo?(gene.lian)

Corey Frang [:gnarf]

Comment 34

•

11 years ago

Gene,

I think we really want some improvements here in the device API in general - The suggestions I recall are as follows:

1) Make the filter object able to accept an array of thread ids.  I.E.:

var filter = new MozSmsFilter();
filter.threadId = [2, 4, 5]; // give me all messages in threads 2 4 and 5

Currently, we have to make 3 calls to getMessage to get three threads worth of message ids.  If we could reduce the number of calls to getMessage, we gain some improvement.


2) Make the deleteMessages able to take a Filter instead of a list of message ids.  I.E:

navigator.mozMobileMessage.delete(filter); // delete all messages in threads 2 4 and 5

Currently, we have to cursor through every message we want to delete and collect an array of messageIds to pass to delete.  If delete could just take the filter, I'm sure the query could be even further optimized inside gecko.  This would be a big win, even if filters didn't take an array.  3 * delete - instead of - 3 * (getMessage(), cursors), delete()

3) Make the "delete" operation set a 'deleted' flag on the message which will then hide that message from any of the further calls to mozMobileMessage.  The system can then delete at will anything with the deleted flag.  This seems like it would be the fastest path that would stop "phantom threads" from re-appearing after we request deletion.

Having all 3 of these would make delete a very close to instant operation from the gaia side.

Julien Wajsberg [:julienw]

Comment 35

•

11 years ago

If I'm not wrong, I think the long-term plan is to move all the threads mechanism to Gaia, and not keep it in Gecko (bug 898364).

To me, the "journaling" system would kick in when we ask to delete eg 1000 message ids.

Gabriele Svelto [:gsvelto]

Comment 36

•

11 years ago

(In reply to Julien Wajsberg [:julienw] from comment #35)
> If I'm not wrong, I think the long-term plan is to move all the threads
> mechanism to Gaia, and not keep it in Gecko (bug 898364).

Agreed, in fact I'd be willing to work on that if there's enough interest. This would allow us to remove complexity from Gecko and move it into the app while at the same time giving us more flexibility.

> To me, the "journaling" system would kick in when we ask to delete eg 1000
> message ids.

Yes, I would say that any operation affecting the data should be atomic from the app's perspective. This would also mean that all those operations should be async by default and should notify errors appropriately.

Gene Lian (I already quit Mozilla)

Comment 37

•

11 years ago

(In reply to Corey Frang [:gnarf] [:gnarf37] from comment #34)
> 
> 1) Make the filter object able to accept an array of thread ids.  I.E.:
> 
> 2) Make the deleteMessages able to take a Filter instead of a list of
> message ids.

Bug 895734 is already fired to do exactly what the above 2 requirements want. Adding the dependency.

Depends on: 895734

Gene Lian (I already quit Mozilla)

Updated

•

11 years ago

Depends on: 901419

Gene Lian (I already quit Mozilla)

Comment 38

•

11 years ago

(In reply to Julien Wajsberg [:julienw] from comment #35)
> If I'm not wrong, I think the long-term plan is to move all the threads
> mechanism to Gaia, and not keep it in Gecko (bug 898364).

As Jonas mentioned at bug 898364 comment #6. We still need to wait for DataStore API to be finalized and it will still need to take long time. I think we can keep moving on without considering the long-term plan.

> To me, the "journaling" system would kick in when we ask to delete eg 1000
> message ids.

Fire bug 901410 to make the database *journalising*.

Joe Cheng [:jcheng] (please needinfo)

Comment 39

•

11 years ago

flag cleared for koi?. Just land it when it's ready. Thanks

blocking-b2g: koi? → ---

Rick Waldron [:rwaldron]

Assignee

Updated

•

11 years ago

Summary: [SMS][MMS] Deleting threads and messages takes forever → [Messages] Deleting threads and messages takes forever

Julien Wajsberg [:julienw]

Comment 40

•

11 years ago

I'll have a look at it again, do some profile again, and see if we can move forward here.

Flags: needinfo?(felash)

Rick Waldron [:rwaldron]

Assignee

Updated

•

11 years ago

Blocks: 905075

Julien Wajsberg [:julienw]

Updated

•

11 years ago

No longer blocks: 905075

Rick Waldron [:rwaldron]

Assignee

Updated

•

11 years ago

No longer blocks: 905075

Caio Lima(:caiolima)

Comment 42

•

11 years ago

Hey guys, How about the extreme situation when the Messages are being deleted and the device turn off?

Nobody; OK to take it and work on it

Updated

•

11 years ago

Attachment mime type: text/plain → text/x-github-pull-request

Julien Wajsberg [:julienw]

Comment 43

•

11 years ago

That's one of the reasons why I don't like the current patch and request doing a new profile.

Flags: needinfo?(felash)

Julien Wajsberg [:julienw]

Updated

•

11 years ago

Flags: needinfo?(felash)

Julien Wajsberg [:julienw]

Comment 44

•

11 years ago

Gabriele, would you find time to do a profile for this, again ? Deleting only 1 thread and 1 message in a thread ?

I'd like to see if we can optimize the code more, or if we "need" to do a UI workaround.

Flags: needinfo?(gsvelto)

Corey Frang [:gnarf]

Updated

•

11 years ago

Attachment #778211 - Flags: review?(gnarf37)

Gabriele Svelto [:gsvelto]

Comment 45

•

11 years ago

Here's a profile for the "delete thread" case using the medium reference workload:

http://people.mozilla.org/~bgirard/cleopatra/#report=40dba8662ae51c8cabaabf751a0fc43a30fa286a

The relevant sample range is [52197, 53988] which accounts for the delete operation alone (~1.8s). Interpreting the profile is non-trivial because it shows the overlapping effect of the deletion itself and the spinner being animated at the same time. The time spent actually retrieving the messages to be deleted is 350ms (you can notice a large IndexedDB call). If you look carefully you'll notice that this call is intermixed with paint events, each one taking ~50ms: that's the spinner. Once the call is finished ThreadListUI::thlui_delete() removes the thread node from the view and triggers a full re-layout (CSS style flush, followed by a re-flow and then re-paint).

This is all OK-ish but it degenerates when the user has a lot of threads in which case the final re-layout can take a lot of time. As previously discussed the only way to solve this is to display only a subset of the threads at a time and add more as the user scrolls.

I'll post the "delete message" case - which is more interesting - soon.

Flags: needinfo?(gsvelto)

Gabriele Svelto [:gsvelto]

Comment 46

•

11 years ago

Here's the second part of my testing. Using the same setup as before I've entered a small mixed SMS/MMS thread (8 messages in total) and deleted one message from it. The profile is here:

http://people.mozilla.org/~bgirard/cleopatra/#report=f1218b9604f713640ae48d6fbbd4e168864d62ac

The relevant interval is [709070, 717598]; as you can see this lasts almost 8s (!). The early stages of the profile are fairly similar to the "delete thread" case with a small delay caused by waiting for the DB to return all the threads after the deletion (~600ms) but right after things turn real ugly.

Starting roughly from sample 710500 you can see a long chain of CSS styles flushes each followed by a re-flow which keeps going until the process quiets down again. This is caused by MessageManager.getThreads() which we invoke right after deleting the message, here's the culprit:

https://github.com/mozilla-b2g/gaia/blob/48599c4db185a973cbcb3b0d1606d663480f8c46/apps/sms/js/thread_ui.js#L1428

MessageManager.getThreads() will start reading back all messages in the databaseand will re-populate the screen. This would be pretty bad in and by itself since we're re-reading all the threads, re-laying out them and then discarding them again and going back to the current thread view. However it's made even worse by the fact that ThreadListUI::renderThreads() appends one thread at a time scheduling the next one using setTimeout(). I assume this was made to make the view responsive to user commands while appending large number of threads but in this cases it works against us allowing CSS/layout flushes to happen in-between each addition even if the view is not visible and slow down enormously the trivial operation of getting rid of a single message.

Julien Wajsberg [:julienw]

Comment 47

•

11 years ago

so:

* the behavior with setTimeout is the same than what we have with cursors in bug 903763
* bug 854417 is about not clearing and retrieving the whole list each time we need to change one thing there
* bug 859730 is about not rendering all the threads

All these are already quite bad, but what strikes me in what Gabriele is saying is that _we're waiting that all the threads are rendered before removing the overlay_ !!

The problem is at https://github.com/mozilla-b2g/gaia/blob/master/apps/sms/js/thread_ui.js#L1427-L1443

I think Rick's patch was tackling this problem but was a little too agressive (ie: removed the overlay too early for my taste) and I think we can have a middle ground here.

Julien Wajsberg [:julienw]

Comment 48

•

11 years ago

The setTimeout issue could be dealt with in bug 865743 (thanks Steve for reminding me this bug).

Flags: needinfo?(felash)

Rick Waldron [:rwaldron]

Assignee

Comment 49

•

11 years ago


(In reply to Julien Wajsberg [:julienw] from comment #47)
> so:
> 
> * the behavior with setTimeout is the same than what we have with cursors in
> bug 903763
> * bug 854417 is about not clearing and retrieving the whole list each time
> we need to change one thing there
> * bug 859730 is about not rendering all the threads
> 
> All these are already quite bad, but what strikes me in what Gabriele is
> saying is that _we're waiting that all the threads are rendered before
> removing the overlay_ !!
> 
> The problem is at
> https://github.com/mozilla-b2g/gaia/blob/master/apps/sms/js/thread_ui.
> js#L1427-L1443
> 
> I think Rick's patch was tackling this problem but was a little too
> agressive (ie: removed the overlay too early for my taste) and I think we
> can have a middle ground here.

Yes, I'm definitly willing to re-work the system for a better experience.

Leo

Updated

•

10 years ago

Flags: needinfo?(jjoons79)

Mike Lee [:mlee]

Updated

•

10 years ago

Priority: -- → P3

Whiteboard: [c= ] → [c=progress p= s= u=]

Target Milestone: 1.1 QE5 → ---

Gabriele Svelto [:gsvelto]

Updated

•

10 years ago

Depends on: messaging-perf

Gabriele Svelto [:gsvelto]

Updated

•

10 years ago

Blocks: messaging-perf

No longer depends on: messaging-perf

Doug Sherk (:drs) (inactive)

Comment 50

•

10 years ago

Gabriele, Julien, Vivien, and I all talked about this over IRC for a bit. Here are some more up to date profiles for deleting a single message, with a couple of performance fixes I did (see bug 994865 and bug 994818):

b2g process: http://people.mozilla.org/~bgirard/cleopatra/#report=2180a3bb78f5f985bc95c5ac13176635690206c6
SMS process: http://people.mozilla.org/~bgirard/cleopatra/#report=bd08b690e8e0992545b94026b1603bf4c07c4c37

Gabriele and I agree that there's not much we can do here in terms of reflows and repainting. The reflows happening right after we begin deleting the message are caused by the removal of the deleted messages from the DOM. The painting here comes from a combination of the animated spinner and the repainting after the aforementioned reflow.

Once the message deletion is done, we then have to reflow and repaint a bunch because we're switching from edit->view mode for this thread. Once again, I'm not sure there's much we can do here. After we talked a bit, a few possibilities we came up with are:

1) We can clean up the CSS, improve our selectors, etc, since the time taken during reflows is dominated by CSS calculations.
2) We can move the switch from edit->view modes so that it happens while we have the "in progress" modal up. This would actually worsen perceptible performance, but it would make it more responsive after the messages are deleted, and allow us to take advantage of the idle state between IPC calls.
3) We could implement comment 26. I really like this idea.

I'm going to try all of these in the order that I listed.

Doug Sherk (:drs) (inactive)

Updated

•

10 years ago

Depends on: 994818, 994865

Julien Wajsberg [:julienw]

Comment 51

•

10 years ago

Re: 3, I still think that we need to show something to the user if we start doing the background task way, and so UX needs to be involved first. I agree this is a great idea but needs refinement.

Doug Sherk (:drs) (inactive)

Comment 52

•

10 years ago

(In reply to Julien Wajsberg [:julienw] from comment #51)
> Re: 3, I still think that we need to show something to the user if we start
> doing the background task way, and so UX needs to be involved first. I agree
> this is a great idea but needs refinement.

Why? If killing the app doesn't cancel it, then why would the user need to know that it's in progress? The only reason I can think of is because they might want to know why performance has degraded, but we spend a lot of time in IPC and idle during deletions, so it shouldn't actually be that bad.

Julien Wajsberg [:julienw]

Comment 53

•

10 years ago

All I'm saying is that we need UX involved, maybe UX will say like you :)

Now, I don't exactly see why it would be that longer to delete in the SMS db than to add something in a local db.

Gabriele Svelto [:gsvelto]

Comment 54

•

10 years ago

(In reply to Julien Wajsberg [:julienw] from comment #53)
> Now, I don't exactly see why it would be that longer to delete in the SMS db
> than to add something in a local db.

The SMS db in Gecko has some seriously heavy-weight code last time I checked and IIRC an in-app IndexedDB database wouldn't incur in the IPC overhead either so we should do a fair bit better.

Doug Sherk (:drs) (inactive)

Comment 55

•

10 years ago

(In reply to Julien Wajsberg [:julienw] from comment #53)
> All I'm saying is that we need UX involved, maybe UX will say like you :)
> 
> Now, I don't exactly see why it would be that longer to delete in the SMS db
> than to add something in a local db.

Who should we be flagging for this?

Doug Sherk (:drs) (inactive)

Updated

•

10 years ago

Depends on: 996776

Doug Sherk (:drs) (inactive)

Comment 56

•

10 years ago

(In reply to Doug Sherk (:drs) from comment #50)
> 1) We can clean up the CSS, improve our selectors, etc, since the time taken
> during reflows is dominated by CSS calculations.

Filed bug 996776 for this.

Julien Wajsberg [:julienw]

Comment 57

•

10 years ago

(In reply to Doug Sherk (:drs) from comment #56)
> (In reply to Doug Sherk (:drs) from comment #50)
> > 1) We can clean up the CSS, improve our selectors, etc, since the time taken
> > during reflows is dominated by CSS calculations.

Note that this also proves we have too much DOM ;)

This is my next work after bug 881469: start using a local db to be able to have less DOM and generate it on demand.

Doug Sherk (:drs) (inactive)

Updated

•

10 years ago

Depends on: 997391

Doug Sherk (:drs) (inactive)

Comment 58

•

10 years ago

bug 996776 was a bust (see bug 996776 comment 6), so I'm moving onto this:

(In reply to Doug Sherk (:drs) from comment #50)
> 2) We can move the switch from edit->view modes so that it happens while we
> have the "in progress" modal up. This would actually worsen perceptible
> performance, but it would make it more responsive after the messages are
> deleted, and allow us to take advantage of the idle state between IPC calls.

Filed bug 997391.

My intuition tells me that this also won't give us very good gains, if any at all, so I expect that we're only going to get serious benefits when we get to 3).

Doug Sherk (:drs) (inactive)

Comment 59

•

10 years ago

(In reply to Doug Sherk (:drs) from comment #50)
> 3) We could implement comment 26. I really like this idea.

Filed bug 997420 for this.

Depends on: 997420

Julien Wajsberg [:julienw]

Comment 60

•

10 years ago

Doug, we need to separate the 2 cases:
1. deleting threads from the inbox
2. deleting messages from the thread view

For 1), currently, deleting a thread is costly because we need to retrieve all messages in that thread. So maybe a "deleteThread" gecko API would help a lot here?

Fo we have a profile for 2) ?

Flags: needinfo?(drs+bugzilla)

Doug Sherk (:drs) (inactive)

Comment 61

•

10 years ago

We noticed a while ago that the spinner was taking a lot of time painting, so I tried removing it. Here are the results:

with: 1112 ms painting
without: 934 ms painting

I don't think this is worth removing it over. I tried adding |will-change: animation| to it to layerize it, but this had no noticeable impact, and it might have already been getting layerized anyways based on my looking at the paint flashing tool.

Doug Sherk (:drs) (inactive)

Comment 62

•

10 years ago

(In reply to Julien Wajsberg [:julienw] from comment #60)
> Doug, we need to separate the 2 cases:
> 1. deleting threads from the inbox
> 2. deleting messages from the thread view
> 
> For 1), currently, deleting a thread is costly because we need to retrieve
> all messages in that thread. So maybe a "deleteThread" gecko API would help
> a lot here?
> 
> Fo we have a profile for 2) ?

The profiles I've posted are of 2). I agree that a "deleteThread" API may help here.

Flags: needinfo?(drs+bugzilla)

Doug Sherk (:drs) (inactive)

Comment 63

•

10 years ago

I'm pretty thoroughly confused now because the b2g process is effectively idle when we're deleting a message. It's doing a small amount of work for about 200ms, so there must be a ton of overhead or latency here. Fabrice says that Ben T has diagnosed similar problems with indexedDB code. I haven't read nor do I understand the internal workings of the SMS database, but indexedDB is showing up in the profiles. More to come.

Doug Sherk (:drs) (inactive)

Comment 64

•

10 years ago

OK, today I learned that the SMS DB is actually a specialized indexedDB exposed by a Web API. I talked with Ben T who helped me analyze the B2G main process profile. He said that the indexedDB transaction, from the point of the main process receiving the request to delete a message, to the point that the message is deleted and a message sent back to the SMS app process, is about 500ms. This is not enough to account for the slowness that we're seeing. He suggested going back to the SMS app profile. :(

Doug Sherk (:drs) (inactive)

Comment 65

•

10 years ago

I noticed that switching from edit->view mode once message deletion is finished is a pretty significant hit to our performance, in the order of 4 s -> 0.75 s. Filed bug 999095.

Depends on: 999095

Borja Salguero [:borjasalguero]

Updated

•

10 years ago

Attachment #778211 - Flags: review?(borja.bugzilla)

Oleg Zasypkin [:azasypkin]

Updated

•

10 years ago

Depends on: 1053952

Julien Wajsberg [:julienw]

Comment 66

•

7 years ago

Mass closing of Gaia::SMS bugs. End of an era :(

Status: ASSIGNED → RESOLVED

Closed: 7 years ago

Resolution: --- → WONTFIX

Julien Wajsberg [:julienw]

Comment 67

•

7 years ago

Mass closing of Gaia::SMS bugs. End of an era :(