1127550 - [MobileMessaging] Very slow sending while retrieving messages in a thread

Reporter

Description

•

9 years ago

[Blocking Requested - why for this release]:

STR:
1. have a thread with a lot of messages. The thread must take ~30sec to load completely. Most of the thread loading is invisible, but this is visible when looking at the CPU usage for example.
2. open this thread.
3. send a message right away.

=> Observe that the message is not added in the thread before some seconds.

I've added some logs and I see that we don't get the "sending" event until we retrieved all messages.

I see this on a v2.1 build on my dogfooding device. This is IMO a regression from 2.0 because I never noticed this issue before upgrading my build to v2.1. That's why I'm asking a blocker status.

QAWanted to test on newer builds. If you can't make it work I can try myself with my own data.

NI Bevis to look closer at this issue.

Flags: needinfo?(btseng)

Brogan Zumwalt [Inactive]

Updated

•

9 years ago

QA Contact: bzumwalt

Brogan Zumwalt [Inactive]

Comment 1

•

9 years ago

Unable to reproduce issue on Flame 2.1, 2.2, or 3.0

Sent message in thread containing over 100 messages (including three with over 1000 words of text) while thread was loading after rebooting device. Message sent in less than 4 seconds with "Sending..." message visible from the time the send button is pressed.

Device: Flame 2.1
Build ID: 20150129003733
Gaia: 9df43e6e696da43b6a1433283dc5b155987747bc
Gecko: 095a46f32cd4
Gonk: e7c90613521145db090dd24147afd5ceb5703190
Version: 34.0 (2.1)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

Device: Flame 2.2
Build ID: 20150129003432
Gaia: 6e494f1d2676d231abba7dcc2e2822d1170d2d02
Gecko: 5e6fac01a72f
Gonk: e7c90613521145db090dd24147afd5ceb5703190
Version: 37.0a2 (2.2)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0

Device: Flame 3.0 Master
Build ID: 20150129010239
Gaia: 9d2378a9ef092ab1fc15c3a9f7fc4171aab59d57
Gecko: 6bfc0e1c4b29
Gonk: e7c90613521145db090dd24147afd5ceb5703190
Version: 38.0a1 (3.0)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:38.0) Gecko/38.0 Firefox/38.0

QA Whiteboard: [QAnalyst-Triage?]

Flags: needinfo?(ktucker)

Keywords: qawanted

KTucker [:KTucker][Inactive 3/4/2016]

Updated

•

9 years ago

QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]

Flags: needinfo?(ktucker)

Bevis Tseng[:bevis][:btseng](Exited)

Assignee

Comment 2

•

9 years ago

(In reply to Julien Wajsberg [:julienw] from comment #0)
> [Blocking Requested - why for this release]:
> 
> STR:
> 1. have a thread with a lot of messages. The thread must take ~30sec to load
> completely. Most of the thread loading is invisible, but this is visible
> when looking at the CPU usage for example.
Does that means DOMCursor.continue() was requested continuously unexpectedly even when the message is not visible?
> 2. open this thread.
> 3. send a message right away.
> 
> => Observe that the message is not added in the thread before some seconds.
> 
> I've added some logs and I see that we don't get the "sending" event until
> we retrieved all messages.
If DOMCursor.continue() is requested continuously until all the messages are loaded,
then it is possible that the "sending" event is sent until all messages are retrieved,
because the "sending" event will be triggered after the outing message is saved to the DB.
> 
> I see this on a v2.1 build on my dogfooding device. This is IMO a regression
> from 2.0 because I never noticed this issue before upgrading my build to
> v2.1. That's why I'm asking a blocker status.
> 
> QAWanted to test on newer builds. If you can't make it work I can try myself
> with my own data.
> 
> NI Bevis to look closer at this issue.

Flags: needinfo?(btseng) → needinfo?(felash)

Julien Wajsberg [:julienw]

Reporter

Comment 3

•

9 years ago

Yes we call DOMCursor.continue() continuously (but still in an asynchronous way because this API is asynchronous, right?).

But we do that for a long time, I don't think we changed this between v2.0 and v2.1 :/

Flags: needinfo?(felash) → needinfo?(btseng)

Julien Wajsberg [:julienw]

Reporter

Comment 4

•

9 years ago

(In reply to Brogan Zumwalt [:BroganZ] from comment #1)
> Unable to reproduce issue on Flame 2.1, 2.2, or 3.0

ok I'll test myself.

> 
> Sent message in thread containing over 100 messages (including three with
> over 1000 words of text) while thread was loading after rebooting device.

100 messages is likely not enough. I think I have more thousands of message in this thread. This is not unlikely, when you use a phone there are some people you communicate a lot more with.

> Message sent in less than 4 seconds with "Sending..." message visible from
> the time the send button is pressed.

4 seconds is a lot, it should be less than 1 sec, and even less than 500ms.

Bevis Tseng[:bevis][:btseng](Exited)

Assignee

Comment 5

•

9 years ago

(In reply to Julien Wajsberg [:julienw] from comment #3)
> Yes we call DOMCursor.continue() continuously (but still in an asynchronous
> way because this API is asynchronous, right?).
> 
> But we do that for a long time, I don't think we changed this between v2.0
> and v2.1 :/

Is it possible to have the same messages loaded in v2.0 for further clarification?
This seems not reproducible by QA per comment comment1. :(
It will be helpful to know whether this is a regression or not.

Anyway, I'll try to see if I can reproduce this locally as well.

Flags: needinfo?(btseng)

Julien Wajsberg [:julienw]

Reporter

Comment 6

•

9 years ago

Actually I used the same message set in v2.0 than in v2.1 (my personal data set).

I'll see if I can reproduce easily with one of the performance workload, NI me.

Flags: needinfo?(felash)

Bevis Tseng[:bevis][:btseng](Exited)

Assignee

Comment 7

•

9 years ago

(In reply to Julien Wajsberg [:julienw] from comment #4)
> (In reply to Brogan Zumwalt [:BroganZ] from comment #1)
> > Unable to reproduce issue on Flame 2.1, 2.2, or 3.0
> 
> ok I'll test myself.
> 
> > 
> > Sent message in thread containing over 100 messages (including three with
> > over 1000 words of text) while thread was loading after rebooting device.
> 
> 100 messages is likely not enough. I think I have more thousands of message
> in this thread. This is not unlikely, when you use a phone there are some
> people you communicate a lot more with.
> 
> > Message sent in less than 4 seconds with "Sending..." message visible from
> > the time the send button is pressed.
> 
> 4 seconds is a lot, it should be less than 1 sec, and even less than 500ms.

It could be reasonable to have message "sent" in 4 seconds according to the network/modem condition.
But, for "sending", as you said it should be displayed as soon as possible after the outgoing message is saved.

Bevis Tseng[:bevis][:btseng](Exited)

Assignee

Comment 8

•

9 years ago

Set owner to myself and take this change to deeply review if there is any performance issue in current MobileMessageDB implementation. :)

Assignee: nobody → btseng

Julien Wajsberg [:julienw]

Reporter

Comment 9

•

9 years ago

(In reply to Bevis Tseng[:bevistseng][:btseng] from comment #7)

> > > Message sent in less than 4 seconds with "Sending..." message visible from
> > > the time the send button is pressed.
> > 
> > 4 seconds is a lot, it should be less than 1 sec, and even less than 500ms.
> 
> It could be reasonable to have message "sent" in 4 seconds according to the
> network/modem condition.
> But, for "sending", as you said it should be displayed as soon as possible
> after the outgoing message is saved.

ah yeah right, I misread the sentence. "sending" should be displayed quickly.

Julien Wajsberg [:julienw]

Reporter

Comment 10

•

9 years ago

I can't really reproduce on my flames, I reproduce only on my Open C. But I didn't use the same dataset, so I'll investigate more on monday.

Bevis Tseng[:bevis][:btseng](Exited)

Assignee

Comment 11

•

9 years ago

I am not able to reproduce this in normal use cases with more than 3K messages of a single thread in both v2.0 & m-c.

Hence, it doesn't seems to be a regression, IMO.

However, when enabling MobileMessageDB logs(which cause device performance lower), I found a bottleneck in MobileMessageDB that allows me to reproduce the symptom in comment 0 only if user send a SMS immediately right after entering a message thread.
(The delay I saw was less than 3 seconds after send button was pressed.)

So far, I'd like to summarize what I have found here for further reference/discussion:
1. In current implementation of |createMessageCursor|, there are 2 phases to be done as followed:
a) Collect |ALL| message ids of selected threads.
This requires a dedicated transaction to sms object store and this transaction will be completed
only after all message ids are loaded.
b) After the invocation of |getMessages| with first |continue| triggered internally,
whenever a message id is collected and a DOMCursor.continue() has been requested,
another transaction will be created to retrieve the specified message or more if |readAhead| is
configured.
2. The bottleneck is obvious in step a), if there is a WRITE request to the DB like sending a SMS.
3. To remove the bottleneck in a), there are 2 approaches to collect message ids on demand instead of
retrieving them all:
a) Define a threshold to retrieve new ids if the number of |retrieved ids| is less than this threshold:
- we have to know the number of messages in step a) which seems achievable by |IDBIndex.count|. [1]
- Instead of getting all message ids, whenever the number of collected ids is reduced than this
threshold, we start the retrieve the rest of message until the threshold is met or no more to
retrieved.
b) reuse the |SmsFilter| with start/end timestamp to load the message in the range of requested
timestamps.
- This requires changes in both UX/Gaia/Gecko.
- In gecko, currently, we don't support compound queries, i.e. define index with multiple keys
like |threadId| and |timestamp| which seems achievable by |IDBObjectStore::createIndex| with
multiple keypathes applied. [2]
- In UX/Gaia, we could consider to load the messages in a thread within 1 week, 1 month, etc.

NI Julien to double confirm if this finding is the root cause of the symptom in comment 0:
1. How many messages are stored in the same thread?
2. Is it only reproducible if user send the sms immediately right after selecting a thread?
(IMO, this is not a normal use case, because normal user always takes time to input something before
sending, unless there's already a draft to be sent.)
3. Anything other suspicious points?

If the problem looks different from your side, may I have your help to collect adb logcat with DEBUG = true in MobileMessageDB.jsm for further analysis?
https://dxr.mozilla.org/mozilla-central/source/dom/mobilemessage/gonk/MobileMessageDB.jsm#25

Thanks!

[1] https://developer.mozilla.org/en-US/docs/Web/API/IDBIndex.count
[2] http://www.w3.org/TR/IndexedDB/#widl-IDBObjectStore-createIndex-IDBIndex-DOMString-name-DOMString-sequence-DOMString--keyPath-IDBIndexParameters-optionalParameters

Julien Wajsberg [:julienw]

Reporter

Comment 12

•

9 years ago

(In reply to Bevis Tseng[:bevistseng][:btseng] from comment #11)
> 
> However, when enabling MobileMessageDB logs(which cause device performance
> lower), I found a bottleneck in MobileMessageDB that allows me to reproduce
> the symptom in comment 0 only if user send a SMS immediately right after
> entering a message thread. 

> 2. Is it only reproducible if user send the sms immediately right after
> selecting a thread?
>    (IMO, this is not a normal use case, because normal user always takes
> time to input something before 
>     sending, unless there's already a draft to be sent.)

yes, that's exactly this.

I have a lot of messages in my thread (I'll try to count them later, so I'm keeping the needinfo) and it looks like it takes time to load. Maybe my inbox is not retrieved yet either (I also have a lot of different conversations), and maybe both loads compete with each other :/



It looks like a regression for me because I haven't seen this ever when I was dogfooding my Flame v2.0, but I see this while dogfooding the Open C on v2.1.

Bevis Tseng[:bevis][:btseng](Exited)

Assignee

Comment 13

•

9 years ago

(In reply to Julien Wajsberg [:julienw] from comment #12)
> It looks like a regression for me because I haven't seen this ever when I
> was dogfooding my Flame v2.0, but I see this while dogfooding the Open C on
> v2.1.

This is weird to me because there is no only one minor change from v2.0 to v2.1 in MobileMessageDB and
that is bug 878533 which doesn't cause any performance impact. :-(

Since there is no DB version changed from v2.0 to v2.1,
Is it possible to clarify with the same data in Open C on v2.1 to be loaded on your Flame v2.0?

Julien Wajsberg [:julienw]

Reporter

Comment 14

•

9 years ago

This is what I want to try today, yes :)

Bevis Tseng[:bevis][:btseng](Exited)

Assignee

Comment 15

•

9 years ago

BTW, I am also interested in how long the delay of "sending" displayed is with how many messages you have in the same thread. :)
The reason to ask is that I only saw less than 3 second-delay with 3K messages only if I enable the log in MobileMessageDB.jsm in both v2.0 and m-c.

Thanks!

Julien Wajsberg [:julienw]

Reporter

Comment 16

•

9 years ago

I have 169 threads and 1869 messages in the thread where I see the issue.

And I still don't reproduce on my Flame (v2.1).

I just flashed a new version of v2.1 on my Open C, I'll see if I reproduce in the next days. I did a first try and I don't, but maybe the STR is not what I thought initially. Clearing nom for now.

blocking-b2g: 2.1? → ---

Flags: needinfo?(felash)

Wesley Huang [:wesley_huang] (EPM) (NI me)

Updated

•

9 years ago

blocking-b2g: --- → backlog

Whiteboard: [grooming]

Bevis Tseng[:bevis][:btseng](Exited)

Assignee

Updated

•

9 years ago

Blocks: b2g-sms, b2g-mms

Julien Wajsberg [:julienw]

Reporter

Comment 17

•

9 years ago

Bevis, it seems that I don't see this issue anymore now that I use a more recent v2.1 build.

If I see this again I'll reopen but let's close for now.

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → WORKSFORME

Bevis Tseng[:bevis][:btseng](Exited)

Assignee

Comment 18

•

9 years ago

(In reply to Julien Wajsberg [:julienw] from comment #17)
> Bevis, it seems that I don't see this issue anymore now that I use a more
> recent v2.1 build.
> 
> If I see this again I'll reopen but let's close for now.

Sure, thanks for your information! :)

Nobody; OK to take it and work on it

Updated

•

9 years ago

blocking-b2g: backlog → ---

tracking-b2g: --- → backlog

Bugzilla

Quick Search

[MobileMessaging] Very slow sending while retrieving messages in a thread

Categories

(Firefox OS Graveyard :: RIL, defect)

Tracking

(tracking-b2g:backlog, b2g-v2.0 unaffected, b2g-v2.1 affected, b2g-v2.2 ?, b2g-master ?)

People

(Reporter: julienw, Assigned: bevis)

References

Details

(Whiteboard: [grooming])

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Updated

Updated

Comment 17

Comment 18

Updated