Closed
Bug 964228
Opened 11 years ago
Closed 11 years ago
Race condition in NetworkStatsService.jsm
Categories
(Core :: Networking, defect)
Tracking
()
People
(Reporter: albert, Assigned: johnshih.bugs)
References
Details
Attachments
(7 files, 9 obsolete files)
35.55 KB,
text/x-csrc
|
Details | |
18.58 KB,
patch
|
johnshih.bugs
:
review+
|
Details | Diff | Splinter Review |
14.34 KB,
patch
|
albert
:
review+
|
Details | Diff | Splinter Review |
9.99 KB,
patch
|
albert
:
review+
|
Details | Diff | Splinter Review |
17.40 KB,
patch
|
johnshih.bugs
:
review+
fabrice
:
approval-mozilla-b2g28+
|
Details | Diff | Splinter Review |
18.60 KB,
patch
|
Details | Diff | Splinter Review | |
27.13 KB,
patch
|
Details | Diff | Splinter Review |
As seen in bug 961162 there is a race condition in 'saveStats' and 'updateCachedStats' functions. Is necessary to add a mechanism to ensure single or synchronized access to 'cachedStats' array.
Reporter | ||
Comment 1•11 years ago
|
||
Race condition can be identified because cachedStats array is cleared before calling _db.saveStats and produces the following error:
E/GeckoConsole( 3649): [JavaScript Error: "aStats is undefined" {file: "resource://gre/modules/NetworkStatsDB.jsm" line: 258}]
Comment 3•11 years ago
|
||
FYI:
I obtain the same error when I browse with Gecko-c5866e0 and Gaia-bb2e2f3:
E/GeckoConsole( 822): [JavaScript Error: "aStats is undefined" {file: "resource://gre/modules/NetworkStatsDB.jsm" line: 354}]
Regards
Comment 4•11 years ago
|
||
Occurs the same on v1.3 branch, after browse, when open datausage, I saw the same error:
E/GeckoConsole( 477): [JavaScript Error: "aStats is undefined" {file: "resource://gre/modules/NetworkStatsDB.jsm" line: 234}]
E/GeckoConsole( 477): [JavaScript Error: "aStats is undefined" {file: "resource://gre/modules/NetworkStatsDB.jsm" line: 234}]
status-b2g-v1.3:
--- → affected
Comment 5•11 years ago
|
||
I've been seeing occasionally this bug with the following STR:
1. Browse
2. Starts the data usage app.
Actual behaviour:
After launching the usage app, the data usage chart is not displayed.
Expected behavior:
Usage app can display data usage chart
After the error is produced if you reopen the application the behaviour is normal.
I tried to reproduce several times this bug and it doesn't occurs always, approximately the twenty percent of the time with the previous STR.
Regards
Sorry I was stuck on another emergency blocker this week. Will work on this bug ASAP. (As I said in Comment 2, I believe this bug will be covered by the work in Bug 928289, which is going to re-factor some logic of NetworkStatsService).
Comment 8•11 years ago
|
||
Please do not rely on the feature Bug 928289. 1.3+
blocking-b2g: 1.3? → 1.3+
In the patch I en-queue all the operations related to DB/cached read&write and It works for me.
But since I don't have a good STR to reproduce the race condition, I might need you or QA can help on test this solution.
Attachment #8384437 -
Flags: review?(gene.lian)
Attachment #8384437 -
Flags: review?(acperez)
Assignee | ||
Comment 10•11 years ago
|
||
Could you help on testing with the patch (or provide a good STR to reproduce the bug)? Actually I can't reproduce the bug on my own v1.3 build. But I still try to refactor the logic in NetworkStatsService and hope can solve the problem.
Flags: needinfo?(mri)
Comment 11•11 years ago
|
||
Hi,
i don't mind testing the patch. The STR to reproduce is explained on Comment 5.
1. Browse with the navigator.
2. Minimize the browser, although the page is not completely loaded.
3. Launchs Costcontrol application.
Anyway, I'm going to test now, and i'll give you my feedback ;).
Regards,
Mai
Flags: needinfo?(mri)
Assignee | ||
Comment 12•11 years ago
|
||
(In reply to marina rodríguez [:mai] from comment #11)
> Hi,
> i don't mind testing the patch. The STR to reproduce is explained on Comment
> 5.
> 1. Browse with the navigator.
> 2. Minimize the browser, although the page is not completely loaded.
> 3. Launchs Costcontrol application.
>
> Anyway, I'm going to test now, and i'll give you my feedback ;).
>
> Regards,
> Mai
Thanks for your help!
Reporter | ||
Comment 13•11 years ago
|
||
Comment on attachment 8384437 [details] [diff] [review]
(v1.3) Bug 964228: Fix race condition in NetworkStatsService.jsm. r=gene, albert
Review of attachment 8384437 [details] [diff] [review]:
-----------------------------------------------------------------
I can reproduce the bug in 1.3 and it is fixed with your patch
Thanks John!
::: dom/network/src/NetworkStatsService.jsm
@@ +549,5 @@
> }
>
> + // Prevent from executing processQueue multiple times while queue is already
> + // under process
> + if (this.isQueueRunning) {
'if' not needed, same 'if' statement is evaluated in processQueue function.
@@ +702,5 @@
> + queueType: QUEUE_TYPE_WRITE_CACHE });
> +
> + // Prevent from executing processQueue multiple times while queue is already
> + // undger process.
> + if (this.isQueueRunning) {
ditto
@@ +779,5 @@
> + this.updateQueue.push({ callbacks: [aCallback],
> + queueType: QUEUE_TYPE_UPDATE_CACHE });
> + // Prevent from executing processQueue multiple times while queue is already
> + // undger process.
> + if (this.isQueueRunning) {
ditto
Attachment #8384437 -
Flags: review?(acperez) → review+
Assignee | ||
Comment 14•11 years ago
|
||
(In reply to Albert [:albert] from comment #13)
> Comment on attachment 8384437 [details] [diff] [review]
> (v1.3) Bug 964228: Fix race condition in NetworkStatsService.jsm. r=gene,
> albert
>
> Review of attachment 8384437 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> I can reproduce the bug in 1.3 and it is fixed with your patch
>
> Thanks John!
>
> ::: dom/network/src/NetworkStatsService.jsm
> @@ +549,5 @@
> > }
> >
> > + // Prevent from executing processQueue multiple times while queue is already
> > + // under process
> > + if (this.isQueueRunning) {
>
> 'if' not needed, same 'if' statement is evaluated in processQueue function.
>
> @@ +702,5 @@
> > + queueType: QUEUE_TYPE_WRITE_CACHE });
> > +
> > + // Prevent from executing processQueue multiple times while queue is already
> > + // undger process.
> > + if (this.isQueueRunning) {
>
> ditto
>
> @@ +779,5 @@
> > + this.updateQueue.push({ callbacks: [aCallback],
> > + queueType: QUEUE_TYPE_UPDATE_CACHE });
> > + // Prevent from executing processQueue multiple times while queue is already
> > + // undger process.
> > + if (this.isQueueRunning) {
>
> ditto
Yes, I also notice the duplicated logic here. But when I remove these check and test, the successive processQueue call (without callback) could occur, which doesn't happen with these check. I'm guessing there is a timing issue or other problems. Will investigate more on it tomorrow.
Assignee | ||
Comment 15•11 years ago
|
||
Patch update: remove redundant 'if' statement.
Attachment #8384437 -
Attachment is obsolete: true
Attachment #8384437 -
Flags: review?(gene.lian)
Attachment #8385143 -
Flags: review?(gene.lian)
Assignee | ||
Comment 16•11 years ago
|
||
Hi Albert,
After investigating deeper on the patch, I found another timing issue:
the error message is
E/GeckoConsole( 520): [JavaScript Error: "item is undefined" {file: "resource://gre/modules/NetworkStatsService.jsm" line: 584}]
It is caused by while a queue is under process (with multiple callbacks), another processQueue() is called. In this case, the processQueue() will be called one more time.
Please see the log below for more clear:
I/Gecko ( 520): jshih 4 processQueue start undefined
I/Gecko ( 520): jshih 4 queueType 0
I/Gecko ( 520): jshih 5 processQueue start true
I/Gecko ( 520): jshih 5 queue [{"netId":"00","callbacks":[null,null,null],"queueType":0}]
* Begins to handle execute callbacks
I/Gecko ( 520): jshih 6 processQueue start undefined
* Another processQueue is called
I/Gecko ( 520): jshih 6 queueType 1
I/Gecko ( 520): jshih 7 processQueue start undefined
I/Gecko ( 520): jshih 7 queue is running (length > 1). return.
I/Gecko ( 520): jshih 8 processQueue start undefined
I/Gecko ( 520): jshih 8 queue is running (length > 1). return.
I/Gecko ( 520): jshih 5 queueType 1
* Previous callbacks handling is ended. Cause executing processQueue one more time.
I/Gecko ( 520): jshih 9 processQueue start true
I/Gecko ( 520): jshih 9 queue [{"callbacks":[null],"queueType":1},{"callbacks":[null],"queueType":1},{"callbacks":[null],"queueType":1}]
I/Gecko ( 520): jshih 9 queueType 1
I/Gecko ( 520): jshih 10 processQueue start true
I/Gecko ( 520): jshih 10 queue [{"callbacks":[null],"queueType":1},{"callbacks":[null],"queueType":1}]
I/Gecko ( 520): jshih 10 queueType 1
I/Gecko ( 520): jshih 11 processQueue start true
I/Gecko ( 520): jshih 11 queue [{"callbacks":[null],"queueType":1}]
I/Gecko ( 520): jshih 11 turn isQueueRunning to false. return.
I/Gecko ( 520): jshih 12 processQueue start true
I/Gecko ( 520): jshih 12 queue []
* Error occurs, since there is no |callbacks| here.
What do you think on this case? Compare to my original method (add more check of isQueueRunning before each processQueue is called), I think the easy way might be adding a check (updateQueue is empty or not) before
let item = this.updateQueue.shift();
Assignee | ||
Comment 17•11 years ago
|
||
Debug message for processQueue
Assignee | ||
Comment 18•11 years ago
|
||
>
> What do you think on this case? Compare to my original method (add more
> check of isQueueRunning before each processQueue is called), I think the
> easy way might be adding a check (updateQueue is empty or not) before
> let item = this.updateQueue.shift();
One more thought! I think we can just change
if(this.updateQueue.length > 1)
to
if(this.updateQueue.length > 0)
Assignee | ||
Comment 19•11 years ago
|
||
Attachment #8385143 -
Attachment is obsolete: true
Attachment #8385143 -
Flags: review?(gene.lian)
Attachment #8385214 -
Flags: review?(gene.lian)
Attachment #8385214 -
Flags: review?(acperez)
Assignee | ||
Comment 20•11 years ago
|
||
patch for mozilla-central
Attachment #8385218 -
Flags: review?(gene.lian)
Attachment #8385218 -
Flags: review?(acperez)
status-b2g-v1.3T:
--- → affected
Assignee | ||
Comment 21•11 years ago
|
||
The modification here will be affected by Bug 922926, Bug 964270, which were tagged as 1.3T+. Ask for triage.
Assignee | ||
Comment 23•11 years ago
|
||
(In reply to Albert [:albert] from comment #22)
> John how do you reproduce it?
Firstly you should remove the all the 'if' statements you mentioned in my previous patch, and then remain
if(this.updateQueue.length > 1)
The STR is simple, just open Usage app (mostly occur in the first launch time)
the alternative way is
1. Open a browser
2. Navigate to a web (find one needs to load for a while, e.g., 'tw.yahoo.com', which I always test with)
3. During wed loading, press home button
4. Launch Usage app
It's easily reproducible for me.
Reporter | ||
Comment 24•11 years ago
|
||
Comment on attachment 8385214 [details] [diff] [review]
(v1.3) Bug 964228: Fix race condition in NetworkStatsService.jsm. r=gene, albert
Review of attachment 8385214 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good for me.
::: dom/network/src/NetworkStatsService.jsm
@@ +578,5 @@
> // The caller is a function that has pushed new elements to the queue,
> // if isQueueRunning is false it means there is no processing currently
> // being done, so start.
> if (this.isQueueRunning) {
> + if(this.updateQueue.length > 0) {
Now that we are checking that length > 0, does make sense to have two if statements? I mean, we can return if isQueueRunning. Are you agree?
Attachment #8385214 -
Flags: review?(acperez) → review+
Reporter | ||
Comment 25•11 years ago
|
||
Comment on attachment 8385218 [details] [diff] [review]
Bug 964228: Fix race condition in NetworkStatsService.jsm. r=gene, albert
Review of attachment 8385218 [details] [diff] [review]:
-----------------------------------------------------------------
Looks fine.
Same comment as the one for 1.3 patch regarding the if in processQueue.
Can you launch a try to ensure nothing is broken?
Attachment #8385218 -
Flags: review?(acperez) → review+
Assignee | ||
Comment 26•11 years ago
|
||
Update based on Comment 24.
Attachment #8385214 -
Attachment is obsolete: true
Attachment #8385214 -
Flags: review?(gene.lian)
Attachment #8385806 -
Flags: review?(gene.lian)
Assignee | ||
Comment 27•11 years ago
|
||
update based on Comment 25
Attachment #8385218 -
Attachment is obsolete: true
Attachment #8385218 -
Flags: review?(gene.lian)
Attachment #8385807 -
Flags: review?(gene.lian)
Assignee | ||
Comment 28•11 years ago
|
||
Fix some errors founded by testing.
Attachment #8385807 -
Attachment is obsolete: true
Attachment #8385807 -
Flags: review?(gene.lian)
Attachment #8386610 -
Flags: review?(gene.lian)
Assignee | ||
Comment 29•11 years ago
|
||
tests update for corresponding change
Attachment #8386613 -
Flags: review?(gene.lian)
Attachment #8386613 -
Flags: review?(acperez)
Assignee | ||
Comment 30•11 years ago
|
||
fix nits
Attachment #8386613 -
Attachment is obsolete: true
Attachment #8386613 -
Flags: review?(gene.lian)
Attachment #8386613 -
Flags: review?(acperez)
Attachment #8386616 -
Flags: review?(gene.lian)
Attachment #8386616 -
Flags: review?(acperez)
Assignee | ||
Comment 31•11 years ago
|
||
Attachment #8385806 -
Attachment is obsolete: true
Attachment #8385806 -
Flags: review?(gene.lian)
Attachment #8386621 -
Flags: review?(gene.lian)
Assignee | ||
Comment 32•11 years ago
|
||
Fix nits.
Attachment #8386616 -
Attachment is obsolete: true
Attachment #8386616 -
Flags: review?(gene.lian)
Attachment #8386616 -
Flags: review?(acperez)
Attachment #8386707 -
Flags: review?(gene.lian)
Attachment #8386707 -
Flags: review?(acperez)
Assignee | ||
Comment 33•11 years ago
|
||
Test cases for v1.3
Attachment #8386709 -
Flags: review?(gene.lian)
Attachment #8386709 -
Flags: review?(acperez)
Assignee | ||
Comment 34•11 years ago
|
||
Comment 35•11 years ago
|
||
I don't really have time to review this until next week because I'm joining the work week with partners. Having Albert's review is enough to land if we're rushing to fix this for v1.3.
Again, please remember to get Fabrice's approval before uplifting.
Assignee | ||
Comment 36•11 years ago
|
||
[Approval Request Comment]
Bug caused by (feature/regressing bug #):
User impact if declined: Caused JS Error sometimes when launch CostControl App
Testing completed: yes
Risk to taking this patch (and alternatives if risky): low
String or UUID changes made by this patch: none
Attachment #8386621 -
Attachment is obsolete: true
Attachment #8386621 -
Flags: review?(gene.lian)
Attachment #8387347 -
Flags: review+
Flags: needinfo?(fabrice)
Assignee | ||
Comment 37•11 years ago
|
||
Comment on attachment 8386610 [details] [diff] [review]
Bug 964228 - Part 1: Fix race condition in NetworkStatsService.jsm. r=gene, albert
Review of attachment 8386610 [details] [diff] [review]:
-----------------------------------------------------------------
r=albert.
Attachment #8386610 -
Flags: review?(gene.lian) → review+
Assignee | ||
Comment 38•11 years ago
|
||
Comment on attachment 8386707 [details] [diff] [review]
Bug 964228 - Part 2: Tests update. r=gene, albert
Review of attachment 8386707 [details] [diff] [review]:
-----------------------------------------------------------------
Since Gene is not available for review, removing the flag.
Attachment #8386707 -
Flags: review?(gene.lian)
Assignee | ||
Comment 39•11 years ago
|
||
Comment on attachment 8386709 [details] [diff] [review]
(v1.3) Bug 964228 - Part 2: Tests update. r=gene, albert
Review of attachment 8386709 [details] [diff] [review]:
-----------------------------------------------------------------
Since Gene is not available for review, removing the flag.
Attachment #8386709 -
Flags: review?(gene.lian)
Comment 40•11 years ago
|
||
John, just ask for approval when it's ready, I track these.
Flags: needinfo?(fabrice)
Reporter | ||
Comment 41•11 years ago
|
||
Comment on attachment 8386707 [details] [diff] [review]
Bug 964228 - Part 2: Tests update. r=gene, albert
Review of attachment 8386707 [details] [diff] [review]:
-----------------------------------------------------------------
Looks fine
Attachment #8386707 -
Flags: review?(acperez) → review+
Reporter | ||
Comment 42•11 years ago
|
||
Comment on attachment 8386709 [details] [diff] [review]
(v1.3) Bug 964228 - Part 2: Tests update. r=gene, albert
Review of attachment 8386709 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks John!
Attachment #8386709 -
Flags: review?(acperez) → review+
Updated•11 years ago
|
Component: General → Networking
Product: Firefox OS → Core
Assignee | ||
Comment 43•11 years ago
|
||
Fabrice,
I think it is ready for landing now, ask for approval.
Flags: needinfo?(fabrice)
Assignee | ||
Comment 44•11 years ago
|
||
Hi Joe,
As I mentioned in Comment 21, this bug should also be uplifted to 1.3T. Can you take a look and make a confirmation?
Flags: needinfo?(jcheng)
Comment 45•11 years ago
|
||
John, you need to ask for approval in the patch details page, and fill in the approval questionnaire.
Flags: needinfo?(fabrice)
Reporter | ||
Comment 46•11 years ago
|
||
John, 1.4 also need the patch.
Comment 48•11 years ago
|
||
since its 1.3+, it should also get uplifted to 1.3T
Flags: needinfo?(jcheng)
Assignee | ||
Comment 49•11 years ago
|
||
Comment on attachment 8387347 [details] [diff] [review]
(v1.3) Bug 964228 - Part 1: Fix race condition in NetworkStatsService.jsm. r=albert a=fabrice
Review of attachment 8387347 [details] [diff] [review]:
-----------------------------------------------------------------
[Approval Request Comment]
Bug caused by (feature/regressing bug #):
User impact if declined: Caused JS Error sometimes when launch CostControl App
Testing completed: yes
Risk to taking this patch (and alternatives if risky): low
String or UUID changes made by this patch: none
Attachment #8387347 -
Flags: approval-mozilla-b2g28?
Assignee | ||
Comment 50•11 years ago
|
||
(In reply to Albert [:albert] from comment #46)
> John, 1.4 also need the patch.
The patch for 1.4 is already provided, so I'm guessing you are saying 1.3t, right ? ;)
I'll provide the patch later.
Assignee | ||
Comment 51•11 years ago
|
||
Rebase for v1.3t
Assignee | ||
Comment 52•11 years ago
|
||
rebase for v1.3t
Updated•11 years ago
|
Attachment #8387347 -
Flags: approval-mozilla-b2g28? → approval-mozilla-b2g28+
Keywords: checkin-needed
Comment 53•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/861f6c26ef3a
https://hg.mozilla.org/integration/mozilla-inbound/rev/49ec48262f79
Flags: in-testsuite+
Keywords: checkin-needed
Comment 54•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/861f6c26ef3a
https://hg.mozilla.org/mozilla-central/rev/49ec48262f79
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
Comment 55•11 years ago
|
||
https://hg.mozilla.org/releases/mozilla-b2g28_v1_3/rev/80069f20fa18
https://hg.mozilla.org/releases/mozilla-b2g28_v1_3/rev/2a8e950d9099
status-b2g-v1.4:
--- → fixed
status-firefox28:
--- → wontfix
status-firefox29:
--- → wontfix
status-firefox30:
--- → fixed
Comment 56•11 years ago
|
||
Goes without saying that this made for some good times when merging b2g28 to v1.3t. Please double-check the files patched in this bug on the 1.3t branch to make sure everything looks sane. Basically what I did was revert them to their pre-merge versions then manually apply the 1.3t patches attached to this bug, and committed the results as part of the merge cset. Everything went smoothly like that, but it would still be nice to get confirmation :)
Flags: needinfo?(jshih)
Assignee | ||
Comment 57•11 years ago
|
||
Thanks Ryan! I've verified the newest 1.3t build locally, and it works fine (both running device and test cases)
Flags: needinfo?(jshih)
Comment 58•11 years ago
|
||
Thank *you* for posting the 1.3t patches. I would have been completely lost without them :)
You need to log in
before you can comment on or make changes to this bug.
Description
•