Closed Bug 664371 Opened 14 years ago Closed 12 years ago

Intermittent tp5 "browser frozen" (or the lying "crash during run (stack found)" for Linux) or "timeout exceeded" while loading www.alipay.com

Categories

(Testing :: Talos, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philor, Unassigned)

References

Details

(Keywords: intermittent-failure, Whiteboard: [see comment 59 for occurence data])

http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1308117068.1308119477.5380.gz Rev3 MacOSX Snow Leopard 10.6.2 tracemonkey talos tp on 2011/06/14 22:51:08 s: talos-r3-snow-019 NOISE: Cycle 2: loaded http://localhost/page_load_test/tp5/goo.ne.jp/goo.ne.jp/index.html (next: http://localhost/page_load_test/tp5/alipay.com/www.alipay.com/index.html) sh: line 1: 523 Terminated ../Nightly.app/Contents/MacOS/firefox-bin -foreground -profile /var/folders/H5/H5TD8hgwEqKq9hgKlayjWU+++TM/-Tmp-/tmpRlPpXp/profile -tp page_load_test/tp5/tp5.manifest -tpchrome -tpnoisy -tpformat tinderbox -tpcycles 10 > browser_output.txt NOISE: NOISE: __FAILbrowser frozen__FAIL Error in collecting counter: Private Bytes Error in collecting counter: RSS
http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla-Inbound/1308750427.1308753803.324.gz Rev3 MacOSX Snow Leopard 10.6.2 mozilla-inbound talos tp on 2011/06/22 06:47:07 s: talos-r3-snow-041 NOISE: Cycle 2: loaded http://localhost/page_load_test/tp5/goo.ne.jp/goo.ne.jp/index.html (next: http://localhost/page_load_test/tp5/alipay.com/www.alipay.com/index.html) sh: line 1: 562 Terminated ../Nightly.app/Contents/MacOS/firefox-bin -foreground -profile /var/folders/H5/H5TD8hgwEqKq9hgKlayjWU+++TM/-Tmp-/tmprPdJyB/profile -tp page_load_test/tp5/tp5.manifest -tpchrome -tpnoisy -tpformat tinderbox -tpcycles 10 > browser_output.txt Error in collecting counter: Private Bytes Error in collecting counter: RSS NOISE: NOISE: __FAILbrowser frozen__FAIL Error in collecting counter: Private Bytes Error in collecting counter: RSS
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1308824371.1308826843.22719.gz (which, being Linux, claims it was a crash and has a stack for the hang, though nothing in it helps *me* any).
http://tinderbox.mozilla.org/showlog.cgi?log=TraceMonkey/1308963757.1308966219.24627.gz Not sure why I put this in Talos, since it's always hanging in the same place, so the hypothetical person who hypothetically does record and replay ought to be able to hit the same hang by just loading some stuff, loading http://goo.ne.jp/, and then loading http://www.alipay.com/.
Component: Talos → General
OS: Mac OS X → All
Product: Testing → Core
QA Contact: talos → general
Hardware: x86 → All
Summary: Intermittent tp5 "browser frozen" → Intermittent tp5 "browser frozen" (or the lying "crash during run (stack found)" for Linux) while loading www.alipay.com
http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla-Aurora/1312398730.1312400816.7884.gz NOISE: Cycle 8: loaded http://localhost/page_load_test/tp5/goo.ne.jp/goo.ne.jp/index.html (next: http://localhost/page_load_test/tp5/alipay.com/www.alipay.com/index.html) NOISE: NOISE: __FAILbrowser frozen__FAIL sh: line 1: 331 Terminated ../Aurora.app/Contents/MacOS/firefox-bin -foreground -profile /var/folders/H5/H5TD8hgwEqKq9hgKlayjWU+++TM/-Tmp-/tmpyRllpJ/profile -tp page_load_test/tp5/tp5.manifest -tpchrome -tpnoisy -tpformat tinderbox -tpcycles 10 > browser_output.txt Failed tp5: Stopped Wed, 03 Aug 2011 12:46:43 FAIL: Busted: tp5 FAIL: browser frozen Completed test tp5: Stopped Wed, 03 Aug 2011 12:46:43 RETURN: cycle time: 00:29:37<br> talos-r3-snow-039: Stopped Wed, 03 Aug 2011 12:46:43 Sending results: Started Wed, 03 Aug 2011 12:46:43 RETURN:<br> RETURN:<p style="font-size:smaller;">Details:<br>|</p> Completed sending results: Stopped Wed, 03 Aug 2011 12:46:43 program finished with exit code 0 elapsedTime=1777.918994 TinderboxPrint:s: talos-r3-snow-039 TinderboxPrint:id:20110803094919 TinderboxPrint:<a href = "http://hg.mozilla.org/releases/mozilla-aurora/rev/17d82fcd27ad">rev:17d82fcd27ad</a> TinderboxPrint:FAIL: Busted: tp5 TinderboxPrint:FAIL: browser frozen TinderboxPrint: cycle time: 00:29:37<br> TinderboxPrint:<br> TinderboxPrint:<p style="font-size:smaller;">Details:<br>|</p> === Output ended === ======== BuildStep ended ======== ======== BuildStep started ======== reboot after 1 test run slave lost === Output === === Output ended === ======== BuildStep ended ========
Timebomb, zombocom, the clusters and timing are awfully suspicious.
Component: General → Talos
Product: Core → Testing
QA Contact: general → talos
Since talos is sort of opaque, dunno if this is the clue to what all the others are about, or something separate (which would be amazing, since it's right within that late-21:00-23:00 block last night): https://tbpl.mozilla.org/php/getParsedLog.php?id=7017178&tree=Mozilla-Inbound Rev3 WINNT 5.1 mozilla-inbound talos tp on 2011-10-24 21:47:55 PDT for push b87afa49527d NOISE: Cycle 3: loaded http://localhost/page_load_test/tp5/goo.ne.jp/goo.ne.jp/index.html (next: http://localhost/page_load_test/tp5/alipay.com/www.alipay.com/index.html) Traceback (most recent call last): File "bcontroller.py", line 235, in ? sys.exit(main()) File "bcontroller.py", line 232, in main bcontroller.run() File "bcontroller.py", line 175, in run results_file = open(self.browser_log, "a") IOError: [Errno 13] Permission denied: 'browser_output.txt' Failed tp5:
odd, that it makes it to cycle 3, then has a problem with browser_output.txt. I suspect something else is reading/writing to this file at the same time. Probably a different problem than a crash or hang of the browser.
Summary: Intermittent tp5 "browser frozen" (or the lying "crash during run (stack found)" for Linux) while loading www.alipay.com → Intermittent tp5 "browser frozen" (or the lying "crash during run (stack found)" for Linux) or "timeout exceeded" while loading www.alipay.com
This has a really odd usage history. It went to 0 for a while and then recently spiked. I'd start looking at code checkins here, both in terms of whether we changed something in talos around this time or if we changed something in firefox that caused the behavior we're seeing. See: http://brasstacks.mozilla.com/orangefactor/?display=Bug&tree=mozilla-central&endday=2011-10-31&startday=2011-10-03&bugid=664371
Whiteboard: [orange] → [orange][see comment 59 for occurence data]
There was a big flare up Oct 29th (Saturday). I don't recall any talos or pageloader bits landing on that previous Friday. Maybe DST changes caused problems?
oops, comment 63 is actually a new bug 701935.
https://tbpl.mozilla.org/php/getParsedLog.php?id=8324454&tree=Services-Central (In reply to Phil Ringnalda (:philor) from comment #39) > Timebomb, zombocom, the clusters and timing are awfully suspicious. $5 on zombocom.
Depends on: 720852
Fixed by not hitting the network anymore, bets on zombocom pay off.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Whiteboard: [orange][see comment 59 for occurence data] → [see comment 59 for occurence data]
You need to log in before you can comment on or make changes to this bug.