Intermittent silent mochitest-2 failure after TEST-START | dom/tests/mochitest/dom-level0/test_setting_document.domain_idn.html or test_location.html or test_setting_document.domain_to_shortened_ipaddr.html in Android native fennec

RESOLVED FIXED

Status

()

defect
RESOLVED FIXED
8 years ago
7 years ago

People

(Reporter: mbrubeck, Unassigned)

Tracking

({intermittent-failure})

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [test which aborts the suite][android_tier_1])

Attachments

(1 attachment)

(Reporter)

Description

8 years ago
Android tests on the Birch branch seem to frequently just stop in test_setting_document.domain_idn.html:

https://tbpl.mozilla.org/php/getParsedLog.php?id=7652469&tree=Birch
Android Tegra 250 birch opt test mochitest-2 on 2011-11-29 23:08:52 PST for push c77033ad9656

11 INFO TEST-START | /tests/dom/tests/mochitest/dom-level0/test_location.html
12 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location.html | got a new location object in the iframe
13 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location.html | can read the new prop - 42 should equal 42
14 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location.html | got a new location object
15 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location.html | can read location.href
16 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location.html | correctly threw a permission denied security error when reading location.href
17 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location.html | correctly threw a permission denied security error an expando on location
18 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location.html | got a new location object in the iframe
19 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location.html | can still read old values of the location object - 42 should equal 42
20 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location.html | location objects are distinct
21 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location.html | can read location.href
22 INFO TEST-END | /tests/dom/tests/mochitest/dom-level0/test_location.html | finished in 1318ms
23 INFO TEST-START | /tests/dom/tests/mochitest/dom-level0/test_location_setters.html
24 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location_setters.html | Location should not have become string - object should equal object
25 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location_setters.html | href of content window's location should match url of current test - data:text/plain,1 should equal data:text/plain,1
26 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location_setters.html | Location should not have become string - object should equal object
27 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location_setters.html | href of content window's location should match url of current test - data:text/plain,2 should equal data:text/plain,2
28 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location_setters.html | Location should not have become string - object should equal object
29 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location_setters.html | href of content window's location should match url of current test - data:text/plain,3 should equal data:text/plain,3
30 INFO TEST-END | /tests/dom/tests/mochitest/dom-level0/test_location_setters.html | finished in 1076ms
31 INFO TEST-START | /tests/dom/tests/mochitest/dom-level0/test_setting_document.domain_idn.html
INFO | automation.py | Application ran for: 0:00:25.543786
INFO | automation.py | Reading PID log: /tmp/tmpObXNpPpidlog
getting files in '/mnt/sdcard/tests/profile/minidumps/'
WARNING | automationutils.processLeakLog() | refcount logging is off, so leaks can't be detected!

INFO | runtests.py | Running tests: end.
removing file: /mnt/sdcard/tests/logs/mochitest.log
program finished with exit code 0
elapsedTime=45.450814
TinderboxPrint: mochitest-plain_2<br/><em class="testfail">T-FAIL</em>
======== Finished 'python mochitest/runtestsremote.py ...' warnings (results: 1, elapsed: 45 secs) ========

Also seen in:
https://tbpl.mozilla.org/php/getParsedLog.php?id=7651539&tree=Birch
https://tbpl.mozilla.org/php/getParsedLog.php?id=7665505&tree=Birch
(Reporter)

Updated

8 years ago
Duplicate of this bug: 706497
(Reporter)

Comment 2

8 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=7712025&tree=Birch

also this one failed just one test later:
https://tbpl.mozilla.org/php/getParsedLog.php?id=7716776&tree=Birch
54 INFO TEST-END | /tests/dom/tests/mochitest/dom-level0/test_setting_document.domain_idn.html | finished in 716ms
55 INFO TEST-START | /tests/dom/tests/mochitest/dom-level0/test_setting_document.domain_to_shortened_ipaddr.html
INFO | automation.py | Application ran for: 0:00:23.249400
INFO | automation.py | Reading PID log: /tmp/tmp_4rIDKpidlog
getting files in '/mnt/sdcard/tests/profile/minidumps/'
WARNING | automationutils.processLeakLog() | refcount logging is off, so leaks can't be detected!

INFO | runtests.py | Running tests: end.
removing file: /mnt/sdcard/tests/logs/mochitest.log
program finished with exit code 0
elapsedTime=51.101540
TinderboxPrint: mochitest-plain_2<br/><em class="testfail">T-FAIL</em>
======== Finished 'python mochitest/runtestsremote.py ...' warnings (results: 1, elapsed: 51 secs) ========
(Reporter)

Comment 4

8 years ago
and this one looks like it happened one test earlier (or at least, that's when the log stops):

https://tbpl.mozilla.org/php/getParsedLog.php?id=7708145&tree=Birch
10 INFO TEST-END | /tests/dom/tests/mochitest/dom-level0/test_innerWidthHeight_script.html | finished in 1237ms
11 INFO TEST-START | /tests/dom/tests/mochitest/dom-level0/test_location.html
12 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location.html | got a new location object in the iframe
13 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location.html | can read the new prop - 42 should equal 42
14 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location.html | got a new location object
15 INFO TEST-PASS | /tests/dom/tests/mochitest/dom-level0/test_location.html | can read location.href
INFO | automation.py | Application ran for: 0:00:23.322967
INFO | automation.py | Reading PID log: /tmp/tmpy6nRs5pidlog
getting files in '/mnt/sdcard/tests/profile/minidumps/'
WARNING | automationutils.processLeakLog() | refcount logging is off, so leaks can't be detected!

INFO | runtests.py | Running tests: end.
failed to validate file when downloading /mnt/sdcard/tests/logs/mochitest.log!
removing file: /mnt/sdcard/tests/logs/mochitest.log
program finished with exit code 0
elapsedTime=49.849667
TinderboxPrint: mochitest-plain_2<br/><em class="testfail">T-FAIL</em>
Summary: Intermittent silent mochitest-2 failure after TEST-START | /tests/dom/tests/mochitest/dom-level0/test_setting_document.domain_idn.html in Android native fennec → Intermittent silent mochitest-2 failure after TEST-START | dom/tests/mochitest/dom-level0/test_setting_document.domain_idn.html or test_location.html or test_setting_document.domain_to_shortened_ipaddr.html in Android native fennec
Of the failures that I looked at, all have this in the logcat output:

12-02 18:29:02.707 E/AndroidRuntime( 1868): FATAL EXCEPTION: main
12-02 18:29:02.707 E/AndroidRuntime( 1868): java.lang.OutOfMemoryError: bitmap size exceeds VM budget
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at android.graphics.Bitmap.nativeCreate(Native Method)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at android.graphics.Bitmap.createBitmap(Bitmap.java:468)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at org.mozilla.gecko.gfx.GeckoSoftwareLayerClient.getBitmap(GeckoSoftwareLayerClient.java:185)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at org.mozilla.gecko.GeckoApp$20.run(GeckoApp.java:1100)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at android.os.Handler.handleCallback(Handler.java:587)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at android.os.Handler.dispatchMessage(Handler.java:92)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at android.os.Looper.loop(Looper.java:123)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at org.mozilla.gecko.GeckoApp$26.run(GeckoApp.java:1335)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at android.os.Handler.handleCallback(Handler.java:587)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at android.os.Handler.dispatchMessage(Handler.java:92)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at android.os.Looper.loop(Looper.java:123)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at android.app.ActivityThread.main(ActivityThread.java:4627)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at java.lang.reflect.Method.invokeNative(Native Method)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at java.lang.reflect.Method.invoke(Method.java:521)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:868)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:626)
12-02 18:29:02.707 E/AndroidRuntime( 1868): 	at dalvik.system.NativeStart.main(Native Method)
12-02 18:29:02.717 W/ActivityManager( 1017):   Force finishing activity org.mozilla.fennec/.App

We should be able to handle this in fennec code, even if it's just a catch(OutOfMemoryError)
More failure logs:

https://tbpl.mozilla.org/php/getParsedLog.php?id=7728009&tree=Birch
https://tbpl.mozilla.org/php/getParsedLog.php?id=7731731&tree=Birch

The first one fails at test_bug691215.html, second one at test_setting_document.domain_idn.html
(Reporter)

Comment 8

8 years ago
Comment on attachment 578870 [details] [diff] [review]
Catch OOM to prevent it killing fennec

Is there anything else we should also be doing, to avoid the OOM situation in the first place?  Are we trying to create especially large bitmaps here for some reason, or is it just that we are hitting the VM limit in general and this is where it happens to show up?
Attachment #578870 - Flags: review?(mbrubeck) → review+
Yeah, to create the thumbnail it looks like we first create a tile-size (1024x2048, currently) bitmap, and then extract the screen-sized top-left corner of it for the thumbnail. The tile bitmap allocation is where the OOM is happening, so avoiding that intermediate step would help. If we're only going to render the thumbnail scaled down, then a full-size bitmap is also an unnecessary intermediate step that costs memory, particularly on large-screen devices like tablets.
https://hg.mozilla.org/projects/birch/rev/1bc10492926e

We should probably file a new bug if we want to prevent allocating the memory in the first place as the change will be non-trivial.
Closing this bug since my patch seems to have fixed it. Bug 707675 can be used to track reducing the memory allocation.
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
(Reporter)

Updated

8 years ago
Blocks: 707675
Whiteboard: [orange][test which aborts the suite][android_tier_1] → [test which aborts the suite][android_tier_1]
You need to log in before you can comment on or make changes to this bug.