Closed Bug 964191 Opened 10 years ago Closed 10 years ago

Crash [@ gdk_visual_get_blue_pixel_details ] under PK11PasswordPrompt during mozmill testrun

Categories

(Core :: Graphics, defect)

24 Branch
x86
Linux
defect
Not set
critical

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox-esr24 --- wontfix
firefox-esr31 --- unaffected

People

(Reporter: cosmin-malutan, Unassigned)

References

()

Details

(Keywords: crash, Whiteboard: [mozmill])

Crash Data

Attachments

(3 files, 5 obsolete files)

Attached file jenkins.log
Firefox 24.2.0esrpre crashed during a mozmill testrun on Ubuntu 13.10 (mm-ub-1310-32-4)
I submitted the crash:
https://crash-stats.mozilla.com/report/index/641c64aa-37a6-4d05-87b9-1239a2140127
I attach the jenkins log, the crash submit form was opened while I investigated a different failure so the jenkins log I've got by searching by machine, build id and testrun.
Attachment #8365838 - Attachment mime type: text/x-log → text/plain
Cosmin, this bug need further investigation. Developers don't know how to read those logs and how to get this reproduced. Please explain what you have filed here as a crasher so other people could work on it. So which test was causing the crash? Is it reproducible?
Severity: normal → critical
Crash Signature: Firefox 24.2.0esrpre Crash Report [@ gdk_visual_get_blue_pixel_details ] → [@ gdk_visual_get_blue_pixel_details ]
Flags: needinfo?(cosmin.malutan)
Keywords: crash
Whiteboard: [mozmill]
It failed after restart test restartTests/testPreferences_masterPassword/test2.js, I ran the testran again with the affected build but it didn't failed.

One thing I noticed is that each time we opened the firefox we had a dump:
03:38:56 (process:7740): GLib-CRITICAL **: g_slice_set_config: assertion 'sys_page_size == 0' failed
And when it failed we had:
03:40:22 (firefox:7740): GLib-GObject-CRITICAL **: g_object_ref: assertion 'G_IS_OBJECT (object)' failed
Flags: needinfo?(cosmin.malutan)
(In reply to Cosmin Malutan from comment #2)
> It failed after restart test
> restartTests/testPreferences_masterPassword/test2.js, I ran the testran
> again with the affected build but it didn't failed.

You ran it once more? I would suggest you keep it running in a loop for about 100 times. Then also with --console-level=DEBUG so we get more information about the actual test step when it crashed.

> One thing I noticed is that each time we opened the firefox we had a dump:
> 03:38:56 (process:7740): GLib-CRITICAL **: g_slice_set_config: assertion
> 'sys_page_size == 0' failed

Right, that shouldn't be the problem. It always happens and is being tracked on bug 833117.

> And when it failed we had:
> 03:40:22 (firefox:7740): GLib-GObject-CRITICAL **: g_object_ref: assertion
> 'G_IS_OBJECT (object)' failed

Interesting! The actual test step could give us a sign for which window this happened. But not sure if this caused the crash.
(In reply to Cosmin Malutan from comment #2)
> And when it failed we had:
> 03:40:22 (firefox:7740): GLib-GObject-CRITICAL **: g_object_ref: assertion
> 'G_IS_OBJECT (object)' failed

Yes, that is bad.  May be the same issue as bug 887587.

An ASAN build might get closer to the source of the problem.
Depends on: 887587
Summary: Crash [@ gdk_visual_get_blue_pixel_details ] during mozmill testrun → Crash [@ gdk_visual_get_blue_pixel_details ] under PK11PasswordPrompt during mozmill testrun
Karl, do we have ASAN builds for ESR24? I can't find any of those whether for nightly builds nor tinderbox builds.
Flags: needinfo?(karlt)
We only have 64-bit ASAN builds (because the sanitizing uses address space).
e.g. http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-linux64-asan/1390876568/

ASAN builds may be newer than ESR24.  I don't know.
Flags: needinfo?(karlt)
We will keep an eye on this crash. If it happens again and more often we might want to find a testcase for it. With it we could try if this issue is still present for mozilla-central with an ASAN build.
I ran 100 testrun last night and it didn't reproduced.
This crashed again on mac 10.6 (mm-osx-106-4) with beta build 28.0b2#1 gu-IN

>08:54:31 2014-02-11 09:04:52.468 firefox[60437:8b03] invalid pixel format
>08:54:31 2014-02-11 09:04:52.469 firefox[60437:8b03] invalid context
>08:54:31 2014-02-11 09:04:52.469 firefox[60437:8b03] invalid pixel format
>08:54:31 2014-02-11 09:04:52.469 firefox[60437:8b03] invalid context
>08:54:31 TEST-PASS | restartTests/testPreferences_masterPassword/test1.js | testSetMasterPassword
>08:54:31 TEST-START | restartTests/testPreferences_masterPassword/test1.js | teardownModule
>08:54:31 TEST-END | restartTests/testPreferences_masterPassword/test1.js | finished in 5456ms
>08:55:32 RESULTS | Passed: 27
>08:55:32 RESULTS | Failed: 0
>08:55:32 RESULTS | Skipped: 5
>08:55:32 Traceback (most recent call last):
>08:55:32   File "/Users/mozauto/jenkins/workspace/ondemand_functional/mozmill-env-mac/python-lib/mozmill_automation/testrun.py", line 349, in run
>08:55:32     self.run_tests()
>08:55:32   File "/Users/mozauto/jenkins/workspace/ondemand_functional/mozmill-env-mac/python-lib/mozmill_automation/testrun.py", line 573, in run_tests
This is OS X and not Linux, Cosmin. Are you sure that htis is the crash? If yes we need the crashreport.
Flags: needinfo?(cosmin.malutan)
I think this is the crash-report Cosmin referenced:
https://crash-stats.mozilla.com/report/index/306b49f6-e725-4cf7-9708-2a8102140204

This isn't the same crash.
Also the versions are wrong the mentioned crash is on 29.0a1 from the 4th Feb.

The crash report does reference this bug as possibly related.
I'm leaving the needinfo flag in case Cosmin wants to add any additional info
That's a totally different crash, yes. It has been already fixed by bug 962846, which is even referenced in the crash report. So no need to follow that further.
Flags: needinfo?(cosmin.malutan)
This crashed again on Ubuntu 13.10 x86(mm-ub-1310-32-4)
I will keep the node offline and try to reproduce.


https://crash-stats.mozilla.com/report/index/5576b7d3-7957-4f0b-95ff-cc7702140217
I ran the testrun in loop for 10 times and it did not reproduced.
I ran only the testPreferences_masterPassword for 2000 times and still didn't failed.
I failed again on mm-ub-1310-32-4, I ran the master-password tests for 20 times, and though the tests passes and I haven't seen a crash I had once in test two:
>GLib-GObject-CRITICAL **: g_object_ref: assertion 'G_IS_OBJECT (object)' failed
in test two
Attached file dump.txt (obsolete) —
I reproduced the crash just by running the testPreferences_masterPassword:
>(process:15079): GLib-CRITICAL **: g_slice_set_config: assertion 'sys_page_size == 0' failed
>TEST-START | test1.js | setupModule
>TEST-START | test1.js | testSetMasterPassword
>TEST-PASS | test1.js | testSetMasterPassword
>TEST-START | test1.js | teardownModule
>TEST-END | test1.js | finished in 1284ms
>
>(process:15079): GLib-CRITICAL **: g_slice_set_config: assertion 'sys_page_size == 0' failed
>TEST-START | test2.js | setupModule
>TEST-START | test2.js | testInvokeMasterPassword
>
>(firefox:15079): GLib-GObject-CRITICAL **: g_object_ref: assertion 'G_IS_OBJECT (object)' failed
>
>(firefox:15079): GLib-GObject-CRITICAL **: g_object_unref: assertion 'G_IS_OBJECT (object)' failed
>
>(firefox:15079): GLib-GObject-CRITICAL **: g_object_ref: assertion 'G_IS_OBJECT (object)' failed
>
>(firefox:15079): GLib-GObject-CRITICAL **: g_object_unref: assertion 'G_IS_OBJECT (object)' failed
>
>(firefox:15079): GLib-GObject-CRITICAL **: g_object_ref: assertion 'G_IS_OBJECT (object)' failed
>PROCESS-CRASH | /home/mozauto/jenkins/workspace/mozilla-esr24_functional/mozmill-tests/firefox/tests/functional/restartTests/testPreferences_masterPassword/test2.js | application crashed [Unknown top frame]
>Crash dump filename: /tmp/tmphrBqYT.mozrunner/minidumps/38337121-b498-be38-10687ad1-5974b840.dmp
>No symbols path given, can't process dump.
>MINIDUMP_STACKWALK not set, can't process dump.
>mozcrash INFO | Saved minidump as /home/mozauto/.mozilla/firefox/Crash Reports/pending/38337121-b498-be38-10687ad1-5974b840.dmp
>mozcrash INFO | Saved app info as /home/mozauto/.mozilla/firefox/Crash Reports/pending/38337121-b498-be38-10687ad1-5974b840.extra
>
>(process:15147): GLib-CRITICAL **: g_slice_set_config: assertion 'sys_page_size == 0' failed
>TEST-START | test3.js | setupModule
>TEST-START | test3.js | testRemoveMasterPassword
>TEST-PASS | test3.js | testRemoveMasterPassword
>TEST-START | test3.js | teardownModule
>TEST-END | test3.js | finished in 599ms
>RESULTS | Passed: 2
>RESULTS | Failed: 1
>RESULTS | Skipped: 0
I attached the extra file.
Cosmin, the extra file doesn't contain any helpful informatino regarding this crash. So there is no need to attach it. Beside that how oftne can you reproduce this crash? If it happens all the time, it would be great to run a test with the debug version of Firefox.
Attached file dump.txt (obsolete) —
I opened the actual dump file with WinDbg, and it looks like the system cannot find a file.
>Unable to load image /usr/lib/i386-linux-gnu/libgdk-x11-2.0.so.0.2400.20
I reproduce the crash once, right now I'm running with the debugging version, I hope it I can reproduce it and bring more information.
Thanks
Attachment #8408169 - Attachment is obsolete: true
Not sure why you are trying to analyze a minidump file from Linux with WinDBG. This tool only works for minidumps created on Windows, as the name says. If you can reproduce on Linux, you should better use gdb.
Prior of this I tried to use gdb but I've got:
>"/home/cosmin/Desktop/dump.dmp" is not a core dump: File format not recognized
You should run mozmill with the debugger attached. See the (I think) --debugger option.
Attached file console log crashed (obsolete) —
It couldn't reproduce it with an debug build, I could with a  regular one and witl mozmill --debug option, This is the log, I will attach one where it pass, to be able to see the diff.
Attached file console log passed (obsolete) —
I tried to find steps to reproduce this but I couldn't.
From the diff between the two logs it looks like it crashes when it handles this master-password prompt modal:
https://hg.mozilla.org/qa/mozmill-tests/file/mozilla-esr24/firefox/tests/functional/restartTests/testPreferences_masterPassword/test2.js#l97
Cosmin, I have not said that you should use the --debug option but the --debugger option. Please read my comments carefully enough.
If I ran mozmill with bot debugger arguments (--debugger=gdb --debugger-args=""):
>mozmill -m mozmill-tests/firefox/tests/functional/restartTests/testPreferences_masterPassword/manifest.ini -b builds/firefox/firefox --debugger=gdb --debugger-args=""
I get:
>/usr/bin/gdb: unrecognized option '-profile'
>Use `/usr/bin/gdb --help' for a complete list of options.
>TEST-UNEXPECTED-FAIL | Disconnect Error: Application unexpectedly closed
>RESULTS | Passed: 0
>RESULTS | Failed: 0
>RESULTS | Skipped: 0
>Traceback (most recent call last):

If I ran mozmill only with (--debugger=gdb):
>mozmill -m mozmill-tests/firefox/tests/functional/restartTests/testPreferences_masterPassword/manifest.ini -b builds/firefox/firefox --debugger=gdb
I get:
>Reading symbols from /home/mozauto/jenkins/workspace/mozilla-esr24_functional/builds/firefox/firefox...(no debugging symbols found)...done.
>(gdb)
Well, read about the usage of gdb and start the testrun via 'r' (run). See the manual for further commands.
Attached file crash_log (obsolete) —
I started after I gave r, but then the application wouldn't close at the end.
Whatsoever the mozcrash is not called correctly because:
We don't have MINIDUMP_STACKWALK exported application, and symbols.

I took the stackwalk appkuication and set it as explained in:
https://developer.mozilla.org/en/docs/Mochitest#stacks

For symbols I used http://hg.mozilla.org/users/jwatt_jwatt.org/fetch-symbols/file/6f7ab0270fc6/fetch-symbols.py to download the symbols, then I hardcoded it in:
https://github.com/mozilla/gecko-dev/blob/master/testing/mozbase/mozcrash/mozcrash/mozcrash.py#L73

Doing so I've got this trace, I hope it helps, I don't know what further investigation I can do here.
Attachment #8408184 - Attachment is obsolete: true
Attachment #8408239 - Attachment is obsolete: true
Attachment #8408240 - Attachment is obsolete: true
You are not running Mozmill under gdb. In those cases no minidump will be created, but the debugger will stop execution. As said earlier please use gdb, read its documentation, or ask.
I ran the tests with --debugger=gdb, afterward I had to hit "r" for tests to start, but after the first test ends, firefox restarts and the second test does not start. I added all the test functions in a single file, but if it's not a restart test then the crash doesn't reproduce.

Henrik can you please guide what to do next here?
Looks like gdb isn't reattached to the process after a restart. You might wanna check quickly if that is a Mozmill or a general issue. If it's only for Mozmill we should get this fixed.

For now I would propose you prepare the profile until the step before the conflicting restart happens. Then you can run Mozmill against this single test module only with the state of the profile before the restart. I'm fairly sure this should reproduce the problem and we can catch the crash.
Attached file crash_log
I prepared the profile as instructed then I ran only the second test.
When it failed I gave "bt" (print a stack trace) command and hit Enter until it ended printing trace. Then I sent "c"(continue) command, and this unfrozen FF and terminated the process. 

Attached is the full log.
Attachment #8410227 - Attachment is obsolete: true
Is this really a debug version of Firefox as requested in comment 18 on this bug? It completely misses symbols.
It didn't failed so far with the debug version, I will keep trying.
I ran the tests for 250 times with the debug build and it didn't fail.
(In reply to Cosmin Malutan from comment #36)
> I ran the tests for 250 times with the debug build and it didn't fail.

Sad to see. :/ So no detailed information for us here.

(In reply to Cosmin Malutan from comment #33)
> Created attachment 8410947 [details]

Karl, is there anything in here which could be helpful to see what the problem is?

Cosmin, can you create a minimized profile with data contained necessary to crash Firefox during the next start? That would be very helpful to have. Please attach a zip archive here.
Flags: needinfo?(karlt)
It's looking like the same issue as bug 887587 and bug 879373.

I can't work out what has gone wrong from the stack because I suspect the error has happened before that point.

Bug 887587 comment 6 is our best lead so far, but this will be hard to track down.
Flags: needinfo?(karlt)
Attached file prf.zip
To reproduce this unzip this profile, and from a mozmill-environment run:
>mozmill --profile=prf  -t mozmill-tests/firefox/tests/functional/restartTests/testPreferences_masterPassword/test2.js -b <path to a esr24 build> --debugger=gdb
Then hit r. It will fail once in 20 runs.
The packaged profile doesn't seem to contain all the necessary data. setupModule in the test2.js file failes for me. I think it would be better if you could strip the profile even more, AND include a small testcase so we can strip the dependency on mozmill-tests completely.
Flags: needinfo?(cosmin.malutan)
Cosmin, any update here for the last request?
The profile works for me, it might not work if you ran it on another build than ESR24.
I didn't had time to make TC that doesn't depend on libs, I will try to make it today, I had ran this with ESR24.30 and ESR24.40 en-US and but it didn't crashed again.
Flags: needinfo?(cosmin.malutan)
We've had another crash, again ESR24, linux, from the crash log it looks identical:
https://crash-stats.mozilla.com/report/index/96b6b382-b560-4523-94ca-ae9eb2140730

Cosmin, you said you had some updates here?
Flags: needinfo?(cosmin.malutan)
I ran this for 50 times today, and it didn't reproduced. The command to reproduce this it's the one I gave in comment 39, please make sure you use an esr24 build which is the only one affected, i added the flag for 28 by mistake (comment 11).
So considering this happens rarely and only affects esr24 build I would say we wait for Esr31 to replace 24, and don't waste any more time here.
Flags: needinfo?(cosmin.malutan)
We lived with this crash for a long time, and putting any work into such a rarely happening bug doesn't seem to be wise. So I agree and also tend to say we should close the bug as wontfix for esr24. On all other current branches it works fine, and no crash is happening.

Silvestre, what do you think?
Flags: needinfo?(sledru)
That looks the right choice to me. 
ESR 31.0.0 is live now.
Flags: needinfo?(sledru)
Given that this crash is on esr24 only, we can mark the whole bug as wontfix.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
Version: Trunk → 24 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: