Hit MOZ_CRASH(assertion failed: y2 > 1. / 12. && y2 <= 1.) at gfx/qcms/src/iccread.rs:1392
Categories
(Core :: Graphics: ImageLib, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr78 | --- | unaffected |
firefox-esr91 | --- | unaffected |
firefox92 | --- | disabled |
firefox93 | --- | fixed |
firefox94 | --- | verified |
People
(Reporter: tsmith, Assigned: jbauman)
References
(Regression)
Details
(4 keywords, Whiteboard: [bugmon:bisected,confirmed])
Crash Data
Attachments
(2 files)
14.62 KB,
application/octet-stream
|
Details | |
48 bytes,
text/x-phabricator-request
|
pascalc
:
approval-mozilla-beta+
|
Details | Review |
Found while fuzzing m-c 20210907-eac402936496 (--enable-debug --enable-fuzzing)
Hit MOZ_CRASH(assertion failed: y2 > 1. / 12. && y2 <= 1.) at gfx/qcms/src/iccread.rs:1392
#0 0x7f3c91f6a0e5 in MOZ_Crash /builds/worker/workspace/obj-build/dist/include/mozilla/Assertions.h:256:3
#1 0x7f3c91f6a0e5 in RustMozCrash src/mozglue/static/rust/wrappers.cpp:18:3
#2 0x7f3c91f6a064 in mozglue_static::panic_hook::h63b3c2e6144e67e9 src/mozglue/static/rust/lib.rs:91:9
#3 0x7f3c91f69adb in core::ops::function::Fn::call::h0d4763c52fdc30fd /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/ops/function.rs:70:5
#4 0x7f3c92d35218 in std::panicking::rust_panic_with_hook::h7ee9e1a2d0f8975a /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:626:17
#5 0x7f3c92d34c96 in std::panicking::begin_panic_handler::_$u7b$$u7b$closure$u7d$$u7d$::h8ab3b4491718b2c7 /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:517:13
#6 0x7f3c92d3103b in std::sys_common::backtrace::__rust_end_short_backtrace::hd489062ffa586a9f /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/sys_common/backtrace.rs:141:18
#7 0x7f3c92d34c28 in rust_begin_unwind /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/std/src/panicking.rs:515:5
#8 0x7f3c88f10070 in core::panicking::panic_fmt::hca6330e3e14086b4 /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/panicking.rs:92:14
#9 0x7f3c88f0ffbc in core::panicking::panic::h1a48d878ff3dcd40 /rustc/a178d0322ce20e33eac124758e837cbd80a6f633/library/core/src/panicking.rs:50:5
#10 0x7f3c91010d6f in _$LT$qcms..iccread..curveType$u20$as$u20$core..convert..From$LT$qcms..iccread..TransferCharacteristics$GT$$GT$::from::_$u7b$$u7b$closure$u7d$$u7d$::ha817a1d3ccc3bdfb src/gfx/qcms/src/iccread.rs:1392:25
#11 0x7f3c91010d6f in qcms::iccread::build_trc_table::he6f51c21bd9a977f src/gfx/qcms/src/iccread.rs:920:22
#12 0x7f3c91010d6f in _$LT$qcms..iccread..curveType$u20$as$u20$core..convert..From$LT$qcms..iccread..TransferCharacteristics$GT$$GT$::from::hff68029ca57bb146 src/gfx/qcms/src/iccread.rs:1385:29
#13 0x7f3c910117ea in qcms::iccread::Profile::new_cicp::h90bf7e8ae50a442d src/gfx/qcms/src/iccread.rs:1545:21
#14 0x7f3c91003f59 in qcms_profile_create_cicp src/gfx/qcms/src/c_bindings.rs:71:5
#15 0x7f3c8aba1959 in mozilla::image::nsAVIFDecoder::Decode(mozilla::image::SourceBufferIterator&, mozilla::image::IResumable*) src/image/decoders/nsAVIFDecoder.cpp:1433:20
#16 0x7f3c8aba0058 in mozilla::image::nsAVIFDecoder::DoDecode(mozilla::image::SourceBufferIterator&, mozilla::image::IResumable*) src/image/decoders/nsAVIFDecoder.cpp:1144:25
#17 0x7f3c8aade2e7 in mozilla::image::Decoder::Decode(mozilla::image::IResumable*) src/image/Decoder.cpp:177:19
#18 0x7f3c8aae6bad in mozilla::image::DecodedSurfaceProvider::Run() src/image/DecodedSurfaceProvider.cpp:123:34
#19 0x7f3c8ab01533 in mozilla::image::DecodingTask::Run() src/image/DecodePool.cpp:146:12
#20 0x7f3c89100ddd in mozilla::TaskController::RunPoolThread() src/xpcom/threads/TaskController.cpp:287:33
#21 0x7f3c9f743957 in _pt_root src/nsprpub/pr/src/pthreads/ptthread.c:201:5
#22 0x7f3ca04bf608 in start_thread /build/glibc-eX1tMB/glibc-2.31/nptl/pthread_create.c:477:8
#23 0x7f3ca0087292 in clone /build/glibc-eX1tMB/glibc-2.31/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Reporter | ||
Comment 1•4 years ago
|
||
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Updated•4 years ago
|
Reporter | ||
Comment 2•4 years ago
|
||
A Pernosco session is available here: https://pernos.co/debug/uLyo26cfwLdJMr4q5BkJNw/index.html
Updated•4 years ago
|
Comment 3•4 years ago
|
||
Bugmon Analysis
Verified bug as reproducible on mozilla-central 20210908032417-a4d2ca53b2a4.
The bug appears to have been introduced in the following build range:
Start: 8eb9e75580b68837ee7c91d1c03c218521f51b34 (20210805151303)
End: 4aa3a54f7d7202bf6868a76b10e4669b56ed3c35 (20210805165711)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=8eb9e75580b68837ee7c91d1c03c218521f51b34&tochange=4aa3a54f7d7202bf6868a76b10e4669b56ed3c35
Assignee | ||
Comment 4•4 years ago
|
||
Ok, I see what's going on. Will have a patch up sortly
Assignee | ||
Comment 5•4 years ago
|
||
The assertion is due to an inappropriate test of exact floating-point values.
build_trc_table() handles this saturating case, so there's no need to assert.
Add more test coverage to be certain no fuzzing inputs will lead to crashes.
Updated•4 years ago
|
Comment 6•4 years ago
|
||
:jbauman, since this bug contains a bisection range, could you fill (if possible) the regressed_by field?
For more information, please visit auto_nag documentation.
Updated•4 years ago
|
Assignee | ||
Comment 9•4 years ago
|
||
Ok, I think I see the issue
It's frustrating that mach try auto
didn't catch this: https://treeherder.mozilla.org/jobs?repo=try&revision=f668f74164d6cc6be378bdf18e6756b5b4d5132d&selectedTaskRun=PrMv6LymQTKyvw6VU_6ZoA.0
Though my other try chooser run did hit it and I failed to catch it. Should be a straightforward fix.
Assignee | ||
Comment 10•4 years ago
|
||
Fixed with new try runs that look good:
- https://treeherder.mozilla.org/jobs?repo=try&revision=4674388f4f4a366c42aab3bc2d2349daeae0d93b
- https://treeherder.mozilla.org/jobs?repo=try&revision=aae073d5f49e18f12b598734985e284574d2b239
Relanding queued
Comment 11•4 years ago
|
||
Comment 12•4 years ago
|
||
Comment 13•4 years ago
|
||
Backed out changeset bd23fb0c95cc (Bug 1729539) for causing web platform failures.
Backout link
Push with failures WdH1
Failure Log
Updated•4 years ago
|
Assignee | ||
Comment 14•4 years ago
|
||
I don't see how this is error related to https://hg.mozilla.org/integration/autoland/rev/bd23fb0c95cc, can you help?
Comment 15•4 years ago
•
|
||
Hi, according to the backfills the issue started from there: https://treeherder.mozilla.org/jobs?repo=autoland&group_state=expanded&selectedTaskRun=QtASV-GgT3KmuXdM6vohzQ.0&fromchange=23bc48417fa101b4d8e85f36fe294339e3eb53b4&tochange=91dd7cd03ca3a6ca1b0d0fa79563bfb9145111ad&searchStr=os%2Cx%2C10.15%2Cwebrender%2Copt%2Cweb%2Cplatform%2Ctests%2Ctest-macosx1015-64-qr%2Fopt-web-platform-tests-wdspec-headless-e10s%2Cwdh1
Assignee | ||
Comment 16•4 years ago
|
||
Based on multiple green runs of WdH1 in my pre-land try, I don't think this failure is being caused by bd23fb0c95cc.
The commit before mine has a lot of errors too, but doesn't look like it has the process exiting which is causing my commit to be flagged:
- Task before mine with test job https://treeherder.mozilla.org/jobs?repo=autoland&group_state=expanded&selectedTaskRun=Ww-Ep0n4T_qbAmIRb3PQ1Q.0&fromchange=52d6f0947939a6dfa3ed602f62d5fb72b525c432&tochange=91dd7cd03ca3a6ca1b0d0fa79563bfb9145111ad&searchStr=os%2Cx%2C10.15%2Cwebrender%2Copt%2Cweb%2Cplatform%2Ctests%2Ctest-macosx1015-64-qr%2Fopt-web-platform-tests-wdspec-headless-e10s%2Cwdh1
- Log of that job: https://treeherder.mozilla.org/logviewer?job_id=351587719&repo=autoland&lineNumber=8915
Additionally the changes in that revision seem far more related to the issue being seen. Given all that, I'd like to re-land unless you have objections.
Assignee | ||
Comment 17•4 years ago
|
||
Bug 1730234 also seem like it might be relevant, and predates my landing by several days
![]() |
||
Comment 18•4 years ago
|
||
Henrik, do you have any insight why the changes in this bug started a permanent wdspec failure?
Jon, could you reland after 12pm PDT - this will prevent potential conflicts with the next merge candidate for Nightly. Thank you.
Assignee | ||
Comment 19•4 years ago
|
||
Jon, could you reland after 12pm PDT - this will prevent potential conflicts with the next merge candidate for Nightly. Thank you.
Will do! Thanks for the guidance
Comment 20•4 years ago
|
||
I had a look and the problem here is actually a different failure. The ones mentioned above are just log output, and are expected due to an expected failing test.
So here are the actual failing lines:
https://treeherder.mozilla.org/logviewer?job_id=351587981&repo=autoland&lineNumber=39967-39978
Something is forcing Firefox to shutdown after the initial browser window (toplevel-window-ready
notification) has been opened. We never actually reach the marionette-startup-requested
notification, which is sent out by Firefox when all windows have been fully their session restored.
Comment 21•4 years ago
|
||
Comment 22•4 years ago
|
||
Backed out for causing Wd failures.
Backout link: https://hg.mozilla.org/integration/autoland/rev/efe2adb390631eea3c4d0f6b8a3f5c7b3a0041bd
Failure log: https://treeherder.mozilla.org/logviewer?job_id=351687017&repo=autoland&lineNumber=7415
Assignee | ||
Comment 23•4 years ago
|
||
I don't think this is likely to be related to this change. See the WdH2 green in my pre-landing try. Is there something that implicates this particular revision? It wasn't the responsible revision for the last backout.
Updated•4 years ago
|
![]() |
||
Comment 24•4 years ago
|
||
The Try push uses a base from Sept 8, might have anything regressed since then? Comment 20 might indicate a crash.
Is there something that implicates this particular revision? It wasn't the responsible revision for the last backout.
I partially fail to parse the question. Each time the https://phabricator.services.mozilla.com/D125006 got backed out. Both WdH1 and WdH2 failed on macOS. The failing tasks didn't run initially for the pushes of bug 1729539 but do so for every 10th push. The code sheriffs added them to the previous pushes to identify with which push they started. The revision of that 10th push gets appended to the task name to indicate the tasks contain the same set of tests..
Assignee | ||
Comment 25•4 years ago
•
|
||
After running a lot of try runs, I'm becoming more an more convinced that while this change triggers a fault, it's not due to a fault in this code. Instead, I think that it's perturbing something that is making an otherwise rare or impossible fault condition elsewhere consistent. Unfortunately, I don't understand the test which is failing well enough to effectively investigate, so I'm at a bit of a loss for what to do. I'm going to continue breaking down my change and tweaking it to see if I can get it into a form which doesn't trigger the fault anymore, but I am concerned the real issue here is not getting investigated.
The change in D125006 is pretty small to begin with, but slicing it into orthogonal parts as finely as will compile/function, I've got (working all from the same base revision) this sequence of changes gradually building up to the original failing code:
- No changes: ๐
- Remove just the assert line: ๐
- Fully remove the assert (eliminating an unnecessary variable binding): ๐
- derive
Debug
forqcms_CIE_xyY
,qcms_CIE_xyYTRIPLE
: ๐ - Change
Try
toTryFrom<ColourPrimaries> for qcms_CIE_xyYTRIPLE
: ๐ - Change
white_point
to return aResult
: โ - Change
Try
toTryFrom<TransferCharacteristics> for curveType
: โ - Update rust unit tests: โ
Based on that, it seems like the problem is step 6, the addition of the white_point
change. But I tried another run just removing that code and leaving the rest (note that all 8 steps are pretty orthogonal):
and continuing to remove other code that was failing:
I've run all the jobs at least twice and results are consistent. But I can't see any reason for the behavior. I'll keep trying some things, but so far my best guess is that this is causing a very specific perturbation in unrelated code. Note that I also can't reproduce this error locally despite being on a macOS system (thought I'm using 10.14 SDK, so maybe it's worth changing to 10.14 10.15 to be more like the try runners).
Comment 26•4 years ago
|
||
(In reply to Jon Bauman [:jbauman:] from comment #25)
What I can see so far it's only Wd2 in headless mode which is permanently failing on MacOS 10.15 (not 10.14 as you said above). I tried yesterday to reproduce it locally too, but it's not failing on MacOS 11.5.2. Maybe it's something specific for 10.15?
Nevertheless I used the above try build to trigger some more jobs for that particular platform; just to see if other Wd2 jobs are failing too. The variations that I used are Fission, and non-headless. And as it can be seen only the headless tests are failing.
Given that your changes are graphics related I wonder if we could run a try build with some GFX logging via MOZ_LOG enabled? But not sure which type of logs the graphics component supports, and if that also includes Rust components.
Lets have a look at the differences...
Here the logs from a working test:
[task 2021-09-16T19:07:55.558Z] 19:07:55 INFO - PID 1899 | 1631819275557 Marionette INFO Marionette enabled
[task 2021-09-16T19:07:55.611Z] 19:07:55 INFO - PID 1899 | 1631819275609 Marionette TRACE Received observer notification toplevel-window-ready
[task 2021-09-16T19:07:55.616Z] 19:07:55 INFO - PID 1899 | 1631819275615 geckodriver::marionette TRACE Connection refused (os error 61). Retrying in 100ms
[task 2021-09-16T19:07:55.624Z] 19:07:55 INFO - PID 1899 | DEBUG: Adding blocker ClientManagerService: start destroying IPC actors early for phase xpcom-will-shutdown
[task 2021-09-16T19:07:55.633Z] 19:07:55 INFO - PID 1899 | DEBUG: Adding blocker Flush WebExtension StartupCache for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2021-09-16T19:07:55.672Z] 19:07:55 INFO - PID 1899 | DEBUG: Adding blocker JSON store: writing data for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2021-09-16T19:07:55.706Z] 19:07:55 INFO - PID 1899 | DEBUG: Adding blocker MediaShutdownManager: shutdown for phase profile-before-change
[task 2021-09-16T19:07:55.718Z] 19:07:55 INFO - PID 1899 | DEBUG: Adding blocker JSON store: writing data for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2021-09-16T19:07:55.744Z] 19:07:55 INFO - PID 1899 | DEBUG: Adding blocker ServiceWorkerShutdownBlocker: shutting down Service Workers for phase profile-change-teardown
[task 2021-09-16T19:07:55.799Z] 19:07:55 INFO - PID 1899 | 1631819275798 geckodriver::marionette TRACE Connection refused (os error 61). Retrying in 100ms
[task 2021-09-16T19:07:55.868Z] 19:07:55 INFO - PID 1899 | DEBUG: Adding blocker GMPProvider for phase AddonManager: Waiting for providers to shut down.
[task 2021-09-16T19:07:55.873Z] 19:07:55 INFO - PID 1899 | DEBUG: Adding blocker ContentParent: id=142b83800 for phase xpcom-will-shutdown
[task 2021-09-16T19:07:55.874Z] 19:07:55 INFO - PID 1899 | DEBUG: Adding blocker ContentParent: id=142b83800 for phase profile-before-change
[task 2021-09-16T19:07:55.985Z] 19:07:55 INFO - PID 1899 | [GFX1-]: RenderCompositorSWGL failed mapping default framebuffer, no dt
[task 2021-09-16T19:07:56.032Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker PageActions: purging unregistered actions from cache for phase profile-before-change
[task 2021-09-16T19:07:56.038Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker Places Clients shutdown for phase profile-change-teardown
[task 2021-09-16T19:07:56.039Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker Places Connection shutdown for phase profile-before-change
[task 2021-09-16T19:07:56.040Z] 19:07:56 INFO - PID 1899 | 1631819276039 geckodriver::marionette TRACE Connection refused (os error 61). Retrying in 100ms
[task 2021-09-16T19:07:56.047Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker Remote Settings profile-before-change for phase profile-before-change
[task 2021-09-16T19:07:56.055Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker DoHController: clear state and remove observers for phase profile-before-change
[task 2021-09-16T19:07:56.207Z] 19:07:56 INFO - PID 1899 | 1631819276206 geckodriver::marionette TRACE Connection refused (os error 61). Retrying in 100ms
[task 2021-09-16T19:07:56.286Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker Sqlite.jsm shutdown blocker for phase profile-before-change
[task 2021-09-16T19:07:56.287Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker content-prefs.sqlite#0: waiting for shutdown for phase Sqlite.jsm: wait until all connections are closed
[task 2021-09-16T19:07:56.288Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker Closing ContentPrefService2 connection. for phase Sqlite.jsm: wait until all clients have completed their task
[task 2021-09-16T19:07:56.289Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker Transaction (0) for phase content-prefs.sqlite#0: waiting for clients
[task 2021-09-16T19:07:56.290Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker Transaction (1) for phase content-prefs.sqlite#0: waiting for clients
[task 2021-09-16T19:07:56.290Z] 19:07:56 INFO - PID 1899 | DEBUG: Completed blocker Transaction (0) for phase content-prefs.sqlite#0: waiting for clients
[task 2021-09-16T19:07:56.291Z] 19:07:56 INFO - PID 1899 | DEBUG: Completed blocker Transaction (1) for phase content-prefs.sqlite#0: waiting for clients
[task 2021-09-16T19:07:56.308Z] 19:07:56 INFO - PID 1899 | 1631819276307 geckodriver::marionette TRACE Connection refused (os error 61). Retrying in 100ms
[task 2021-09-16T19:07:56.315Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker sanitize.js: Sanitize on shutdown for phase Places Clients shutdown
[task 2021-09-16T19:07:56.381Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker Search service: shutting down for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2021-09-16T19:07:56.420Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker places.sqlite#1: waiting for shutdown for phase Sqlite.jsm: wait until all connections are closed
[task 2021-09-16T19:07:56.421Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker Places Expiration: shutdown for phase Places Connection shutdown
[task 2021-09-16T19:07:56.422Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker PlacesUtils wrapped connection closing as part of Places shutdown for phase Places Connection shutdown
[task 2021-09-16T19:07:56.422Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker PlacesUtils wrapped connection must be closed before Sqlite.jsm for phase Sqlite.jsm: wait until all clients have completed their task
[task 2021-09-16T19:07:56.423Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker places.sqlite#1: PlacesExpiration.jsm: setup (0) for phase places.sqlite#1: waiting for clients
[task 2021-09-16T19:07:56.424Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker places.sqlite#0: waiting for shutdown for phase Sqlite.jsm: wait until all connections are closed
[task 2021-09-16T19:07:56.425Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker PlacesUtils read-only connection closing as part of Places shutdown for phase Places Connection shutdown
[task 2021-09-16T19:07:56.426Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker PlacesUtils read-only connection must be closed before Sqlite.jsm for phase Sqlite.jsm: wait until all clients have completed their task
[task 2021-09-16T19:07:56.466Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker ContentParent: id=14f4d9800 for phase xpcom-will-shutdown
[task 2021-09-16T19:07:56.466Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker ContentParent: id=14f4d9800 for phase profile-before-change
[task 2021-09-16T19:07:56.471Z] 19:07:56 INFO - PID 1899 | DEBUG: Completed blocker places.sqlite#1: PlacesExpiration.jsm: setup (0) for phase places.sqlite#1: waiting for clients
[task 2021-09-16T19:07:56.499Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker JSON store: writing data for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2021-09-16T19:07:56.548Z] 19:07:56 INFO - PID 1899 | 1631819276547 geckodriver::marionette TRACE Connection refused (os error 61). Retrying in 100ms
[task 2021-09-16T19:07:56.689Z] 19:07:56 INFO - PID 1899 | DEBUG: Adding blocker JSON store: writing data for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2021-09-16T19:07:56.796Z] 19:07:56 INFO - PID 1899 | 1631819276795 geckodriver::marionette TRACE Connection refused (os error 61). Retrying in 100ms
[task 2021-09-16T19:07:56.871Z] 19:07:56 INFO - PID 1899 | 1631819276870 Marionette TRACE Received observer notification marionette-startup-requested
And here the logs from a failing test where Firefox seems to shutdown during startup:
[task 2021-09-16T19:07:54.676Z] 19:07:54 INFO - PID 1899 | 1631819274676 Marionette INFO Marionette enabled
[task 2021-09-16T19:07:54.696Z] 19:07:54 INFO - PID 1899 | 1631819274695 geckodriver::marionette TRACE Connection refused (os error 61). Retrying in 100ms
[task 2021-09-16T19:07:54.731Z] 19:07:54 INFO - PID 1899 | 1631819274730 Marionette TRACE Received observer notification toplevel-window-ready
[task 2021-09-16T19:07:54.744Z] 19:07:54 INFO - PID 1899 | DEBUG: Adding blocker ClientManagerService: start destroying IPC actors early for phase xpcom-will-shutdown
[task 2021-09-16T19:07:54.752Z] 19:07:54 INFO - PID 1899 | DEBUG: Adding blocker Flush WebExtension StartupCache for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2021-09-16T19:07:54.791Z] 19:07:54 INFO - PID 1899 | DEBUG: Adding blocker JSON store: writing data for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2021-09-16T19:07:54.826Z] 19:07:54 INFO - PID 1899 | DEBUG: Adding blocker MediaShutdownManager: shutdown for phase profile-before-change
[task 2021-09-16T19:07:54.838Z] 19:07:54 INFO - PID 1899 | DEBUG: Adding blocker JSON store: writing data for phase IOUtils: waiting for profileBeforeChange IO to complete
[task 2021-09-16T19:07:54.865Z] 19:07:54 INFO - PID 1899 | DEBUG: Adding blocker ServiceWorkerShutdownBlocker: shutting down Service Workers for phase profile-change-teardown
[task 2021-09-16T19:07:54.940Z] 19:07:54 INFO - PID 1899 | 1631819274939 geckodriver::marionette TRACE Connection refused (os error 61). Retrying in 100ms
[task 2021-09-16T19:07:55.166Z] 19:07:55 INFO - PID 1899 | 1631819275164 geckodriver::browser DEBUG Browser process stopped: exit status: 1
[task 2021-09-16T19:07:55.169Z] 19:07:55 INFO - PID 1899 | 1631819275165 webdriver::server DEBUG <- 500 Internal Server Error {"value":{"error":"unknown error","message":"Process unexpectedly closed with status 1","stacktrace":""}}
[task 2021-09-16T19:07:55.318Z] 19:07:55 INFO - STDOUT: ERROR
So the difference between both is the following line:
[task 2021-09-16T19:07:55.868Z] 19:07:55 INFO - PID 1899 | DEBUG: Adding blocker GMPProvider for phase AddonManager: Waiting for providers to shut down.
So the GMPProvider for AddonManager is not getting added as blocker for shutdown. So something between that and ServiceWorkerShutdownBlocker
is triggering the shutdown. Sadly we do not yet collect minidumps for crashes in wdspec (see bug 1490906), which might have helped here. I'm working towards it but it might still take a bit.
I wonder if Mn
jobs would also fail on MacOS when run in headless mode, but as it looks like they are not available for that platform even defined in the taskcluster configuration. These use Firefox in a similar way and might also trigger the same issue, where we would be able to catch a minidump. Maybe you could add the headless config - marionette-headless
to test-platforms.yml and push such a try build?
Comment 27•4 years ago
|
||
Assignee | ||
Comment 28•4 years ago
|
||
tl;dr I've queued for landing the minimal fix (just remove the assert) since I haven't seen that cause a backout-worthy failure in several runs. My hope is to uplift that for 93 since it will prevent a tab crash in the event of input which uses the HLG transfer function. That should be quite rare to nonexistent based on what we've seen so far with beta telemetry, but it's a simple fix.
(In reply to Henrik Skupin (:whimboo) [โ๏ธUTC+1] from comment #26)
What I can see so far it's only Wd2 in headless mode which is permanently failing on MacOS 10.15 (not 10.14 as you said above). I tried yesterday to reproduce it locally too, but it's not failing on MacOS 11.5.2. Maybe it's something specific for 10.15?
I interpreted the "OS X 10.15" in the job description to refer to the SDK used for building, not the OS it's actually running on, but I'm not yet very knowledgeable about the CI system generally. I was previously doing my local builds on the 10.14 SDK, but changed to 10.15 (my above comment was a typo) last night in an effort to reproduce. Unfortunately, I still wasn't able to reproduce locally, but it's probably worth trying again after changing my rust toolchain from 1.53 to 1.55 (which it seems most of the try servers are using, at least for macOS builds).
Given that your changes are graphics related I wonder if we could run a try build with some GFX logging via MOZ_LOG enabled? But not sure which type of logs the graphics component supports, and if that also includes Rust components.
I'll take a look at that, but probably in a separate bug I'll open for follow-up to this one.
I wonder if
Mn
jobs would also fail on MacOS when run in headless mode, but as it looks like they are not available for that platform even defined in the taskcluster configuration. These use Firefox in a similar way and might also trigger the same issue, where we would be able to catch a minidump. Maybe you could add the headless config- marionette-headless
to test-platforms.yml and push such a try build?
Another thing I'll put on the to-do list when I can get to it. Right now my priorities are getting as much AVIF polish and telemetry uplifted for 93.
Comment 29•4 years ago
|
||
bugherder |
Assignee | ||
Comment 30•4 years ago
|
||
Comment on attachment 9240184 [details]
Bug 1729539 - Hit MOZ_CRASH(assertion failed: y2 > 1. / 12. && y2 <= 1.) at gfx/qcms/src/iccread.rs:1392. r=jrmuizel,tsmith
Beta/Release Uplift Approval Request
- User impact if declined: Tab crash on AVIF inputs using HLG transfer functions (currently extremely rare based on telemetry)
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): It's merely removing an assert that was ill-conceived to begin with
- String changes made/needed:
Comment 31•4 years ago
|
||
Bugmon Analysis
Verified bug as fixed on rev mozilla-central 20210917215008-186467330eb1.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.
Comment 32•4 years ago
|
||
(In reply to Jon Bauman [:jbauman:] from comment #28)
I interpreted the "OS X 10.15" in the job description to refer to the SDK used for building, not the OS it's actually running on, but I'm not yet very knowledgeable about the CI system generally.
10.15 there means that is the version of macOS those jobs are running on.
Bug 1475652 suggests we are using the 10.11 SDK to build in CI for Intel, I think we are using the 11 SDK for build for apple silicon.
Comment 33•4 years ago
|
||
Comment on attachment 9240184 [details]
Bug 1729539 - Hit MOZ_CRASH(assertion failed: y2 > 1. / 12. && y2 <= 1.) at gfx/qcms/src/iccread.rs:1392. r=jrmuizel,tsmith
Approved for uplift in 93 beta 8, thanks.
Comment 34•4 years ago
|
||
bugherder uplift |
Updated•4 years ago
|
Description
•