segfault in DoDatabaseWork (this makes Firefox 71 to crash at startup)
Categories
(Core :: Storage: IndexedDB, defect, P2)
Tracking
()
People
(Reporter: jh, Assigned: sg, NeedInfo)
References
(Regression)
Details
(Keywords: crash, regression)
Attachments
(5 files)
3.28 KB,
patch
|
Details | Diff | Splinter Review | |
3.02 KB,
patch
|
Details | Diff | Splinter Review | |
5.89 KB,
patch
|
froydnj
:
feedback+
|
Details | Diff | Splinter Review |
47 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-beta+
jcristau
:
approval-mozilla-release-
|
Details | Review |
47 bytes,
text/x-phabricator-request
|
pascalc
:
approval-mozilla-release-
|
Details | Review |
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:70.0) Gecko/20100101 Firefox/70.0
Steps to reproduce:
I build Firefox 71 with GCC 9 and LTO enabled
Actual results:
It segfaults on startup. The problem is with following code:
#5 mozilla::dom::indexedDB::(anonymous namespace)::ObjectStoreAddOrPutRequestOp::DoDatabaseWork (this=0x7fffd7687430, aConnection=0x7fffd92d6640)
at /aux/hubicka/firefox-2019-2/dom/indexedDB/ActorsParent.cpp:24402
24402 rv = aConnection->GetCachedStatement(
(gdb) l
24397 // detectable errors rather than corrupting data.
24398 DatabaseConnection::CachedStatement stmt;
24399 const auto& optReplaceDirective = (!mOverwrite || keyUnset)
24400 ? NS_LITERAL_CSTRING("")
24401 : NS_LITERAL_CSTRING("OR REPLACE ");
24402 rv = aConnection->GetCachedStatement(
24403 NS_LITERAL_CSTRING("INSERT ") + optReplaceDirective +
24404 NS_LITERAL_CSTRING("INTO object_data "
24405 "(object_store_id, key, file_ids, data) "
24406 "VALUES (:") +
(gdb)
24407 kStmtParamNameObjectStoreId + NS_LITERAL_CSTRING(", :") +
24408 kStmtParamNameKey + NS_LITERAL_CSTRING(", :") +
24409 kStmtParamNameFileIds + NS_LITERAL_CSTRING(", :") +
24410 kStmtParamNameData + NS_LITERAL_CSTRING(");"),
24411 &stmt);
24412 if (NS_WARN_IF(NS_FAILED(rv))) {
24413 return rv;
24414 }
24415
24416 rv = stmt->BindInt64ByName(kStmtParamNameObjectStoreId, osid);
Here GCC optimizes out constructors in
24399 const auto& optReplaceDirective = (!mOverwrite || keyUnset)
24400 ? NS_LITERAL_CSTRING("")
24401 : NS_LITERAL_CSTRING("OR REPLACE ");
Because their lifetime ends at the end of the statement. A
self-contained testcase is:
struct a {int val;
a(int v) : val(v) {}
};
int
test(int bar)
{
const struct a &ref=bar ? static_cast<const a &>(a(1)):static_cast <const a &>(a(2));
return ref.val;
}
Here initialization will be optimized out while
struct a {int val;
a(int v) : val(v) {}
};
int
test(int bar)
{
const struct a &ref=static_cast<const a &>(a(1));
return ref.val;
}
Reporter | ||
Comment 1•4 years ago
|
||
Let me see if the I cut&paste the code and get it formated nicely
24402 rv = aConnection->GetCachedStatement(
(gdb) l
24397 // detectable errors rather than corrupting data.
24398 DatabaseConnection::CachedStatement stmt;
24399 const auto& optReplaceDirective = (!mOverwrite || keyUnset)
24400 ? NS_LITERAL_CSTRING("")
24401 : NS_LITERAL_CSTRING("OR REPLACE ");
24402 rv = aConnection->GetCachedStatement(
24403 NS_LITERAL_CSTRING("INSERT ") + optReplaceDirective +
24404 NS_LITERAL_CSTRING("INTO object_data "
24405 "(object_store_id, key, file_ids, data) "
24406 "VALUES (:") +
(gdb)
24407 kStmtParamNameObjectStoreId + NS_LITERAL_CSTRING(", :") +
24408 kStmtParamNameKey + NS_LITERAL_CSTRING(", :") +
24409 kStmtParamNameFileIds + NS_LITERAL_CSTRING(", :") +
24410 kStmtParamNameData + NS_LITERAL_CSTRING(");"),
24411 &stmt);
24412 if (NS_WARN_IF(NS_FAILED(rv))) {
24413 return rv;
24414 }
24415
24416 rv = stmt->BindInt64ByName(kStmtParamNameObjectStoreId, osid);
Reporter | ||
Comment 2•4 years ago
|
||
Also reported as https://bugzilla.redhat.com/show_bug.cgi?id=1779082#
Reporter | ||
Comment 4•4 years ago
|
||
Reporter | ||
Comment 5•4 years ago
|
||
The lifetime of the objects in ?: was extended by 2017 DR as discussed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92831
This is implemented by Clang7+ but not yet by GCC. So Clang 6 also miscompiles the simplified testcase.
I am attaching a workaround.
I needed to rebase the patch to make it apply correctly.
The version I rebased with comments documenting the issue is attached.
Comment 7•4 years ago
|
||
Thanks. I think dom/localstorage/ActorsParent.cpp needs also some care to get extensions working.
Comment 8•4 years ago
|
||
The patch is incomplete, I'm running asan build to identify all affected places.
Comment 10•4 years ago
|
||
This patch fixes the extensions for me.
Comment 11•4 years ago
|
||
Is it really Linux specific? (I arrived here by chain of "duplicate of")
User of uBlock Origin on Reddit thread reports that uBO fails to load properly on fresh profile -
https://www.reddit.com/r/uBlockOrigin/comments/e6lr3x/ublock_origin_no_filter_lists_appear/
This apparently started in 70 and continues in 71.
He uses Windows 10.
After turning uBO off/on this appears in console:
Webconsole context has changed
console.trace() Error: "An unexpected error occurred" start.js:298:13 <anonymous> moz-extension://7d76b2af-3a6a-40d5-9c27-8162efc90c3d/js/start.js:298
UnknownError: The operation failed for reasons unrelated to the database itself and not covered by any other error code. ExtensionStorageIDB.jsm:812
normalizeStorageError resource://gre/modules/ExtensionStorageIDB.jsm:812
method chrome://extensions/content/child/ext-storage.js:273
AsyncFunctionThrow self-hosted:693
Comment 12•4 years ago
|
||
Comment 13•4 years ago
|
||
Ok, so https://www.reddit.com/r/uBlockOrigin/comments/e6lr3x/ublock_origin_no_filter_lists_appear/ is resolved.
I narrowed it down to not being able to write to the profile directory, which is on a shared drive. That said I have full permissions to it and I am able to write to it without issue. To fix the issue I set the profile to save on my C drive in ProgramData. Now it is working as it should. I am not sure why it was not able to write to the shared drive.
Comment 14•4 years ago
|
||
Comment on attachment 9114512 [details] [diff] [review] workaround_dom_indexdb_actorsparent_allignment-v2.patch Review of attachment 9114512 [details] [diff] [review]: ----------------------------------------------------------------- It's nice to find the problem, but I don't think this patch is acceptable as it stands. We'd also need a patch against trunk. ::: firefox-71.0/dom/indexedDB/ActorsParent.cpp.gcc-workaround @@ -25869,5 @@ > } > } > > - const auto& comparisonChar = > - isIncreasingOrder ? NS_LITERAL_CSTRING(">") : NS_LITERAL_CSTRING("<"); Rather than substituting the expression everywhere, does something like: ``` NS_NAMED_LITERAL_CSTRING(kGreaterThan, ">"); NS_NAMED_LITERAL_CSTRING(kLessThan, "<"); const auto& comparisonChar = isIncreasingOrder ? kGreaterThan : kLessThan; ``` work? And similar changes to other places in the file? That would be a significantly more acceptable change.
Comment 16•4 years ago
•
|
||
The status of this code is funky. Bug 1168606 has landed what broke things in 71, but also landed things years ago, as well as in the 72 cycle. The code Nathan specifically points out in comment 14 doesn't exist anymore, and instead, uses something kind of like what he suggested, but using a GetSortKeyClause
function (per https://hg.mozilla.org/integration/autoland/rev/cf53f4e57313). OTOH, the others are still there, but indexTable and table have been using ?: for as long as the file has existed...
Comment 17•4 years ago
|
||
Simon, maybe you can investigate the issue per comment 16?
Assignee | ||
Comment 18•4 years ago
•
|
||
I will look into this.
Assignee | ||
Comment 19•4 years ago
|
||
Thanks for your work on the patches. I think that only minimal changes are necessary to fix this, only the reference needs to removed, so instead of, e.g.
const auto& optReplaceDirective = (!mOverwrite || keyUnset)
? NS_LITERAL_CSTRING("")
: NS_LITERAL_CSTRING("OR REPLACE ");
we can just write:
const auto optReplaceDirective = (!mOverwrite || keyUnset)
? NS_LITERAL_CSTRING("")
: NS_LITERAL_CSTRING("OR REPLACE ");
which does not rely on lifetime extension implemented differently by different compilers, and it does not impact the quality of the source code nor the generated code negatively, since it is equivalent in this case, given that both arms are temporaries.
I will prepare patches for trunk and uplift requests to beta and release to fix that.
Assignee | ||
Comment 20•4 years ago
|
||
(In reply to Jan Hubicka from comment #5)
The lifetime of the objects in ?: was extended by 2017 DR as discussed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92831
This is implemented by Clang7+ but not yet by GCC. So Clang 6 also miscompiles the simplified testcase.
I am attaching a workaround.
In my understanding, this behaviour wasn't changed in the context of http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1299, but merely made explicit. While it was obviously open to different interpretations before that, I don't think that the standard said before that that the lifetime of temporaries in the arms of ?:
should not be extended. The behaviour of gcc and clang <7 is rather surprising IMO.
We didn't catch this earlier, since while we have base Linux toolchain builds (linux64-base-toolchains and linux64-base-toolchains-clang) building with the oldest supported compiler version, I think that tests are only run using clang 9 builds. Mike, is that accurate?
Assignee | ||
Comment 21•4 years ago
|
||
(In reply to Mike Hommey [:glandium] (high latency) from comment #16)
OTOH, the others are still there, but indexTable and table have been using ?: for as long as the file has existed...
They have been using ?:
forever, but that's not the sole condition for the problem, but binding the result of ?:
with temporaries constructed in the arms to a const&
local variable constitutes the problem. This was only introduced by https://hg.mozilla.org/integration/autoland/rev/cddcfdcb52f6a465f62bdfb36bf9dc9a8dbd11a7
Assignee | ||
Comment 22•4 years ago
|
||
Updated•4 years ago
|
Assignee | ||
Comment 23•4 years ago
|
||
Assignee | ||
Comment 24•4 years ago
|
||
Jan, Martin could you please check if applying https://bugzilla.mozilla.org/attachment.cgi?id=9115405 to release fixes the issue with your toolchain? That one is the patch against release, which also fixes one additional code location that no longer exists on trunk and therefore is missing in the other patch, as Mike mentioned in https://bugzilla.mozilla.org/show_bug.cgi?id=1601707#c16
Assignee | ||
Comment 25•4 years ago
|
||
Comment on attachment 9115402 [details]
Bug 1601707 - Workaround for compilers that do not extend the lifetime of temporaries resulting from ?: expressions.
Beta/Release Uplift Approval Request
- User impact if declined: Optimized builds using gcc or clang<7 will crash on startup.
Note that this patch is the one that can be applied to trunk and beta, while the other patch is the one to be applied to release, which needs a fix at one additional code location.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): The change only changes the type of some local (string) variables from a const reference to a const value, which are initialized from temporaries.
- String changes made/needed:
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 26•4 years ago
|
||
User impact if declined: Optimized builds using gcc or clang<7 will crash on startup.
Note this doesn't necessarily lead to crashes. In some cases, it "just" breaks addons.
only run using clang 9 builds. Mike, is that accurate?
Yes.
Comment 27•4 years ago
|
||
Thanks for the patch, I'll test it this week.
Comment 28•4 years ago
|
||
The bug doesn't impact Mozilla builds. We are half way to 72 with no dot release planned in 71 and distros can apply the fix to their toolchain as a temporary workaround. Marking 71 as wontfix.
Updated•4 years ago
|
Updated•4 years ago
|
Comment 29•4 years ago
|
||
Just a heads up, gcc-9.x branch has had the backport and is now working on expected. I do believe the best solution is provided, I doubt many distros are gonna roll a new update to the toolchain.
Comment 30•4 years ago
|
||
Looks like we are still waiting for more feedback. I don't want to land this in late beta 72 during the holidays so it will likely need to wait till 73/74 in January.
Updated•4 years ago
|
Comment 31•4 years ago
|
||
Comment on attachment 9115402 [details]
Bug 1601707 - Workaround for compilers that do not extend the lifetime of temporaries resulting from ?: expressions.
72 is now on m-r.
Assignee | ||
Comment 32•4 years ago
|
||
Still waiting for feedback from distro, but I am landing the patch to central, as the underlying issue affects our coverage builds as well, so we will hopefully see if it fixes the issue in Bug 1607001.
Comment 33•4 years ago
|
||
Pushed by sgiesecke@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/810f85be6ba9 Workaround for compilers that do not extend the lifetime of temporaries resulting from ?: expressions. r=dom-workers-and-storage-reviewers,janv
Comment 36•4 years ago
|
||
So I was told this issue is the cause of
https://github.com/vector-im/riot-web/issues/11711
which is not a crash but a massive problem with using local storage.
Apparently:
https://phabricator.services.mozilla.com/D56873
is supposed to fix this but I tried this on nixos and it didn‘t solve my problem.
Debian uses an alternative patch for firefox 71:
https://salsa.debian.org/mozilla-team/firefox/commit/e888ce1578d68116949782035f8149cdbc5e832c
I tried that one and that does actually solve the problem.
Comment 37•4 years ago
|
||
(In reply to Malte Brandy from comment #36)
So I was told this issue is the cause of
https://github.com/vector-im/riot-web/issues/11711
which is not a crash but a massive problem with using local storage.Apparently:
https://phabricator.services.mozilla.com/D56873
is supposed to fix this but I tried this on nixos and it didn‘t solve my problem.Debian uses an alternative patch for firefox 71:
https://salsa.debian.org/mozilla-team/firefox/commit/e888ce1578d68116949782035f8149cdbc5e832c
I tried that one and that does actually solve the problem.
It helps if you let people know what version you are working with. The patches both solve the same problem just different code sets. Many of the fixes in the debian patch where drop'd already in the beta/72.0 final hense it wouldn't solve your issue in 71.
Comment 38•4 years ago
|
||
I tested on 71.0.
Comment 39•4 years ago
|
||
bugherder |
Assignee | ||
Comment 40•4 years ago
|
||
(In reply to Malte Brandy from comment #36)
So I was told this issue is the cause of
https://github.com/vector-im/riot-web/issues/11711
which is not a crash but a massive problem with using local storage.Apparently:
https://phabricator.services.mozilla.com/D56873
is supposed to fix this but I tried this on nixos and it didn‘t solve my problem.
Sorry if that was not clear, but the patch you need is https://phabricator.services.mozilla.com/D56876. https://phabricator.services.mozilla.com/D56873 is missing one code location, which was removed in the meantime. This is probably the reason it doesn't solve the issue with FF 71.
Comment 41•4 years ago
|
||
This patch missed Fx73 moving to the Beta channel. Please nominate the patch for Beta approval assuming we'd like this fix on 73 :)
Assignee | ||
Comment 42•4 years ago
|
||
Comment on attachment 9115402 [details]
Bug 1601707 - Workaround for compilers that do not extend the lifetime of temporaries resulting from ?: expressions.
Beta/Release Uplift Approval Request
- User impact if declined: Optimized builds using gcc or clang<7 will crash on startup. gcc/gcov coverage builds will show several test failures, some permanent, some intermittent.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): The change only changes the type of some local (string) variables from a const reference to a const value, which are initialized from temporaries.
- String changes made/needed:
Comment 43•4 years ago
|
||
(In reply to Simon Giesecke [:sg] [he/him] from comment #40)
Sorry if that was not clear, but the patch you need is https://phabricator.services.mozilla.com/D56876. https://phabricator.services.mozilla.com/D56873 is missing one code location, which was removed in the meantime. This is probably the reason it doesn't solve the issue with FF 71.
Thank you! That fixed it!
Comment 44•4 years ago
|
||
Comment on attachment 9115402 [details]
Bug 1601707 - Workaround for compilers that do not extend the lifetime of temporaries resulting from ?: expressions.
Fixes a crash with some older compilers. Approved for 73.0b2.
Comment 45•4 years ago
|
||
bugherder uplift |
Updated•4 years ago
|
Comment 46•4 years ago
|
||
Comment on attachment 9115402 [details]
Bug 1601707 - Workaround for compilers that do not extend the lifetime of temporaries resulting from ?: expressions.
let's leave this alone for 72.
Updated•4 years ago
|
Description
•