Closed
Bug 1221872
Opened 9 years ago
Closed 9 years ago
crash in js::jit::LRecoverInfo::appendResumePoint
Categories
(Core :: JavaScript Engine: JIT, defect)
Tracking
()
People
(Reporter: bc, Assigned: nbp)
References
Details
(Keywords: crash, reproducible, sec-high, Whiteboard: [after:2016-06-01;ni?:nbp][adv-main46-][adv-esr38.7+][adv-esr45.1-])
Crash Data
Attachments
(14 files, 1 obsolete file)
6.87 KB,
text/plain
|
Details | |
665 bytes,
patch
|
Details | Diff | Splinter Review | |
8.62 KB,
text/plain
|
Details | |
8.16 KB,
text/x-log
|
Details | |
8.22 KB,
text/plain
|
Details | |
189.23 KB,
text/plain
|
Details | |
162.46 KB,
text/plain
|
Details | |
11.13 KB,
text/plain
|
Details | |
1.87 KB,
patch
|
Details | Diff | Splinter Review | |
3.38 KB,
text/plain
|
Details | |
1.18 KB,
patch
|
sunfish
:
review+
ritu
:
approval-mozilla-aurora+
ritu
:
approval-mozilla-beta+
ritu
:
approval-mozilla-esr38+
Sylvestre
:
approval-mozilla-esr45-
abillings
:
sec-approval+
|
Details | Diff | Splinter Review |
5.40 KB,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
956 bytes,
patch
|
sunfish
:
review+
|
Details | Diff | Splinter Review |
1.54 KB,
patch
|
jseward
:
review+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-721e86d9-1968-43db-855c-106452151105.
=============================================================
1. http://www.digitalforsyth.org/photos/browse/people-individuals-baise-betty
2. Crash [@ js::jit::LRecoverInfo::appendResumePoint ]
Appears to be Windows, Linux and OSX, Beta/43, Aurora/44, Nighty/45. Automation only reproduced on debug though I found the same waiting for the page to complete loading with an opt build.
Reproduced in Bughunter and manually on Windows 7.
opt-asan builds on Linux have stacks like: InlineSpaghettiStackIterator begin storesBegin js::jit::LRecoverInfo::appendResumePoint(js::jit::MResumePoint*) init
Comment 1•9 years ago
|
||
Problem possibly originated due to an uninitialised value
in visitDominatorTree? I'll investigate a bit more.
Comment 2•9 years ago
|
||
It appears that MBasicBlock::MBasicBlock doesn't initialise domIndex_,
and that those uninitialised values are leaking out in MBasicBlock::dominates.
This doesn't seem to me to be good, and maybe should be fixed anyway.
Patch below. It gets rid of all reported uninitialised value complaints.
Curiously though, fixing that doesn't remove the crash. It still
looks like InlineSpaghettiStackIterator is dereferencing a field of a NULL
object, or something like that.
Comment 3•9 years ago
|
||
Initialises domIndex_ in MBasicBlock::MBasicBlock.
Assignee | ||
Comment 4•9 years ago
|
||
The valgrind report is interesting, but for comment 0, you might want to have a look at Bug 1186006.
Unfortunately this patch is currently blocked by our test suite, and I did not managed to reproduce so far, nor any of the fuzzers.
Assignee | ||
Comment 5•9 years ago
|
||
Comment on attachment 8683612 [details] [diff] [review]
bug1221872-1.diff
Review of attachment 8683612 [details] [diff] [review]:
-----------------------------------------------------------------
Honestly, this is strange that we have to initialize this field this early to mute valgrind, because the flag is supposed to be set by BuildDominatorTree[1], which precedes the value numbering [2], while no other transformations are supposed to add any new basic block, which would justify to rerun the BuildDominatorTree function.
[1] https://dxr.mozilla.org/mozilla-central/source/js/src/jit/Ion.cpp#1541
[2] https://dxr.mozilla.org/mozilla-central/source/js/src/jit/Ion.cpp#1620
Assignee | ||
Comment 6•9 years ago
|
||
I guess the problem might be within the ValueNumber:
https://dxr.mozilla.org/mozilla-central/source/js/src/jit/ValueNumbering.cpp#434
Comment 7•9 years ago
|
||
We could initialize domIndex_ to UINT32_MAX or something, then when we access it assert it's != that value.
(In reply to Nicolas B. Pierron [:nbp] from comment #4)
> Unfortunately this patch is currently blocked by our test suite, and I did
> not managed to reproduce so far, nor any of the fuzzers.
You can request temporary access to a test slave to investigate.
Comment 8•9 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #5)
Here are the same errors but this time with stack traces showing
where the uninitialised values were created. Maybe that helps
make sense of the sequence?
Assignee | ||
Comment 9•9 years ago
|
||
Attachment #8683672 -
Flags: review?(sunfish)
Attachment #8683672 -
Flags: feedback?(jseward)
Comment 10•9 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #9)
> Value Numbering: Account for changes after fixupOSROnlyLoop.
Hmm, this doesn't seem to have any effect. The uninitialised value uses
and the final invalid access, which kills the process, are still present.
Updated•9 years ago
|
Attachment #8683672 -
Flags: feedback?(jseward)
Comment 11•9 years ago
|
||
Comment on attachment 8683672 [details] [diff] [review]
Value Numbering: Account for changes after fixupOSROnlyLoop.
Review of attachment 8683672 [details] [diff] [review]:
-----------------------------------------------------------------
Looks reasonable to me. If you wanted to rename blocksRemoved_ to blocksAddedOrRemoved_, that'd be ok with me too.
Attachment #8683672 -
Flags: review?(sunfish) → review+
Reporter | ||
Comment 12•9 years ago
|
||
As originally filed, this is probably a dupe of bug 1186006. This is the second most common crash found in bughunter in the last week.
nbp: Do you think we could get this landed to see if it helps at all?
Comment 13•9 years ago
|
||
nbp: per comment 10, are we sure this is really the right fix?
Assignee | ||
Comment 14•9 years ago
|
||
(In reply to Bob Clary [:bc:] from comment #12)
> As originally filed, this is probably a dupe of bug 1186006. This is the
> second most common crash found in bughunter in the last week.
>
> nbp: Do you think we could get this landed to see if it helps at all?
So far I am unable to reproduce this issue. Valgrind stops reporting because of a large number of error message reported and prevents firefox from launching a window, and I cannot test with ASan because clang failed to compile firefox locally (at least) because of some system header issues in my toolchain.
Also, based on comment 10, the patch which is attached on this bug is not enough. I do not think I can go anywhere without good hints or the ability to reproduce. I will look at landing the current patch as part of another bug.
Reporter | ||
Comment 15•9 years ago
|
||
A quick run with an asan build from 20151205152029
http://hg.mozilla.org/mozilla-central/rev/cc9c6cd756cb744596ba039dcc5ad3065a7cc3ea
==28193==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000020 (pc 0x7efdc502f3dc sp 0x7efd9b63e1c0 bp 0x7efd9b63e1f0 T13)
#0 0x7efdc502f3db in InlineSpaghettiStackIterator /builds/slave/m-cen-l64-asan-000000000000000/build/src/js/src/jit/InlineList.h:640
Reporter | ||
Comment 16•9 years ago
|
||
ditto with today's asan build from http://ftp.mozilla.org/pub/firefox/tinderbox-builds/mozilla-central-linux64-asan/1450726979/
Reporter | ||
Updated•9 years ago
|
Group: core-security
Reporter | ||
Comment 17•9 years ago
|
||
Thread 12 (crashed)
0 XUL!js::jit::LRecoverInfo::appendResumePoint(js::jit::MResumePoint*) [InlineList.h:1ba4f6d91e63 : 639 + 0x0]
I made this security sensitive due to this medium exploitable crash on OSX as well as the valgrind run I am completing now.
Reporter | ||
Comment 18•9 years ago
|
||
This is from a mozilla-central build from 20151218111408 on fedora 23 x86_64
==32462== Conditional jump or move depends on uninitialised value(s)
==32462== at 0x98CF558: js::jit::ValueNumberer::visitDominatorTree(js::jit::MBasicBlock*) (js/src/jit/ValueNumbering.cpp:994)
==32462== Conditional jump or move depends on uninitialised value(s)
==32462== at 0x98CCFE7: js::jit::ValueNumberer::leader(js::jit::MDefinition*) (js/src/jit/ValueNumbering.cpp:638)
==32462== Conditional jump or move depends on uninitialised value(s)
==32462== at 0x982C616: js::jit::MPhi::foldsTernary() (js/src/jit/MIR.cpp:1685)
==32462== Invalid read of size 8
==32462== at 0x97FF807: InlineSpaghettiStackIterator (js/src/jit/InlineList.h:639)
Reporter | ||
Comment 19•9 years ago
|
||
I ran this url through bughunter again and could reproduce on the following:
> aurora debug: Linux 2.6.32 x86 Mac OS X 10.8 x86_64 Mac OS X 10.9 x86_64 Windows NT 6.1 x86 Mac OS X 10.6 x86_64
> aurora opt-asan: Linux 2.6.32 x86_64
> beta debug: Mac OS X 10.8 x86_64 Mac OS X 10.9 x86_64 Mac OS X 10.6 x86_64
> beta opt: Linux 2.6.32 x86
> beta opt-asan: Linux 2.6.32 x86_64
> nightly debug: Linux 2.6.32 x86 Mac OS X 10.8 x86_64 Mac OS X 10.9 x86_64 Mac OS X 10.6 x86_64
> nightly opt: Linux 2.6.32 x86_64
Assignee | ||
Comment 20•9 years ago
|
||
Thanks for the valgrind output, it sounds a bit more helpful than ASan, probably because of a lacks of ASan instrumentation in our LifoAlloc.
I found issues in the way I configured firefox, which might have caused a lot of valgrind issues. I will try it again as soon as it finish building.
Assignee | ||
Comment 21•9 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #20)
> I found issues in the way I configured firefox, which might have caused a
> lot of valgrind issues. I will try it again as soon as it finish building.
I am still unable to reproduce this issue under valgrind, and see any window when executed under valgrind because of issues within mesa-noglu-10.6.9/lib/libGL.so.1.2.0.
Assignee | ||
Comment 22•9 years ago
|
||
I am also unable to make the website mentioned in comment 0 crash while running without valgrind.
Reporter | ||
Comment 23•9 years ago
|
||
nbp: not sure what to say without more information about your system and which build types you have been trying. My Linux machines in bughunter are RHEL6 with memory ranging from 3G for 32bit to 4-5 G for 64bit, but my local machine is a real workstation running Fedora 23 with 16G.
I presume you have Ubuntu? And you have tested both opt and debug builds? I can reproduce on beta/44 and aurora/45 as well as nightly/45. Have you tried beta and aurora? I can also reproduce this with a fresh profile. Have you tried a fresh profile yet?
If you can't reproduce, perhaps we can find someone on the JS team who can?
Assignee | ||
Comment 24•9 years ago
|
||
So I figured out the problem with valgrind. Apparently x30 slower is really slow, and leave time for printing messages about the lack of window. Leaving it running for hours make it able to load the page, but unfortunately not enough, as the unresponsive script dialog kicks in.
I also accidentally killed a release version of Firefox by accidentally pasting the URL. So I am able to reproduce the issue, but not in a way which is manageable yet.
I will see if I can use "rr" to record a faulty run over the night.
(In reply to Bob Clary [:bc:] from comment #23)
> I presume you have Ubuntu?
I am running NixOS. On a laptop with 16G of RAM.
> And you have tested both opt and debug builds?
I only tested with debug builds so far.
> I
> can reproduce on beta/44 and aurora/45 as well as nightly/45. Have you tried
> beta and aurora?
No, I have not, because I cannot use the pre-build versions of Firefox. (no /lib)
> I can also reproduce this with a fresh profile. Have you
> tried a fresh profile yet?
I tried with a profile that I only use for testing. I will create a new profile and test with it.
> If you can't reproduce, perhaps we can find someone on the JS team who can?
If I have no recorded trace tomorrow, I will forward this bug to someone else. Unfortunately, based on the code location I am probably the best person to investigate :/
Reporter | ||
Comment 25•9 years ago
|
||
I forgot to mention that I do use valgrind builds from svn and not the packaged versions in case that helps.
I've never used rr before and not familiar with the details, but is that something I could do and share with you?
Assignee | ||
Comment 26•9 years ago
|
||
(In reply to Bob Clary [:bc:] from comment #25)
> I forgot to mention that I do use valgrind builds from svn and not the
> packaged versions in case that helps.
>
> I've never used rr before and not familiar with the details, but is that
> something I could do and share with you?
Unfortunately, even if you manage to reproduce it with rr, if we do not have the same CPU, then we cannot share the execution trace that easily.
The technique¹ seems to work, I restarted it on an optimized build, because the debug version settled on an unrelated assertion.
¹ the following shell script:
function which_url() {
local i=$(($1 % 20));
case $i in
0) echo 'http://www.digitalforsyth.org/photos/browse/people-individuals-baise-betty';;
*) echo '/home/nicolas/mirror/www.digitalforsyth.org/photos/browse/people-individuals-baise-betty'
esac
}
counter=0;
result=0;
while test $result -ne 11; do
timeout --preserve-status -k 10s -s QUIT 60s \
rr record \
/…/dist/bin/firefox -P debug-1 --no-remote $(which_url $counter) ;
result=$?;
counter=$(($counter + 1));
done
Assignee | ||
Comment 27•9 years ago
|
||
Ok, I have more than a 200 runs so far, and I am still unable to get Firefox crash with the original url of comment 0, within rr.
Sunfish, maybe you would have a bit more chance than me at reproducing this issue.
Flags: needinfo?(sunfish)
Comment 28•9 years ago
|
||
I am a little surprised at the difficulties reproducing it. I can get the
Valgrind reports and crash every time. STR in a bit of detail:
Fedora release 21 (Twenty One) (64 bit)
gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)
valgrind-3.12.0.SVN (but I'm sure it doesn't matter much)
Mozconfig as shown below. Then:
(cd M_CENTRAL && hg pull)
hg clone M_CENTRAL REPRO-1221872
cd REPRO-1221872
./mach build
valgrind --vex-iropt-register-updates=allregs-at-mem-access --smc-check=all-non-file \
--fair-sched=yes --show-mismatched-frees=no \
./ff-Og-linux64/dist/bin/firefox-bin -P dev -no-remote 2>&1 \
| tee spew-01-mc
And load the URL in comment #0. It crashes almost immediately (a few seconds).
In total does not require more than about 4 CPU minutes from Valgrind.
Certainly not hours and hours, per comment #24. Also, with this mozconfig
this is no avalanche of errors from Valgrind. I get just 14 errors from
4 contexts.
-------- MOZCONFIG --------
. $topsrcdir/browser/config/mozconfig
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/ff-Og-linux64
ac_add_options --enable-tests
ac_add_options --enable-optimize="-g -Og"
ac_add_options --enable-debug-symbols
ac_add_options --enable-valgrind
ac_add_options --enable-profiling
ac_add_options --enable-elf-hack
ac_add_options --disable-crashreporter
ac_add_options --disable-jemalloc
mk_add_options MOZ_MAKE_FLAGS="-j8"
mk_add_options AUTOCLOBBER=1
Updated•9 years ago
|
Group: core-security → javascript-core-security
Comment 29•9 years ago
|
||
http://fashion.aeonsquare.net/tomicakoutu/ was reprodcuible today with a debug build from today but not sure if this is this bug or bug bug 1238592
Comment 30•9 years ago
|
||
https://crash-stats.mozilla.com/report/index/d0d48f58-0fc0-4d74-839b-381732160205 is the crash-stack i got for opt builds
Comment 31•9 years ago
|
||
(I don't have much experience running Firefox under valgrind.) When I try to reproduce the build and steps in comment #28, the resulting Firefox exits after a few seconds, and valgrind reports zero errors. Running the same Firefox with the same options outside of valgrind and it launches normally. Is there something extra I should do to run Firefox under valgrind?
Visiting the URLs in comment #0 and comment #29 outside of valgrind don't crash for me.
gcc (Debian 5.2.1-23) 5.2.1 20151028
valgrind-3.11.0
Comment 32•9 years ago
|
||
With current m-c sources, I cannot reproduce the crash any longer.
But I still get the uninitialised value warnings shown in comment 8,
both with
http://www.digitalforsyth.org/photos/browse/people-individuals-baise-betty
and
http://fashion.aeonsquare.net/tomicakoutu/
It's unclear to me why Dan and others can't repro this. I wonder if
there is some difference in JS script execution or some such, which
causes different paths in IM to be used.
Assignee | ||
Comment 33•9 years ago
|
||
(In reply to Julian Seward [:jseward] from comment #32)
> With current m-c sources, I cannot reproduce the crash any longer.
This might have been fixed by Bug 1238592, which fix the issue where instructions could be moved in basic blocks which had nullptr as resume points, and caused the current failure.
Then the only reason why would we have a a non-initialized dominators would be that we re-enter GVN after creating fixup blocks and without recomputing dominators.
Comment 34•9 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #33)
Thanks for the analysis.
> Then the only reason why would we have a a non-initialized dominators would
> be that we re-enter GVN after creating fixup blocks and without recomputing
> dominators.
Are you 100% sure this can never happen?
Comment 35•9 years ago
|
||
Is this about domIndex_? Why can't we do something like this:
(In reply to Jan de Mooij [:jandem] from comment #7)
> We could initialize domIndex_ to UINT32_MAX or something, then when we
> access it assert it's != that value.
At least that'd allow us to reproduce it without Valgrind.
Reporter | ||
Comment 36•9 years ago
|
||
This is still reproducible for me with the original url on Aurora Linux x86_64 though not with Nightly. I used mozregression and it pointed to Bug 1238592 as the fix.
I ran it under valgrind on mozilla central and hit several Conditional jump or move depends on uninitialised value(s) all with
==27132== Uninitialised value was created by a client request
==27132== at 0x98998E8: js::detail::BumpChunk::setBump(void*) (js/src/ds/LifoAlloc.h:84)
Comment 37•9 years ago
|
||
(In reply to Bob Clary [:bc:] from comment #36)
> This is still reproducible for me with the original url on Aurora Linux
> x86_64 though not with Nightly.
I can still repro on nightly.
Comment 38•9 years ago
|
||
> (In reply to Jan de Mooij [:jandem] from comment #7)
> > We could initialize domIndex_ to UINT32_MAX or something, then when we
> > access it assert it's != that value.
>
> At least that'd allow us to reproduce it without Valgrind.
Yeah, good suggestion. And indeed that does cause an immediate assertion
failure for me for the URL in comment #0.
Comment 39•9 years ago
|
||
Feeble patch that adds relevant assertions per comment 38.
Not suitable for committing.
Comment 40•9 years ago
|
||
Assignee | ||
Comment 41•9 years ago
|
||
I guess the problem might come from the removal of an edge into a for loop, which makes us add a fixup block, but not toggle the removeBlocks_ flag. Then when we iterate a second time we have no updated dominator tree information.
Another issue might just be that we iterate over the fixup block that we just added into the graph, even before running the dominator tree analysis again.
Flags: needinfo?(sunfish) → needinfo?(nicolas.b.pierron)
Assignee | ||
Comment 42•9 years ago
|
||
I can reproduce the same use-before-init in the JS Shell, with --ion-offthread-compile=off :
var f = function () {
f = function () {
return true;
};
return false;
};
var uneval_f = uneval(f);
f = eval(uneval_f);
function fixupOSR_7(c) {
var i = 1;
if (!f()) {
outer: while (i < c) {
var j = i++;
while (j > 4) {
f();
if (j % 2 == 0)
j = j / 2;
else
j = 3 * j + 1;
}
}
}
};
fixupOSR_7(100);
This is inspired from Bug 1238592, and made such that we add a fixup block, by removing the first edge and doing an OSR in the outer loop, but also made such that all the basic blocks remain in the MIR Graph.
Currently we only trigger the re-computation of the dominator indexes when one block got removed, but we should do that even if any edge is removed from the graph.
Now that I have a test case, I should be able to come up with a patch :)
Assignee | ||
Comment 43•9 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #41)
> Another issue might just be that we iterate over the fixup block that we
> just added into the graph, even before running the dominator tree analysis
> again.
Ok, this is the issue highlighted by the test case, which is likely to be the same issue seen on the website.
The other hypothesis would have been fixed by the attachment 8683672 [details] [diff] [review], but so far I did not managed to make any test case which triggers it. I think it might still be interesting to land it, but I will submit another patch which renames the variables name too.
Flags: needinfo?(nicolas.b.pierron)
Assignee | ||
Comment 44•9 years ago
|
||
Attachment #8718885 -
Flags: review?(sunfish)
Assignee | ||
Comment 45•9 years ago
|
||
Attachment #8718886 -
Flags: review?(sunfish)
Assignee | ||
Updated•9 years ago
|
Attachment #8683672 -
Attachment is obsolete: true
Assignee | ||
Comment 46•9 years ago
|
||
Attachment #8718887 -
Flags: review?(sunfish)
Assignee | ||
Comment 47•9 years ago
|
||
Attachment #8718888 -
Flags: review?(jseward)
Updated•9 years ago
|
Attachment #8718888 -
Flags: review?(jseward) → review+
Comment 48•9 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #47)
> Created attachment 8718888 [details] [diff] [review]
> Ensure that BasicBlock::domIndex are properly initialized.
With this patch and the ones in comment 44 and 45 applied too, I no longer
see the (relevant) valgrind complaints. So the fix looks good to me.
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → nicolas.b.pierron
Status: NEW → ASSIGNED
Comment 49•9 years ago
|
||
Comment on attachment 8718885 [details] [diff] [review]
ValueNumbering: Set the dominator index of fixup blocks when they are created.
Review of attachment 8718885 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/ValueNumbering.cpp
@@ +438,5 @@
>
> graph_.insertBlockBefore(block, fake);
> fake->setImmediateDominator(fake);
> fake->addNumDominated(1);
> + fake->setDomIndex(fake->id());
Do we need dominates queries to work correctly for the OSR fixup block? If so, we should perhaps set fake's domIndex field to block->domIndex()-1. If not, setting it to some bogus value, and making dominates or perhaps even domIndex() itself assert that it doesn't see the bogus value, would be safer.
Attachment #8718885 -
Flags: review?(sunfish)
Assignee | ||
Comment 50•9 years ago
|
||
(In reply to Dan Gohman [:sunfish] from comment #49)
> Comment on attachment 8718885 [details] [diff] [review]
> ValueNumbering: Set the dominator index of fixup blocks when they are
> created.
>
> Review of attachment 8718885 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: js/src/jit/ValueNumbering.cpp
> @@ +438,5 @@
> >
> > graph_.insertBlockBefore(block, fake);
> > fake->setImmediateDominator(fake);
> > fake->addNumDominated(1);
> > + fake->setDomIndex(fake->id());
>
> Do we need dominates queries to work correctly for the OSR fixup block? If
> so, we should perhaps set fake's domIndex field to block->domIndex()-1. If
> not, setting it to some bogus value, and making dominates or perhaps even
> domIndex() itself assert that it doesn't see the bogus value, would be safer.
Yes, we need the dominates function to work properly after adding the fixup block, and before the end of the graph traversal, in visitDominatorTree [1]. The reason being that when we remove an edge, but no leading blocks, we add a fixup block at the end of the graph that we have not yet finished to traverse.
Setting the domIndex to the id of the block sounds like the safest way from my point of view, which makes the domination info valid, as the fixup block is not supped to be dominating any other block as it is the predecessor of a join block with the OSR block.
[1] https://dxr.mozilla.org/mozilla-central/source/js/src/jit/ValueNumbering.cpp?from=ValueNumbering.cpp#997-998
Assignee | ||
Comment 51•9 years ago
|
||
Comment on attachment 8718885 [details] [diff] [review]
ValueNumbering: Set the dominator index of fixup blocks when they are created.
Review of attachment 8718885 [details] [diff] [review]:
-----------------------------------------------------------------
Note that the fake->id() of the basic block correspond to the last block added [1], which correspond to the number of blocks added so far.
Also note that with fake->addNumDominated(1), this block would effectively be the only block dominated by it-self, and no other blocks will dominate this block. Which is exactly what this fixup block is supposed to be.
[1] https://dxr.mozilla.org/mozilla-central/source/js/src/jit/MIRGraph.cpp?from=insertBlockBefore#182
Attachment #8718885 -
Flags: review?(sunfish)
Updated•9 years ago
|
Attachment #8718886 -
Flags: review?(sunfish) → review+
Updated•9 years ago
|
Attachment #8718887 -
Flags: review?(sunfish) → review+
Comment 52•9 years ago
|
||
Comment on attachment 8718885 [details] [diff] [review]
ValueNumbering: Set the dominator index of fixup blocks when they are created.
Review of attachment 8718885 [details] [diff] [review]:
-----------------------------------------------------------------
Makes sense. Thanks for the explanation!
Attachment #8718885 -
Flags: review?(sunfish) → review+
Assignee | ||
Comment 53•9 years ago
|
||
Comment on attachment 8718885 [details] [diff] [review]
ValueNumbering: Set the dominator index of fixup blocks when they are created.
[Security approval request comment]
> How easily could an exploit be constructed based on the patch?
The patch highlight a non-initialized path, but making something go in this path is hard but not impossible. Then, this issue might cause GVN or any later phases to rerun on one single block (numdominated == 1), thus if somebody manage to reach this far, exploiting this issue still hard.
> Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
No.
> Which older supported branches are affected by this flaw?
All.
> Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
We need to make sure that the context is the same:
fake->addNumDominated(1);
Otherwise this patch should be trivial to backport.
> How likely is this patch to cause regressions; how much testing does it need?
This should not cause any regressions, as we initialize a field which used to remain uninitialized, and the initialization should be safe, as described in comment 50.
Attachment #8718885 -
Flags: sec-approval?
Comment 54•9 years ago
|
||
sec-approval+ for trunk.
We're going to need patches made and nominated for ESR38, Aurora, and Beta so they can land following it landing on trunk.
status-firefox44:
--- → wontfix
status-firefox45:
--- → affected
status-firefox46:
--- → affected
status-firefox47:
--- → affected
status-firefox-esr38:
--- → affected
status-firefox-esr45:
--- → affected
tracking-firefox45:
--- → +
tracking-firefox46:
--- → +
tracking-firefox47:
--- → +
tracking-firefox-esr38:
--- → 45+
Comment 55•9 years ago
|
||
Comment on attachment 8718885 [details] [diff] [review]
ValueNumbering: Set the dominator index of fixup blocks when they are created.
Note to Release Management: the fix here is pretty tiny.
Attachment #8718885 -
Flags: sec-approval? → sec-approval+
Assignee | ||
Comment 56•9 years ago
|
||
(In reply to Al Billings [:abillings] from comment #54)
> sec-approval+ for trunk.
>
> We're going to need patches made and nominated for ESR38, Aurora, and Beta
> so they can land following it landing on trunk.
I verified, the same patch applies cleanly on all branches. And the context line needed to ensure that this patch works correctly (as mentioned in comment 53) got added in Bug 1029830 (Sep 2014).
I will land this patch on trunk.
Assignee | ||
Comment 57•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/47952b802aab
(ni? my-self for landing the rest of the patches in a few months.)
Flags: needinfo?(nicolas.b.pierron)
Assignee | ||
Comment 58•9 years ago
|
||
Comment on attachment 8718885 [details] [diff] [review]
ValueNumbering: Set the dominator index of fixup blocks when they are created.
Approval Request Comment
[Feature/regressing bug #]:
Bug 1029830
[User impact if declined]:
Keep a few crashes found by bug hunter.
[Describe test coverage new/current, TreeHerder]:
Just landed on inbound, no tests landed with it, but verified locally with the test case attached on this bug.
[Risks and why]:
Low.
[String/UUID change made/needed]:
N/A
Attachment #8718885 -
Flags: approval-mozilla-esr38?
Attachment #8718885 -
Flags: approval-mozilla-beta?
Attachment #8718885 -
Flags: approval-mozilla-aurora?
Comment 59•9 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Assignee | ||
Comment 60•9 years ago
|
||
(clear ni? to avoid release-management spam. Todo: make a bot to interpret the whiteboard)
Flags: needinfo?(nicolas.b.pierron)
Whiteboard: [after:2016-06-01;ni?:nbp]
Updated•9 years ago
|
Group: javascript-core-security → core-security-release
Comment on attachment 8718885 [details] [diff] [review]
ValueNumbering: Set the dominator index of fixup blocks when they are created.
Sec-high issues meet the uplift bar for esr38.7, beta45 and aurora46. Let's land it in all three branches.
Attachment #8718885 -
Flags: approval-mozilla-esr38?
Attachment #8718885 -
Flags: approval-mozilla-esr38+
Attachment #8718885 -
Flags: approval-mozilla-beta?
Attachment #8718885 -
Flags: approval-mozilla-beta+
Attachment #8718885 -
Flags: approval-mozilla-aurora?
Attachment #8718885 -
Flags: approval-mozilla-aurora+
Comment 62•9 years ago
|
||
Comment 63•9 years ago
|
||
Updated•9 years ago
|
Whiteboard: [after:2016-06-01;ni?:nbp] → [after:2016-06-01;ni?:nbp][adv-main45+][adv-esr38.7+]
Updated•9 years ago
|
Comment 64•9 years ago
|
||
I managed reproduce the initial issue on Firefox 45.0a1 (2015-11-04) under Windows 10 64-bit.
Verified fixed on Firefox 47.0a1 (2016-02-01/02) and Firefox 38.6.1esrpre tinderbox-build (20160229210832) across all platforms.
I am still able to reproduce the initial crash on 46.0a2 (2016-03-02), Firefox 45 RC (20160301003640) under Ubuntu 12.04 32-bit and Mac OS X 10.11:
[1] Ubuntu 12.04 32-bit:
bp-9aec4da2-b2d4-43f2-a4ef-bfd5a2160302
bp-30a36bc4-fef2-44a7-8658-4dd712160302
bp-249f4541-20d8-427a-ae55-25cc42160302
[2] Mac OS X 10.11:
bp-320b4273-e7ab-499c-b6c5-d4c412160302
bp-e6621954-8c37-4ba3-b126-3a2a72160302
I’ve got another crash report while testing on Firefox 45.0 ESR (20160301172432) under these 2 platforms (Ubuntu and Mac):
[1] Ubuntu: bp-8a0386bf-1fdb-4a25-a874-fafcc2160302 and bp-6f34d186-aa3a-4c26-9ff7-a991c2160302
[2] Mac: bp-7f330050-5221-426d-8d83-3c6032160302
Should I file a separate bug for this crash report?
I’ve encountered a different crash report while testing on Windows 10 64-bit using Firefox 46.0a2 (2016-03-02), Firefox 45 RC (20160301003640) and Firefox 45.0 ESR (20160301172432):
bp-d10cdb7e-9593-49fe-82e1-7e29d2160302
bp-8e991445-90db-48a2-87df-a8c272160302
bp-3a1c47a7-08d9-40b0-a23e-87d582160302
Is this crash tracked or should I file a new issues for it?
Nicolas, any thoughts about all these crashes ?
Flags: needinfo?(nicolas.b.pierron)
Assignee | ||
Comment 65•9 years ago
|
||
(In reply to Vasilica Mihasca, QA [:vasilica_mihasca] from comment #64)
> I managed reproduce the initial issue on Firefox 45.0a1 (2015-11-04) under
> Windows 10 64-bit.
> Verified fixed on Firefox 47.0a1 (2016-02-01/02) and Firefox 38.6.1esrpre
> tinderbox-build (20160229210832) across all platforms.
>
> I am still able to reproduce the initial crash on 46.0a2 (2016-03-02),
> Firefox 45 RC (20160301003640) under Ubuntu 12.04 32-bit and Mac OS X 10.11:
> [1] Ubuntu 12.04 32-bit:
> bp-9aec4da2-b2d4-43f2-a4ef-bfd5a2160302
> bp-30a36bc4-fef2-44a7-8658-4dd712160302
> bp-249f4541-20d8-427a-ae55-25cc42160302
> [2] Mac OS X 10.11:
> bp-320b4273-e7ab-499c-b6c5-d4c412160302
> bp-e6621954-8c37-4ba3-b126-3a2a72160302
The only reason I could think of which might explain why this issue is still present on beta / aurora but not in nightly would be because of Bug 1238592, which did not got backported because of the low volume of crashes (Bug 1238592 comment 30).
> I’ve got another crash report while testing on Firefox 45.0 ESR
> (20160301172432) under these 2 platforms (Ubuntu and Mac):
> [1] Ubuntu: bp-8a0386bf-1fdb-4a25-a874-fafcc2160302 and
> bp-6f34d186-aa3a-4c26-9ff7-a991c2160302
> [2] Mac: bp-7f330050-5221-426d-8d83-3c6032160302
>
> Should I file a separate bug for this crash report?
These are probably the same issue, but we should probably file an issue because of the lack of symbols for these builds.
> I’ve encountered a different crash report while testing on Windows 10 64-bit
> using Firefox 46.0a2 (2016-03-02), Firefox 45 RC (20160301003640) and
> Firefox 45.0 ESR (20160301172432):
> bp-d10cdb7e-9593-49fe-82e1-7e29d2160302
> bp-8e991445-90db-48a2-87df-a8c272160302
> bp-3a1c47a7-08d9-40b0-a23e-87d582160302
>
> Is this crash tracked or should I file a new issues for it?
No, these are the same as before, except that the stack is polluted and we might not be able to walk the stack properly.
> Nicolas, any thoughts about all these crashes ?
We should probably mark this bug as depending on Bug 1238592.
I guess the next question, is what is the volume of this signature, and would it justify to backport Bug 1238592?
Flags: needinfo?(sledru)
Assignee | ||
Updated•9 years ago
|
Flags: needinfo?(nicolas.b.pierron)
Comment 66•9 years ago
|
||
Nicolas, do you know how many crashes this is causing? merci
Flags: needinfo?(sledru) → needinfo?(nicolas.b.pierron)
Assignee | ||
Comment 67•9 years ago
|
||
(In reply to Sylvestre Ledru [:sylvestre] from comment #66)
> Nicolas, do you know how many crashes this is causing? merci
Crash-stat seems to show that we have about 600 crashes per month with this signature. [1] Which does not appear in the top300 crashers of 44.0.2.
What I know is that the first bugs which might have been related to these issues started to appear in November, but the issue was present in the code base for more than a year. (Bug 1029830)
The code which highlighted the previous issue came from an old version of JQuery (Bug 1238592 comment 8), but I have no idea what caused the issue to show up only recently.
I currently have a fix for Bug 1029830, in Bug 1252034, so I would not "yet" recommend to backport Bug 1029830 & Bug 1252034 until the code matured in other channels.
The crash in appendResumePoint is a null-deref and this should not be a security issue. In the cases as rare as the null-deref we do not have this null-deref issue, the generated code is supposed to be unreachable, and thus only exploitable if an attacker can get the control over the pc.
[1] https://crash-stats.mozilla.com/report/list?product=Firefox&range_unit=days&range_value=28&signature=js%3A%3Ajit%3A%3ALRecoverInfo%3A%3AappendResumePoint
Flags: needinfo?(nicolas.b.pierron)
Comment 68•9 years ago
|
||
I managed to reproduce this issue on Firefox 46.0a2 (2016-03-07)
bp-9bb4d4a1-8310-4e1a-98bc-051812160308 and Firefox 45 RC bp-10122308-3f10-4f40-b2bb-d1d2b2160308. Based on these crashes I am updating the status flags for Firefox 45 and Firefox 46 in affected.
(In reply to Nicolas B. Pierron [:nbp] from comment #65)
> > I’ve got another crash report while testing on Firefox 45.0 ESR
> > (20160301172432) under these 2 platforms (Ubuntu and Mac):
> > [1] Ubuntu: bp-8a0386bf-1fdb-4a25-a874-fafcc2160302 and
> > bp-6f34d186-aa3a-4c26-9ff7-a991c2160302
> > [2] Mac: bp-7f330050-5221-426d-8d83-3c6032160302
> >
> > Should I file a separate bug for this crash report?
>
> These are probably the same issue, but we should probably file an issue
> because of the lack of symbols for these builds.
I was no longer able to reproduce the above mentioned crash on Firefox 45 ESR (20160304113541) but I encountered the crash for which was reported this bug: bp-1fc1d2da-d4cf-4a05-af72-eec482160308 , bp-9e7f8ebc-0b35-4ddf-97a4-3be992160308 . In order to reflect this information, I am marking Firefox 45 ESR as affected as well.
> We should probably mark this bug as depending on Bug 1238592.
All these versions will be affected until Bug 1238592 will be fixed.
Verified again on Firefox 47.0a1 (2016-03-06) and Firefox 38.7.0 ESR (20160302171452) and this bug seems to be fixed across these versions.
Status: RESOLVED → VERIFIED
Depends on: 1238592
Comment 69•9 years ago
|
||
Too late for 45...
Comment 72•9 years ago
|
||
Yes we could still take the fix in bug 1238592. I commented there as well.
Comment 73•9 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #70)
> And for 46?
this landed already on beta and aurora, so is there anything left to do for us sheriffs to uplift ?
Flags: needinfo?(lhenry)
Comment 74•9 years ago
|
||
I think the affected flag just didn't get set to fixed for 46. Thanks Tomcat!
Flags: needinfo?(lhenry)
Comment 75•9 years ago
|
||
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #74)
> I think the affected flag just didn't get set to fixed for 46. Thanks Tomcat!
I updated the status flag for Firefox 46 based on my testing (See Comment 64, Comment 65 and Comment 68 ). This issue is still reproducible on Firefox 46 beta 2 (20160316065941) bp-c3291541-093d-4201-a398-23c062160317
Comment 76•9 years ago
|
||
The patches in this bug have landed on 45/46, but Vasilica can still reproduce because (we think) bug 1238592 is also required and has not landed on those branches.
tracking-firefox-esr45:
--- → 46+
Comment 77•9 years ago
|
||
Comment on attachment 8718885 [details] [diff] [review]
ValueNumbering: Set the dominator index of fixup blocks when they are created.
sec-high, taking it in esr45
Attachment #8718885 -
Flags: approval-mozilla-esr45+
Comment 78•9 years ago
|
||
Comment on attachment 8718885 [details] [diff] [review]
ValueNumbering: Set the dominator index of fixup blocks when they are created.
Tomcat told me it already landed in esr ( http://hg.mozilla.org/releases/mozilla-esr45/rev/ef3f3f8ed9bd ) oups.
Attachment #8718885 -
Flags: approval-mozilla-esr45+ → approval-mozilla-esr45-
Updated•9 years ago
|
Comment 79•9 years ago
|
||
(In reply to Daniel Veditz [:dveditz] from comment #76)
> The patches in this bug have landed on 45/46, but Vasilica can still
> reproduce because (we think) bug 1238592 is also required and has not landed
> on those branches.
Bug 1238592 landed now on esr45 -> https://hg.mozilla.org/releases/mozilla-esr45/rev/fdcfad0a693c
Comment 80•9 years ago
|
||
Since this work landed and had a positive effect we should keep it marked "fixed". Opening a new bug would have been better to keep track of any remaining open issues. We are trying out landing the work from Bug 1238592 on m-r and m-b for today's 46.0 RC build.
Updated•9 years ago
|
Whiteboard: [after:2016-06-01;ni?:nbp][adv-main45+][adv-esr38.7+] → [after:2016-06-01;ni?:nbp][adv-main46-][adv-esr38.7+][adv-esr45.1-]
Comment 81•8 years ago
|
||
Crash volume for signature 'js::jit::LRecoverInfo::appendResumePoint':
- nightly (version 50): 0 crash from 2016-06-06.
- aurora (version 49): 4 crashes from 2016-06-07.
- beta (version 48): 54 crashes from 2016-06-06.
- release (version 47): 86 crashes from 2016-05-31.
- esr (version 45): 22 crashes from 2016-04-07.
Crash volume on the last weeks:
Week N-1 Week N-2 Week N-3 Week N-4 Week N-5 Week N-6 Week N-7
- nightly 0 0 0 0 0 0 0
- aurora 1 2 0 0 0 1 0
- beta 18 4 8 9 4 7 0
- release 9 11 8 14 13 18 5
- esr 3 5 0 6 1 1 2
Affected platforms: Windows, Mac OS X
status-firefox48:
--- → affected
status-firefox49:
--- → affected
Comment 82•8 years ago
|
||
Crash volume for signature 'js::jit::LRecoverInfo::appendResumePoint':
- nightly(version 50):1 crash from 2016-06-06.
- aurora (version 49):5 crashes from 2016-06-07.
- beta (version 48):62 crashes from 2016-06-06.
- release(version 47):96 crashes from 2016-05-31.
- esr (version 45):23 crashes from 2016-04-07.
Crash volume on the last weeks:
W. N-1 W. N-2 W. N-3 W. N-4 W. N-5 W. N-6 W. N-7
- nightly 1 0 0 0 0 0 0
- aurora 1 1 2 0 0 0 1
- beta 10 18 5 8 9 4 7
- release 16 9 13 8 14 13 18
- esr 4 3 5 0 6 1 1
Affected platforms: Windows, Mac OS X
status-firefox50:
--- → affected
Updated•8 years ago
|
Group: core-security-release
You need to log in
before you can comment on or make changes to this bug.
Description
•