Closed Bug 794471 Opened 13 years ago Closed 13 years ago

Heap-use-after-free in mozilla::image::nsPNGDecoder::row_callback during WebGL conformance suite

Categories

(Core :: Graphics, defect)

18 Branch
x86_64
All
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox18 + unaffected
firefox19 + unaffected
firefox20 + unaffected
firefox-esr17 --- wontfix

People

(Reporter: inferno, Assigned: joe)

Details

(5 keywords, Whiteboard: [asan] Affects Aurora still?)

Attachments

(3 files)

Attached file Testcase
Reproduces on trunk. Run the testcase as webgl_conformance\conformance\more\functions\test.html [Need to download WebGL Conformance Test Suite] ================================================================= ==29891== ERROR: AddressSanitizer heap-use-after-free on address 0x7fd8bb115c80 at pc 0x7fd8c69dbc78 bp 0x7fff49cfac90 sp 0x7fff49cfac88 WRITE of size 4 at 0x7fd8bb115c80 thread T0 #0 0x7fd8c69dbc77 in mozilla::image::nsPNGDecoder::row_callback(png_struct_def*, unsigned char*, unsigned int, int) image/decoders/nsPNGDecoder.cpp:761 #1 0x7fd8c9ff2346 in MOZ_PNG_push_have_row media/libpng/pngpread.c:1460 #2 0x7fd8c9ff18e4 in MOZ_PNG_proc_IDAT_data media/libpng/pngpread.c:1131 0x7fd8bb115c80 is located 224256 bytes inside of 262144-byte region [0x7fd8bb0df080,0x7fd8bb11f080) freed by thread T0 here: #0 0x433760 in free #1 0x7fd8c99c2cbd in ~gfxImageSurface gfx/thebes/gfxImageSurface.cpp:150 previously allocated by thread T0 here: #0 0x433a9a in posix_memalign #1 0x7fd8c99c249b in TryAllocAlignedBytes(unsigned long) gfx/thebes/gfxImageSurface.cpp:89 #2 0x7fd8c698e470 in imgFrame::Init(int, int, int, int, gfxASurface::gfxImageFormat, unsigned char) image/src/imgFrame.cpp:192 #3 0x7fd8c6979461 in mozilla::image::RasterImage::InternalAddFrame(unsigned int, int, int, int, int, gfxASurface::gfxImageFormat, unsigned char, unsigned char**, unsigned int*, unsigned int**, unsigned int*) image/src/RasterImage.cpp:1006 #4 0x7fd8c697a207 in mozilla::image::RasterImage::EnsureFrame(unsigned int, int, int, int, int, gfxASurface::gfxImageFormat, unsigned char, unsigned char**, unsigned int*, unsigned int**, unsigned int*) image/src/RasterImage.cpp:1128 #5 0x7fd8c697a6bf in mozilla::image::RasterImage::EnsureFrame(unsigned int, int, int, int, int, gfxASurface::gfxImageFormat, unsigned char**, unsigned int*) image/src/RasterImage.cpp:1173 #6 0x7fd8c69da9d0 in mozilla::image::nsPNGDecoder::info_callback(png_struct_def*, png_info_def*) image/decoders/nsPNGDecoder.cpp:618 #7 0x7fd8c9fefa3c in MOZ_PNG_push_have_info media/libpng/pngpread.c:1446 #8 0x7fd8c9fedc05 in MOZ_PNG_proc_some_data media/libpng/pngpread.c:121 Shadow byte and word: 0x1ffb17622b90: fd 0x1ffb17622b90: fd fd fd fd fd fd fd fd More shadow bytes: 0x1ffb17622b70: fd fd fd fd fd fd fd fd 0x1ffb17622b78: fd fd fd fd fd fd fd fd 0x1ffb17622b80: fd fd fd fd fd fd fd fd 0x1ffb17622b88: fd fd fd fd fd fd fd fd =>0x1ffb17622b90: fd fd fd fd fd fd fd fd 0x1ffb17622b98: fd fd fd fd fd fd fd fd 0x1ffb17622ba0: fd fd fd fd fd fd fd fd 0x1ffb17622ba8: fd fd fd fd fd fd fd fd 0x1ffb17622bb0: fd fd fd fd fd fd fd fd Stats: 236M malloced (239M for red zones) by 305903 calls Stats: 32M realloced by 16369 calls Stats: 194M freed by 159136 calls Stats: 60M really freed by 70955 calls Stats: 448M (114774 full pages) mmaped in 112 calls mmaps by size class: 8:196596; 9:40955; 10:12285; 11:12282; 12:3072; 13:2048; 14:1280; 15:256; 16:448; 17:1280; 18:240; 19:40; 20:20; mallocs by size class: 8:234746; 9:39731; 10:10208; 11:12486; 12:2555; 13:2077; 14:1511; 15:316; 16:636; 17:1345; 18:232; 19:42; 20:18; frees by size class: 8:109994; 9:27174; 10:5809; 11:9217; 12:1609; 13:1762; 14:1253; 15:257; 16:570; 17:1295; 18:139; 19:41; 20:16; rfrees by size class: 8:52331; 9:8931; 10:1749; 11:5560; 12:542; 13:496; 14:503; 15:109; 16:302; 17:419; 18:7; 19:5; 20:1; Stats: malloc large: 1637 small slow: 1758 ==29891== ABORTING
Whiteboard: [asan]
Component: General → Graphics
Product: Firefox → Core
Summary: Heap-use-after-free in mozilla::image::nsPNGDecoder::row_callback → Heap-use-after-free in mozilla::image::nsPNGDecoder::row_callback during WebGL conformance suite
Over to Joe.
Assignee: nobody → joe
It is of some concern that this bug is discoverable since it happens during a conformance test. Joe any ideas?
I can't reproduce this bug on current trunk. Is a particular version of the WebGL conformance test suite needed? A standalone test might help.
Flags: needinfo?(inferno)
(In reply to Joe Drew (:JOEDREW! \o/) from comment #4) > I can't reproduce this bug on current trunk. Is a particular version of the > WebGL conformance test suite needed? A standalone test might help. So, it does not reproduce for me anymore on trunk. Something has either fixed it or caused it to stop reproducing. You might want to try with an older build from https://people.mozilla.com/~choller/firefox/asan/ and see when it stopped reproducing. Btw, i am using this version of webgl conformance suite - Version 1.0.2 (beta) - March 20, 2012
Flags: needinfo?(inferno)
I can't even reproduce this in a build from 1 Aug 2012. No idea what's going on here.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WORKSFORME
Joe: you're using an ASan build to test?
Flags: needinfo?(joe)
Some notes 1. You need a script to parallely run many firefox instances at once under a memory debugging tool like ASAN to reproduce reliably. Using this script, i find more than 1 instances crash. Use prefs.js from https://blog.mozilla.org/security/2012/06/20/7-tips-for-fuzzing-firefox-more-effectively/. # Python script import os import time thread_num = 14 for i in range(0, thread_num): profile_dir = '/tmp/firefox_profile_%d' % i os.system('mkdir %s' % profile_dir) os.system('cp ~/prefs.js %s/' % profile_dir) cmd = 'path-to-firefox/objdir-ff-asan/dist/bin/firefox-bin -no-remote -profile "%s" "path-to-tests/webgl_conformance/conformance/more/functions/test.html" 2>&1 | tee /tmp/%d.log &' % (profile_dir, i) os.system(cmd) time.sleep(0.3) 2. Make sure when you run a single instance of firefox with testcase does not cause error "No webgl context found". 3. I am able to reproduce on Sep 6 build a52b3e3632d5.
(In reply to Daniel Veditz [:dveditz] from comment #7) > Joe: you're using an ASan build to test? Yep.
Flags: needinfo?(joe)
So, this bug does affect Aurora. It successfully reproduces on https://people.mozilla.com/~choller/firefox/asan/20121024-mozilla-aurora-linux64-opt-c49b35c67d4a+asan.html. Please note that I forgot to mention in comment #0 that you need to run lot of parallel instances since this bug looks to be timing dependent. When i use the script mentioned in c#8, i am able to reproduce always in atleast one firefox instance.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Whiteboard: [asan] → [asan] Affects Aurora still?
Version: Trunk → 18 Branch
Dan is the whiteboard question still valid? Joe any traction?
Flags: needinfo?(dveditz)
Lowering severity for now because of the timing(memory?) conditions required, but there may be other ways to make this issue more reliable. Christoph: can you try this test in a TSAN build and see if it is in fact a timing issue?
Flags: needinfo?(dveditz) → needinfo?(cdiehl)
Keywords: sec-criticalsec-high
Some timing related issues are just due to GC scheduling, and thus not really thread-related. You could imagine there being some kind of image cache getting purged on a timer, or some such thing.
both free and crash are on the same thread, i don't think TSAN will help here. Anyway, timurrr@chromium.org is a good contact if you have any TSAN related questions. WRITE of size 4 at 0x7fd8bb115c80 thread T0 freed by thread T0
Tested successfully with changeset: 104223:a52b3e3632d5 on Linux 64bit/ASan with the provided testcase. I didn't even need to put it into the conformance test suite folder.
Flags: needinfo?(cdiehl) needinfo?(cdiehl) → needinfo?
The script for running multiple instances of Firefox is also not needed.
Minimized testcase which crashes with a use-after-free but with a complete different callstack. The DOM snippet is still the same, only removed the WebGL and <<< tokens. In this crash we have multiple threads. This bug is also not reproducible anymore with trunk but with rev 104223.
Christoph, i think you are hitting the startup bug - https://bugzilla.mozilla.org/show_bug.cgi?id=737987. This bug is easily visible when you see nsXPCWrappedJS::Release in the stack. My bug and that startup bug are different.
decoder: can you please take look at this.
Have tried it now with 4/8/12/14 and 18 instances - no crash here with 104223:a52b3e3632d5. This has been tested inside a Linux VM 64-bit with 7 GiB memory. If I run more instances then ASan will produces an OOM crash. Abhishek: Are you sure the changeset it correct?
(In reply to Christoph Diehl [:cdiehl] from comment #21) > Have tried it now with 4/8/12/14 and 18 instances - no crash here with > 104223:a52b3e3632d5. This has been tested inside a Linux VM 64-bit with 7 > GiB memory. If I run more instances then ASan will produces an OOM crash. > > Abhishek: Are you sure the changeset it correct? Yes i remember explicitly picking a52b3e3632d5 and doing hg update with it. Also, https://people.mozilla.com/~choller/firefox/asan/20121024-mozilla-aurora-linux64-opt-c49b35c67d4a+asan.html was something i tested with(and where it reproduced) and that build is now gone :( i just tested with latest aurora https://people.mozilla.com/~choller/firefox/asan/20121108-mozilla-aurora-linux64-opt-8ae22fe748fc+asan.html and there it does not seem to reproduce, tried multiple times. There might be an interesting merge that you guys did that might have fixed it on aurora as well. I can retry building a52b3e3632d5 over the weekend if needed. it will be great if you guys can archive all the builds like commondatastorage.googleapis.com/chromium-browser-asan/index.html. a lot of time can be saved since building is a pain, verifying isn't.
Flags: sec-bounty?
What's the state of this on 18 beta?
Is there still anything needed from me here?
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #23) > What's the state of this on 18 beta? Looks like we're still waiting for a patch.
(In reply to Al Billings [:abillings] from comment #25) > (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #23) > > What's the state of this on 18 beta? > > Looks like we're still waiting for a patch. Or perhaps someone who can still reproduce the problem (according to the reporter in comment 22 we may have fixed it). Is this bug headed for WFM?
Please re-open the bug and change the status flags if anybody is able to repro.
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → WORKSFORME
Flags: sec-bounty? → sec-bounty-
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: