Closed
Bug 1277397
Opened 9 years ago
Closed 2 years ago
High memory usage triggered by a jpg
Categories
(Core :: Graphics: ImageLib, defect, P3)
Core
Graphics: ImageLib
Tracking
()
RESOLVED
FIXED
113 Branch
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | wontfix |
firefox-esr78 | --- | wontfix |
firefox-esr91 | --- | wontfix |
firefox-esr102 | --- | wontfix |
firefox56 | --- | wontfix |
firefox57 | --- | wontfix |
firefox58 | --- | wontfix |
firefox59 | --- | wontfix |
firefox89 | --- | wontfix |
firefox90 | --- | wontfix |
firefox91 | --- | wontfix |
firefox111 | --- | wontfix |
firefox112 | --- | wontfix |
firefox113 | --- | fixed |
People
(Reporter: tsmith, Assigned: tnikkel)
References
(Blocks 2 open bugs)
Details
(Keywords: crash, csectype-oom, testcase, Whiteboard: [MemShrink:P2][gfx-noted])
Attachments
(3 files)
This test case triggers a large amount of memory to be consumed and then the browser process continues to grow until the system OOM killer terminates the process. I've seen it go up to ~10GB before the process was killed.
Updated•9 years ago
|
Whiteboard: [MemShrink]
Assignee | ||
Comment 1•9 years ago
|
||
The size indicated in the SOF (start of frame) chunk is 60000 x 60000 pixels. So we probably try to allocate a surface that size.
Updated•9 years ago
|
Flags: needinfo?(seth)
Whiteboard: [MemShrink] → [MemShrink][gfx-noted]
Comment 2•9 years ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #1)
> The size indicated in the SOF (start of frame) chunk is 60000 x 60000
> pixels. So we probably try to allocate a surface that size.
Yep, that'd require at least 14GB. Given Linux's overcommit behavior, we probably succeed in allocating that much and then trigger an OOM later as the decoder writes data into the surface.
However, what I'd like to know is, why are we even *attempting* to allocate a surface that larger? |image.mem.surfacecache.max_size_kb|'s default value limits us to 1GB of memory for images at the most. Decoding should've failed.
Flags: needinfo?(seth)
Comment 3•9 years ago
|
||
Yep, that's what happens on my system. We're able to display it when zoomed out because of downscale-during-decode, but decoding fails when zoomed in. We never even attempt to allocate a 60000x60000 surface.
Tyson, can you reproduce this in safe mode? Have you changed any image-related prefs in about:config? Can you verify that you're on Linux? (I assumed you were because you mentioned the OOM killer.)
Flags: needinfo?(twsmith)
Comment 4•9 years ago
|
||
Sounds like we need the SurfaceCache limit in bug 1281725 enforced for JPEG too. It may be good to put that logic somewhere in ImgLib that's shared across all formats.
Whiteboard: [MemShrink][gfx-noted] → [MemShrink:P2][gfx-noted]
Comment 5•9 years ago
|
||
(In reply to Jet Villegas (:jet) from comment #4)
> Sounds like we need the SurfaceCache limit in bug 1281725 enforced for JPEG
> too. It may be good to put that logic somewhere in ImgLib that's shared
> across all formats.
In this case my guess is that libjpeg-turbo itself is allocating the memory, so we can't trivially share the fix. libpng's deinterlacing system is also specific to libpng, I'm afraid.
Reporter | ||
Comment 6•9 years ago
|
||
(In reply to Seth Fowler [:seth] [:s2h] from comment #3)
> Tyson, can you reproduce this in safe mode?
Yes, I can repro in safe mode.
> Have you changed any image-related prefs in about:config?
Nope, I can repro with a fresh profile.
> Can you verify that you're on Linux? (I assumed you were because you mentioned the OOM killer.)
Yes I was running Linux.
Flags: needinfo?(twsmith)
Updated•8 years ago
|
Priority: -- → P3
Reporter | ||
Comment 7•7 years ago
|
||
Our fuzzers are still hitting this issue.
status-firefox57:
--- → affected
Updated•7 years ago
|
Has Regression Range: --- → no
status-firefox56:
--- → wontfix
status-firefox58:
--- → fix-optional
status-firefox-esr52:
--- → wontfix
Comment 8•7 years ago
|
||
Hi,
I've been poking around with this bug for the last few days, mostly been trying to find out where that huge amount of memory is being allocated.
Bare in mind I've never touched ImgLib before.
As far as I can tell, the memory gets allocated well before the JPEG gets decoded. Presumably something is reading the JPEG size (from the header, maybe?) and allocating some buffer somewhere.
I've been searching through nsJPEGDecoder->DecoderFactory->RasterImage->ImageFactory->imgRequest trying to find something that allocates memory, reads image size or anything relevant.
I checked out (via a lot of reading, and a lot of break-point setting) everything that looked like it might allocate the memory, or return the information used to allocate the memory, but not found anything yet.
No matter where I set a breakpoint the memory has already been allocated.
Safe to say the memory allocation happens way before libjpeg gets involved.
I feel like I'm looking in the wrong place here, any pointers or ideas of where to look would be useful.
Assignee | ||
Comment 9•7 years ago
|
||
If you break in nsJPEGDecoder::ReadJPEGData I would think that no frames or buffers would have been allocated by the first call of that function.
Comment 10•7 years ago
|
||
Nope.
`watch -n 1 'free -h'` still shows memory usage rising up even when breaking on entry to nsJPEGDecoder::ReadJPEGData.
The memory keeps rising , with gdb just sitting on the entry to the method, until it fills all the RAM and swap and gets purged by OOM killer.
Assignee | ||
Comment 11•7 years ago
|
||
Oh, are you attached to the parent process? The image decoding will happen in the child process, so you won't hit the breakpoint for the correct jpeg decoder. You can either attach to the child process or flip the pref browser.tabs.remote.autostart to go into single process mode.
Comment hidden (mozreview-request) |
Comment 13•7 years ago
|
||
Thanks for the tip.
It might be useful to put that tip in the MDN documentation somewhere, any thoughts on where that might live?
comment 5 is correct, it is libjpeg-turbo allocating the memory on the call to jpeg_consume_input.
Here: https://dxr.mozilla.org/mozilla-central/source/image/decoders/nsJPEGDecoder.cpp#474
I've added a check in the readJPEG method to check if decoding the image will allocate more memory than SurfaceCache::MaximumCapacity() similar to the check in nsPNGDecoder.
The patch compiles for me on Linux and successfully stops the test case image loading, and therefore filling up all my RAM.
When loading the test image FF just doesnt load the image.
I'm not sure how to report back a useful error to the user.
Assignee | ||
Comment 14•7 years ago
|
||
mozreview-review |
Comment on attachment 8935421 [details]
Bug 1277397 - High memory usage triggered by a jpg
https://reviewboard.mozilla.org/r/206316/#review212062
::: image/decoders/nsJPEGDecoder.cpp:384
(Diff revision 1)
> }
>
> +
> + //If the memory needed by the image is bigger than the SurfaceCache size limit,
> + //don't start decompressing. Bail out.
> + if ( ((mInfo.image_width * mInfo.image_height) *24) >
I think that libjpegturbo only makes this large allocation if mInfo.buffered_image is set to true below. So I think we only need this check if that is true. This will allow us to decode large images if we are downscaling them to a smaller size because we only need enough space for the downscaled size image.
Comment 15•7 years ago
|
||
(In reply to Timothy Nikkel (:tnikkel) from comment #14)
> Comment on attachment 8935421 [details]
> Bug 1277397 - High memory usage triggered by a jpg
>
> https://reviewboard.mozilla.org/r/206316/#review212062
>
> ::: image/decoders/nsJPEGDecoder.cpp:384
> (Diff revision 1)
> > }
> >
> > +
> > + //If the memory needed by the image is bigger than the SurfaceCache size limit,
> > + //don't start decompressing. Bail out.
> > + if ( ((mInfo.image_width * mInfo.image_height) *24) >
>
> I think that libjpegturbo only makes this large allocation if
> mInfo.buffered_image is set to true below. So I think we only need this
> check if that is true. This will allow us to decode large images if we are
> downscaling them to a smaller size because we only need enough space for the
> downscaled size image.
Hm. Given this new check is above where mInfo.buffered_image is determined, it would be nice if we changed mDecodeStyle at this point instead. I think it is better for the user to decode sequentially instead of progressively, if the alternative is not at all, no?
Comment hidden (mozreview-request) |
Comment 17•7 years ago
|
||
Oops - I didn't see comment 51 before I pushed that most recent change.
The above moves the check to after we assign mInfo.buffered_image and still succeeds in refusing to load the test image.
I havent managed to get the test image to load at all actually.
I'm not sure how to get FF to downscale it since the zoom level resets every time I reload the page, and zooming out when the image has already been not-loaded doesnt load the image.
Since I cant check that the image does load if we're downscaling, I'm concerned that the above patch doesnt work as we'd like.
Appreciate if that could be checked by someone.
I'd agree that the best option would be to at least try to load the image (sequentially).
I'll get on implementing that and see what happens.
Comment hidden (mozreview-request) |
Comment 19•7 years ago
|
||
The above patch checks the size of the image and sets mDecodeStyle to SEQUENTIAL.
Assuming the test case image is a plain white image (thats what I can see), this patch allows the image to be loaded and FF does not use a huge amount of memory when doing it.
Worth noting that when opening the test case FF complains of 'Corrupt JPEG data: 18 extraneous bytes before marked 0xdb'
GIMP also complains of the same thing and flat out can't open the image.
Comment 20•7 years ago
|
||
status-firefox59:
--- → ?
Assignee | ||
Comment 21•5 years ago
|
||
Added crashtest in bug 1633121.
Comment 22•4 years ago
|
||
Hey Taylor,
Can you still reproduce this issue or should we close it?
Flags: needinfo?(twsmith)
Reporter | ||
Comment 23•4 years ago
|
||
I can still reproduce this with m-c 20210705-4d0245a42361.
I used an ASan builds and set ASAN_OPTIONS=max_allocation_size_mb=512:allocator_may_return_null=false
to get the following stack trace:
==3898==ERROR: AddressSanitizer: requested allocation size 0x3b99d037 (0x3b99e038 after adjustments for alignment, red zones etc.) exceeds maximum supported size of 0x20000000 (thread T8 (TaskCon~read #1))
#0 0x55d3cbb76abd in malloc /builds/worker/fetches/llvm-project/llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3
#1 0x7f50e4a9ee8d in alloc_large /builds/worker/checkouts/gecko/media/libjpeg/jmemmgr.c:391:29
#2 0x7f50e4a9f3eb in alloc_barray /builds/worker/checkouts/gecko/media/libjpeg/jmemmgr.c:527:28
#3 0x7f50e4aa0220 in realize_virt_arrays /builds/worker/checkouts/gecko/media/libjpeg/jmemmgr.c:741:26
#4 0x7f50e4a5c1b9 in master_selection /builds/worker/checkouts/gecko/media/libjpeg/jdmaster.c:573:3
#5 0x7f50e4a5c1b9 in jinit_master_decompress /builds/worker/checkouts/gecko/media/libjpeg/jdmaster.c:736:3
#6 0x7f50e4a28901 in jpeg_start_decompress /builds/worker/checkouts/gecko/media/libjpeg/jdapistd.c:49:5
#7 0x7f50df2b0dcf in mozilla::image::nsJPEGDecoder::ReadJPEGData(char const*, unsigned long) /builds/worker/checkouts/gecko/image/decoders/nsJPEGDecoder.cpp:400:11
#8 0x7f50df34f2e9 in operator() /builds/worker/checkouts/gecko/image/decoders/nsJPEGDecoder.cpp:186:34
#9 0x7f50df34f2e9 in mozilla::Maybe<mozilla::Variant<mozilla::image::TerminalState, mozilla::image::Yield> > mozilla::image::StreamingLexer<mozilla::image::nsJPEGDecoder::State, 16ul>::ContinueUnbufferedRead<mozilla::image::nsJPEGDecoder::DoDecode(mozilla::image::SourceBufferIterator&, mozilla::image::IResumable*)::$_5>(char const*, unsigned long, unsigned long, mozilla::image::nsJPEGDecoder::DoDecode(mozilla::image::SourceBufferIterator&, mozilla::image::IResumable*)::$_5) /builds/worker/checkouts/gecko/image/StreamingLexer.h:555:9
#10 0x7f50df2ad610 in UnbufferedRead<(lambda at /builds/worker/checkouts/gecko/image/decoders/nsJPEGDecoder.cpp:183:21)> /builds/worker/checkouts/gecko/image/StreamingLexer.h:501:12
#11 0x7f50df2ad610 in Lex<(lambda at /builds/worker/checkouts/gecko/image/decoders/nsJPEGDecoder.cpp:183:21)> /builds/worker/checkouts/gecko/image/StreamingLexer.h:469:26
#12 0x7f50df2ad610 in mozilla::image::nsJPEGDecoder::DoDecode(mozilla::image::SourceBufferIterator&, mozilla::image::IResumable*) /builds/worker/checkouts/gecko/image/decoders/nsJPEGDecoder.cpp:182:17
#13 0x7f50df157697 in mozilla::image::Decoder::Decode(mozilla::image::IResumable*) /builds/worker/checkouts/gecko/image/Decoder.cpp:177:19
#14 0x7f50df164d19 in mozilla::image::DecodedSurfaceProvider::Run() /builds/worker/checkouts/gecko/image/DecodedSurfaceProvider.cpp:123:34
#15 0x7f50df191dcc in mozilla::image::DecodingTask::Run() /builds/worker/checkouts/gecko/image/DecodePool.cpp:146:12
#16 0x7f50dc1d6254 in mozilla::TaskController::RunPoolThread() /builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp:287:33
#17 0x7f50f95b23fe in _pt_root /builds/worker/checkouts/gecko/nsprpub/pr/src/pthreads/ptthread.c:201:5
#18 0x7f50fd6d1608 in start_thread /build/glibc-eX1tMB/glibc-2.31/nptl/pthread_create.c:477:8
==3898==HINT: if you don't care about these errors you may set allocator_may_return_null=1
SUMMARY: AddressSanitizer: allocation-size-too-big /builds/worker/fetches/llvm-project/llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cpp:145:3 in malloc
Thread T8 (TaskCon~read #1) created by T0 (Web Content) here:
#0 0x55d3cbb60e6c in pthread_create /builds/worker/fetches/llvm-project/llvm/projects/compiler-rt/lib/asan/asan_interceptors.cpp:205:3
#1 0x7f50f95a2474 in _PR_CreateThread /builds/worker/checkouts/gecko/nsprpub/pr/src/pthreads/ptthread.c:458:14
#2 0x7f50f959394e in PR_CreateThread /builds/worker/checkouts/gecko/nsprpub/pr/src/pthreads/ptthread.c:533:12
#3 0x7f50dc1d6dbb in mozilla::TaskController::InitializeThreadPool() /builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp:181:10
#4 0x7f50dc1d832a in mozilla::TaskController::AddTask(already_AddRefed<mozilla::Task>&&) /builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp:357:7
#5 0x7f50de15a1ee in DispatchOffThreadTask() /builds/worker/checkouts/gecko/js/xpconnect/src/XPCJSContext.cpp:1182:26
#6 0x7f50e83f8583 in js::GlobalHelperThreadState::submitTask(js::GCParallelTask*, js::AutoLockHelperThreadState const&) /builds/worker/checkouts/gecko/js/src/vm/HelperThreads.cpp:2034:3
#7 0x7f50e8d3749b in startWithLockHeld /builds/worker/checkouts/gecko/js/src/gc/GCParallelTask.cpp:36:23
#8 0x7f50e8d3749b in js::GCParallelTask::start() /builds/worker/checkouts/gecko/js/src/gc/GCParallelTask.cpp:46:3
#9 0x7f50e8d3ecf6 in js::gc::GCRuntime::beginPreparePhase(JS::GCReason, js::gc::AutoGCSession&) /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:4440:14
#10 0x7f50e8d545f5 in js::gc::GCRuntime::incrementalSlice(js::SliceBudget&, mozilla::Maybe<JS::GCOptions> const&, JS::GCReason) /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:7135:12
#11 0x7f50e8d5945b in js::gc::GCRuntime::gcCycle(bool, js::SliceBudget const&, mozilla::Maybe<JS::GCOptions> const&, JS::GCReason) /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:7656:3
#12 0x7f50e8d5ab82 in js::gc::GCRuntime::collect(bool, js::SliceBudget const&, mozilla::Maybe<JS::GCOptions> const&, JS::GCReason) /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:7865:9
#13 0x7f50e8d22eea in gc /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:7945:3
#14 0x7f50e8d22eea in js::gc::GCRuntime::freezeSelfHostingZone() /builds/worker/checkouts/gecko/js/src/gc/GC.cpp:1610:3
#15 0x7f50e860fc3f in InitSelfHostingFromStencil(JSContext*, JS::Handle<js::GlobalObject*>, js::frontend::CompilationInput&, js::frontend::CompilationStencil const&) /builds/worker/checkouts/gecko/js/src/vm/SelfHosting.cpp:2688:21
#16 0x7f50e86104e4 in JSRuntime::initSelfHosting(JSContext*, mozilla::Span<unsigned char const, 18446744073709551615ul>, bool (*)(JSContext*, mozilla::Span<unsigned char const, 18446744073709551615ul>)) /builds/worker/checkouts/gecko/js/src/vm/SelfHosting.cpp:2743:14
#17 0x7f50e83ff64d in JS::InitSelfHostedCode(JSContext*, mozilla::Span<unsigned char const, 18446744073709551615ul>, bool (*)(JSContext*, mozilla::Span<unsigned char const, 18446744073709551615ul>)) /builds/worker/checkouts/gecko/js/src/vm/Initialization.cpp:249:12
#18 0x7f50de159085 in XPCJSContext::Initialize() /builds/worker/checkouts/gecko/js/xpconnect/src/XPCJSContext.cpp:1380:8
#19 0x7f50de15ad42 in XPCJSContext::NewXPCJSContext() /builds/worker/checkouts/gecko/js/xpconnect/src/XPCJSContext.cpp:1416:23
#20 0x7f50de1edb68 in nsXPConnect::InitJSContext() /builds/worker/checkouts/gecko/js/xpconnect/src/nsXPConnect.cpp:84:25
#21 0x7f50dc266a8e in NS_InitXPCOM /builds/worker/checkouts/gecko/xpcom/build/XPCOMInit.cpp:491:5
#22 0x7f50e7dc2997 in XRE_InitEmbedding2(nsIFile*, nsIFile*, nsIDirectoryServiceProvider*) /builds/worker/checkouts/gecko/toolkit/xre/nsEmbedFunctions.cpp:189:8
#23 0x7f50dd4646aa in mozilla::ipc::ScopedXREEmbed::Start() /builds/worker/checkouts/gecko/ipc/glue/ScopedXREEmbed.cpp
#24 0x7f50e33c6c6b in mozilla::dom::ContentProcess::Init(int, char**) /builds/worker/checkouts/gecko/dom/ipc/ContentProcess.cpp:202:13
#25 0x7f50e7dc322c in XRE_InitChildProcess(int, char**, XREChildData const*) /builds/worker/checkouts/gecko/toolkit/xre/nsEmbedFunctions.cpp:708:21
#26 0x55d3cbbaa74d in content_process_main(mozilla::Bootstrap*, int, char**) /builds/worker/checkouts/gecko/browser/app/../../ipc/contentproc/plugin-container.cpp:57:28
#27 0x55d3cbbaab7d in main /builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp:327:18
#28 0x7f50fd19e0b2 in __libc_start_main /build/glibc-eX1tMB/glibc-2.31/csu/../csu/libc-start.c:308:16
Blocks: oom-fuzz
status-firefox89:
--- → wontfix
status-firefox90:
--- → affected
status-firefox91:
--- → affected
status-firefox-esr78:
--- → affected
status-firefox-esr91:
--- → affected
Flags: needinfo?(twsmith)
Keywords: csectype-oom
Updated•4 years ago
|
Updated•4 years ago
|
QA Whiteboard: [qa-not-actionable]
Updated•3 years ago
|
Version: 46 Branch → unspecified
Updated•2 years ago
|
Severity: critical → S2
Assignee | ||
Comment 24•2 years ago
|
||
High mem usage only triggered by files specifically constructed to cause such a problem, lowering severity -> S3.
Severity: S2 → S3
Updated•2 years ago
|
See Also: → CVE-2023-32209
Updated•2 years ago
|
Blocks: CVE-2023-32209
Assignee | ||
Comment 25•2 years ago
|
||
This means that for trivial images whose header specifies large dimensions but with no image data we don't use a lot of memory.
Updated•2 years ago
|
Assignee: nobody → tnikkel
Status: NEW → ASSIGNED
Comment 26•2 years ago
|
||
Pushed by tnikkel@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f2bfdba84905
Set a max memory limit for libjpeg turbo to use. r=gfx-reviewers,bradwerth
Comment 27•2 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
status-firefox113:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → 113 Branch
Updated•2 years ago
|
status-firefox111:
--- → wontfix
status-firefox112:
--- → wontfix
status-firefox-esr102:
--- → wontfix
You need to log in
before you can comment on or make changes to this bug.
Description
•