Closed Bug 204143 Opened 22 years ago Closed 9 years ago

crash, orkinHeap::Alloc uses new and throws on failure, oom

Categories

(MailNews Core :: Database, defect)

x86
All
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: timeless, Unassigned)

References

Details

(Keywords: crash)

Crash Data

I believe i ran out of memory which BeOS was willing to give to mozilla stack: __throw __builtin_new orkinHeap::Alloc morkZone::zone_new_hunk morkZone::zone_grow_at morkZone::zone_new_chip morkZone::ZoneNewChip morkPool::NewFarBookAtomCopy ... I believe the line that failed was: 91 void* block = ::operator new(inSize); which means bug 149032 wasn't enough > smfr: ok, so.. can you help me? :) <smfr> timeless: change all new() to new(std::nothrow) ? > oh joy <smfr> that needs doing anyway <smfr> NS_NEWNOTHROW This looks useful: http://cpptips.hyperformix.com/cpptips/std_no_throw in this specific case it's #include <new> /*...*/ void* block = ::operator new(inSize, std::nothrow); macro land coming up...
Product: Browser → Seamonkey
Assignee: timeless → nobody
QA Contact: granrosebugs → build-config
Product: SeaMonkey → Core
Whiteboard: CLOSEME?
we haven't really resolved this, it's still hit all over. it might be one of the older bugs on the subject and it actually has the right starting point. that said, someone else will probably use some other bug to fix the problem (new throws an exception for oom).
Can someone triage this to a mork component so I don't have to get bugmail about it, at least?
Component: Build Config → Database
Product: Core → MailNews Core
QA Contact: build-config → database
I doubt this will really get fixed, given that mork is on its way out...
Whiteboard: CLOSEME?
Odd that you should comment on this bug. While tracking down crash reports for Songbird last week. I noticed a bunch of crashes on RaiseException out of new, from various parts of XUL, XPConnect, and other places. I haven't had a chance to check it out, but is XULRunner now building with exceptions on? So did exceptions get turned on a while back?d Maybe with the jemalloc code?
No. But ::operator new is not compiled by mozilla, it's provided by the CRT. It's possible that it will still throw regardless of whether exception handling is on in Mozilla or not.
ted: why bother, we're now getting these crashes everywhere, a few bug reports a week (bug 481401, bug 477571). we could move it to xpcom if you prefer, or build config ;-b jcranmer: this really isn't specific to mork, and it does need to be fixed, i believe this bug is probably one of the (or even my) oldest reports on the problem dbradley: we switched compilers a while back actually, when we moved from vc71 to vc8, these exceptions had to happen. as for why we're getting more oom crashes than we used to (as we've had vc8 for a while), i dunno, most likely CC is failing to retain memory or something is leaking (as one of the other bugs complains). in the end we either have to try hacking jemalloc to do something evil
...in the end we either have to try hacking jemalloc to do something evil (changing how operator new works in the crt), or we have to introduce something like the macro described by smfr in comment 0.
Apparently post VC6, new's implementation changed to always throw. So at least now all the platforms are in the same boat I think. So either new(std::nothrow) or set a new handler and do "something" when out of memory. But yes, this really isn't specific to mork anymore.
OS: BeOS → All
Summary: orkinHeap::Alloc uses new and throws on failure → crash, orkinHeap::Alloc uses new and throws on failure, oom
same as the following...? operator new(unsigned int) | orkinHeap::Alloc(nsIMdbEnv*, unsigned int, void**) bp-a37b5722-061f-4050-a780-73ea12110117 Unhandled C++ Exception 0x778cfbae 0 kernel32.dll RaiseException 1 mozcrt19.dll _CxxThrowException throw.cpp:159 2 mozcrt19.dll operator new objdir-tb/mozilla/memory/jemalloc/src/new.cpp:57 3 thunderbird.exe orkinHeap::Alloc db/mork/src/orkinHeap.cpp:90 4 thunderbird.exe morkNode::MakeNew db/mork/src/morkDeque.cpp:62 5 thunderbird.exe morkFactory::OpenFileStore db/mork/src/morkFactory.cpp:545 6 thunderbird.exe nsMsgDatabase::OpenMDB mailnews/db/msgdb/src/nsMsgDatabase.cpp:1140 7 thunderbird.exe nsMsgDatabase::Open mailnews/db/msgdb/src/nsMsgDatabase.cpp:1026 8 thunderbird.exe nsMailDatabase::Open mailnews/db/msgdb/src/nsMailDatabase.cpp:111 9 thunderbird.exe nsMsgDBService::OpenFolderDB mailnews/db/msgdb/src/nsMsgDatabase.cpp:147
(In reply to comment #8) > Apparently post VC6, new's implementation changed to always throw. So at > least now all the platforms are in the same boat I think. > > So either new(std::nothrow) or set a new handler and do "something" when out > of memory. But yes, this really isn't specific to mork anymore. [@ morkZone::zone_new_hunk] also? example bp-dfac1ee0-3fb0-4c30-b94d-253e52110526 SIGSEGV 0x0 0 thunderbird-bin morkZone::zone_new_hunk morkZone.cpp:266 1 thunderbird-bin morkZone::zone_grow_at morkZone.cpp:242 2 thunderbird-bin morkZone::zone_new_chip morkZone.cpp:319 3 thunderbird-bin morkZone::ZoneNewRun morkZone.cpp:398 4 thunderbird-bin morkPool::NewCells morkPool.cpp:275 5 thunderbird-bin morkPool::AddRowCells morkPool.cpp:321 6 thunderbird-bin morkRow::NewCell morkRow.cpp:453 7 thunderbird-bin morkRow::AddColumn morkRow.cpp:885 8 thunderbird-bin morkRowObject::AddColumn morkRowObject.cpp:267 9 thunderbird-bin nsAddrDatabase::AddCharStringColumn nsAddrDatabase.cpp:2182
Crash Signature: [@ operator new(unsigned int) | orkinHeap::Alloc(nsIMdbEnv*, unsigned int, void**) ] [@ morkZone::zone_new_hunk]
current crash sig looks to be moz_abort | arena_run_split | arena_run_alloc | arena_malloc | je_malloc | orkinHeap::Alloc(nsIMdbEnv*, unsigned int, void**) morkZone::zone_new_hunk ends after TB16.0.2
Crash Signature: [@ operator new(unsigned int) | orkinHeap::Alloc(nsIMdbEnv*, unsigned int, void**) ] [@ morkZone::zone_new_hunk] → [@ operator new(unsigned int) | orkinHeap::Alloc(nsIMdbEnv*, unsigned int, void**) ] [@ morkZone::zone_new_hunk] [@ moz_abort | arena_run_split | arena_run_alloc | arena_malloc | je_malloc | orkinHeap::Alloc(nsIMdbEnv*, unsigned int, void**)]
As I had mentioned https://bugzilla.mozilla.org/show_bug.cgi?id=533313 I hit this issue because the process hit the 2GB per-process hard limit (I was on a 32-bit system at the time) meaning the system couldn't allocate more memory even if it did have plenty of memory to spare. There's really not much of anything to do to work around that, except reduce memory usage and/or warn the user that Thunderbird is out of address space. I guess you could "delay" the issue if the user is running a system with a 3G/1G user-kernel split, but those users are rare... On 64-bit systems, a 64-bit version of TBird would have access to more virtual address making this a non-issue for all practical purposes.
Crash Signature: [@ operator new(unsigned int) | orkinHeap::Alloc(nsIMdbEnv*, unsigned int, void**) ] [@ morkZone::zone_new_hunk] [@ moz_abort | arena_run_split | arena_run_alloc | arena_malloc | je_malloc | orkinHeap::Alloc(nsIMdbEnv*, unsigned int, void**)] → [@ operator new(unsigned int) | orkinHeap::Alloc(nsIMdbEnv*, unsigned int, void**) ] [@ morkZone::zone_new_hunk] [@ moz_abort | arena_run_split | arena_run_alloc | arena_malloc | je_malloc | orkinHeap::Alloc(nsIMdbEnv*, unsigned int void**)] [@ moz_abo…
One problem could be that mork loads the whole database into memory, and so can OOM on huge databases. If we'd replace it with a reasonable, more modern DB that doesn't need to put everything into memory, we'd probably be better there.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #15) > One problem could be that mork loads the whole database into memory, and so > can OOM on huge databases. If we'd replace it with a reasonable, more modern > DB that doesn't need to put everything into memory, we'd probably be better > there. I believe it would be a mistake to assume big db is the majority of cases. And since Bug 723248 - add support for closing inactive databases - it should be difficult more to see runnaway open dbs. But for sure many crashes are db related. the index db ranges from 2-20% of folder size (depends on what's in headers and whether user has custom headers defined in TB). So for example if user had many 4gb folders, 30 databases times 100 MB per index (just over 2%) that's 3GB. But if everything is working properly there should not be 30 db files open. In my wanderings of the bugs, crash reports and user support reports most OOM issues tie to lightning selecting thousands of messages (known flaw) corrupt index large numbers of contacts google contacts sync And I think to a lesser extent mork issues corrupt panacea.dat (perhaps mork related) but recently we do see reports of runaway open databases counts, even Bug 723248 defaults of mail.db.idle_limit 30000 mail.db.max_open 30
Of course, if we're using regular expressions, we might run into something like bug 837845, but open DB counts sound like something different. That said, as the platform is seeing a ton of churn with JS improvements etc. and Firefox not having any mork or RDF stuff like mailnews still has for some things, and AFAIK neither Thunderbird nor SeaMonkey having any reasonable regular pref/memory testing/monitoring, it's pretty much possible that our code triggers some bad behavior in those platform changes, or the other way round, and we take a long time to notice it.
re comment 17, absolutely correct on all counts. And one of our primary fears about TB now releasing essentially only ESR is our ability to detect most regressions becomes (theoretically) as long as a year. because of our smallish beta and alpha user base and the items you cited. for the record, we do have a few reported cases of more than 30 databases being open, despite bug Bug 723248 - so it's not necessary to have huge databases to cause GBs of memory usage in such cases. And, I often forget to ask this of users, but addons like thunderbrowse can of course cause issues to arise that would normally only be seen in browser-land.
Removing myslef on all the bugs I'm cced on. Please NI me if you need something on MailNews Core bugs from me.
Crash Signature: , void**)] [@ moz_abort | arena_run_split | arena_run_alloc | arena_bin_nonfull_run_get | arena_malloc_small | arena_malloc | je_malloc | orkinHeap::Alloc(nsIMdbEnv*, unsigned int, void**)] → , void**)] [@ moz_abort | arena_run_split | arena_run_alloc | arena_bin_nonfull_run_get | arena_malloc_small | arena_malloc | je_malloc | orkinHeap::Alloc(nsIMdbEnv*, unsigned int, void**)] [@ operator new | orkinHeap::Alloc ] [@ moz_abort | arena_run_…
This may have morphed to something like https://crash-stats.mozilla.com/search/?signature=~morkZone&product=Thunderbird&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#crash-reports but the crash rates are low (probably much lower than when the bug was filed), I don't see anything that seems actionable
Status: NEW → RESOLVED
Closed: 9 years ago
Depends on: 149032
Resolution: --- → WORKSFORME
See Also: → 481401
You need to log in before you can comment on or make changes to this bug.