Closed
Bug 958311
Opened 12 years ago
Closed 12 years ago
Assertion failure: progress <= progressMax
Categories
(Core :: Networking: Cache, defect)
Tracking
()
RESOLVED
FIXED
mozilla29
People
(Reporter: mayhemer, Assigned: mayhemer)
References
Details
Attachments
(2 files, 3 obsolete files)
|
8.05 KB,
patch
|
michal
:
review+
|
Details | Diff | Splinter Review |
|
1.59 KB,
patch
|
bajaj
:
approval-mozilla-aurora+
bajaj
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
Assertion failure: progress <= progressMax (unexpected progress values), at ../../../../netwerk/protocol/http/nsHttpChannel.cpp:5413
TEST-UNEXPECTED-FAIL | /tests/content/media/test/test_bug495300.html | application terminated with exit code 11
PROCESS-CRASH | /tests/content/media/test/test_bug495300.html | application crashed [@ mozilla::net::nsHttpChannel::OnTransportStatus(nsITransport*, tag_nsresult, unsigned long long, unsigned long long)]
Return code: 1
One of the early patches for the blockers is causing this.
https://tbpl.mozilla.org/php/getParsedLog.php?id=32775056&tree=Try
https://tbpl.mozilla.org/?tree=Try&rev=2ff48b40e756
| Assignee | ||
Comment 1•12 years ago
|
||
| Assignee | ||
Comment 2•12 years ago
|
||
More bisecting: https://tbpl.mozilla.org/?tree=Try&rev=ab18348302b8
| Assignee | ||
Updated•12 years ago
|
| Assignee | ||
Updated•12 years ago
|
Assignee: nobody → honzab.moz
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
| Assignee | ||
Comment 5•12 years ago
|
||
A general problem.
status-b2g-v1.3:
--- → affected
status-b2g-v1.4:
--- → affected
status-firefox27:
--- → affected
status-firefox28:
--- → affected
status-firefox29:
--- → affected
Summary: HTTP cache v2 [dev]: Assertion failure: progress <= progressMax → Assertion failure: progress <= progressMax
| Assignee | ||
Comment 6•12 years ago
|
||
http://mozilla-releng-blobs.s3.amazonaws.com/blobs/try/sha512/8333e476796569e4ee45d191af11186be33fe64bef2d76eed76a1b5f1057932f9ed15104a83792f9fc557dcdd1e18b2bd2e49e5f0214317dbfcf595e3164deac
There are actually two problems that merging the management and IO thread reveals:
1. bad condition for mCachedContentIsPartial in nsHttpChannel::OnCacheEntryCheck when consumer of the same nsHttpChannel have set Range header (e.g. "Range: 0-") and the cached partial content is not resumable ; leads to:
- we do Range: 0- request, while mCachedContentIsPartial == true (should be false)
- we get 206 based on the Range: 0- request > ProcessPartialContent
- we suspend the http transaction
- we deliver cached data
- we resume the http transaction
- we push the network data (stream completely broken and progress goes over progressMax)
2. sequence of (for the same URL) cacheEntry->AsyncDoom(), cacheService->AsyncOpen(URI) can lead to reuse of the same underlying file:
- we call CacheEntry::AsyncDoom,
- what posts CacheEntry::DoomAlreadyRemoved to the MANAGEMENT level
- we call CacheEntry::AsyncOpen
- we call IOMan::OpenFile,
- what posts IOMan::OpenFileInternal to the OPEN level
- we process IOMan::OpenFileInternal on the OPEN level, since it is processed before the MANAGEMENT level
- we process CacheEntry::DoomAlreadyRemvoed on the MANAGEMENT level
- this calls IOMan::DoomFile,
- what posts IOMan::DoomFileInternal to the OPEN level
- we process IOMan::DoomFileInternal on the OPEN level,
- what (late after opening the same file/handle) finally removes and dooms the handle from the internal hashtables
Solution to 1 is clear from the patch.
Solution to 2 is following:
- CacheEntry::AsyncDoom() calls directly PurgeAndDoom() that removes the entry from the master hashtables (threadsafe) and immediately calls DoomAlreadyRemoved()
- DoomAlreadyRemoved now works on any thread and dooms the file instantly on the called thread
- the actual file dooming is asynchronous, but goes via a single thread and after bug 914644 has been fixed, dooming cannot "physically" happen after opening "physically" happens
https://tbpl.mozilla.org/?tree=Try&rev=b088fbd250df (checking only Linux Dbg M(1))
Attachment #8362604 -
Flags: feedback?(michal.novotny)
| Assignee | ||
Comment 7•12 years ago
|
||
> - we process IOMan::OpenFileInternal on the OPEN level, since it is
> processed before the MANAGEMENT level
Forgot to mention the most important aspect of this ordering problem: here we use the same handle whose file has already been scheduled for doom.
| Assignee | ||
Comment 8•12 years ago
|
||
| Assignee | ||
Comment 9•12 years ago
|
||
Michal, any idea what this could be?
22:24:11 INFO - ==3125==ERROR: AddressSanitizer: heap-use-after-free on address 0x6040008033d0 at pc 0x7f14e0dfb89b bp 0x7f14d8fab8c0 sp 0x7f14d8fab8b8
22:24:11 INFO - READ of size 4 at 0x6040008033d0 thread T11 (Cache2 I/O)
22:24:12 INFO - #0 0x7f14e0dfb89a in mozilla::net::CacheIndex::RemoveRecordFromFrecencyArray(mozilla::net::CacheIndexRecord*) /builds/slave/try-l64-asan-00000000000000000/build/netwerk/cache2/CacheIndex.cpp:2612
22:24:12 INFO - #1 0x7f14e0dfe06c in mozilla::net::CacheIndexEntryAutoManage::~CacheIndexEntryAutoManage() /builds/slave/try-l64-asan-00000000000000000/build/netwerk/cache2/CacheIndex.h:610
22:24:12 INFO - #2 0x7f14e0dee70f in ~CacheIndexEntryAutoManage /builds/slave/try-l64-asan-00000000000000000/build/netwerk/cache2/CacheIndex.h:598
22:24:12 INFO - #3 0x7f14e0dee70f in mozilla::net::CacheIndex::RemoveEntry(unsigned char const (*) [20]) /builds/slave/try-l64-asan-00000000000000000/build/netwerk/cache2/CacheIndex.cpp:673
22:24:12 INFO - #4 0x7f14e0dd0f11 in mozilla::net::CacheFileIOManager::CloseHandleInternal(mozilla::net::CacheFileHandle*) /builds/slave/try-l64-asan-00000000000000000/build/netwerk/cache2/CacheFileIOManager.cpp:1538
22:24:12 INFO - #5 0x7f14e0dda4d6 in mozilla::net::CloseHandleEvent::Run() /builds/slave/try-l64-asan-00000000000000000/build/netwerk/cache2/CacheFileIOManager.cpp:634
22:24:12 INFO - #6 0x7f14e0e1c382 in mozilla::net::CacheIOThread::LoopOneLevel(unsigned int) /builds/slave/try-l64-asan-00000000000000000/build/netwerk/cache2/CacheIOThread.cpp:220
22:24:12 INFO - #7 0x7f14e0e1bc7d in mozilla::net::CacheIOThread::ThreadFunc() /builds/slave/try-l64-asan-00000000000000000/build/netwerk/cache2/CacheIOThread.cpp:156
22:24:12 INFO - #8 0x7f14eec05da9 in _pt_root /builds/slave/try-l64-asan-00000000000000000/build/nsprpub/pr/src/pthreads/ptthread.c:205
22:24:12 INFO - #9 0x44cf03 in __asan::AsanThread::ThreadStart(unsigned long) /builds/slave/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/asan_thread.cc:138
22:24:12 INFO - #10 0x7f14f211fe99 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x7e99)
NSPR logy:
http://mozilla-releng-blobs.s3.amazonaws.com/blobs/try/sha512/d94bb1bd021e99e516fbb13bbe01699e3553066a8f3ea02b8edee337a88eaeace3364429cdf638c7b6947d03eb335a9855d2c736659de9cccc71c7742288fb48
http://mozilla-releng-blobs.s3.amazonaws.com/blobs/try/sha512/2f7482476852818d565e31aa48b41a7f8af462fe7578b7b2f20ce6bd86b633b71916e6fefc600c8114dacc57669cf5e23018d338dd59af1a03c4cca10b4b1060
http://mozilla-releng-blobs.s3.amazonaws.com/blobs/try/sha512/e4a247016876f780d0789d610a597e51f025d9e92b788b587660344c8ab40a867700519349181e7db8aca02d23e0c5efeba66cfee2551cb11e04ff9b52960490
Flags: needinfo?(michal.novotny)
Comment 10•12 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #9)
> Michal, any idea what this could be?
>
> 22:24:11 INFO - ==3125==ERROR: AddressSanitizer: heap-use-after-free on
> address 0x6040008033d0 at pc 0x7f14e0dfb89b bp 0x7f14d8fab8c0 sp
> 0x7f14d8fab8b8
> 22:24:11 INFO - READ of size 4 at 0x6040008033d0 thread T11 (Cache2 I/O)
> 22:24:12 INFO - #0 0x7f14e0dfb89a in
> mozilla::net::CacheIndex::RemoveRecordFromFrecencyArray(mozilla::net::
> CacheIndexRecord*)
> /builds/slave/try-l64-asan-00000000000000000/build/netwerk/cache2/CacheIndex.
> cpp:2612
> 22:24:12 INFO - #1 0x7f14e0dfe06c in
> mozilla::net::CacheIndexEntryAutoManage::~CacheIndexEntryAutoManage()
> /builds/slave/try-l64-asan-00000000000000000/build/netwerk/cache2/CacheIndex.
> h:610
> 22:24:12 INFO - #2 0x7f14e0dee70f in ~CacheIndexEntryAutoManage
> /builds/slave/try-l64-asan-00000000000000000/build/netwerk/cache2/CacheIndex.
> h:598
> 22:24:12 INFO - #3 0x7f14e0dee70f in
> mozilla::net::CacheIndex::RemoveEntry(unsigned char const (*) [20])
> /builds/slave/try-l64-asan-00000000000000000/build/netwerk/cache2/CacheIndex.
> cpp:673
> 22:24:12 INFO - #4 0x7f14e0dd0f11 in
CacheIndex::RemoveRecordFromFrecencyArray() should not dereference the pointer to the record since it is not guaranteed that it still exists. It is sufficient to log the pointer. I'll include this change in the main CacheIndex patch.
Flags: needinfo?(michal.novotny)
| Assignee | ||
Comment 11•12 years ago
|
||
New run -a (index ripped of the series):
https://tbpl.mozilla.org/?tree=Try&rev=7c1d1ca3f804
| Assignee | ||
Comment 12•12 years ago
|
||
(In reply to Michal Novotny (:michal) from comment #10)
> CacheIndex::RemoveRecordFromFrecencyArray() should not dereference the
> pointer to the record since it is not guaranteed that it still exists. It is
> sufficient to log the pointer. I'll include this change in the main
> CacheIndex patch.
So, the method can be given an invalid pointer? And that invalid pointer may be included in the mFrecencyArray? Is this the point we remove the just-gone record from that array and no one else cannot in the meantime (between record's deleted and removed from those arrays) access it via mFrecencyArray? If so, then it's OK.
I didn't so far get to reviewing this code deeply enough to understand myself.
Comment 13•12 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #12)
> So, the method can be given an invalid pointer? And that invalid pointer
> may be included in the mFrecencyArray? Is this the point we remove the
> just-gone record from that array and no one else cannot in the meantime
> (between record's deleted and removed from those arrays) access it via
> mFrecencyArray? If so, then it's OK.
It is OK. We remove the pointer just after removing the entry from index (or marking as deleted). RemoveRecordFrom*Array() are called only from CacheIndexEntryAutoManage which is always created on the stack under the CacheIndex lock. This class is responsible for updating CacheIndexStats and for managing frecency and expirationTime arrays.
| Assignee | ||
Comment 14•12 years ago
|
||
Comment on attachment 8362604 [details] [diff] [review]
v1 (healthy)
(In reply to Honza Bambas (:mayhemer) from comment #11)
> New run -a (index ripped of the series):
> https://tbpl.mozilla.org/?tree=Try&rev=7c1d1ca3f804
Looks pretty green!
Asking for r.
Attachment #8362604 -
Attachment description: v1 → v1 (healthy)
Attachment #8362604 -
Flags: feedback?(michal.novotny) → review?(michal.novotny)
Comment 15•12 years ago
|
||
Comment on attachment 8362604 [details] [diff] [review]
v1 (healthy)
Review of attachment 8362604 [details] [diff] [review]:
-----------------------------------------------------------------
::: netwerk/cache2/CacheEntry.cpp
@@ +1418,4 @@
> mIsDoomed = true;
>
> + // This schedules dooming of the file, dooming is ensured to happen
> + // sooner then demand to open the same file made after this point
then -> than
@@ +1444,1 @@
> OnFileDoomed(NS_OK);
Why we return NS_OK even if CacheFile::Doom() fails synchronously?
Attachment #8362604 -
Flags: review?(michal.novotny)
| Assignee | ||
Comment 16•12 years ago
|
||
(In reply to Michal Novotny (:michal) from comment #15)
> Why we return NS_OK even if CacheFile::Doom() fails synchronously?
I was asking my self, but it was like this before... Will check on that.
| Assignee | ||
Comment 17•12 years ago
|
||
And since I'm paranoid: https://tbpl.mozilla.org/?tree=Try&rev=259373ec0076
| Assignee | ||
Comment 18•12 years ago
|
||
Attachment #8362604 -
Attachment is obsolete: true
Attachment #8363345 -
Flags: review?(michal.novotny)
Comment 19•12 years ago
|
||
Comment on attachment 8363345 [details] [diff] [review]
v2
Review of attachment 8363345 [details] [diff] [review]:
-----------------------------------------------------------------
::: netwerk/cache2/CacheFile.cpp
@@ +919,5 @@
>
> nsresult rv = NS_OK;
>
> if (mMemoryOnly) {
> // TODO what exactly to do here?
Please remove this comment since we've clarified what dooming memoryOnly CacheFile should do.
Attachment #8363345 -
Flags: review?(michal.novotny) → review+
| Assignee | ||
Comment 20•12 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
| Assignee | ||
Comment 21•12 years ago
|
||
[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 913807
User impact if declined: potential broken content shown on screen/media player under some relatively rare conditions
Testing completed (on m-c, etc.): landed on m-c on Jan 22 (one day ago)
Risk to taking this patch (and alternatives if risky): very low to none, this only bypasses HTTP cache loads, so only difference is performance/fresher content
String or IDL/UUID changes made by this patch: none
Attachment #8363051 -
Attachment is obsolete: true
Attachment #8363344 -
Attachment is obsolete: true
Attachment #8364459 -
Flags: approval-mozilla-beta?
Attachment #8364459 -
Flags: approval-mozilla-aurora?
Updated•12 years ago
|
Attachment #8364459 -
Flags: approval-mozilla-beta?
Attachment #8364459 -
Flags: approval-mozilla-beta+
Attachment #8364459 -
Flags: approval-mozilla-aurora?
Attachment #8364459 -
Flags: approval-mozilla-aurora+
| Assignee | ||
Comment 22•12 years ago
|
||
Comment on attachment 8364459 [details] [diff] [review]
v2 for backport
https://hg.mozilla.org/releases/mozilla-beta/rev/e9969e74c4a4
Comment 23•12 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•