Closed Bug 914644 Opened 7 years ago Closed 6 years ago

HTTP cache v2: failure of test_doomentry.js | 2147746065 == 0 | test_doomentry.js :: check_doom3 :: line 74

Categories

(Core :: Networking: Cache, defect)

All
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla29
Tracking Status
firefox27 --- unaffected
firefox28 --- unaffected
firefox29 --- fixed
firefox-esr24 --- unaffected

People

(Reporter: mayhemer, Assigned: mayhemer)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure, Whiteboard: [cache2])

Attachments

(2 files, 5 obsolete files)

Attached file Run log
Very low rate.

 0:00.65 >>>>>>>
 0:00.65 TEST-INFO | (xpcshell/head.js) | test MAIN run_test pending (1)
 0:00.65 TEST-INFO | (xpcshell/head.js) | test pending (2)
 0:00.65 TEST-INFO | (xpcshell/head.js) | test MAIN run_test finished (2)
 0:00.65 TEST-INFO | (xpcshell/head.js) | running event loop
 0:00.65 TEST-PASS | c:/Mozilla/src/gum/_obj-browser-debug/_tests/xpcshell/netwerk/test/unit/test_doomentry.js | [check_doom1 : 48] 0 == 0
 0:00.66 TEST-PASS | c:/Mozilla/src/gum/_obj-browser-debug/_tests/xpcshell/netwerk/test/unit/test_doomentry.js | [check_doom2 : 54] 2147746065 == 21477460
65
 0:00.66 TEST-UNEXPECTED-FAIL | c:/Mozilla/src/gum/_obj-browser-debug/_tests/xpcshell/netwerk/test/unit/test_doomentry.js | 2147746065 == 0 - See followin
g stack:
 0:00.66 JS frame :: c:/Mozilla/src/gum/_obj-browser-debug/_tests/xpcshell/netwerk/test/unit/test_doomentry.js :: check_doom3 :: line 74
 0:00.66 JS frame :: c:/Mozilla/src/gum/_obj-browser-debug/_tests/xpcshell/netwerk/test/unit/test_doomentry.js :: doom/<.onCacheEntryDoomed :: line 15
 0:00.66 native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0
 0:00.66 TEST-INFO | (xpcshell/head.js) | exiting test
 0:00.66 System JS : ERROR (null):0
 0:00.66                      uncaught exception: 2147500036
 0:00.66 WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0xC1F30001: file c:/Mozilla/src/gum/netwerk/cache2/CacheFileMetadata.cpp, line 218
 0:00.66 WARNING: NS_ENSURE_TRUE(compMgr) failed: file c:/Mozilla/src/gum/_obj-browser-debug/xpcom/build/nsComponentManagerUtils.cpp, line 58
 0:00.66 nsStringStats
 0:00.66  => mAllocCount:           2522
 0:00.66  => mReallocCount:          363
 0:00.66  => mFreeCount:            2522
 0:00.66  => mShareCount:           8889
 0:00.66  => mAdoptCount:            100
 0:00.66  => mAdoptFreeCount:        100
 0:00.66 <<<<<<<
 0:00.66 INFO | Result summary:
 0:00.66 INFO | Passed: 0
 0:00.66 INFO | Failed: 1
 0:00.66 INFO | Todo: 0
 0:00.66 INFO | Retried: 0
Whiteboard: [cache2]
This failure is not intermittent at all.  This is a regular failure on linux.  Hard to say when it has started.

CacheFileIOManager::DoomFileByKeyInternal doesn't find the file so it says it has not been able to doom (unexpected NS_ERROR_NOT_AVAILABLE bubbles to the test listener).  Which might seem correct, the file had not been written until that time.  But an entry exists and there is a _doomed_ handle for it:

CacheFileHandles::GetHandle() hash=ae330970e7e750fc2a6de17032001e3480a0fd38 found doomed handle 7f5bd97926a0, entry 7f5bd70a2cd0

GetHandle though returns NS_ERROR_NOT_AVAILABLE/nullptr since handle->IsDoomed() thus we cannot bypass file deletion at CacheFileIOManager::DoomFileByKeyInternal:

  if (handle) {
    if (handle->IsDoomed())
      return NS_OK;

    return DoomFileInternal(handle);
  }

That return NS_OK is actually a dead code.

Maybe just a "give me doomed handle" bool argument can do the job here?
Assignee: nobody → michal.novotny
OS: All → Linux
Summary: HTTP cache v2: intermittent failure of test_doomentry.js | 2147746065 == 0 | test_doomentry.js :: check_doom3 :: line 74 → HTTP cache v2: failure of test_doomentry.js | 2147746065 == 0 | test_doomentry.js :: check_doom3 :: line 74
Attached patch v1 (obsolete) — Splinter Review
As suggested in comment 1, this fixes the test for me on linux.

Pushed to gum as https://hg.mozilla.org/projects/gum/rev/3fad25133446
Assignee: michal.novotny → honzab.moz
Status: NEW → ASSIGNED
Attachment #829564 - Flags: review?(michal.novotny)
Still has problems, this time only intermittent:

https://tbpl.mozilla.org/php/getParsedLog.php?id=30353054&tree=Gum
[ No longer blocks enabling cache2, expected to be fixed after cache2 is on ]
No longer blocks: cache2enable
Blocks: cache2enable
No longer blocks: cache2enable
Blocks: cache2enable
No longer blocks: cache2enable
Comment on attachment 829564 [details] [diff] [review]
v1

Doesn't work anyway.  I need to look at this ones again.
Attachment #829564 - Flags: review?(michal.novotny)
Attached patch v1 (merged) (obsolete) — Splinter Review
Attachment #829564 - Attachment is obsolete: true
Another problem is following:
- this is about OS thread scheduling
- dooming is handled before opening (and writing) files
- the test writes and then immediately dooms
- if the IO thread doesn't process open+write sooner then we schedule doom, the file is not found and CacheFileIOManager::DoomFileByKeyInternal fails with NS_ERROR_NOT_AVAILABLE despite the fact the goal of dooming (a.k.a. getting rid of) the file actually succeeded...  hard to say what to do.


Simplest solution seems to me to merge DOOM and OPEN event levels into just a single OPEN level where dooms will be handled as well, in order with openings.  That preserves consistency when someone dooms a file sooner then it tries to open-read it.
Attached patch v2 (obsolete) — Splinter Review
Pushed to gum as https://hg.mozilla.org/projects/gum/rev/dbc8975938d4
- same as v1
- removed DOOM and DOOM_PRIORITY event levels (+the IMMEDIATE one that was unused)
Attachment #8356671 - Attachment is obsolete: true
Attachment #8357449 - Flags: review?(michal.novotny)
Attachment #8357449 - Flags: review?(michal.novotny) → review+
Comment on attachment 8357449 [details] [diff] [review]
v2

The problem is (yet more!) elsewhere... and a bit more complex.

The still appearing failure is caused by following scenario:
- precondition: IOThread start is delayed (a lot after we open/write/doom on the main thread in the test)
- we create CacheEntry
- we create CacheFile, with OPEN_TRUNCATE
- we demand CacheFileHandle, that is created asynchronously
- we write to the cache entry
- we close it
- we doom it:
  - CacheEntry::Doom
  - CacheFile::Doom
  - no handle -> DoomByKey scheduled *)
- till this point, nothing of the operations actually happened, since the IOThread sleeps
- now, IOThread gets its time
- DoomByKey is executed as the first *)
- we open
- we write
- we close
=> file's on disk + doom fails since the file was not there at the time

*) first, there is a typo when posting the DoomByKey event, instead a priority we pass nsIEventTarget::DISPATCH_NORMAL (argument is uint32_t so it builds).
second, if there were the OPEN priority used, the test would work, but just this test...  it's very hard to decide which priority is the correct one here

CacheFile::Doom must wait with dooming until it has the handle when it's still opening it (mOpeningFile).
Attachment #8357449 - Flags: review+ → review-
Depends on: 958311
Attached patch v2->v3 IDIFF (obsolete) — Splinter Review
selfexplanatory

https://tbpl.mozilla.org/?tree=Try&rev=ab18348302b8
Attachment #8358618 - Flags: feedback?(michal.novotny)
Comment on attachment 8358618 [details] [diff] [review]
v2->v3 IDIFF

Review of attachment 8358618 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/cache2/CacheFile.cpp
@@ +685,5 @@
>      }
>      else {
>        mHandle = aHandle;
>  
> +      if (mDoomAfterOpenListener) {

Other if-statements in OnFileOpened() must also check if there is mDoomAfterOpenListener and notify listener with NS_ERROR_NOT_AVAILABLE.

@@ +947,5 @@
>  
>    if (mHandle) {
>      rv = CacheFileIOManager::DoomFile(mHandle, listener);
> +  } else if (mOpeningFile) {
> +    mDoomAfterOpenListener = listener;

This won't work in case caller passed nullptr as callback. Listener as well as mDoomAfterOpenListener will be null and we won't doom the handle in OnFileOpened(). DoomFileHelper must be changed so that it can work with no callback.

@@ +952,2 @@
>    } else {
> +    // This should never happen and is potentially dangerous...

This really cannot happen (see the assert at the beginning of this method), so remove this block.
Attachment #8358618 - Flags: feedback?(michal.novotny) → feedback-
No longer depends on: 958311
(In reply to Michal Novotny (:michal) from comment #20)
> Comment on attachment 8358618 [details] [diff] [review]
> v2->v3 IDIFF
> 
> Review of attachment 8358618 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: netwerk/cache2/CacheFile.cpp
> @@ +685,5 @@
> >      }
> >      else {
> >        mHandle = aHandle;
> >  
> > +      if (mDoomAfterOpenListener) {
> 
> Other if-statements in OnFileOpened() must also check if there is
> mDoomAfterOpenListener and notify listener with NS_ERROR_NOT_AVAILABLE.

Can you please direct me to those statements?  It's not clear to me which statement in OnFileOpened (in what class) you are referring to.
Attached patch v4 [healthy] (obsolete) — Splinter Review
This one seems to work for me.
Attachment #8357449 - Attachment is obsolete: true
Attachment #8358618 - Attachment is obsolete: true
Attachment #8359972 - Flags: review?(michal.novotny)
Attachment #8359972 - Attachment description: v4 → v4 [healthy]
(In reply to Honza Bambas (:mayhemer) from comment #25)
> Can you please direct me to those statements?  It's not clear to me which
> statement in OnFileOpened (in what class) you are referring to.

I meant these 2 other places:

http://hg.mozilla.org/projects/gum/annotate/fa46ddfd2751/netwerk/cache2/CacheFile.cpp#l608
If mDoomAfterOpenListener is non-null, then SetMemoryOnly() was called after Doom(). If there is a listener waiting for the result of doom operation, we need to notify it.

http://hg.mozilla.org/projects/gum/annotate/fa46ddfd2751/netwerk/cache2/CacheFile.cpp#l615
The same as above, if there is a doom listener waiting, it must be notified that the operation failed, because there is no handle to doom.
Comment on attachment 8359972 [details] [diff] [review]
v4 [healthy]

Review of attachment 8359972 [details] [diff] [review]:
-----------------------------------------------------------------

r- because of comment #28

::: netwerk/cache2/CacheFile.cpp
@@ +896,5 @@
>      // TODO what exactly to do here?
>      return NS_ERROR_NOT_AVAILABLE;
>    }
>  
> +  nsCOMPtr<CacheFileIOListener> listener = new DoomFileHelper(aCallback);

Just an optimization, we don't need to create the listener if we have a handle and caller didn't provide a callback. So this should look like:

nsCOMPtr<CacheFileIOListener> listener;

if (aCallback || !mHandle) {
  listener = new DoomFileHelper(aCallback);
}
Attachment #8359972 - Flags: review?(michal.novotny) → review-
(In reply to Michal Novotny (:michal) from comment #28)
> (In reply to Honza Bambas (:mayhemer) from comment #25)
> > Can you please direct me to those statements?  It's not clear to me which
> > statement in OnFileOpened (in what class) you are referring to.
> 
> I meant these 2 other places:
> 
> http://hg.mozilla.org/projects/gum/annotate/fa46ddfd2751/netwerk/cache2/
> CacheFile.cpp#l608
> If mDoomAfterOpenListener is non-null, then SetMemoryOnly() was called after
> Doom(). If there is a listener waiting for the result of doom operation, we
> need to notify it.

This seems to me a bit inconsistent.

Imagine following different (only by timing of OnFileOpened) scenarios:

1. File::Init(createNew) -> File::Doom() -> File::SetMemoryOnly -> File::OnFileOpened
2. File::Init(createNew) -> File::Doom() -> File::OnFileOpened -> File::SetMemoryOnly
3. File::Init(createNew) -> File::OnFileOpened -> File::Doom() -> File::SetMemoryOnly

in the first case, doom is NOT performed
in the second case, doom IS performed
in the third case, doom IS performed

IMO we should doom the handle always, regardless the memory only flag.


> 
> http://hg.mozilla.org/projects/gum/annotate/fa46ddfd2751/netwerk/cache2/
> CacheFile.cpp#l615
> The same as above, if there is a doom listener waiting, it must be notified
> that the operation failed, because there is no handle to doom.

This is one is clear.
(In reply to Honza Bambas (:mayhemer) from comment #30)
> 1. File::Init(createNew) -> File::Doom() -> File::SetMemoryOnly ->
> File::OnFileOpened
> 
> in the first case, doom is NOT performed
> 
> IMO we should doom the handle always, regardless the memory only flag.

I think we don't need to doom the handle since it is guaranteed that the entry is empty, file doesn't exist and the handle will be released as soon as possible. In fact, calling doom on such handle just delays its removing from the hash table. The question is whether we should return NS_ERROR_NOT_AVAILABLE or NS_OK to the listener. Probably NS_OK.
Attached patch v5Splinter Review
https://tbpl.mozilla.org/?tree=Try&rev=7bdd96561382
Attachment #8359972 - Attachment is obsolete: true
Attachment #8361135 - Flags: review?(michal.novotny)
Comment on attachment 8361135 [details] [diff] [review]
v5

Review of attachment 8361135 [details] [diff] [review]:
-----------------------------------------------------------------

I would prefer to put the logic of AutoFailDoomListener direcly into DoomListenerHelper, since AutoFailDoomListener makes method CacheFile::OnFileOpened() a bit hard to read and is IMO unnecessary. But this is just a suggestion.
Attachment #8361135 - Flags: review?(michal.novotny) → review+
(In reply to Michal Novotny (:michal) from comment #33)
> Comment on attachment 8361135 [details] [diff] [review]
> v5
> 
> Review of attachment 8361135 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I would prefer to put the logic of AutoFailDoomListener direcly into
> DoomListenerHelper, since AutoFailDoomListener makes method
> CacheFile::OnFileOpened() a bit hard to read and is IMO unnecessary. But
> this is just a suggestion.

If you are not strongly against this class (that is necessary because of many nested early returns in OnFileOpened) then because of the time press I'd rather leave it.
OK, land it as it is.
https://hg.mozilla.org/mozilla-central/rev/062efe173113
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
You need to log in before you can comment on or make changes to this bug.