Closed Bug 1258236 Opened 8 years ago Closed 8 years ago

Intermittent test_action-clear-snapshots_04.js,browser_memory_diff_01.js | Test timed out, on OS X debug starting with the backout of bug 1054759

Categories

(DevTools :: Memory, defect, P1)

x86_64
macOS
defect

Tracking

(firefox48 fixed)

RESOLVED FIXED
Firefox 48
Tracking Status
firefox48 --- fixed

People

(Reporter: philor, Assigned: fitzgen)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure)

Attachments

(1 file, 1 obsolete file)

I don't get it, but https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=8461e2ffc339&filter-searchStr=xpcshell - http://mxr.mozilla.org/mozilla-central/source/devtools/client/memory/test/unit/test_action-clear-snapshots_04.js started timing out on OS X debug (https://treeherder.mozilla.org/logviewer.html#?job_id=24199724&repo=mozilla-inbound) 50% of the time with the backout of something which just landed four days ago, even though test_action-clear-snapshots_04.js has been sitting unchanged since it landed in January.
Looks like an error reading the snapshot:

> 16:04:04     INFO -  PROCESS | 7058 | [7058] WARNING: '!codedStream.ReadVarint32(&size)', file /builds/slave/m-in-m64-d-0000000000000000000/build/src/devtools/shared/heapsnapshot/HeapSnapshot.cpp, line 131
> 16:04:04     INFO -  PROCESS | 7058 | [7058] WARNING: 'aRv.Failed()', file /builds/slave/m-in-m64-d-0000000000000000000/build/src/dom/workers/WorkerPrivate.cpp, line 5499
> 16:04:04     INFO -  PROCESS | 7058 | readSnapshot threw an exception: Error while performing task "readHeapSnapshot": [Exception... "Unexpected error"  nsresult: "0x8000ffff (NS_ERROR_UNEXPECTED)"  location: "JS frame :: resource://devtools/shared/heapsnapshot/HeapAnalysesWorker.js :: <TOP_LEVEL> :: line 31"  data: no]
> 16:04:04     INFO -  PROCESS | 7058 | @resource://devtools/shared/heapsnapshot/HeapAnalysesWorker.js:31:5
> 16:04:04     INFO -  PROCESS | 7058 | createHandler/<@resource://devtools/shared/worker/helper.js:85:24
> 16:04:04     INFO -  PROCESS | 7058 | EventHandlerNonNull*createTask@resource://devtools/shared/worker/helper.js:60:24
> 16:04:04     INFO -  PROCESS | 7058 | @resource://devtools/shared/heapsnapshot/HeapAnalysesWorker.js:29:1
> 16:04:04     INFO -  PROCESS | 7058 | console.error:
> 16:04:04     INFO -  PROCESS | 7058 |   readSnapshot threw an exception: Error while performing task "readHeapSnapshot": [Exception... "Unexpected error"  nsresult: "0x8000ffff (NS_ERROR_UNEXPECTED)"  location: "JS frame :: resource://devtools/shared/heapsnapshot/HeapAnalysesWorker.js :: <TOP_LEVEL> :: line 31"  data: no]
> 16:04:04     INFO -  PROCESS | 7058 | @resource://devtools/shared/heapsnapshot/HeapAnalysesWorker.js:31:5
> 16:04:04     INFO -  PROCESS | 7058 | createHandler/<@resource://devtools/shared/worker/helper.js:85:24
> 16:04:04     INFO -  PROCESS | 7058 | EventHandlerNonNull*createTask@resource://devtools/shared/worker/helper.js:60:24
> 16:04:04 INFO - PROCESS | 7058 | @resource://devtools/shared/heapsnapshot/HeapAnalysesWorker.js:29:1
Assignee: nobody → nfitzgerald
Priority: -- → P1
I can't reproduce this locally, even though I am on OSX 10.10. I ran it in a loop 100 times, no failures.
Summary: Intermittent test_action-clear-snapshots_04.js | Test timed out, on OS X debug starting with the backout of bug 1054759 → Intermittent test_action-clear-snapshots_04.js,browser_memory_diff_01.js | Test timed out, on OS X debug starting with the backout of bug 1054759
Fold reading the size of the next message and testing whether the stream has
more data into the same operation.

Try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2415bf3e5ad7
Attachment #8734869 - Flags: review?(sphink)
Comment on attachment 8734869 [details] [diff] [review]
Change the way we test for end-of-stream in ReadHeapSnapshot

Review of attachment 8734869 [details] [diff] [review]:
-----------------------------------------------------------------

Feel free to disagree, but this seems a little weird to me.

::: devtools/shared/heapsnapshot/HeapSnapshot.cpp
@@ +382,5 @@
>  #undef GET_STRING_OR_REF_WITH_PROP_NAMES
>  #undef GET_STRING_OR_REF
>  
> +// Because protobuf messages aren't self-delimiting, we serialize each message
> +// preceeded by its size in bytes. When deserializing, we read this size and

*preceded

@@ +407,5 @@
>    // First is the metadata.
>  
>    protobuf::Metadata metadata;
> +  if (NS_WARN_IF(!readSizeOfNextMessage(gzipStream, &sizeOfMessage)))
> +    return false;

It feels weird to me to lump together actual errors and false returns due to the stream ending prematurely. I think I'd prefer it if readSizeOfNextMessage returned true on the end of stream, and this was split into

  if (NS_WARN_IF(!readSizeOfNextMessage(gzipStream, &sizeOfMessage))
    return false;
  if (NS_WARN_IF(sizeOfMessage == 0))
    return false;

...though if you do that, there's really no need for readSizeOfNextMessage at all, other than for giving a nice place to hang the comment.

@@ +441,5 @@
> +  // Test for the end of the stream. The protobuf library gives no way to tell
> +  // the difference between an underlying read error and the stream being
> +  // done. All we can do is attempt to read the size of the next message and
> +  // extrapolate guestimations from the result of that operation.
> +  while (readSizeOfNextMessage(gzipStream, &sizeOfMessage)) {

Here, the comment seems stale -- protobufs may not be able to tell you when the stream is done, but now that you're inserting sizes yourself, you always know whether you've hit the end or there's an error. This seems like a good place for the readSizeOfNextMessage comment to move.

Also, making readSizeOfNextMessage() false on either an error or EOS works here, but it's a little cryptic, and prevents a warning from being emitted for the error case. I think I'd still prefer

  while (true) {
    if (!codedStream.ReadVarint32(&sizeOfMessage)) {
      NS_WARNING("error reading from stream, expected size or EOF");
      return false;
    }
    if (sizeOfMessage == 0)
      break;
    ...
  }
Attachment #8734869 - Flags: review?(sphink)
Comment on attachment 8734869 [details] [diff] [review]
Change the way we test for end-of-stream in ReadHeapSnapshot

Review of attachment 8734869 [details] [diff] [review]:
-----------------------------------------------------------------

Ok, fitzgen explained to me over IRC that the problem is in ReadVarint32, in that there's no way to distinguish error from end of stream. So yeah, it seems this is the best you can do.
Attachment #8734869 - Flags: review+
Attachment #8734869 - Attachment is obsolete: true
Reminder to check the try push and see if the intermittent still seems to be around.
Flags: needinfo?(nfitzgerald)
Flags: needinfo?(nfitzgerald)
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/334c1b88d28a
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 48
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.