Closed Bug 1042097 Opened 10 years ago Closed 10 years ago

Save Android tombstones to blobber

Categories

(Testing :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla34

People

(Reporter: gbrown, Assigned: gbrown)

References

Details

Attachments

(1 file)

In bug 803158 (and related), we planned long ago to deploy ndk-stack to foopies, check for tombstones and if found, generate ndk-stack reports in the test logs.

Let's do something simpler instead: Check for tombstones and save them to blobber when found. Test investigators can then pull the tombstones and run ndk-stack if desired -- no need to deploy another tool, less clutter in crash reports, and maybe we will actually get this done!
Depends on: 836334
On Android 2.3, there is no /data/tombstones directory. We should probably create it, similar to what we did recently for /data/anr.

On Android 4.0 and Android 4.2 x86, files are created in /data/tombstones, but I am having trouble retrieving them -- devicemanager.pullFile fails for some reason.
(In reply to Geoff Brown [:gbrown] from comment #1)
...caused by simple permission problems on /data/tombstones. Solved by liberal use of root=True!

Another trick here is that the tombstone files have names like "tombstone_00". Blobber refuses to upload those because there is no recognized file extension -- renaming to tombstone_00.txt does wonders!

For robocop runs, and Android x86, it is easy to generate 2 or more "tombstone_00" files, so I also had to watch for existing files and make unique file names.
Structurally, I'm using the same approach as we have used for ANR reports: delete any tombstones at the start of the test, then when checking for crashes at the end of a test, pull any tombstones found. 

To check for tombstones, automation simply pulls any files from the device's /data/tombstones directory and puts them in the MOZ_UPLOAD_DIR, for blobber processing. As I mentioned in a bug comment, the file name needs to be modified, adding a .txt extension and an integer for uniqueness.

Try run without any crashes: https://tbpl.mozilla.org/?tree=Try&rev=80a43eca764e. Note that a few of these runs actually have tombstones -- these seem to be from SIGSEGV's in shutdown which are not caught be breakpad -- cool!

Try run with intentional crash: https://tbpl.mozilla.org/?tree=Try&rev=7f3259bb065c. Note that some of these do not have tombstones, especially on Android 2.3. I've checked and the tombstone files just aren't there -- a crash does not guarantee the creation of a tombstone file. That's okay with me: We'll upload the tombstone when we can.
Attachment #8470334 - Flags: review?(jmaher)
> Note that a few of these runs actually have tombstones
> -- these seem to be from SIGSEGV's in shutdown which are
> not caught be breakpad -- cool!

I think we should make these fatal - what are your thoughts?
(In reply to Ed Morley [:edmorley] from comment #4)
> I think we should make these fatal - what are your thoughts?

I would not make them fatal. I think it is unlikely that a failure that late in shutdown will affect the user experience -- better to focus our efforts elsewhere.
Makes sense, sounds good to me :-)
Comment on attachment 8470334 [details] [diff] [review]
upload tombstone files to blobber

Oops - jmaher is away this week. Maybe dminor will take the review?
Attachment #8470334 - Flags: review?(jmaher) → review?(dminor)
Comment on attachment 8470334 [details] [diff] [review]
upload tombstone files to blobber

Review of attachment 8470334 [details] [diff] [review]:
-----------------------------------------------------------------

Overall, looks good to me. r+ once my concerns about the creation/existence of "/data/tombstones" are addressed.

::: build/mobile/remoteautomation.py
@@ +130,5 @@
>          else:
>              print "%s not found" % traces
>  
> +    def deleteTombstones(self):
> +        # delete any existing tombstone files, but ensure the directory exists

What are the consequences of "/data/tombstones" being missing?

It looks like we only do the mkdir after a run. If firefox can create the tombstone directory for us, then I don't see any reason to do the mkdir here. If it can't, then the mkdir should be part of a separate function called prior to the tests running.

Since the comment says "ensure" I think we should either raise an error if we are unable to create the directory, or change the wording in the comment.

@@ +161,5 @@
> +                except DMError:
> +                    # This may just indicate that no tombstone files are present
> +                    pass
> +                self.deleteTombstones()
> +                # add a .txt file extension to each tombstone file name, so 

nit: trailing whitespace

@@ +171,5 @@
> +                    for i in xrange(1, sys.maxint):
> +                        newname = "%s.%d.txt" % (f, i)
> +                        if not os.path.exists(newname):
> +                            os.rename(f, newname)
> +                            break

You could use
  for i, f in enumerate(glob.glob(...))
and then use the value of i as the unique integer to avoid the second loop.
Attachment #8470334 - Flags: review?(dminor) → review+
(In reply to Dan Minor [:dminor] from comment #8)
> What are the consequences of "/data/tombstones" being missing?

I'm glad you asked. I was not sure if a crash would create /data/tombstones and was being cautious. But I checked - https://tbpl.mozilla.org/?tree=Try&rev=05f1c2557dc0 - and indeed, the tombstones directory is created if missing and everything continues to work. I removed the mkdir/chmod.

> You could use
>   for i, f in enumerate(glob.glob(...))
> and then use the value of i as the unique integer to avoid the second loop.

I think that almost works, but doesn't (without additional provisions) in the case of, say, multiple crashes in a robocop run. For example, if robocop-2 has crashes in both testNewTab and testOverscroll, testNewTab writes tombstone_00 which is pulled and renamed to tombstone_00_1, then testOverscroll writes tombstone_00 and a new call to checkTombstones will try tombstone_00_1 again before blobber has cleaned out the upload directory. To handle that, I think we pretty much need a check for local file existence and a separate loop. I'll keep the original code.

https://hg.mozilla.org/integration/mozilla-inbound/rev/f2e3009e1bd1
https://hg.mozilla.org/mozilla-central/rev/f2e3009e1bd1
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
"/data/tombstones does not exist; tombstone check skipped"
https://tbpl.mozilla.org/php/getParsedLog.php?id=46167051&tree=Mozilla-Inbound

Is this expected?
Flags: needinfo?(gbrown)
Yes - that's fine.

Our Android 2.3 emulator image starts off without any /data/tombstones directory. If there are no crashes, the system does not create /data/tombstones. If there is a crash, the system may or may not create /data/tombstones (I'm not sure of the conditions that make a difference) and may or may not create a tombstone file.

If we find a tombstone, I'm happy to put it on blobber in case someone finds it interesting; if it's not there, I can live with that.
Flags: needinfo?(gbrown)
(In reply to Geoff Brown [:gbrown] from comment #12)
> Our Android 2.3 emulator image starts off without any /data/tombstones
> directory. If there are no crashes, the system does not create
> /data/tombstones.

Ah this was the piece I was missing; sounds fine to me :-)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: